International Journal of Information Technology & Computer Science ( IJITCS )
Proteins have a flexible structure and undergo structural rearrangements during their activities. However, most of the protein structure alignment methods consider them as rigid bodies. In this paper, we report a linear encoding based method for flexible alignment of protein structure. In the first step, the method looks for Aligned Fragment Pairs (AFP) between two proteins, and then, uses a text modeling technique to create a topology string for each AFP. The generated topology strings of two proteins are compared and aligned using n-gram modeling technique over entropy concept. Finally, a step-by-step algorithm is utilized to create alignment between two structures. The method was evaluated by a set of experiments using a dataset of proteins with macromolecular motions and the results were compared with those of the existing flexible alignment methods e.g. FlexProt, FATCAT, and FlexSnap. The results prove the efficiency and applicability of the proposed method in comparison with the other similar methods .
: protein structure alignment; flexible alignment; structure comparison
- W. Bennett, and R. Huber, “Structural and functional aspects of domain motions in proteins”, Crit. Rev. Biochem., vol 15, pp. 291–384, 1984.
- D.J. Jacobs, A.J. Rader, L.A. Kuhn, and M.F. Thorpe, “Protein flexibility predictions using graph theory”, Proteins, vol. 44, pp. 150–165, 2001.
- L. Holm, C. Sander, “Protein structure comparison by alignment of distance matrices”, Journal of Molecular Biology, vol. 233, pp. 123-138, 1993.
- J.F. Gibrat, T. Madej, J.L. Spouge, and S.H. Bryant, “The VAST protein structure comparison method”, Biophysics Journal, vol. 72, MP298, 1997.
- I. Shindyalov and P. Bourne, “Protein structure alignment by incremental combinatorial extension (CE) of the optimal path”, Protein Engineering, vol. 11, pp. 739-47, 1998.
- E. Krissinel, K. Henrick, “Secondary-structure matching (SSM), a new tool for fast protein structure alignment in three dimensions”, Acta Crystallographica Section D: Biological Crystallography, vol. 60, pp. 2256-2268, 2004.
- Y. Zhang, J. Skolnick, “TM-align: a protein structure alignment algorithm based on the TM-score”, Nucleic Acid Research, vol. 33, pp. 2302-2309, 2005.
- B. Kolbeck, P. May, T. Schmidt-Goenner, T. Steinke, E.W. Knapp, “Connectivity independent proteinstructure alignment: a hierarchical approach”, BMC Bioinformatics, vol. 7, doi:10.1186/1471-2105-7-510, 2006.
- Y. Ye and A. Godzik, “Flexible structure alignment by chaining aligned fragment pairs allowing twists”, Bioinformatics, vol. 19, pp. II246-II255, 2003.
- M. Shatsky, R. Nussinov, and H. Wolfson, “Flexible protein alignment and hinge detection”, Proteins: Structure, Function, and Bioinformatics, vol. 48, pp. 242-256, 2002.
- U. Emekli, D, Schneidman-Duhovny, H. Wolfson, R. Nussinov, and T. Haliloglu, “HingeProt: Automated Prediction of Hinges in Protein Structures”, Proteins, vol. 70, pp. 1219-1227, 2008.
- M. Shatsky, R. Nussinov, and H. Wolfson, “A method for simultaneous alignment of multiple protein structures”. Proteins: Structure, Function,and Bioinformatics, vol. 56, pp. 143-156, 2004.
- R. Mosca, T. Schneider, “RAPIDO: a web server for the alignment of protein structures in the presence of conformational changes”, Nucleic Acids Research, vol. 36, W42-W46, 2008.
- S. Salem, M. J. Zaki, and C. Bystroff, “FlexSnap: Flexible non-sequential protein structure alignment”, Algorithms for Molecular Biology, vol. 5, 2010.
- M. Carpentier, S. Brouillet, and J. Pothier, “YAKUSA: a fast structural database scanning method”, Proteins, vol. 61, pp. 137-151, 2005.
- C.H. Tung, J.W. Huang, J.M. Yang, “Kappa-alpha plot derived structural alphabet and BLOSUM-like substitution matrix for rapid search of protein structure database”, Genome Biology, vol. 8:R31, 2007.
- W.C. Lo, P.J. Huang, C.H. Chang, and P.C. Lyu, “Protein structural similarity search by Ramachandran codes”, BMC Bioinformatics, vol. 8:307, 2007.
- J. Razmara, S. Deris, S. Parvizpour, “TS-AMIR: a topology string alignment method for intensive rapid protein structure comparison”, Algorithms for Molecular Biology, vol. 7, 2012.
- A. Bogan-Marta, A. Hategan and I. Pitas, “Language engineering and information theoretic methods in protein sequence similarity studies”, Studies in Computational Intelligence, vol. 85, pp. 151-183, 2008.
- W. Kabsch, “A discussion of the solution for the best rotation to relate two sets of vectors”, Acta Crystallographica Section A: Crystal Physics, Diffraction, Theoretical and General Crystallography, vol. 34, pp. 827-828, 1978.
- D. Fischer, A. Elofsson,D. Rice, and D. Eisenberg, “Assessing the performance of fold recognition methods by means of a comprehensive benchmark”. In Pacific Symposium on Biocomputing, pp. 300–318, 1996.
- T. Madej, J. F. Gibrat, and S. H. Bryant, “Threading a database of protein cores”, Proteins, 23, 356–369, 1995.
- M. Gerstein and W. Krebs, “A database of macromolecular motions”, Nucleic Acids Res, vol. 26, pp. 42804290, 1998.