Header menu link for other important links
Phylogenetic clustering of protein sequences using recurrence quantification analysis
A. Yadav, , M. Kale, U. Kulkarni-Kale
Published in
Volume: 19
Issue: 5
Pages: 1336 - 1339
Molecular phylogeny analysis (MPA) investigates differences in molecular sequences to analyze the evolutionary relationships of organisms or their bio-macromolecules. Currently there are two categories of methods for MPA viz. distance based and character based. All the methods require multiple sequence alignment (MSA) as a prerequisite to MPA and are usually followed by bootstrap analysis. MSA is efficient in terms of time, computations and memory requirement only when size and number of sequences are small. MSA of whole proteomes is computationally intensive and time consuming. In this paper, an attempt is made to propose a new alignment free approach of MPA using Recurrence Quantification Analysis (RQA). The protein sequence is converted into a numeric sequence by assigning a unique score in the form of real numbers to each amino acid and various RQA features are extracted from the numeric sequence. These features are used as co-ordinates for calculating the distance between the sequences using an appropriate distance function. The distance matrix, thus obtained is used for clustering using Neighbor-Joining method. Requirement of time for clustering and tree construction is observed to be significantly reduced in comparison with the alignment-based algorithms. As an example, a test case involving an application of this method for clustering of 59 polyprotein sequences of family Flaviviridae is demonstrated and phylogenetic tree thus obtained was found to be fairly accurate with 95% accuracy. Only 3 misclassifications out of 59 were observed. Thus, the proposed method has potential to be a reasonable alternative for existing MPA methods. © 2013 American Scientific Publishers All rights reserved.
About the journal
Published in
Open Access
Impact factor