Journal Article

Model-based prediction of sequence alignment quality

Virpi Ahola, Tero Aittokallio, Mauno Vihinen and Esa Uusipaikka

in Bioinformatics

Volume 24, issue 19, pages 2165-2171
Published in print October 2008 | ISSN: 1367-4803
Published online August 2008 | e-ISSN: 1460-2059 | DOI:
Model-based prediction of sequence alignment quality

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology


Show Summary Details


Motivation: Multiple sequence alignment (MSA) is an essential prerequisite for many sequence analysis methods and valuable tool itself for describing relationships between protein sequences. Since the success of the sequence analysis is highly dependent on the reliability of alignments, measures for assessing the quality of alignments are highly requisite.

Results: We present a statistical model-based alignment quality score. Unlike other quality scores, it does not require several parallel alignments for the same set of sequences or additional structural information. Our quality score is based on measuring the conservation level of reference alignments in Homstrad. Reference sequences were realigned with the Mafft, Muscle and Probcons alignment programs, and a sum-of-pairs (SP) score was used to measure the quality of the realignments. Statistical modelling of the SP score as a function of conservation level and other alignment characteristics makes it possible to predict the SP score for any global MSA. The predicted SP scores are highly correlated with the correct SP scores, when tested on the Homstrad and SABmark databases. The results are comparable to that of multiple overlap score (MOS) and better than those of normalized mean distance (NorMD) and normalized iRMSD (NiRMSD) alignment quality criteria. Furthermore, the predicted SP score is able to detect alignments with badly aligned or unrelated sequences.

Availability: The method is freely available at


Supplementary information: Supplementary data are available at Bioinformatics online.

Journal Article.  6158 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content. subscribe or login to access all content.