Journal Article

Efficient implementation of a generalized pair hidden Markov model for comparative gene finding

W. H. Majoros, M. Pertea and S. L. Salzberg

in Bioinformatics

Volume 21, issue 9, pages 1782-1788
Published in print May 2005 | ISSN: 1367-4803
Published online February 2005 | e-ISSN: 1460-2059 | DOI: https://dx.doi.org/10.1093/bioinformatics/bti297
Efficient implementation of a generalized pair hidden Markov model for comparative gene finding

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology

GO

Show Summary Details

Preview

Motivation: The increased availability of genome sequences of closely related organisms has generated much interest in utilizing homology to improve the accuracy of gene prediction programs. Generalized pair hidden Markov models (GPHMMs) have been proposed as one means to address this need. However, all GPHMM implementations currently available are either closed-source or the details of their operation are not fully described in the literature, leaving a significant hurdle for others wishing to advance the state of the art in GPHMM design.

Results: We have developed an open-source GPHMM gene finder, TWAIN, which performs very well on two related Aspergillus species, A.fumigatus and A.nidulans, finding 89% of the exons and predicting 74% of the gene models exactly correctly in a test set of 147 conserved gene pairs. We describe the implementation of this GPHMM and we explicitly address the assumptions and limitations of the system. We suggest possible ways of relaxing those assumptions to improve the utility of the system without sacrificing efficiency beyond what is practical.

Availability: Available at http://www.tigr.org/software/pirate/twain/twain.html under the open-source Artistic License.

Contact: bmajoros@tigr.org

Journal Article.  6264 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content. subscribe or login to access all content.