Journal Article

Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue–residue contacts

Patrik Björkholm, Pawel Daniluk, Andriy Kryshtafovych, Krzysztof Fidelis, Robin Andersson and Torgeir R. Hvidsten

in Bioinformatics

Volume 25, issue 10, pages 1264-1270
Published in print May 2009 | ISSN: 1367-4803
Published online March 2009 | e-ISSN: 1460-2059 | DOI: http://dx.doi.org/10.1093/bioinformatics/btp149
Using multi-data hidden Markov models trained on local neighborhoods of protein structure to predict residue–residue contacts

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology

GO

Show Summary Details

Preview

Motivation:Correct prediction of residue–residue contacts in proteins that lack good templates with known structure would take ab initio protein structure prediction a large step forward. The lack of correct contacts, and in particular long-range contacts, is considered the main reason why these methods often fail.

Results: We propose a novel hidden Markov model (HMM)-based method for predicting residue–residue contacts from protein sequences using as training data homologous sequences, predicted secondary structure and a library of local neighborhoods (local descriptors of protein structure). The library consists of recurring structural entities incorporating short-, medium- and long-range interactions and is general enough to reassemble the cores of nearly all proteins in the PDB. The method is tested on an external test set of 606 domains with no significant sequence similarity to the training set as well as 151 domains with SCOP folds not present in the training set. Considering the top 0.2 · L predictions (L=sequence length), our HMMs obtained an accuracy of 22.8% for long-range interactions in new fold targets, and an average accuracy of 28.6% for long-, medium- and short-range contacts. This is a significant performance increase over currently available methods when comparing against results published in the literature.

Availability: http://predictioncenter.org/Services/FragHMMent/

Contact: torgeir.hvidsten@plantphys.umu.se

Supplementary information: Supplementary data are available at Bioinformatics online.

Journal Article.  4892 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.