Journal Article

Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits

Christophe Dessimoz, Brigitte Boeckmann, Alexander C. J. Roth and Gaston H. Gonnet

in Nucleic Acids Research

Volume 34, issue 11, pages 3309-3316
Published in print July 2006 | ISSN: 0305-1048
Published online July 2006 | e-ISSN: 1362-4962 | DOI:

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology


Show Summary Details


Correct orthology assignment is a critical prerequisite of numerous comparative genomics procedures, such as function prediction, construction of phylogenetic species trees and genome rearrangement analysis. We present an algorithm for the detection of non-orthologs that arise by mistake in current orthology classification methods based on genome-specific best hits, such as the COGs database. The algorithm works with pairwise distance estimates, rather than computationally expensive and error-prone tree-building methods. The accuracy of the algorithm is evaluated through verification of the distribution of predicted cases, case-by-case phylogenetic analysis and comparisons with predictions from other projects using independent methods. Our results show that a very significant fraction of the COG groups include non-orthologs: using conservative parameters, the algorithm detects non-orthology in a third of all COG groups. Consequently, sequence analysis sensitive to correct orthology assignments will greatly benefit from these findings.

Journal Article.  5548 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content. subscribe or login to access all content.