Journal Article

Detection of Short Protein Coding Regions within the Cyanobacterium Genome: Application of the Hidden Markov Model

Tetsushi Yada and Makoto Hirosawa

in DNA Research

Published on behalf of Kazusa DNA Research Institute

Volume 3, issue 6, pages 355-361
Published in print January 1996 | ISSN: 1340-2838
Published online January 1996 | e-ISSN: 1756-1663 | DOI:

Show Summary Details


The gene-finding programs developed so far have not paid much attention to the detection of short protein coding regions (CDSs). However, the detection of short CDSs is important for the study of photosynthesis. We utilized GeneHacker, a gene-finding program based on the hidden Markov model (HMM), to detect short CDSs (from 90 to 300 bases) in a 1.0 mega contiguous sequence of cyanobacterium Synechocystis sp. strain PCC6803 which carries a complete set of genes for oxygenic photosynthesis. GeneHacker differs from other gene-finding programs based on the HMM in that it utilizes di-codon statistics as well. GeneHacker successfully detected seven out of the eight short CDSs annotated in this sequence and was clearly superior to GeneMark in this range of length. GeneHacker detected 94 potentially new CDSs, 9 of which have counterparts in the genetic databases. Four of the nine CDSs were less than 150 bases and were photosynthesis-related genes. The results show the effectiveness of GeneHacker in detecting very short CDSs corresponding to genes.

Keywords: Cyanobacterium; gene finding; hidden Markov model; short protein coding region; oxygenic photosynthesis genes

Journal Article.  0 words. 

Subjects: Genetics and Genomics

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.