Journal Article

Genes from Nine Genomes Are Separated into Their Organisms in the Dinucleotide Composition Space

Hiroshi Nakashima, Motonori Ota, Ken Nishikawa and Tatsuo Ooi

in DNA Research

Published on behalf of Kazusa DNA Research Institute

Volume 5, issue 5, pages 251-259
Published in print January 1998 | ISSN: 1340-2838
Published online January 1998 | e-ISSN: 1756-1663 | DOI:

Show Summary Details


A set of 16 kinds of dinucleotide compositions was used to analyze the protein-encoding nucleotide sequences in nine complete genomes: Escherichia coli, Haemophilus influenzae, Helicobacter pylori, Mycoplasma genitalium, Mycoplasma pneumoniae, Synechocystis sp., Methanococcus jannaschii, Archaeoglobus fulgidus, and Saccharomyces cerevisiae. The dinucleotide composition was significantly different between the organisms. The distribution of genes from an organism was clustered around its center in the dinucleotide composition space. The genes from closely related organisms such as Gram-negative bacteria, mycoplasma species and eukaryotes showed some overlap in the space. The genes from nine complete genomes together with those from human were discriminated into respective clusters with 80% accuracy using the dinucleotide composition alone. The composition data estimated from a whole genome was close to that obtained from genes, indicating that the characteristic feature of dinucleotides holds not only for protein coding regions but also noncoding regions. When a dendrogram was constructed from the disposition of the clusters in the dinucleotide space, it resembled the real phylogenetic tree. Thus, the distinct feature observed in the dinucleotide composition may reflect the phylogenetic relationship of organisms.

Keywords: separation of genes; dinucleotide frequency; phylogenetic tree

Journal Article.  0 words. 

Subjects: Genetics and Genomics

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.