Journal Article

A New Database (GCD) on Genome Composition for Eukaryote and Prokaryote Genome Sequences and Their Initial Analyses

Kirill Kryukov, Kenta Sumiyama, Kazuho Ikeo, Takashi Gojobori and Naruya Saitou

in Genome Biology and Evolution

Published on behalf of Society for Molecular Biology and Evolution

Volume 4, issue 4, pages 501-512
Published in print January 2012 |
Published online March 2012 | e-ISSN: 1759-6653 | DOI:

More Like This

Show all results sharing these subjects:

  • Bioinformatics and Computational Biology
  • Evolutionary Biology
  • Genetics and Genomics


Show Summary Details


Eukaryote genomes contain many noncoding regions, and they are quite complex. To understand these complexities, we constructed a database, Genome Composition Database, for the whole genome composition statistics for 101 eukaryote genome data, as well as more than 1,000 prokaryote genomes. Frequencies of all possible one to ten oligonucleotides were counted for each genome, and these observed values were compared with expected values computed under observed oligonucleotide frequencies of length 1–4. Deviations from expected values were much larger for eukaryotes than prokaryotes, except for fungal genomes. Mammalian genomes showed the largest deviation among animals. The results of comparison are available online at

Keywords: GCD; oligonucleotide frequency; alignment-free sequence comparison

Journal Article.  4111 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology ; Evolutionary Biology ; Genetics and Genomics

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.