Journal Article

Gene Frequency Distributions Reject a Neutral Model of Genome Evolution

Alexander E. Lobkovsky, Yuri I. Wolf and Eugene V. Koonin

in Genome Biology and Evolution

Published on behalf of Society for Molecular Biology and Evolution

Volume 5, issue 1, pages 233-242
Published in print January 2013 |
Published online January 2013 | e-ISSN: 1759-6653 | DOI:

More Like This

Show all results sharing these subjects:

  • Bioinformatics and Computational Biology
  • Evolutionary Biology
  • Genetics and Genomics


Show Summary Details


Evolution of prokaryotes involves extensive loss and gain of genes, which lead to substantial differences in the gene repertoires even among closely related organisms. Through a wide range of phylogenetic depths, gene frequency distributions in prokaryotic pangenomes bear a characteristic, asymmetrical U-shape, with a core of (nearly) universal genes, a “shell” of moderately common genes, and a “cloud” of rare genes. We employ mathematical modeling to investigate evolutionary processes that might underlie this universal pattern. Gene frequency distributions for almost 400 groups of 10 bacterial or archaeal species each over a broad range of evolutionary distances were fit to steady-state, infinite allele models based on the distribution of gene replacement rates and the phylogenetic tree relating the species in each group. The fits of the theoretical frequency distributions to the empirical ones yield model parameters and estimates of the goodness of fit. Using the Akaike Information Criterion, we show that the neutral model of genome evolution, with the same replacement rate for all genes, can be confidently rejected. Of the three tested models with purifying selection, the one in which the distribution of replacement rates is derived from a stochastic population model with additive per-gene fitness yields the best fits to the data. The selection strength estimated from the fits declines with evolutionary divergence while staying well outside the neutral regime. These findings indicate that, unlike some other universal distributions of genomic variables, for example, the distribution of paralogous gene family membership, the gene frequency distribution is substantially affected by selection.

Keywords: gene frequency distribution; steady genome model; goodness of fit; evolution mechanisms

Journal Article.  5895 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology ; Evolutionary Biology ; Genetics and Genomics

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.