Journal Article

Improving phylogenetic analyses by incorporating additional information from genetic sequence databases

Li-Jung Liang, Robert E. Weiss, Benjamin Redelings and Marc A. Suchard

in Bioinformatics

Volume 25, issue 19, pages 2530-2536
Published in print October 2009 | ISSN: 1367-4803
Published online August 2009 | e-ISSN: 1460-2059 | DOI:
Improving phylogenetic analyses by incorporating additional information from genetic sequence databases

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology


Show Summary Details


Motivation: Statistical analyses of phylogenetic data culminate in uncertain estimates of underlying model parameters. Lack of additional data hinders the ability to reduce this uncertainty, as the original phylogenetic dataset is often complete, containing the entire gene or genome information available for the given set of taxa. Informative priors in a Bayesian analysis can reduce posterior uncertainty; however, publicly available phylogenetic software specifies vague priors for model parameters by default. We build objective and informative priors using hierarchical random effect models that combine additional datasets whose parameters are not of direct interest but are similar to the analysis of interest.

Results: We propose principled statistical methods that permit more precise parameter estimates in phylogenetic analyses by creating informative priors for parameters of interest. Using additional sequence datasets from our lab or public databases, we construct a fully Bayesian semiparametric hierarchical model to combine datasets. A dynamic iteratively reweighted Markov chain Monte Carlo algorithm conveniently recycles posterior samples from the individual analyses. We demonstrate the value of our approach by examining the insertion–deletion (indel) process in the enolase gene across the Tree of Life using the phylogenetic software BALI-PHY; we incorporate prior information about indels from 82 curated alignments downloaded from the BAliBASE database.


Supplementary information: Supplementary data are available at Bioinformatics online.

Journal Article.  6088 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.