Journal Article

Genetic algorithms for simultaneous variable and sample selection in metabonomics

Rachel Cavill, Hector C. Keun, Elaine Holmes, John C. Lindon, Jeremy K. Nicholson and Timothy M. D. Ebbels

in Bioinformatics

Volume 25, issue 1, pages 112-118
Published in print January 2009 | ISSN: 1367-4803
Published online November 2008 | e-ISSN: 1460-2059 | DOI:
Genetic algorithms for simultaneous variable and sample selection in metabonomics

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology


Show Summary Details


Motivation: Metabolic profiles derived from high resolution 1H-NMR data are complex, therefore statistical and machine learning approaches are vital for extracting useful information and biological insights. Focused modelling on targeted subsets of metabolites and samples can improve the predictive ability of models, and techniques such as genetic algorithms (GAs) have a proven utility in feature selection problems. The Consortium for Metabonomic Toxicology (COMET) obtained temporal NMR spectra of urine from rats treated with model toxins and stressors. Here, we develop a GA approach which simultaneously selects sets of samples and spectral regions from the COMET database to build robust, predictive classifiers of liver and kidney toxicity.

Results: The results indicate that using simultaneous sample and variable selection improved performance by over 9% compared with either method alone. Simultaneous selection also halved computation time. Successful classifiers repeatedly selected particular variables indicating that this approach can aid defining biomarkers of toxicity. Novel visualizations of the results from multiple computations were developed to aid the interpretability of which samples and variables were frequently selected. This method provides an efficient way to determine the most discriminatory variables and samples for any post-genomic dataset.

Availability: GA code available from


Supplementary information: Supplementary data are available at Bioinformatics online.

Journal Article.  5118 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.