Journal Article

A comparative study of survival models for breast cancer prognostication based on microarray data: does a single gene beat them all?

B. Haibe-Kains, C. Desmedt, C. Sotiriou and G. Bontempi

in Bioinformatics

Volume 24, issue 19, pages 2200-2208
Published in print October 2008 | ISSN: 1367-4803
Published online July 2008 | e-ISSN: 1460-2059 | DOI:

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology


Show Summary Details


Motivation: Survival prediction of breast cancer (BC) patients independently of treatment, also known as prognostication, is a complex task since clinically similar breast tumors, in addition to be molecularly heterogeneous, may exhibit different clinical outcomes. In recent years, the analysis of gene expression profiles by means of sophisticated data mining tools emerged as a promising technology to bring additional insights into BC biology and to improve the quality of prognostication. The aim of this work is to assess quantitatively the accuracy of prediction obtained with state-of-the-art data analysis techniques for BC microarray data through an independent and thorough framework.

Results: Due to the large number of variables, the reduced amount of samples and the high degree of noise, complex prediction methods are highly exposed to performance degradation despite the use of cross-validation techniques. Our analysis shows that the most complex methods are not significantly better than the simplest one, a univariate model relying on a single proliferation gene. This result suggests that proliferation might be the most relevant biological process for BC prognostication and that the loss of interpretability deriving from the use of overcomplex methods may be not sufficiently counterbalanced by an improvement of the quality of prediction.

Availability: The comparison study is implemented in an R package called survcomp and is available from


Supplementary information: Supplementary data are available at Bioinformatics online.

Journal Article.  7641 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.