Journal Article

On the beta-binomial model for analysis of spectral count data in label-free tandem mass spectrometry-based proteomics

Thang V. Pham, Sander R. Piersma, Marc Warmoes and Connie R. Jimenez

in Bioinformatics

Volume 26, issue 3, pages 363-369
Published in print February 2010 | ISSN: 1367-4803
Published online December 2009 | e-ISSN: 1460-2059 | DOI:
On the beta-binomial model for analysis of spectral count data in label-free tandem mass spectrometry-based proteomics

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology


Show Summary Details


Motivation: Spectral count data generated from label-free tandem mass spectrometry-based proteomic experiments can be used to quantify protein's abundances reliably. Comparing spectral count data from different sample groups such as control and disease is an essential step in statistical analysis for the determination of altered protein level and biomarker discovery. The Fisher's exact test, the G-test, the t-test and the local-pooled-error technique (LPE) are commonly used for differential analysis of spectral count data. However, our initial experiments in two cancer studies show that the current methods are unable to declare at 95% confidence level a number of protein markers that have been judged to be differential on the basis of the biology of the disease and the spectral count numbers. A shortcoming of these tests is that they do not take into account within- and between-sample variations together. Hence, our aim is to improve upon existing techniques by incorporating both the within- and between-sample variations.

Result: We propose to use the beta-binomial distribution to test the significance of differential protein abundances expressed in spectral counts in label-free mass spectrometry-based proteomics. The beta-binomial test naturally normalizes for total sample count. Experimental results show that the beta-binomial test performs favorably in comparison with other methods on several datasets in terms of both true detection rate and false positive rate. In addition, it can be applied for experiments with one or more replicates, and for multiple condition comparisons. Finally, we have implemented a software package for parameter estimation of two beta-binomial models and the associated statistical tests.

Availability and implementation: A software package implemented in R is freely available for download at


Supplementary information: Supplementary data are available at Bioinformatics online.

Journal Article.  4451 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.