Journal Article

Fewer permutations, more accurate <i>P</i>-values

Theo A. Knijnenburg, Lodewyk F. A. Wessels, Marcel J. T. Reinders and Ilya Shmulevich

in Bioinformatics

Volume 25, issue 12, pages i161-i168
Published in print June 2009 | ISSN: 1367-4803
Published online May 2009 | e-ISSN: 1460-2059 | DOI: http://dx.doi.org/10.1093/bioinformatics/btp211

Show Summary Details

Preview

Motivation: Permutation tests have become a standard tool to assess the statistical significance of an event under investigation. The statistical significance, as expressed in a P-value, is calculated as the fraction of permutation values that are at least as extreme as the original statistic, which was derived from non-permuted data. This empirical method directly couples both the minimal obtainable P-value and the resolution of the P-value to the number of permutations. Thereby, it imposes upon itself the need for a very large number of permutations when small P-values are to be accurately estimated. This is computationally expensive and often infeasible.

Results: A method of computing P-values based on tail approximation is presented. The tail of the distribution of permutation values is approximated by a generalized Pareto distribution. A good fit and thus accurate P-value estimates can be obtained with a drastically reduced number of permutations when compared with the standard empirical way of computing P-values.

Availability: The Matlab code can be obtained from the corresponding author on request.

Contact: tknijnenburg@systemsbiology.org

Supplementary information:Supplementary data are available at Bioinformatics online.

Journal Article.  6151 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.