Journal Article

Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis

Kengo Kinoshita and Takeshi Obayashi

in Bioinformatics

Volume 25, issue 20, pages 2677-2684
Published in print October 2009 | ISSN: 1367-4803
Published online July 2009 | e-ISSN: 1460-2059 | DOI:

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology


Show Summary Details


Background: Recent improvements in DNA microarray techniques have made a large variety of gene expression data available in public databases. This data can be used to evaluate the strength of gene coexpression by calculating the correlation of expression patterns among different genes between many experiments. However, gene expression levels differ significantly across various tissues in higher organisms, as well as in different cellular location in eukaryotes in different cell state. Thus the usual correlation measure can only evaluate the difference of tissues or cellular localizations, and cannot adequately elucidate the functional relationship from the coexpression of genes.

Method: We propose a new measure of coexpression by expanding the generally used correlation into a multidimensional one. We used principal component analyses to identify the major factors of gene expression correlation, and then re-calculate the correlation by subtracting the major components in order to remove biases cased by a few experiments. The repeated subtractions of the major components yielded a set of correlation values for each pair of genes. We observed the correlation changes when the first ten principal components were subtracted step-by-step in large-scale Arabidopsis expression data.

Results: We found two extreme patterns of correlation changes, corresponding to stable and fragile coexpression. Our new indexes provided a good means to determine the functional relationships of the genes, by examining a few examples, and higher performance of Gene Ontology term prediction by using the support vector machine and the multidimensional correlation.

Availability: The results are available from the expression detail pages in ATTED-II (


Supplementary information: Supplementary data are available at Bioinformatics online.

Journal Article.  5827 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.