Journal Article

Spectral analysis of two-signed microarray expression data

Desmond J. Higham, Gabriela Kalna and J. Keith Vass

in Mathematical Medicine and Biology: A Journal of the IMA

Published on behalf of Institute of Mathematics and its Applications

Volume 24, issue 2, pages 131-148
Published in print June 2007 | ISSN: 1477-8599
Published online June 2007 | e-ISSN: 1477-8602 | DOI: http://dx.doi.org/10.1093/imammb/dql030
Spectral analysis of two-signed microarray expression data

More Like This

Show all results sharing these subjects:

  • Applied Mathematics
  • Biomathematics and Statistics

GO

Show Summary Details

Preview

We give a simple and informative derivation of a spectral algorithm for clustering and reordering complementary DNA microarray expression data. Here, expression levels of a set of genes are recorded simultaneously across a number of samples, with a positive weight reflecting up-regulation and a negative weight reflecting down-regulation. We give theoretical support for the algorithm based on a biologically justified hypothesis about the structure of the data, and illustrate its use on public domain data in the context of unsupervised tumour classification. The algorithm is derived by considering a discrete optimization problem and then relaxing to the continuous realm. We prove that in the case where the data have an inherent ‘checkerboard’ sign pattern, the algorithm will automatically reveal that pattern. Further, our derivation shows that the algorithm may be regarded as imposing a random graph model on the expression levels and then clustering from a maximum likelihood perspective. This indicates that the output will be tolerant to perturbations and will reveal ‘near-checkerboard’ patterns when these are present in the data. It is interesting to note that the checkerboard structure is revealed by the first (dominant) singular vectors—previous work on spectral methods has focussed on the case of nonnegative edge weights, where only the second and higher singular vectors are relevant. We illustrate the algorithm on real and synthetic data, and then use it in a tumour classification context on three different cancer data sets. Our results show that respecting the two-signed nature of the data (thereby distinguishing between up-regulation and down-regulation) reveals structures that cannot be gleaned from the absolute value data (where up- and down-regulation are both regarded as ‘changes’).

Keywords: bioinformatics; cDNA; checkerboard; clustering; data mining; maximum likelihood; microarray; reordering; singular value decomposition; tumour classification; unsupervised feature extraction

Journal Article.  0 words. 

Subjects: Applied Mathematics ; Biomathematics and Statistics

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.