Journal Article

Correlating multiple SNPs and multiple disease phenotypes: penalized non-linear canonical correlation analysis

Sandra Waaijenborg and Aeilko H. Zwinderman

in Bioinformatics

Volume 25, issue 21, pages 2764-2771
Published in print November 2009 | ISSN: 1367-4803
Published online August 2009 | e-ISSN: 1460-2059 | DOI: http://dx.doi.org/10.1093/bioinformatics/btp491
Correlating multiple SNPs and multiple disease phenotypes: penalized non-linear canonical correlation analysis

More Like This

Show all results sharing this subject:

  • Bioinformatics and Computational Biology

GO

Show Summary Details

Preview

Motivation: Canonical correlation analysis (CCA) can be used to capture the underlying genetic background of a complex disease, by associating two datasets containing information about a patient's phenotypical and genetic details. Often the genetic information is measured on a qualitative scale, consequently ordinary CCA cannot be applied to such data. Moreover, the size of the data in genetic studies can be enormous, thereby making the results difficult to interpret.

Results: We developed a penalized non-linear CCA approach that can deal with qualitative data by transforming each qualitative variable into a continuous variable through optimal scaling. Additionally, sparse results were obtained by adapting soft-thresholding to this non-linear version of the CCA. By means of simulation studies, we show that our method is capable of extracting relevant variables out of high-dimensional sets. We applied our method to a genetic dataset containing 144 patients with glial cancer.

Contact: s.waaijenborg@amc.uva.nl

Journal Article.  6472 words.  Illustrated.

Subjects: Bioinformatics and Computational Biology

Full text: subscription required

How to subscribe Recommend to my Librarian

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.