Journal Article

Latent class models for joint analysis of disease prevalence and high-dimensional semicontinuous biomarker data

Bo Zhang, Zhen Chen and Paul S. Albert

in Biostatistics

Volume 13, issue 1, pages 74-88
Published in print January 2012 | ISSN: 1465-4644
Published online September 2011 | e-ISSN: 1468-4357 | DOI:

More Like This

Show all results sharing this subject:

  • Probability and Statistics


Show Summary Details


High-dimensional biomarker data are often collected in epidemiological studies when assessing the association between biomarkers and human disease is of interest. We develop a latent class modeling approach for joint analysis of high-dimensional semicontinuous biomarker data and a binary disease outcome. To model the relationship between complex biomarker expression patterns and disease risk, we use latent risk classes to link the 2 modeling components. We characterize complex biomarker-specific differences through biomarker-specific random effects, so that different biomarkers can have different baseline (low-risk) values as well as different between-class differences. The proposed approach also accommodates data features that are common in environmental toxicology and other biomarker exposure data, including a large number of biomarkers, numerous zero values, and complex mean–variance relationship in the biomarkers levels. A Monte Carlo EM (MCEM) algorithm is proposed for parameter estimation. Both the MCEM algorithm and model selection procedures are shown to work well in simulations and applications. In applying the proposed approach to an epidemiological study that examined the relationship between environmental polychlorinated biphenyl (PCB) exposure and the risk of endometriosis, we identified a highly significant overall effect of PCB concentrations on the risk of endometriosis.

Keywords: Categorical data; Chemical exposure biomarkers; Latent variables; Monte Carlo EM algorithm; Random effects

Journal Article.  5621 words.  Illustrated.

Subjects: Probability and Statistics