Chapter

Cluster Analysis

Steve Selvin

in Epidemiologic Analysis

Published in print August 2001 | ISBN: 9780195146189
Published online September 2009 | e-ISBN: 9780199864720 | DOI: http://dx.doi.org/10.1093/acprof:oso/9780195146189.003.0007
Cluster Analysis

Show Summary Details

Preview

This chapter identifies the similarities and differences among a series of human populations based on variations in genetic frequencies. Using gene frequency data from twenty-six race/ethnicity groups provides a basis for classifying these groups into a series of categories indicating genetic similarity/dissimilarity. Similarity defined by a nearest-neighbor criterion based on Euclidean distance allows two approaches to classification: one based on a dendrogram and the other on the first two principal components. Both approaches produce rather similar results when applied to the genetic variant data made up of thirteen genetic frequencies. Starting from the closest groups (U.S. whites and Germans), the description of closeness continues until the most dissimilar groups are identified (Bantu, Navaho, and Guinean). The two descriptive approaches clearly identify two and, perhaps, three definite clusters based on gene frequencies. The results are more qualitative (graphic) than quantitative, which is typical of many cluster analysis techniques.

Keywords: human populations; genetic frequencies; genetic variants; red blood cell systems

Chapter.  3714 words.  Illustrated.

Subjects: Public Health and Epidemiology

Full text: subscription required

How to subscribe Recommend to my Librarian

Buy this work at Oxford University Press »

Users without a subscription are not able to see the full content. Please, subscribe or login to access all content.