Accessibility issues or difficulties with this website?
Call 919-962-2073 or email hchsadministration@unc.edu.

Genetic association analysis under complex survey sampling: the Hispanic Community Health Study/Study of Latinos.

TitleGenetic association analysis under complex survey sampling: the Hispanic Community Health Study/Study of Latinos.
Publication TypePublication
Year2014
AuthorsLin D-Y, Tao R, Kalsbeek WD, Zeng D, Gonzalez F, Fernández-Rhodes L, Graff M, Koch GG, North KE, Heiss G
JournalAm J Hum Genet
Volume95
Issue6
Pagination675-88
Date Published2014 Dec 04
ISSN1537-6605
KeywordsAdolescent, Adult, Aged, Cohort Studies, Computer Simulation, Female, Genetic Association Studies, Genotype, Health Surveys, Hispanic Americans, Humans, Male, Middle Aged, Models, Statistical, Phenotype, Research Design, Sampling Studies, United States, Young Adult
Abstract

The cohort design allows investigators to explore the genetic basis of a variety of diseases and traits in a single study while avoiding major weaknesses of the case-control design. Most cohort studies employ multistage cluster sampling with unequal probabilities to conveniently select participants with desired characteristics, and participants from different clusters might be genetically related. Analysis that ignores the complex sampling design can yield biased estimation of the genetic association and inflation of the type I error. Herein, we develop weighted estimators that reflect unequal selection probabilities and differential nonresponse rates, and we derive variance estimators that properly account for the sampling design and the potential relatedness of participants in different sampling units. We compare, both analytically and numerically, the performance of the proposed weighted estimators with unweighted estimators that disregard the sampling design. We demonstrate the usefulness of the proposed methods through analysis of MetaboChip data in the Hispanic Community Health Study/Study of Latinos, which is the largest health study of the Hispanic/Latino population in the United States aimed at identifying risk factors for various diseases and determining the role of genes and environment in the occurrence of diseases. We provide guidelines on the use of weighted and unweighted estimators, as well as the relevant software.

DOI10.1016/j.ajhg.2014.11.005
Alternate JournalAm J Hum Genet
PubMed ID25480034
PubMed Central IDPMC4259979
Grant ListR01 CA082659 / CA / NCI NIH HHS / United States
U01 HG007416 / HG / NHGRI NIH HHS / United States
R01CA082659 / CA / NCI NIH HHS / United States
R37 GM047845 / GM / NIGMS NIH HHS / United States
N01HC65236 / HL / NHLBI NIH HHS / United States
U01HG004803 / HG / NHGRI NIH HHS / United States
N01HC65235 / HL / NHLBI NIH HHS / United States
N01HC65234 / HL / NHLBI NIH HHS / United States
R37GM047845 / GM / NIGMS NIH HHS / United States
N01HC65233 / HL / NHLBI NIH HHS / United States
N01HC65237 / HL / NHLBI NIH HHS / United States
U01 HG004803 / HG / NHGRI NIH HHS / United States
N01 HC65233 / HC / NHLBI NIH HHS / United States
N01-HC65234 / HC / NHLBI NIH HHS / United States
N01-HC65236 / HC / NHLBI NIH HHS / United States
P01 CA142538 / CA / NCI NIH HHS / United States
N01 HC65237 / HC / NHLBI NIH HHS / United States
T32 HL007055 / HL / NHLBI NIH HHS / United States
N01-HC65235 / HC / NHLBI NIH HHS / United States
MS#: 
0298
Manuscript Lead/Corresponding Author Affiliation: 
Coordinating Center - Collaborative Studies Coordinating Center - UNC at Chapel Hill
ECI: 
Manuscript Status: 
Published