Accessibility issues or difficulties with this website?
Call 919-962-2073 or email hchsadministration@unc.edu.

Improved imputation accuracy in Hispanic/Latino populations with larger and more diverse reference panels: applications in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL).

TitleImproved imputation accuracy in Hispanic/Latino populations with larger and more diverse reference panels: applications in the Hispanic Community Health Study/Study of Latinos (HCHS/SOL).
Publication TypePublication
Year2016
AuthorsNelson SC, Stilp AM, Papanicolaou GJ, Taylor KD, Rotter JI, Thornton TA, Laurie CC
JournalHum Mol Genet
Volume25
Issue15
Pagination3245-3254
Date Published2016 Aug 01
ISSN1460-2083
KeywordsFemale, genome-wide association study, Hispanic or Latino, Human Genome Project, Humans, Male, United States
Abstract

Imputation is commonly used in genome-wide association studies to expand the set of genetic variants available for analysis. Larger and more diverse reference panels, such as the final Phase 3 of the 1000 Genomes Project, hold promise for improving imputation accuracy in genetically diverse populations such as Hispanics/Latinos in the USA. Here, we sought to empirically evaluate imputation accuracy when imputing to a 1000 Genomes Phase 3 versus a Phase 1 reference, using participants from the Hispanic Community Health Study/Study of Latinos. Our assessments included calculating the correlation between imputed and observed allelic dosage in a subset of samples genotyped on a supplemental array. We observed that the Phase 3 reference yielded higher accuracy at rare variants, but that the two reference panels were comparable at common variants. At a sample level, the Phase 3 reference improved imputation accuracy in Hispanic/Latino samples from the Caribbean more than for Mainland samples, which we attribute primarily to the additional reference panel samples available in Phase 3. We conclude that a 1000 Genomes Project Phase 3 reference panel can yield improved imputation accuracy compared with Phase 1, particularly for rare variants and for samples of certain genetic ancestry compositions. Our findings can inform imputation design for other genome-wide association studies of participants with diverse ancestries, especially as larger and more diverse reference panels continue to become available.

DOI10.1093/hmg/ddw174
Alternate JournalHum Mol Genet
PubMed ID27346520
PubMed Central IDPMC5179925
Grant ListHHSN268201300005C / HL / NHLBI NIH HHS / United States
P01 GM099568 / GM / NIGMS NIH HHS / United States
N01HC65236 / HL / NHLBI NIH HHS / United States
N01HC65235 / HL / NHLBI NIH HHS / United States
UL1 TR000124 / TR / NCATS NIH HHS / United States
N01HC65234 / HL / NHLBI NIH HHS / United States
P30 DK063491 / DK / NIDDK NIH HHS / United States
N01HC65233 / HL / NHLBI NIH HHS / United States
N01HC65237 / HL / NHLBI NIH HHS / United States
MS#: 
0401
Manuscript Lead/Corresponding Author Affiliation: 
HCHS/SOL Genetic Analysis Center - University of Washington, Seattle
ECI: 
Yes
Manuscript Status: 
Published