Title | AFA: Ancestry-specific allele frequency estimation in admixed populations: The Hispanic Community Health Study/Study of Latinos. |
Publication Type | Publication |
Year | 2022 |
Authors | Granot-Hershkovitz E, Sun Q, Argos M, Zhou H, Lin X, Browning SR, Sofer T |
Journal | HGG Adv |
Volume | 3 |
Issue | 2 |
Pagination | 100096 |
Date Published | 2022 Apr 14 |
ISSN | 2666-2477 |
Abstract | Allele frequency estimates in admixed populations, such as Hispanics and Latinos, rely on the sample's specific admixture composition and thus may differ between two seemingly similar populations. However, ancestry-specific allele frequencies, i.e., pertaining to the ancestral populations of an admixed group, may be particularly useful for prioritizing genetic variants for genetic discovery and personalized genomic health. We developed a method, ancestry-specific allele frequency estimation in admixed populations (AFA), to estimate the frequencies of biallelic variants in admixed populations with an unlimited number of ancestries. AFA uses maximum-likelihood estimation by modeling the conditional probability of having an allele given proportions of genetic ancestries. It can be applied using either local ancestry interval proportions encompassing the variant (local-ancestry-specific allele frequency estimations in admixed populations [LAFAs]) or global proportions of genetic ancestries (global-ancestry-specific allele frequency estimations in admixed populations [GAFAs]), which are easier to compute and are more widely available. Simulations and comparisons to existing software demonstrated the high accuracy of the method. We implemented AFA on high-quality imputed data of ∼9,000 Hispanics and Latinos from the Hispanic Community Health Study/Study of Latinos (HCHS/SOL), an understudied, admixed population with three predominant continental ancestries: Amerindian, European, and African. Comparison of the European and African estimated frequencies to the respective gnomAD frequencies demonstrated high correlations (Pearson R = 0.97-0.99). We provide a genome-wide dataset of the estimated ancestry-specific allele frequencies for available variants with allele frequency between 5% and 95% in at least one of the three ancestral populations. Association analysis of Amerindian-enriched variants with cardiometabolic traits identified five loci associated with lipid traits in Hispanics and Latinos, demonstrating the utility of ancestry-specific allele frequencies in admixed populations. |
DOI | 10.1016/j.xhgg.2022.100096 |
Alternate Journal | HGG Adv |
PubMed ID | 35300209 |
PubMed Central ID | PMC8920934 |
Grant List | R01 HG010869 / HG / NHGRI NIH HHS / United States R21 AG070644 / AG / NIA NIH HHS / United States |
AFA: Ancestry-specific allele frequency estimation in admixed populations: The Hispanic Community Health Study/Study of Latinos.
MS#:
1177
ECI:
Yes
Manuscript Status:
Published