Pulse lineResearch With Heart Logo

Population genomic analysis of 962 whole genome sequences of humans reveals natural selection in non-coding regions.

TitlePopulation genomic analysis of 962 whole genome sequences of humans reveals natural selection in non-coding regions.
Publication TypeJournal Article
Year of Publication2015
AuthorsYu F, Lu J, Liu X, Gazave E, Chang D, Raj S, Hunter-Zinck H, Blekhman R, Arbiza L, Van Hout C, Morrison A, Johnson AD, Bis J, L Cupples A, Psaty BM, Muzny D, Yu J, Gibbs RA, Keinan A, Clark AG
Secondary AuthorsBoerwinkle E
JournalPLoS One
Volume10
Issue3
Paginatione0121644
Date Published2015
ISSN1932-6203
KeywordsDNA, Intergenic, Genetic Loci, Humans, Metagenomics, Open Reading Frames, Polymorphism, Single Nucleotide
Abstract

Whole genome analysis in large samples from a single population is needed to provide adequate power to assess relative strengths of natural selection across different functional components of the genome. In this study, we analyzed next-generation sequencing data from 962 European Americans, and found that as expected approximately 60% of the top 1% of positive selection signals lie in intergenic regions, 33% in intronic regions, and slightly over 1% in coding regions. Several detailed functional annotation categories in intergenic regions showed statistically significant enrichment in positively selected loci when compared to the null distribution of the genomic span of ENCODE categories. There was a significant enrichment of purifying selection signals detected in enhancers, transcription factor binding sites, microRNAs and target sites, but not on lincRNA or piRNAs, suggesting different evolutionary constraints for these domains. Loci in "repressed or low activity regions" and loci near or overlapping the transcription start site were the most significantly over-represented annotations among the top 1% of signals for positive selection.

DOI10.1371/journal.pone.0121644
Alternate JournalPLoS One
PubMed ID25807536
PubMed Central IDPMC4373932
Grant ListR01 GM108805 / GM / NIGMS NIH HHS / United States
N01-HC-25195 / HC / NHLBI NIH HHS / United States
N01-HC-85085 / HC / NHLBI NIH HHS / United States
N01-HC-85081 / HC / NHLBI NIH HHS / United States
HHSN268201100005C / / PHS HHS / United States
HL105756 / HL / NHLBI NIH HHS / United States
HHSN268201100009C / / PHS HHS / United States
HL087652 / HL / NHLBI NIH HHS / United States
N01-HC-85086 / HC / NHLBI NIH HHS / United States
HHSN268201100010C / / PHS HHS / United States
R01 HL105756 / HL / NHLBI NIH HHS / United States
N01-HC-85082 / HC / NHLBI NIH HHS / United States
N01-HC-35129 / HC / NHLBI NIH HHS / United States
N01 HC-55222 / HC / NHLBI NIH HHS / United States
HHSN268201100008C / / PHS HHS / United States
HHSN268201100012C / / PHS HHS / United States
N01-HC-85083 / HC / NHLBI NIH HHS / United States
N01-HC-75150 / HC / NHLBI NIH HHS / United States
N01-HC-85080 / HC / NHLBI NIH HHS / United States
N01 HC-15103 / HC / NHLBI NIH HHS / United States
HHSN268201100007C / / PHS HHS / United States
HHSN268201100011C / / PHS HHS / United States
N01-HC-45133 / HC / NHLBI NIH HHS / United States
N01-HC-85079 / HC / NHLBI NIH HHS / United States
HHSN268201200036C / / PHS HHS / United States
HL080295 / HL / NHLBI NIH HHS / United States
N01-HC-85239 / HC / NHLBI NIH HHS / United States
HHSN268201100006C / / PHS HHS / United States
N01-HC-85084 / HC / NHLBI NIH HHS / United States