Pulse lineResearch With Heart Logo

Mining gold dust under the genome wide significance level: a two-stage approach to analysis of GWAS.

TitleMining gold dust under the genome wide significance level: a two-stage approach to analysis of GWAS.
Publication TypeJournal Article
Year of Publication2011
AuthorsShi G, Boerwinkle E, Morrison AC, Gu CC, Chakravarti A, Rao DC
JournalGenet Epidemiol
Volume35
Issue2
Pagination111-8
Date Published2011 Feb
ISSN1098-2272
KeywordsComputer Simulation, False Negative Reactions, False Positive Reactions, Genome-Wide Association Study, Humans, Linkage Disequilibrium, Models, Statistical, Molecular Epidemiology, Polymorphism, Single Nucleotide, Regression Analysis, Reproducibility of Results
Abstract

We propose a two-stage approach to analyze genome-wide association data in order to identify a set of promising single-nucleotide polymorphisms (SNPs). In stage one, we select a list of top signals from single SNP analyses by controlling false discovery rate. In stage two, we use the least absolute shrinkage and selection operator (LASSO) regression to reduce false positives. The proposed approach was evaluated using simulated quantitative traits based on genome-wide SNP data on 8,861 Caucasian individuals from the Atherosclerosis Risk in Communities (ARIC) Study. Our first stage, targeted at controlling false negatives, yields better power than using Bonferroni-corrected significance level. The LASSO regression reduces the number of significant SNPs in stage two: it reduces false-positive SNPs and it reduces true-positive SNPs also at simulated causal loci due to linkage disequilibrium. Interestingly, the LASSO regression preserves the power from stage one, i.e., the number of causal loci detected from the LASSO regression in stage two is almost the same as in stage one, while reducing false positives further. Real data on systolic blood pressure in the ARIC study was analyzed using our two-stage approach which identified two significant SNPs, one of which was reported to be genome-significant in a meta-analysis containing a much larger sample size. On the other hand, a single SNP association scan did not yield any significant results.

DOI10.1002/gepi.20556
Alternate JournalGenet Epidemiol
PubMed ID21254218
PubMed Central IDPMC3624896
Grant List5U01HL054473 / HL / NHLBI NIH HHS / United States
R01 HL086694 / HL / NHLBI NIH HHS / United States
5R01HL086694 / HL / NHLBI NIH HHS / United States
5R01GM028719 / GM / NIGMS NIH HHS / United States
R01 GM028719 / GM / NIGMS NIH HHS / United States
U01 HL054473 / HL / NHLBI NIH HHS / United States