A variable selection method for genome-wide association studies.

TitleA variable selection method for genome-wide association studies.
Publication TypeJournal Article
Year of Publication2011
AuthorsHe, Qianchuan, and Dan-Yu Lin
JournalBioinformatics
Volume27
Issue1
Pagination1-8
Date Published2011 Jan 01
ISSN1367-4811
KeywordsComputer Simulation, Genome-Wide Association Study, Linkage Disequilibrium, Logistic Models, Polymorphism, Single Nucleotide, Software
Abstract

MOTIVATION: Genome-wide association studies (GWAS) involving half a million or more single nucleotide polymorphisms (SNPs) allow genetic dissection of complex diseases in a holistic manner. The common practice of analyzing one SNP at a time does not fully realize the potential of GWAS to identify multiple causal variants and to predict risk of disease. Existing methods for joint analysis of GWAS data tend to miss causal SNPs that are marginally uncorrelated with disease and have high false discovery rates (FDRs).RESULTS: We introduce GWASelect, a statistically powerful and computationally efficient variable selection method designed to tackle the unique challenges of GWAS data. This method searches iteratively over the potential SNPs conditional on previously selected SNPs and is thus capable of capturing causal SNPs that are marginally correlated with disease as well as those that are marginally uncorrelated with disease. A special resampling mechanism is built into the method to reduce false positive findings. Simulation studies demonstrate that the GWASelect performs well under a wide spectrum of linkage disequilibrium patterns and can be substantially more powerful than existing methods in capturing causal variants while having a lower FDR. In addition, the regression models based on the GWASelect tend to yield more accurate prediction of disease risk than existing methods. The advantages of the GWASelect are illustrated with the Wellcome Trust Case-Control Consortium (WTCCC) data.AVAILABILITY: The software implementing GWASelect is available at http://www.bios.unc.edu/~lin. Access to WTCCC data: http://www.wtccc.org.uk/.

DOI10.1093/bioinformatics/btq600
Alternate JournalBioinformatics
Original PublicationA variable selection method for genome-wide association studies.
PubMed ID21036813
PubMed Central IDPMC3025714
Grant ListP01 CA142538 / CA / NCI NIH HHS / United States
R01 CA082659 / CA / NCI NIH HHS / United States
R01 CA082659-13 / CA / NCI NIH HHS / United States
1-P01-CA142538-01 / CA / NCI NIH HHS / United States
Project: