Title | A general framework for studying genetic effects and gene-environment interactions with missing data. |
Publication Type | Journal Article |
Year of Publication | 2010 |
Authors | Hu, Y J., D Y. Lin, and D Zeng |
Journal | Biostatistics |
Volume | 11 |
Issue | 4 |
Pagination | 583-98 |
Date Published | 2010 Oct |
ISSN | 1468-4357 |
Keywords | Algorithms, Biostatistics, Carcinoma, Non-Small-Cell Lung, Case-Control Studies, Cohort Studies, Computer Simulation, Cross-Sectional Studies, Cysteine Endopeptidases, Disease, Environment, Genetic Association Studies, Genotype, Haplotypes, Humans, Likelihood Functions, Nerve Tissue Proteins, Odds Ratio, Phenotype, Polymorphism, Single Nucleotide, Receptors, Nicotinic, Regression Analysis, Smoking |
Abstract | Missing data arise in genetic association studies when genotypes are unknown or when haplotypes are of direct interest. We provide a general likelihood-based framework for making inference on genetic effects and gene-environment interactions with such missing data. We allow genetic and environmental variables to be correlated while leaving the distribution of environmental variables completely unspecified. We consider 3 major study designs-cross-sectional, case-control, and cohort designs-and construct appropriate likelihood functions for all common phenotypes (e.g. case-control status, quantitative traits, and potentially censored ages at onset of disease). The likelihood functions involve both finite- and infinite-dimensional parameters. The maximum likelihood estimators are shown to be consistent, asymptotically normal, and asymptotically efficient. Expectation-Maximization (EM) algorithms are developed to implement the corresponding inference procedures. Extensive simulation studies demonstrate that the proposed inferential and numerical methods perform well in practical settings. Illustration with a genome-wide association study of lung cancer is provided. |
DOI | 10.1093/biostatistics/kxq015 |
Alternate Journal | Biostatistics |
Original Publication | A general framework for studying genetic effects and gene-environment interactions with missing data. |
PubMed ID | 20348396 |
PubMed Central ID | PMC3294269 |
Grant List | R01 CA082659 / CA / NCI NIH HHS / United States P01 CA142538-01 / CA / NCI NIH HHS / United States R01 CA133996 / CA / NCI NIH HHS / United States R37 GM047845 / GM / NIGMS NIH HHS / United States R01 CA055769 / CA / NCI NIH HHS / United States R01CA55769 / CA / NCI NIH HHS / United States R01CA133996 / CA / NCI NIH HHS / United States P01 CA142538 / CA / NCI NIH HHS / United States |
A general framework for studying genetic effects and gene-environment interactions with missing data.
Project: