TEAM: efficient two-locus epistasis tests in human genome-wide association study.

Publication TypeJournal Article
Year of Publication2010
AuthorsZhang, Xiang, Shunping Huang, Fei Zou, and Wei Wang
Date Published2010 Jun 15
KeywordsAlgorithms, Epistasis, Genetic, Genome, Human, Genome-Wide Association Study, Genomics, Humans, Population Groups, Software

As a promising tool for identifying genetic markers underlying phenotypic differences, genome-wide association study (GWAS) has been extensively investigated in recent years. In GWAS, detecting epistasis (or gene-gene interaction) is preferable over single locus study since many diseases are known to be complex traits. A brute force search is infeasible for epistasis detection in the genome-wide scale because of the intensive computational burden. Existing epistasis detection algorithms are designed for dataset consisting of homozygous markers and small sample size. In human study, however, the genotype may be heterozygous, and number of individuals can be up to thousands. Thus, existing methods are not readily applicable to human datasets. In this article, we propose an efficient algorithm, TEAM, which significantly speeds up epistasis detection for human GWAS. Our algorithm is exhaustive, i.e. it does not ignore any epistatic interaction. Utilizing the minimum spanning tree structure, the algorithm incrementally updates the contingency tables for epistatic tests without scanning all individuals. Our algorithm has broader applicability and is more efficient than existing methods for large sample study. It supports any statistical test that is based on contingency tables, and enables both family-wise error rate and false discovery rate controlling. Extensive experiments show that our algorithm only needs to examine a small portion of the individuals to update the contingency tables, and it achieves at least an order of magnitude speed up over the brute force approach.

Alternate JournalBioinformatics
PubMed ID20529910
PubMed Central IDPMC2881371
Grant ListP01 CA142538 / CA / NCI NIH HHS / United States
P30 ES010126 / ES / NIEHS NIH HHS / United States