Publications
Comparison of adaptive treatment strategies based on longitudinal outcomes in sequential multiple assignment randomized trials." Stat Med 36, no. 3 (2017): 403-415.
"Efficient estimation of grouped survival models." BMC Bioinformatics 20, no. 1 (2019): 269.
"A global logrank test for adaptive treatment strategies based on observational studies." Stat Med 33, no. 5 (2014): 760-71.
"Secondary analysis of case-control association studies: Insights on weighting-based inference motivate a new specification test." Stat Med 39, no. 22 (2020): 2869-2882.
"Sample size calculation for studies with grouped survival data." Stat Med 37, no. 27 (2018): 3904-3917.
" SEMIPARAMETRIC REGRESSION ANALYSIS OF REPEATED CURRENT STATUS DATA." Stat Sin 27, no. 3 (2017): 1079-1100.
"Sparse concordance-assisted learning for optimal treatment decision." J Mach Learn Res 18 (2018).
"Deep advantage learning for optimal dynamic treatment regime." Stat Theory Relat Fields 2, no. 1 (2018): 80-88.
"groupedSurv: Efficient Estimation of Grouped Survival Models Using the Exact Likelihood Function (R). 1.0.0 ed., 2018.
A simple and accurate method to determine genomewide significance for association tests in sequencing studies." Genet Epidemiol 43, no. 4 (2019): 365-372.
" Nonparametric estimation of the mean function for recurrent event data with missing event category." Biometrika 100, no. 3 (2013).
"A general framework for detecting disease associations with rare variants in sequencing studies." Am J Hum Genet 89, no. 3 (2011): 354-67.
"Genetic association analysis under complex survey sampling: the Hispanic Community Health Study/Study of Latinos." Am J Hum Genet 95, no. 6 (2014): 675-88.
" Functional-mixed effects models for candidate genetic mapping in imaging genetic studies." Genet Epidemiol 38, no. 8 (2014): 680-91.
" fastJT: Efficient Jonckheere-Terpstra Test Statistics for Robust Machine Learning and Genome-Wide Association Studies (R). 1.0.4 ed., 2017.
Discussion of the Paper by R. L. Prentice and Y. Huang - Optimal Designs and Efficient Inference for Biomarker Studies." Stat Theory Relat Fields 2, no. 1 (2018): 21-22.
"SCORE-Seq: Score tests for detecting disease associations with rare variants in sequencing studies (C).. 5.0 ed., 2013.
A general framework for integrative analysis of incomplete multiomics data." Genet Epidemiol 44, no. 7 (2020): 646-664.
"jtGWAS: Efficient Jonckheere-Terpstra test statistics (R).. 1.0 ed., 2016.
Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data." Genet Epidemiol 34, no. 1 (2010): 60-6.
"fastJT: An R package for robust and efficient feature selection for machine learning and genome-wide association studies." BMC Bioinformatics 20, no. 1 (2019): 333.
"On the relative efficiency of using summary statistics versus individual-level data in meta-analysis." Biometrika 97, no. 2 (2010): 321-332.
"Quantitative trait analysis in sequencing studies under trait-dependent sampling." Proc Natl Acad Sci U S A 110, no. 30 (2013): 12247-52.
"bcSeq: an R package for fast sequence mapping in high-throughput shRNA and CRISPR screens." Bioinformatics 34, no. 20 (2018): 3581-3583.
"intcensROC: Fast Spline Function Based Constrained Maximum Likelihood Estimator for AUC Estimation of Interval Censored Survival Data (R). 0.1.1 ed., 2018.
Variable Selection for Nonparametric Quantile Regression via Smoothing Spline AN OVA." Stat 2, no. 1 (2013): 255-268.
" DOVE: Durability of Vaccine Efficacy. v1.2 ed., 2021.
Estimation of dynamic treatment regimes for complex outcomes: Balancing benefits and risks." In Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine, 249-262. Philadelphia: ASA-SIAM, 2016.
" IQ-Learning., 2012.
iqLearn: Interactive Q-Learning in R." J Stat Softw 64, no. 1 (2015).
"Interactive Q-learning for Quantiles." J Am Stat Assoc 112, no. 518 (2017): 638-649.
"Associating somatic mutations to clinical outcomes: a pan-cancer study of survival time." Genome Med 11, no. 1 (2019): 37.
"Look before you leap: systematic evaluation of tree-based statistical methods in subgroup identification." J Biopharm Stat 29, no. 6 (2019): 1082-1102.
"Hard or Soft Classification? Large-margin Unified Machines." J Am Stat Assoc 106, no. 493 (2011): 166-177.
"Association analysis using somatic mutations." PLoS Genet 14, no. 11 (2018): e1007746.
"A unification of models for meta-analysis of diagnostic accuracy studies without a gold standard." Biometrics 71, no. 2 (2015): 538-47.
"Joint skeleton estimation of multiple directed acyclic graphs for heterogeneous population." Biometrics 75, no. 1 (2019): 36-47.
"SMAC: Spatial multi-category angle-based classifier for high-dimensional neuroimaging data." Neuroimage 175 (2018): 230-245.
"DTRlearn: Learning Algorithms for Dynamic Treatment Regimes (R). 1.3 ed., 2018.
Augmented outcome-weighted learning for estimating optimal dynamic treatment regimens." Stat Med 37, no. 26 (2018): 3776-3788.
"Principal Components Adjusted Variable Screening." Comput Stat Data Anal 110 (2017): 134-144.
"Marginal hazard regression for correlated failure time data with auxiliary covariates." Lifetime Data Anal 18, no. 1 (2012): 116-38.
"Sequential multiple assignment randomization trials with enrichment design." Biometrics 73, no. 2 (2017): 378-390.
"Utility-based Weighted Multicategory Robust Support Vector Machines." Stat Interface 3, no. 4 (2010): 465-476.
"Accelerated intensity frailty model for recurrent events data." Biometrics 70, no. 3 (2014): 579-87.
"A Comparison of Monte Carlo Methods for Computing Marginal Likelihoods of Item Response Theory Models." J Korean Stat Soc 48, no. 4 (2019): 503-512.
"Estimating personalized diagnostic rules depending on individualized characteristics." Stat Med 36, no. 7 (2017): 1099-1117.
"Multi-Objective Markov Decision Processes for Data-Driven Decision Support." J Mach Learn Res 17 (2016).
"Tximeta: Reference sequence checksums for provenance identification in RNA-seq." PLoS Comput Biol 16, no. 2 (2020): e1007664.
"Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification." F1000Res 7 (2018): 952.
"Bayesian longitudinal low-rank regression models for imaging genetic data from longitudinal studies." Neuroimage 149 (2017): 305-322.
"Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design." Stat Med 36, no. 6 (2017): 985-997.
"Variable selection for optimal treatment decision." Stat Methods Med Res 22, no. 5 (2013): 493-504.
"OTRselect: Variable selection for optimal treatment decision (R).. 1.0 ed. CRAN Repository, 2016.
Semiparametric estimation of treatment effect with time-lagged response in the presence of informative censoring." Lifetime Data Anal 17, no. 4 (2011): 566-93.
"Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning." J Am Stat Assoc 115, no. 530 (2020): 692-706.
"Receiver operating characteristic curves and confidence bands for support vector machines." Biometrics 77, no. 4 (2021): 1422-1430.
"On the substructure controls in rare variant analysis: Principal components or variance components?" Genet Epidemiol 42, no. 3 (2018): 276-287.
"mmeta: Multivariate meta-analysis (R).. 2.2 ed., 2014.
mmeta: An R Package for Multivariate Meta-Analysis." J Stat Softw 56, no. 11 (2014): 11.
"FSEM: Functional Structural Equation Models for Twin Functional Data." J Am Stat Assoc 114, no. 525 (2019): 344-357.
"A hybrid Bayesian hierarchical model combining cohort and case-control studies for meta-analysis of diagnostic tests: Accounting for partial verification bias." Stat Methods Med Res 25, no. 6 (2016): 3015-3037.
"Assessing Similarity to Existing Drugs to Decide Whether to Continue Drug Development." Stat Biopharm Res 4, no. 3 (2012): 293-300.
"A trivariate meta-analysis of diagnostic studies accounting for prevalence and non-evaluable subjects: re-evaluation of the meta-analysis of coronary CT angiography studies." BMC Med Res Methodol 14 (2014): 128.
"Incorporating higher-order representative features improves prediction in network-based cancer prognosis analysis." BMC Med Genomics 4 (2011): 5.
"Statistical methods for multivariate meta-analysis of diagnostic tests: An overview and tutorial." Stat Methods Med Res 25, no. 4 (2016): 1596-619.
" Drug safety in spontaneous reports, observational databases, and clinical trials: Can we do better?., 2011.
Multiplicative rates model for recurrent events in case-cohort studies." Lifetime Data Anal 26, no. 1 (2020): 134-157.
"Multivariate phenotype association analysis by marker-set kernel machine regression." Genet Epidemiol 36, no. 7 (2012): 686-95.
"Inference on phenotype-specific effects of genes using multivariate kernel machine regression." Genet Epidemiol 42, no. 1 (2018): 64-79.
"Semiparametric regression for the weighted composite endpoint of recurrent and terminal events." Biostatistics 17, no. 2 (2016): 390-403.
"Semiparametric regression analysis of interval-censored competing risks data." Biometrics 73, no. 3 (2017): 857-865.
"Efficient Estimation of Semiparametric Transformation Models for the Cumulative Incidence of Competing Risks." J R Stat Soc Series B Stat Methodol 79, no. 2 (2017): 573-587.
"A Fast Multiple-Kernel Method With Applications to Detect Gene-Environment Interaction." Genet Epidemiol 39, no. 6 (2015): 456-68.
" Robust kernel association testing (RobKAT)." Genet Epidemiol 44, no. 3 (2020): 272-282.
"Gene set analysis methods: a systematic comparison." BioData Min 11 (2018): 8.
"Maximum likelihood estimation in generalized linear models with multiple covariates subject to detection limits." Stat Med 30, no. 20 (2011): 2551-61.
"Consistency and overfitting of multi-omics methods on experimental data." Brief Bioinform 21, no. 4 (2020): 1277-1284.
"Localized differences in caudate and hippocampal shape are associated with schizophrenia but not antipsychotic type." Psychiatry Res 211, no. 1 (2013): 1-10.
"Cancer pharmacogenomics: early promise, but concerted effort needed." Science 339, no. 6127 (2013): 1563-6.
"Clinical characteristics, response to exercise training, and outcomes in patients with heart failure and chronic obstructive pulmonary disease: findings from Heart Failure and A Controlled Trial Investigating Outcomes of Exercise TraiNing (HF-ACTION)." Am Heart J 165, no. 2 (2013): 193-9.
"Data for cancer comparative effectiveness research: past, present, and future potential." Cancer 118, no. 21 (2012): 5186-97.
"Multiple testing of treatment-effect-modifying biomarkers in a randomized clinical trial with a survival endpoint." Stat Med 30, no. 13 (2011): 1502-18.
"Estimation After a Group Sequential Trial." Stat Biosci 7, no. 2 (2015): 187-205.
"Properties of Estimators in Exponential Family Settings with Observation-based Stopping Rules." J Biom Biostat 7, no. 1 (2016).
"Active Clinical Trials for Personalized Medicine." J Am Stat Assoc 111, no. 514 (2016): 875-887.
"Bayesian spatial transformation models with applications in neuroimaging data." Biometrics 69, no. 4 (2013): 1074-83.
" SNPpy--database management for SNP data from genome wide association studies." PLoS One 6, no. 10 (2011): e24982.
"On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data." Stat Methods Med Res 23, no. 1 (2014): 11-41.
"