Publications
Weighted Area Under the Receiver Operating Characteristic Curve and Its Application to Gene Selection." J R Stat Soc Ser C Appl Stat 59, no. 4 (2010): 673-692.
"A global logrank test for adaptive treatment strategies based on observational studies." Stat Med 33, no. 5 (2014): 760-71.
"Incorporating covariates in skewed functional data models." Biostatistics 16, no. 3 (2015): 413-26.
"Nucleotide excision repair capacity increases during differentiation of human embryonic carcinoma cells into neurons and muscle cells." J Biol Chem 294, no. 15 (2019): 5914-5922.
"Sample size calculation for cluster randomization trials with a time-to-event endpoint." Stat Med 39, no. 25 (2020): 3608-3623.
"cSFM: Covariate-adjusted skewed functional model (R).. 1.1 ed., 2014.
Deep advantage learning for optimal dynamic treatment regime." Stat Theory Relat Fields 2, no. 1 (2018): 80-88.
"SEMIPARAMETRIC REGRESSION ANALYSIS OF REPEATED CURRENT STATUS DATA." Stat Sin 27, no. 3 (2017): 1079-1100.
"Sparse concordance-assisted learning for optimal treatment decision." J Mach Learn Res 18 (2018).
"A general framework for integrative analysis of incomplete multiomics data." Genet Epidemiol 44, no. 7 (2020): 646-664.
"fastJT: An R package for robust and efficient feature selection for machine learning and genome-wide association studies." BMC Bioinformatics 20, no. 1 (2019): 333.
"jtGWAS: Efficient Jonckheere-Terpstra test statistics (R).. 1.0 ed., 2016.
A general framework for detecting disease associations with rare variants in sequencing studies." Am J Hum Genet 89, no. 3 (2011): 354-67.
"Nonparametric estimation of the mean function for recurrent event data with missing event category." Biometrika 100, no. 3 (2013).
"bcSeq: an R package for fast sequence mapping in high-throughput shRNA and CRISPR screens." Bioinformatics 34, no. 20 (2018): 3581-3583.
" intcensROC: Fast Spline Function Based Constrained Maximum Likelihood Estimator for AUC Estimation of Interval Censored Survival Data (R). 0.1.1 ed., 2018.
SCORE-Seq: Score tests for detecting disease associations with rare variants in sequencing studies (C).. 5.0 ed., 2013.
Meta-analysis of genome-wide association studies: no efficiency gain in using individual participant data." Genet Epidemiol 34, no. 1 (2010): 60-6.
"groupedSurv: Efficient Estimation of Grouped Survival Models Using the Exact Likelihood Function (R). 1.0.0 ed., 2018.
On the relative efficiency of using summary statistics versus individual-level data in meta-analysis." Biometrika 97, no. 2 (2010): 321-332.
"A simple and accurate method to determine genomewide significance for association tests in sequencing studies." Genet Epidemiol 43, no. 4 (2019): 365-372.
"Quantitative trait analysis in sequencing studies under trait-dependent sampling." Proc Natl Acad Sci U S A 110, no. 30 (2013): 12247-52.
" Genetic association analysis under complex survey sampling: the Hispanic Community Health Study/Study of Latinos." Am J Hum Genet 95, no. 6 (2014): 675-88.
"Functional-mixed effects models for candidate genetic mapping in imaging genetic studies." Genet Epidemiol 38, no. 8 (2014): 680-91.
" fastJT: Efficient Jonckheere-Terpstra Test Statistics for Robust Machine Learning and Genome-Wide Association Studies (R). 1.0.4 ed., 2017.
Discussion of the Paper by R. L. Prentice and Y. Huang - Optimal Designs and Efficient Inference for Biomarker Studies." Stat Theory Relat Fields 2, no. 1 (2018): 21-22.
"Variable Selection for Nonparametric Quantile Regression via Smoothing Spline AN OVA." Stat 2, no. 1 (2013): 255-268.
"DOVE: Durability of Vaccine Efficacy. v1.2 ed., 2021.
Interactive Q-learning for Quantiles." J Am Stat Assoc 112, no. 518 (2017): 638-649.
"IQ-Learning., 2012.
Estimation of dynamic treatment regimes for complex outcomes: Balancing benefits and risks." In Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine, 249-262. Philadelphia: ASA-SIAM, 2016.
"iqLearn: Interactive Q-Learning in R." J Stat Softw 64, no. 1 (2015).
"Associating somatic mutations to clinical outcomes: a pan-cancer study of survival time." Genome Med 11, no. 1 (2019): 37.
"Principal Components Adjusted Variable Screening." Comput Stat Data Anal 110 (2017): 134-144.
"Hard or Soft Classification? Large-margin Unified Machines." J Am Stat Assoc 106, no. 493 (2011): 166-177.
"Sequential multiple assignment randomization trials with enrichment design." Biometrics 73, no. 2 (2017): 378-390.
"Utility-based Weighted Multicategory Robust Support Vector Machines." Stat Interface 3, no. 4 (2010): 465-476.
"A Comparison of Monte Carlo Methods for Computing Marginal Likelihoods of Item Response Theory Models." J Korean Stat Soc 48, no. 4 (2019): 503-512.
"Estimating personalized diagnostic rules depending on individualized characteristics." Stat Med 36, no. 7 (2017): 1099-1117.
"Marginal hazard regression for correlated failure time data with auxiliary covariates." Lifetime Data Anal 18, no. 1 (2012): 116-38.
"Look before you leap: systematic evaluation of tree-based statistical methods in subgroup identification." J Biopharm Stat 29, no. 6 (2019): 1082-1102.
"Joint skeleton estimation of multiple directed acyclic graphs for heterogeneous population." Biometrics 75, no. 1 (2019): 36-47.
"Association analysis using somatic mutations." PLoS Genet 14, no. 11 (2018): e1007746.
"A unification of models for meta-analysis of diagnostic accuracy studies without a gold standard." Biometrics 71, no. 2 (2015): 538-47.
"SMAC: Spatial multi-category angle-based classifier for high-dimensional neuroimaging data." Neuroimage 175 (2018): 230-245.
"DTRlearn: Learning Algorithms for Dynamic Treatment Regimes (R). 1.3 ed., 2018.
Accelerated intensity frailty model for recurrent events data." Biometrics 70, no. 3 (2014): 579-87.
" Augmented outcome-weighted learning for estimating optimal dynamic treatment regimens." Stat Med 37, no. 26 (2018): 3776-3788.
"Multi-Objective Markov Decision Processes for Data-Driven Decision Support." J Mach Learn Res 17 (2016).
"Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification." F1000Res 7 (2018): 952.
"Tximeta: Reference sequence checksums for provenance identification in RNA-seq." PLoS Comput Biol 16, no. 2 (2020): e1007664.
"OTRselect: Variable selection for optimal treatment decision (R).. 1.0 ed. CRAN Repository, 2016.
Variable selection for optimal treatment decision." Stat Methods Med Res 22, no. 5 (2013): 493-504.
"Bayesian longitudinal low-rank regression models for imaging genetic data from longitudinal studies." Neuroimage 149 (2017): 305-322.
"Semiparametric estimation of treatment effect with time-lagged response in the presence of informative censoring." Lifetime Data Anal 17, no. 4 (2011): 566-93.
"Statistical inferences for data from studies conducted with an aggregated multivariate outcome-dependent sample design." Stat Med 36, no. 6 (2017): 985-997.
"Estimating Dynamic Treatment Regimes in Mobile Health Using V-learning." J Am Stat Assoc 115, no. 530 (2020): 692-706.
"Receiver operating characteristic curves and confidence bands for support vector machines." Biometrics 77, no. 4 (2021): 1422-1430.
"mmeta: Multivariate meta-analysis (R).. 2.2 ed., 2014.
FSEM: Functional Structural Equation Models for Twin Functional Data." J Am Stat Assoc 114, no. 525 (2019): 344-357.
"mmeta: An R Package for Multivariate Meta-Analysis." J Stat Softw 56, no. 11 (2014): 11.
"On the substructure controls in rare variant analysis: Principal components or variance components?" Genet Epidemiol 42, no. 3 (2018): 276-287.
"Statistical methods for multivariate meta-analysis of diagnostic tests: An overview and tutorial." Stat Methods Med Res 25, no. 4 (2016): 1596-619.
"A trivariate meta-analysis of diagnostic studies accounting for prevalence and non-evaluable subjects: re-evaluation of the meta-analysis of coronary CT angiography studies." BMC Med Res Methodol 14 (2014): 128.
"A hybrid Bayesian hierarchical model combining cohort and case-control studies for meta-analysis of diagnostic tests: Accounting for partial verification bias." Stat Methods Med Res 25, no. 6 (2016): 3015-3037.
"Assessing Similarity to Existing Drugs to Decide Whether to Continue Drug Development." Stat Biopharm Res 4, no. 3 (2012): 293-300.
"Incorporating higher-order representative features improves prediction in network-based cancer prognosis analysis." BMC Med Genomics 4 (2011): 5.
" Drug safety in spontaneous reports, observational databases, and clinical trials: Can we do better?., 2011.
Multiplicative rates model for recurrent events in case-cohort studies." Lifetime Data Anal 26, no. 1 (2020): 134-157.
"Multivariate phenotype association analysis by marker-set kernel machine regression." Genet Epidemiol 36, no. 7 (2012): 686-95.
"Inference on phenotype-specific effects of genes using multivariate kernel machine regression." Genet Epidemiol 42, no. 1 (2018): 64-79.
"Efficient Estimation of Semiparametric Transformation Models for the Cumulative Incidence of Competing Risks." J R Stat Soc Series B Stat Methodol 79, no. 2 (2017): 573-587.
"Semiparametric regression for the weighted composite endpoint of recurrent and terminal events." Biostatistics 17, no. 2 (2016): 390-403.
"Semiparametric regression analysis of interval-censored competing risks data." Biometrics 73, no. 3 (2017): 857-865.
"A Fast Multiple-Kernel Method With Applications to Detect Gene-Environment Interaction." Genet Epidemiol 39, no. 6 (2015): 456-68.
" Robust kernel association testing (RobKAT)." Genet Epidemiol 44, no. 3 (2020): 272-282.
"Gene set analysis methods: a systematic comparison." BioData Min 11 (2018): 8.
"Maximum likelihood estimation in generalized linear models with multiple covariates subject to detection limits." Stat Med 30, no. 20 (2011): 2551-61.
"Consistency and overfitting of multi-omics methods on experimental data." Brief Bioinform 21, no. 4 (2020): 1277-1284.
"Localized differences in caudate and hippocampal shape are associated with schizophrenia but not antipsychotic type." Psychiatry Res 211, no. 1 (2013): 1-10.
"Cancer pharmacogenomics: early promise, but concerted effort needed." Science 339, no. 6127 (2013): 1563-6.
"Clinical characteristics, response to exercise training, and outcomes in patients with heart failure and chronic obstructive pulmonary disease: findings from Heart Failure and A Controlled Trial Investigating Outcomes of Exercise TraiNing (HF-ACTION)." Am Heart J 165, no. 2 (2013): 193-9.
"Data for cancer comparative effectiveness research: past, present, and future potential." Cancer 118, no. 21 (2012): 5186-97.
"Multiple testing of treatment-effect-modifying biomarkers in a randomized clinical trial with a survival endpoint." Stat Med 30, no. 13 (2011): 1502-18.
"Properties of Estimators in Exponential Family Settings with Observation-based Stopping Rules." J Biom Biostat 7, no. 1 (2016).
"Estimation After a Group Sequential Trial." Stat Biosci 7, no. 2 (2015): 187-205.
"Active Clinical Trials for Personalized Medicine." J Am Stat Assoc 111, no. 514 (2016): 875-887.
"Bayesian spatial transformation models with applications in neuroimaging data." Biometrics 69, no. 4 (2013): 1074-83.
"SNPpy--database management for SNP data from genome wide association studies." PLoS One 6, no. 10 (2011): e24982.
" On random sample size, ignorability, ancillarity, completeness, separability, and degeneracy: sequential trials, random sample sizes, and missing data." Stat Methods Med Res 23, no. 1 (2014): 11-41.
"