Double Sparsity Kernel Learning with Automatic Variable Selection and Data Extraction. | Innovative Methods Program for Advancing Clinical Trials (IMPACT)

Title	Double Sparsity Kernel Learning with Automatic Variable Selection and Data Extraction.
Publication Type	Journal Article
Year of Publication	2018
Authors	Chen, Jingxiang, Chong Zhang, Michael R. Kosorok, and Yufeng Liu
Journal	Stat Interface
Volume	11
Issue	3
Pagination	401-420
Date Published	2018
ISSN	1938-7989
Abstract	Learning in the Reproducing Kernel Hilbert Space (RKHS) has been widely used in many scientific disciplines. Because a RKHS can be very flexible, it is common to impose a regularization term in the optimization to prevent overfitting. Standard RKHS learning employs the squared norm penalty of the learning function. Despite its success, many challenges remain. In particular, one cannot directly use the squared norm penalty for variable selection or data extraction. Therefore, when there exists noise predictors, or the underlying function has a sparse representation in the dual space, the performance of standard RKHS learning can be suboptimal. In the literature, work has been proposed on how to perform variable selection in RKHS learning, and a data sparsity constraint was considered for data extraction. However, how to learn in a RKHS with both variable selection and data extraction simultaneously remains unclear. In this paper, we propose a unified RKHS learning method, namely, DOuble Sparsity Kernel (DOSK) learning, to overcome this challenge. An efficient algorithm is provided to solve the corresponding optimization problem. We prove that under certain conditions, our new method can asymptotically achieve variable selection consistency. Simulated and real data results demonstrate that DOSK is highly competitive among existing approaches for RKHS learning.
DOI	10.4310/SII.2018.v11.n3.a1
Alternate Journal	Stat Interface
Original Publication	Double sparsity kernel learning with automatic variable selection and data extraction.
PubMed ID	30294406
PubMed Central ID	PMC6168218
Grant List	P01 CA142538 / CA / NCI NIH HHS / United States R01 GM126550 / GM / NIGMS NIH HHS / United States

Project:

Project 2.3