Double Sparsity Kernel Learning with Automatic Variable Selection and Data Extraction.

TitleDouble Sparsity Kernel Learning with Automatic Variable Selection and Data Extraction.
Publication TypeJournal Article
Year of Publication2018
AuthorsChen, Jingxiang, Chong Zhang, Michael R. Kosorok, and Yufeng Liu
JournalStat Interface
Date Published2018

Learning in the Reproducing Kernel Hilbert Space (RKHS) has been widely used in many scientific disciplines. Because a RKHS can be very flexible, it is common to impose a regularization term in the optimization to prevent overfitting. Standard RKHS learning employs the squared norm penalty of the learning function. Despite its success, many challenges remain. In particular, one cannot directly use the squared norm penalty for variable selection or data extraction. Therefore, when there exists noise predictors, or the underlying function has a sparse representation in the dual space, the performance of standard RKHS learning can be suboptimal. In the literature, work has been proposed on how to perform variable selection in RKHS learning, and a data sparsity constraint was considered for data extraction. However, how to learn in a RKHS with both variable selection and data extraction simultaneously remains unclear. In this paper, we propose a unified RKHS learning method, namely, DOuble Sparsity Kernel (DOSK) learning, to overcome this challenge. An efficient algorithm is provided to solve the corresponding optimization problem. We prove that under certain conditions, our new method can asymptotically achieve variable selection consistency. Simulated and real data results demonstrate that DOSK is highly competitive among existing approaches for RKHS learning.

Alternate JournalStat Interface
Original PublicationDouble sparsity kernel learning with automatic variable selection and data extraction.
PubMed ID30294406
PubMed Central IDPMC6168218
Grant ListP01 CA142538 / CA / NCI NIH HHS / United States
R01 GM126550 / GM / NIGMS NIH HHS / United States