A Massive Data Framework for M-Estimators with Cubic-Rate.

TitleA Massive Data Framework for M-Estimators with Cubic-Rate.
Publication TypeJournal Article
Year of Publication2018
AuthorsShi, Chengchun, Wenbin Lu, and Rui Song
JournalJ Am Stat Assoc
Date Published2018

The divide and conquer method is a common strategy for handling massive data. In this article, we study the divide and conquer method for cubic-rate estimators under the massive data framework. We develop a general theory for establishing the asymptotic distribution of the aggregated M-estimators using a weighted average with weights depending on the subgroup sample sizes. Under certain condition on the growing rate of the number of subgroups, the resulting aggregated estimators are shown to have faster convergence rate and asymptotic normal distribution, which are more tractable in both computation and inference than the original M-estimators based on pooled data. Our theory applies to a wide class of M-estimators with cube root convergence rate, including the location estimator, maximum score estimator and value search estimator. Empirical performance via simulations and a real data application also validate our theoretical findings.

Alternate JournalJ Am Stat Assoc
Original PublicationA massive data framework for M-estimators with cubic-rate.
PubMed ID30739966
PubMed Central IDPMC6364750
Grant ListP01 CA142538 / CA / NCI NIH HHS / United States