Title | Determining the Number of Latent Factors in Statistical Multi-Relational Learning. |
Publication Type | Journal Article |
Year of Publication | 2019 |
Authors | Shi, Chengchun, Wenbin Lu, and Rui Song |
Journal | J Mach Learn Res |
Volume | 20 |
Date Published | 2019 |
ISSN | 1532-4435 |
Abstract | Statistical relational learning is primarily concerned with learning and inferring relationships between entities in large-scale knowledge graphs. Nickel et al. (2011) proposed a RESCAL tensor factorization model for statistical relational learning, which achieves better or at least comparable results on common benchmark data sets when compared to other state-of-the-art methods. Given a positive integer , RESCAL computes an -dimensional latent vector for each entity. The latent factors can be further used for solving relational learning tasks, such as collective classification, collective entity resolution and link-based clustering. The focus of this paper is to determine the number of latent factors in the RESCAL model. Due to the structure of the RESCAL model, its log-likelihood function is not concave. As a result, the corresponding maximum likelihood estimators (MLEs) may not be consistent. Nonetheless, we design a specific pseudometric, prove the consistency of the MLEs under this pseudometric and establish its rate of convergence. Based on these results, we propose a general class of information criteria and prove their model selection consistencies when the number of relations is either bounded or diverges at a proper rate of the number of entities. Simulations and real data examples show that our proposed information criteria have good finite sample properties. |
DOI | 10.1214/009053606000001217 |
Alternate Journal | J Mach Learn Res |
Original Publication | Determining the number of latent factors in statistical multi-relational learning. |
PubMed ID | 31983896 |
PubMed Central ID | PMC6980192 |
Grant List | P01 CA142538 / CA / NCI NIH HHS / United States |
Determining the Number of Latent Factors in Statistical Multi-Relational Learning.
Project: