Title | Consistency and overfitting of multi-omics methods on experimental data. |
Publication Type | Journal Article |
Year of Publication | 2020 |
Authors | McCabe, Sean D., Dan-Yu Lin, and Michael I. Love |
Journal | Brief Bioinform |
Volume | 21 |
Issue | 4 |
Pagination | 1277-1284 |
Date Published | 2020 Jul 15 |
ISSN | 1477-4054 |
Keywords | Computational Biology, Genomics |
Abstract | Knowledge on the relationship between different biological modalities (RNA, chromatin, etc.) can help further our understanding of the processes through which biological components interact. The ready availability of multi-omics datasets has led to the development of numerous methods for identifying sources of common variation across biological modalities. However, evaluation of the performance of these methods, in terms of consistency, has been difficult because most methods are unsupervised. We present a comparison of sparse multiple canonical correlation analysis (Sparse mCCA), angle-based joint and individual variation explained (AJIVE) and multi-omics factor analysis (MOFA) using a cross-validation approach to assess overfitting and consistency. Both large and small-sample datasets were used to evaluate performance, and a permuted null dataset was used to identify overfitting through the application of our framework and approach. In the large-sample setting, we found that all methods demonstrated consistency and lack of overfitting; however, in the small-sample size setting, AJIVE provided the most stable results. We provide an R package so that our framework and approach can be applied to evaluate other methods and datasets. |
DOI | 10.1093/bib/bbz070 |
Alternate Journal | Brief Bioinform |
Original Publication | Consistency and overfitting of multi-omics methods on experimental data. |
PubMed ID | 31281919 |
PubMed Central ID | PMC7373174 |
Grant List | P30 ES010126 / ES / NIEHS NIH HHS / United States R01 HG009974 / HG / NHGRI NIH HHS / United States R01 HG009125 / HG / NHGRI NIH HHS / United States T32 CA106209 / CA / NCI NIH HHS / United States P01 CA142538 / CA / NCI NIH HHS / United States R01 HL149683 / HL / NHLBI NIH HHS / United States |
Consistency and overfitting of multi-omics methods on experimental data.
Project: