Independent Component Analysis for Unraveling the Complexity of Cancer Omics Datasets

Archive ouverte

Sompairac, Nicolas | Nazarov, Petr, V | Czerwinska, Urszula | Cantini, Laura | Biton, Anne | Molkenov, Askhat | Zhumadilov, Zhaxybay | Barillot, Emmanuel | Radvanyi, François | Gorban, Alexander, N. | Kairov, Ulykbek | Zinovyev, Andrei

Edité par CCSD ; MDPI -

International audience. Independent component analysis (ICA) is a matrix factorization approach where the signals captured by each individual matrix factors are optimized to become as mutually independent as possible. Initially suggested for solving source blind separation problems in various fields, ICA was shown to be successful in analyzing functional magnetic resonance imaging (fMRI) and other types of biomedical data. In the last twenty years, ICA became a part of the standard machine learning toolbox, together with other matrix factorization methods such as principal component analysis (PCA) and non-negative matrix factorization (NMF). Here, we review a number of recent works where ICA was shown to be a useful tool for unraveling the complexity of cancer biology from the analysis of different types of omics data, mainly collected for tumoral samples. Such works highlight the use of ICA in dimensionality reduction, deconvolution, data pre-processing, meta-analysis, and others applied to different data types (transcriptome, methylome, proteome, single-cell data). We particularly focus on the technical aspects of ICA application in omics studies such as using different protocols, determining the optimal number of components, assessing and improving reproducibility of the ICA results, and comparison with other popular matrix factorization techniques. We discuss the emerging ICA applications to the integrative analysis of multi-level omics datasets and introduce a conceptual view on ICA as a tool for defining functional subsystems of a complex biological system and their interactions under various conditions. Our review is accompanied by a Jupyter notebook which illustrates the discussed concepts and provides a practical tool for applying ICA to the analysis of cancer omics datasets.

Suggestions

Du même auteur

Determining the optimal number of independent components for reproducible transcriptomic data analysis

Archive ouverte | Kairov, Ulykbek | CCSD

International audience. BACKGROUND: Independent Component Analysis (ICA) is a method that models gene expression data as an action of a set of statistically independent hidden factors. The output of ICA depends on a...

Assessing reproducibility of matrix factorization methods in independent transcriptomes

Archive ouverte | Cantini, Laura | CCSD

International audience. MOTIVATION: Matrix factorization (MF) methods are widely used in order to reduce dimensionality of transcriptomic datasets to the action of few hidden factors (metagenes). MF algorithms have ...

BIODICA: a computational environment for Independent Component Analysis of omics data

Archive ouverte | Captier, Nicolas | CCSD

International audience. We developed BIODICA, an integrated computational environment for application of Independent Component Analysis (ICA) to bulk and single-cell molecular profiles, interpretation of the results...

Chargement des enrichissements...