Analysis of -omics data: Graphical interpretation- and validation tools in multi-block methods

Archive ouverte

Hassani, Sahar | Martens, Harald | Qannari, El Mostafa | Hanafi, Mohamed | Borge, Grethe Iren | Kohler, Achim

Edité par CCSD ; Elsevier -

International audience. As systems biology develops, various types of high-throughput -omics data become rapidly available. An increasing challenge is to analyze such massive data, interpret the results and validate the findings. Data analysis for most of the omics-techniques is in a fledgling immature stage. Alone the dimensionality of the data tables calls for new ways to reveal structure in the data, without cognitive overflow and excessive false discovery rate. Multi-block methods have been developed and adapted in order to find common variation patterns in data and depict these findings on graphical displays while providing tools to enhance the interpretation of the outcomes. In particular, multi-block methods based on latent variables are powerful tools to study block and global variation patterns, e.g. by inspecting block and global score plots. These methods can be used to achieve a graphical overview over sample and variable variation patterns in an efficient way. However, a visual detection of patterns may be subjective and, therefore, there is a need for validation tools. In this paper tools for validation of visually identified patterns in multi-block results are presented. Cross-validated estimates of Root Mean Square Error (RMSE) for block results are introduced for estimating the number of relevant PCs of the Consensus Principal Component Analysis (CPCA) models. Furthermore, important variables are identified by approximate t-tests based on Procrustes-corrected jackknifing. For the assessment of the stability of score patterns, block stability plots are introduced. Outliers can be revealed graphically on block and global level by stability plots.

Consulter en ligne

Suggestions

Du même auteur

Model validation and error estimation in multi-block partial least squares regression

Archive ouverte | Hassani, Sahar | CCSD

Document Type : Proceedings Paper Conference Date : SEP 20-24, 2010 Conference Location : Rabat, MOROCCO Conference Host : Min Sch Rabat. International audience. While validation of Partial Least Squares Regression ...

Degrees of freedom estimation in Principal Component Analysis and Consensus Principal Component Analysis

Archive ouverte | Hassani, Sahar | CCSD

International audience. The concept of degree of freedom (DF) is an important issue in statistical model assessment and parameter estimation. In this paper, we investigate this concept within the context of data mod...

Deflation strategies for multi-block principal component analysis revisited

Archive ouverte | Hassani, Sahar | CCSD

International audience. Within the framework of multi-block data sets, multi-block principal component analysis has been successfully used as a tool to investigate the structure of spectroscopy, -omics and sensory d...

Chargement des enrichissements...