Degrees of freedom estimation in Principal Component Analysis and Consensus Principal Component Analysis

Archive ouverte

Hassani, Sahar | Martens, Harald | Qannari, El Mostafa | Kohler, Achim

Edité par CCSD ; Elsevier -

International audience. The concept of degree of freedom (DF) is an important issue in statistical model assessment and parameter estimation. In this paper, we investigate this concept within the context of data modeling by Principal Component Analysis (PCA) and its multi-block extension, the Consensus Principal Component Analysis (CPCA). We run simulation studies and assess the degrees of freedom by comparing cross-validated error estimates with error estimates from uncorrected model fits. These simulation studies reveal that the OF consumption in PCA and CPCA depends on the eigenvalue structure of the data at hand. We also show that the obtained DF estimates can be used to obtain realistic error estimations without performing cross-validation. Furthermore, it is shown how different strategies of cross-validation and the use of an independent test set affect the estimate of the degrees of freedom and the estimate of the model error. (C) 2012 Elsevier B.V. All rights reserved.

Consulter en ligne

Suggestions

Du même auteur

Analysis of -omics data: Graphical interpretation- and validation tools in multi-block methods

Archive ouverte | Hassani, Sahar | CCSD

International audience. As systems biology develops, various types of high-throughput -omics data become rapidly available. An increasing challenge is to analyze such massive data, interpret the results and validate...

Model validation and error estimation in multi-block partial least squares regression

Archive ouverte | Hassani, Sahar | CCSD

Document Type : Proceedings Paper Conference Date : SEP 20-24, 2010 Conference Location : Rabat, MOROCCO Conference Host : Min Sch Rabat. International audience. While validation of Partial Least Squares Regression ...

Deflation strategies for multi-block principal component analysis revisited

Archive ouverte | Hassani, Sahar | CCSD

International audience. Within the framework of multi-block data sets, multi-block principal component analysis has been successfully used as a tool to investigate the structure of spectroscopy, -omics and sensory d...

Chargement des enrichissements...