Principal Component Analysis for Interval-Valued Observations

Archive ouverte

Douzal-Chouakria, Ahlame | Billard, Lynne | Diday, Edwin

Edité par CCSD ; Wiley -

International audience. One feature of contemporary datasets is that instead of the single point value in the p-dimensional space p seen in classical data, the data may take interval values thus producing hypercubes in p. This paper studies the vertices principal components methodology for interval-valued data; and provides enhancements to allow for so-called 'trivial' intervals, and generalized weight functions. It also introduces the concept of vertex contributions to the underlying principal components, a concept not possible for classical data, but one which provides a visualization method that further aids in the interpretation of the methodology. The method is illustrated in a dataset using measurements of facial characteristics obtained from a study of face recognition patterns for surveillance purposes. A comparison with analyses in which classical surrogates replace the intervals, shows how the symbolic analysis gives more informative conclusions. A second example illustrates how the method can be applied even when the number of parameters exceeds the number of observations, as well as how uncertainty data can be accommodated.  2011 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 4: 229-246, 2011

Suggestions

Du même auteur

Classification and Regression Trees on Aggregate Data Modeling: An Application in Acute Myocardial Infarction

Archive ouverte | Quantin, C. | CCSD

International audience. Cardiologists are interested in determining whether the type of hospital pathway followed by a patient is predictive of survival. The study objective was to determine whether accounting for h...

Copula analysis of mixture models

Archive ouverte | Vrac, Mathieu | CCSD

International audience. Contemporary computers collect databases that can be too large for classical methods to handle. The present work takes data whose observations are distribution functions (rather than the sing...

An Exploratory Analysis of Multiple Multivariate Time Series

Archive ouverte | Billard, Lynne | CCSD

International audience. Our aim is to extend standard principal component analysis for non-time series data to explore and highlight the main structure of multiple sets of multivariate time series. To this end, stan...

Chargement des enrichissements...