Variable Selection in Model-based Clustering: A General Variable Role Modeling

Archive ouverte

Maugis, Cathy | Celeux, Gilles | Martin-Magniette, Marie-Laure

Edité par CCSD -

The currently available variable selection procedures in model-based clustering assume that the irrelevant clustering variables are all independent or are all linked with the relevant clustering variables. We propose a more versatile variable selection model which describes three possible roles for each variable: The relevant clustering variables, the irrelevant clustering variables dependent on a part of the relevant clustering variables and the irrelevant clustering variables totally independent of all the relevant variables. A model selection criterion and a variable selection algorithm are derived for this new variable role modeling. The model identifiability and the consistency of the variable selection criterion are also established. Numerical experiments on simulated datasets and on a real dataset highlight the interest of this new modeling.

Suggestions

Du même auteur

Letter to the Editor

Archive ouverte | Celeux, Gilles | CCSD

International audience. no abstract

Mixture models as a useful tool for identifying co-expressed genes from RNA-seq data

Archive ouverte | Rau, Andrea | CCSD

National audience

Sélection de variables pour la classification par mélanges gaussiens pour prédire la fonction des gènes orphelins

Archive ouverte | Maugis, Cathy | CCSD

Biologists are interested in predicting the gene functions of sequenced genome organisms according to microarray transcriptome data. The microarray technology development allows one to study the whole genome in different experimen...

Chargement des enrichissements...