High–Dimensional Sparse Matched Case–Control and Case–Crossover Data: A Review of Recent Works, Description of an R Tool and an Illustration of the Use in Epidemiological Studies

Archive ouverte

Avalos, Marta | Grandvalet, Yves | Pouyes, Hélène | Orriols, Ludivine | Lagarde, Emmanuel

Edité par CCSD ; Springer -

International audience. The conditional logistic regression model is the standard tool for the analysis of epidemiological studies in which one or more cases (the event of interest), are matched with one or more controls (not showing the event). These situations arise, for example, in matched case–control and case–crossover studies. In sparse and high-dimensional settings, penalized methods, such as the Lasso, have emerged as an alternative to conventional estimation and variable selection procedures. We describe the R package clogitLasso, which brings together algorithms to estimate parameters of conditional logistic models using sparsity-inducing penalties. Most individually matched designs are covered, and, beside Lasso, Elastic Net, adaptive Lasso and bootstrapped versions are available. Different criteria for choosing the regularization term are implemented, accounting for the dependency of data. Finally, stability is assessed by resampling methods. We previously review the recent works pertaining to clogitLasso. We also report the use in exploratory analysis of a large pharmacoepidemiological study.

Consulter en ligne

Suggestions

Du même auteur

Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm

Archive ouverte | Avalos, Marta | CCSD

International audience. This paper considers the problem of estimation and variable selection for large high-dimensional data (high number of predictors p and large sample size N, without excluding the possibility t...

Variable selection on large case-crossover data: application to a registry-based study of prescription drugs and road traffic crashes.

Archive ouverte | Avalos, Marta | CCSD

International audience. PURPOSE: In exploratory analyses of pharmacoepidemiological data from large populations with large number of exposures, both a conceptual and computational problem is how to screen hypotheses...

clogitLasso: an R package for high–dimensional analysis of matched case–control and case–crossover data

Archive ouverte | Avalos, Marta, Fernandez | CCSD

International audience. The conditional logistic regression model is the standard tool for the analysis of epidemiological studies in which one or more cases (the event of interest), are individually matched with on...

Chargement des enrichissements...