Sparse conditional logistic regression for analyzing large-scale matched data from epidemiological studies: a simple algorithm

Archive ouverte

Avalos, Marta | Pouyes, Hélène | Grandvalet, Yves | Orriols, Ludivine | Lagarde, Emmanuel

Edité par CCSD ; BioMed Central -

International audience. This paper considers the problem of estimation and variable selection for large high-dimensional data (high number of predictors p and large sample size N, without excluding the possibility that N < p) resulting from an individually matched case-control study. We develop a simple algorithm for the adaptation of the Lasso and related methods to the conditional logistic regression model. Our proposal relies on the simplification of the calculations involved in the likelihood function. Then, the proposed algorithm iteratively solves reweighted Lasso problems using cyclical coordinate descent, computed along a regularization path. This method can handle large problems and deal with sparse features efficiently. We discuss benefits and drawbacks with respect to the existing available implementations. We also illustrate the interest and use of these techniques on a pharmacoepidemiological study of medication use and traffic safety.

Suggestions

Du même auteur

High–Dimensional Sparse Matched Case–Control and Case–Crossover Data: A Review of Recent Works, Description of an R Tool and an Illustration of the Use in Epidemiological Studies

Archive ouverte | Avalos, Marta | CCSD

International audience. The conditional logistic regression model is the standard tool for the analysis of epidemiological studies in which one or more cases (the event of interest), are matched with one or more con...

Variable selection on large case-crossover data: application to a registry-based study of prescription drugs and road traffic crashes.

Archive ouverte | Avalos, Marta | CCSD

International audience. PURPOSE: In exploratory analyses of pharmacoepidemiological data from large populations with large number of exposures, both a conceptual and computational problem is how to screen hypotheses...

clogitLasso: an R package for high–dimensional analysis of matched case–control and case–crossover data

Archive ouverte | Avalos, Marta, Fernandez | CCSD

International audience. The conditional logistic regression model is the standard tool for the analysis of epidemiological studies in which one or more cases (the event of interest), are individually matched with on...

Chargement des enrichissements...