An informed machine learning based environmental risk score for hypertension in European adults

Archive ouverte

Guimbaud, Jean-Baptiste | Calabre, Emilie | de Cid, Rafael | Lassale, Camille | Kogevinas, Manolis | Maître, Léa | Cazabet, Rémy

Edité par CCSD ; Elsevier -

International audience. Background: The exposome framework seeks to unravel the cumulated effects of environmental exposures on health. However, existing methods struggle with challenges including multicollinearity, non-linearity and confounding. To address these limitations, we introduce SEANN (Summary Effect Adjusted Neural Network) a novel approach that integrates pooled effect sizes—a form of domain knowledge—with neural networks to improve the analysis and interpretation of hypertension risk factors.Methods: Based on data from 18,337 adults aged 40-65y participants in the GCAT cohort in Catalonia, covering a diverse selection of 53 environmental factors, we computed two environmental risk scores for hypertension prevalence using deep neural networks. An informed risk score using SEANN, integrating 11 different pooled effect size estimates from meta-analyses, and an agnostic counterpart for comparison. For each score, we computed Shapley values to extract and compare the learnt exposure-outcome relationships from each neural network model.Results: The obtained predictive performances were similarly good for the agnostic NN and SEANN (AUC 0.7). However, we demonstrate substantial improvements in the scientific validity of the informed risk score captured relationships. Directly informed variables were closer to their corresponding relationships observed in literature and other non-informed variables were successfully adjusted with their direction of associations more in line with previous studies. The mean delta SHAP distance averaged over all variables of the relationships extracted with both models and those observed in the literature, was 6 times lower with SEANN compared with the agnostic NN. The most influential environmental variables within the informed risk score included smoking intensity, Mediterranean diet adherence, coffee consumption and sedentary behaviour.Conclusions: This study demonstrates the added value of SEANN over conventional, purely data-driven machine learning approaches. By aligning learned relationships with established literature-based effect sizes, SEANN improves the disentanglement of exposure effects on hypertension.

Suggestions

Du même auteur

State-of-the-art methods for exposure-health studies: Results from the exposome data challenge event

Archive ouverte | Maitre, Léa | CCSD

International audience. The exposome recognizes that individuals are exposed simultaneously to a multitude of different environmental factors and takes a holistic approach to the discovery of etiological factors for...

Machine learning-based health environmental-clinical risk scores in European children

Archive ouverte | Guimbaud, Jean-Baptiste | CCSD

International audience. Background Early life environmental stressors play an important role in the development of multiple chronic disorders. Previous studies that used environmental risk scores (ERS) to assess the...

Climate anxiety and its association with health behaviours and generalized anxiety: An intensive longitudinal study

Archive ouverte | Williams, Marc | CCSD

International audience. Abstract Objectives The United Nations recognize the importance of balancing the needs of people and the planetary systems on which human health relies. This paper investigates the role that ...

Chargement des enrichissements...