Testing for association with rare variants in the coding and non-coding genome: RAVA-FIRST, a new approach based on CADD deleteriousness score

Archive ouverte

Bocher, Ozvan | Ludwig, Thomas E | Oglobinsky, Marie-Sophie | Marenne, Gaelle | Deleuze, Jean-Francois | Suryakant, Suryakant | Odeberg, Jacob | Morange, Pierre-Emmanuel | Tregouet, David-Alexandre | Perdry, Herve | Genin, Emmanuelle

Edité par CCSD ; Public Library of Science -

International audience. Rare variant association tests (RVAT) have been developed to study the contribution of rare variants widely accessible through high-throughput sequencing technologies. RVAT require to aggregate rare variants in testing units and to filter variants to retain only the most likely causal ones. In the exome, genes are natural testing units and variants are usually filtered based on their functional consequences. However, when dealing with whole-genome sequence (WGS) data, both steps are challenging. No natural biological unit is available for aggregating rare variants. Sliding windows procedures have been proposed to circumvent this difficulty, however they are blind to biological information and result in a large number of tests. We propose a new strategy to perform RVAT on WGS data: "RAVA-FIRST" (RAre Variant Association using Functionally-InfoRmed STeps) comprising three steps. (1) New testing units are defined genome-wide based on functionally-adjusted Combined Annotation Dependent Depletion (CADD) scores of variants observed in the gnomAD populations, which are referred to as "CADD regions". (2) A region-dependent filtering of rare variants is applied in each CADD region. (3) A functionally-informed burden test is performed with sub-scores computed for each genomic category within each CADD region. Both on simulations and real data, RAVA-FIRST was found to outperform other WGS-based RVAT. Applied to a WGS dataset of venous thromboembolism patients, we identified an intergenic region on chromosome 18 enriched for rare variants in early-onset patients. This region that was missed by standard sliding windows procedures is included in a TAD region that contains a strong candidate gene. RAVA-FIRST enables new investigations of rare non-coding variants in complex diseases, facilitated by its implementation in the R package Ravages.

Suggestions

Du même auteur

Plasma levels of complement components C5 and C9 are associated with thrombin generation

Archive ouverte | Diaz, Rocio Vacik | CCSD

International audience. BACKGROUND: The thrombin generation assay (TGA) evaluates the potential of plasma to generate thrombin over time, providing a global picture of an individual's hemostatic balance. OBJECTIVES:...

An artificial neural network approach integrating plasma proteomics and genetic data identifies PLXNA4 as a new susceptibility locus for pulmonary embolism

Archive ouverte | Razzaq, Misbah | CCSD

International audience. Venous thromboembolism is the third common cardiovascular disease and is composed of two entities, deep vein thrombosis (DVT) and its potential fatal form, pulmonary embolism (PE). While PE i...

Author Correction: Elevated plasma complement factor H related 5 protein is associated with venous thromboembolism

Archive ouverte | Iglesias, Maria Jesus | CCSD

International audience

Chargement des enrichissements...