CroCoDeEL: accurate control-free detection of cross-sample contamination in metagenomic data

Archive ouverte

Goulet, Lindsay | Oñate, Florian Plaza | Famechon, Alexandre | Quinquis, Benoît | Belda, Eugeni | Prifti, Edi | Le Chatelier, Emmanuelle | Gautreau, Guillaume

Edité par CCSD -

Metagenomic sequencing provides profound insights into microbial communities, but it is often compromised by technical biases, including cross-sample contamination. This underexplored phenomenon arises when microbial content is inadvertently exchanged among concurrently processed samples. Such contamination that distort microbial profiles, poses significant risks to the reliability of metagenomic data and downstream analyses. Despite its critical impact, this issue remains insufficiently addressed. To fill this gap, we introduce CroCoDeEL, a decision-support tool for detecting and quantifying cross-sample contamination. Leveraging a pre-trained supervised model, CroCoDeEL identifies contamination patterns in species abundance profiles with high accuracy. Unlike existing tools, it requires no negative controls or prior knowledge of sample processing positions, offering improved accuracy and versatility. Benchmarks across three public datasets demonstrate that CroCoDeEL accurately detects contaminated samples and identifies their contamination sources, even at low rates (<0.1%), provided sufficient sequencing depth. Our findings suggest that cross-sample contamination is prevalent in metagenomics and emphasize the necessity of systematically integrating contamination detection into sequencing data quality control.

Consulter en ligne

Suggestions

Du même auteur

CroCoDeEL: accurate detection of cross-sample contaminations in metagenomic data

Archive ouverte | Goulet, Lindsay | CCSD

International audience. Metagenomic sequencing provides deep insights into microbial communities but is subject to various experimental biases such as cross-sample contamination where microbial contents from simulta...

Towards a machine learning approach for automated detection of well-to-well contamination in metagenomics data

Archive ouverte | Goulet, Lindsay | CCSD

International audience. Combining advances in high-throughput sequencing and Big Data, metagenomic sequencing has revolutionized our vision of the microscopic world and microbiology by allowing the characterization,...

Towards a machine learning approach for automated detection of well-to-well contamination in metagenomic data

Archive ouverte | Goulet, Lindsay | CCSD

International audience. Samples subjected to metagenomic sequencing can be accidentally contaminated during wet lab steps (DNA extraction, library preparation) by DNA from an external source (e.g.: lab reagents) or ...

Chargement des enrichissements...