Towards a machine learning approach for automated detection of well-to-well contamination in metagenomic data

Archive ouverte

Goulet, Lindsay | Plaza Oñate, Florian | Prifti, Edi | Belda, Eugeni | Le Chatelier, Emmanuelle | Gautreau, Guillaume

Edité par CCSD -

International audience. Samples subjected to metagenomic sequencing can be accidentally contaminated during wet lab steps (DNA extraction, library preparation) by DNA from an external source (e.g.: lab reagents) or from other samples processed on the same plate (well-to-well contamination). These can lead to biased results and eventually to false conclusions if not detected. Although a critical issue, well-to-well contamination remains understudied. A few tools have been developed but suffer from several limitations, such as a lack of sensitivity.By inspecting species abundance profiles of published cohort samples, we identified specific patterns associated with well-to-well contamination. Here, we propose an original method based on the recognition of such patterns that accurately detects contamination events even at low rates (up to 1%). Our approach does not require negative controls, works with related samples that may naturally share strains (e.g.: mother/child), discriminates contamination sources from contaminated samples and estimates contamination rates.However, this method is time-consuming and requires human expertise to manually inspect suspect cases. We are developing a fully automated tool, based on deep learning, trained with semi-simulated sequencing data to classify contaminated samples. As preliminary results are promising, we believe this method will significantly impact the field, making metagenomic experiments more robust.

Suggestions

Du même auteur

Towards a machine learning approach for automated detection of well-to-well contamination in metagenomics data

Archive ouverte | Goulet, Lindsay | CCSD

International audience. Combining advances in high-throughput sequencing and Big Data, metagenomic sequencing has revolutionized our vision of the microscopic world and microbiology by allowing the characterization,...

CroCoDeEL: accurate control-free detection of cross-sample contamination in metagenomic data

Archive ouverte | Goulet, Lindsay | CCSD

Metagenomic sequencing provides profound insights into microbial communities, but it is often compromised by technical biases, including cross-sample contamination. This underexplored phenomenon arises when microbial content is in...

CroCoDeEL: accurate detection of cross-sample contaminations in metagenomic data

Archive ouverte | Goulet, Lindsay | CCSD

International audience. Metagenomic sequencing provides deep insights into microbial communities but is subject to various experimental biases such as cross-sample contamination where microbial contents from simulta...

Chargement des enrichissements...