Quality control of microbiota metagenomics by k-mer analysis

Archive ouverte

Plaza Onate, Florian | Batto, Jean-Michel | Juste, Catherine | Fadlallah, Jehane | Fougeroux, Cyrielle | Gouas, Doriane | Pons, Nicolas | Kennedy, Sean | Levenez, Florence | Dore, Joel | Dusko Ehrlich, S. | Gorochov, Guy | Larsen, Martin

Edité par CCSD ; BioMed Central -

International audience. Background: The biological and clinical consequences of the tight interactions between host and microbiota are rapidly being unraveled by next generation sequencing technologies and sophisticated bioinformatics, also referred to as microbiota metagenomics. The recent success of metagenomics has created a demand to rapidly apply the technology to large case–control cohort studies and to studies of microbiota from various habitats, including habitats relatively poor in microbes. It is therefore of foremost importance to enable a robust and rapid quality assessment of metagenomic data from samples that challenge present technological limits (sample numbers and size). Here we demonstrate that the distribution of overlapping k-mers of metagenome sequence data predicts sequence quality as defined by gene distribution and efficiency of sequence mapping to a reference gene catalogue. Results: We used serial dilutions of gut microbiota metagenomic datasets to generate well-defined high to low quality metagenomes. We also analyzed a collection of 52 microbiota-derived metagenomes. We demonstrate that k-mer distributions of metagenomic sequence data identify sequence contaminations, such as sequences derived from " empty " ligation products. Of note, k-mer distributions were also able to predict the frequency of sequences mapping to a reference gene catalogue not only for the well-defined serial dilution datasets, but also for 52 human gut microbiota derived metagenomic datasets. Conclusions: We propose that k-mer analysis of raw metagenome sequence reads should be implemented as a first quality assessment prior to more extensive bioinformatics analysis, such as sequence filtering and gene mapping. With the rising demand for metagenomic analysis of microbiota it is crucial to provide tools for rapid and efficient decision making. This will eventually lead to a faster turnaround time, improved analytical quality including sample quality metrics and a significant cost reduction. Finally, improved quality assessment will have a major impact on the robustness of biological and clinical conclusions drawn from metagenomic studies.

Suggestions

Du même auteur

Abundance-based reconstitution of microbial pan-genomes from whole-metagenome shotgun sequencing data : Application to the study the human gut microbiota. Reconstitution de pan-génomes microbiens par séquençage métagénomique aléatoire : Application à l’étude du microbiote intestinal humain

Archive ouverte | Plaza Onate, Florian | CCSD

The advent of shotgun metagenomic sequencing has revolutionized microbiology by allowing culture-independent characterization of complex microbial communities such as the human gut microbiota. Recently developed bioinformatics too...

Meteor2: accurate taxonomic profiling of host-associated microbial communities for shotgun metagenomics

Archive ouverte | Ghozlane, Amine | CCSD

International audience. Objective: Detection and quantification of microbial species is a fundamental task in metagenomics. Many taxonomic profilers are available for this purpose, each with their own strengths and ...

Benchmarking second and third-generation sequencing platforms for microbial metagenomics

Archive ouverte | Meslier, Victoria | CCSD

International audience. Abstract Shotgun metagenomic sequencing is a common approach for studying the taxonomic diversity and metabolic potential of complex microbial communities. Current methods primarily use secon...

Chargement des enrichissements...