Microseek: A Protein-Based Metagenomic Pipeline for Virus Diagnostic and Discovery

Archive ouverte

Pérot, Philippe | Bigot, Thomas | Temmam, Sarah | Regnault, Béatrice | Eloit, Marc

Edité par CCSD ; MDPI -

International audience. We present Microseek, a pipeline for virus identification and discovery based on RVDB-prot, a comprehensive, curated and regularly updated database of viral proteins. Microseek analyzes metagenomic Next Generation Sequencing (mNGS) raw data by performing quality steps, de novo assembly, and by scoring the Lowest Common Ancestor (LCA) from translated reads and contigs. Microseek runs on a local computer. The outcome of the pipeline is displayed through a user-friendly and dynamic graphical interface. Based on two representative mNGS datasets de-rived from human tissue and plasma specimens, we illustrate how Microseek works, and we report its performances. In silico spikes of known viral sequences, but also spikes of fake Neo-pneumovirus viral sequences generated with variable evolutionary distances from known mem-bers of the Pneumoviridae family, were used. Results were compared to Chan Zuckerberg ID (CZ ID), a reference cloud-based mNGS pipeline. We show that Microseek reliably identifies known viral sequences and performs well for the detection of distant pseudoviral sequences, especially in complex samples such as in human plasma, while minimizing non-relevant hits.

Suggestions

Du même auteur

Deep Impact of Random Amplification and Library Construction Methods on Viral Metagenomics Results

Archive ouverte | Regnault, Béatrice | CCSD

International audience. Clinical metagenomics is a broad-range agnostic detection method of pathogens, including novel microorganisms. A major limit is the low pathogen load compared to the high background of host n...

Insights into the virome of Hyalomma marginatum in the Danube Delta: a major vector of Crimean-Congo hemorrhagic fever virus in Eastern Europe

Archive ouverte | Bratuleanu, Bianca, Elena | CCSD

International audience. Background: Ticks are significant vectors of pathogens, including viruses, bacteria, and protozoa. With approximately 900 tick species worldwide, many are expanding their geographical range d...

Virus Pop—Expanding Viral Databases by Protein Sequence Simulation

Archive ouverte | Kende, Julia | CCSD

International audience. The improvement of our knowledge of the virosphere, which includes unknown viruses, is a key area in virology. Metagenomics tools, which perform taxonomic assignation from high throughput seq...

Chargement des enrichissements...