Omnicrobe, an open-access database of microbial habitats, phenotypes and uses extracted from text

Archive ouverte

Dérozier, Sandra | Bossy, Robert | Deléger, Louise | Ba, Mouhamadou | Chaix, Estelle | Loux, Valentin | Falentin, Hélène | Nédellec, Claire

Edité par CCSD -

National audience. The drastic increase in microbe descriptions, habitats, phenotypes and uses in databases, reports and papers presents a twofold challenge for the access to the information. The integration of heterogeneous data requires a standardized representation and the normalization of textual descriptions by semantic analysis. Recent information extraction technologies from the text mining domain offer a powerful way to detect and structure textual information along ontology-based representations.The Omnicrobe application (https://omnicrobe.migale.inrae.fr) uses an Information Extraction workflow to populate its database. The workflow is designed to (1) extract microorganism taxa, their habitats, their phenotypes and their uses and (2) categorize the extracted information with taxa from the NCBI (National Center for Biotechnology Information) taxonomy and concepts from the OntoBiotope ontology. The Omnicrobe database contains around 1 million descriptions of microbe properties that are created by analyzing and combining six information sources, i.e. biological resource catalogues (e. g. INRAE CIRM, DSMZ through BacDive), sequence database (GenBank) and scientific literature (PubMed abstracts).Omnicrobe offers powerful ways to express simple and complex ontology-based queries to support studies in various domains of microbiology. Omnicrobe also exposes an API (Application Programming Interface) that allows users to automatically integrate microbe biodiversity knowledge in external information systems. The use of Omnicrobe to quickly target useful strains in a food innovation application illustrates how it can provide an easy-to-use support in the resolution of scientific questions related to the habitats, phenotypes and uses of microbes.

Suggestions

Du même auteur

Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach

Archive ouverte | Dérozier, Sandra | CCSD

International audience. The dramatic increase in the number of microbe descriptions in databases, reports, and papers presents a two-fold challenge for accessing the information: integration of heterogeneous data in...

Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach

Archive ouverte | Dérozier, Sandra | CCSD

The authors thank the Migale platform for providing the resources to run Omnicrobe services (MIGALE, INRAE, 2020. Migale bioinformatics Facility, doi:10.15454/1.5572390655343293E12). The current affiliation of Estelle Chaix is the...

Omnicrobe : une base de données d’habitats et de phénotypes microbiens

Archive ouverte | Falentin, Hélène | CCSD

Le CBL (Club des Bactéries Lactiques) est une manifestation scientifique qui réunit chercheurs, enseignants-chercheurs et industriels R&D, pour échanger sur les avancées scientifiques et techniques réalisées dans le domaine des ba...

Chargement des enrichissements...