Word embedding for French natural language in healthcare: a comparative study (Preprint)

Archive ouverte

Dynomant, Emeric | Lelong, Romain | Dahamna, Badisse | Massonnaud, Clément | Kerdelhué, Gaétan | Grosjean, Julien | Canu, Stephane | Darmoni, Stefan

Edité par CCSD -

International audience. Word embedding technologies, a set of language modeling and feature learning techniques in natural language processing (NLP), are now used in a wide range of applications. However, no formal evaluation and comparison have been made on the ability of each of the 3 current most famous unsupervised implementations (Word2Vec, GloVe, and FastText) to keep track of the semantic similarities existing between words, when trained on the same dataset.

Suggestions

Du même auteur

Multi-lingual Search Engine to Access PubMed Monolingual Subsets: A Feasibility Study.

Archive ouverte | Darmoni, Stéfan | CCSD

International audience. PubMed contains many articles in languages other than English but it is difficult to find them using the English version of the Medical Subject Headings (MeSH) Thesaurus. The aim of this work...

A Search Engine to Access PubMed Monolingual Subsets: Proof of Concept and Evaluation in French

Archive ouverte | Griffon, Nicolas | CCSD

International audience. Background: PubMed contains numerous articles in languages other than English. However, existing solutions to access these articles in the language in which they were written remain unconvinc...

When Context Matters for Credible Measurement of Drug-Drug Interactions Based on Real-World Data

Archive ouverte | Lelong, Romain | CCSD

International audience. The frequency of potential drug-drug interactions (DDI) in published studies on real world data considerably varies due to the methodological framework. Contextualization of DDI has a proven ...

Chargement des enrichissements...