On the evaluation of retrofitting for supervised short-text classification

Archive ouverte

Ghazi, Kaoutar | Tchechmedjiev, Andon | Harispe, Sébastien | Sutton-Charani, Nicolas | Gildas, Tagny

Edité par CCSD -

International audience. Current NLP systems heavily rely on embedding techniques that are used to automatically encode relevant information about linguistic entities of interest (e.g., words, sentences) into latent spaces. These embeddings are currently the cornerstone of the best machine learning systems used in a large variety of problems such as text classification. Interestingly, state-of-the-art embeddings are commonly only computed using large corpora, and generally do not use additional knowledge expressed into established knowledge resources (e.g. WordNet). In this paper, we empirically study if retrofitting, a class of techniques used to update word vectors in a way that takes into account knowledge expressed in knowledge resources, is beneficial for short text classification. To this aim, we compared the performances of several state-of-the-art classification techniques with or without retrofitting on a selection of benchmarks. Our results show that the retrofitting approach is beneficial for some classifiers settings and only for datasets that share a similar domain to the semantic lexicon used for the retrofitting.

Suggestions

Du même auteur

Evidential Bagging: Combining Heterogeneous Classifiers in the Belief Functions Framework

Archive ouverte | Sutton-Charani, Nicolas | CCSD

International audience. In machine learning, Ensemble Learning methodologies are known to improve predictive accuracy and robustness. They consist in the learning of many classifiers that produce outputs which are f...

Characterization of spatiotemporal dynamics in EEG data during picture naming with optical flow patterns

Archive ouverte | Volpert, Vitaly, A | CCSD

International audience. In this study, we investigate the spatiotemporal dynamics of the neural oscillations by analyzing the electric potential that arises from neural activity. We identify two types of dynamics ba...

Can Knowledge Graph Embeddings Tell Us What Fact-checked Claims Are About?

Archive ouverte | Beretta, Valentina | CCSD

International audience. The web offers a wealth of discourse data that help researchers from various fields analyze debates about current societal issues and gauge the effects on society of important phenomena such ...

Chargement des enrichissements...