Literal Occurrences of Multiword Expressions: Rare Birds That Cause a Stir

Archive ouverte

Savary, Agata | Cordeiro, Silvio | Lichte, Timm | Ramisch, Carlos | Iñurrieta, Uxoa | Giouli, Voula

Edité par CCSD ; Univerzita Karlova v Praze -

International audience. Multiword expressions can have both idiomatic and literal occurrences. For instance pulling strings can be understood either as making use of one's influence, or literally. Distinguishing these two cases has been addressed in linguistics and psycholinguistics studies, and is also considered one of the major challenges in MWE processing. We suggest that literal occurrences should be considered in both semantic and syntactic terms, which motivates their study in a treebank. We propose heuristics to automatically pre-identify candidate sentences that might contain literal occurrences of verbal VMWEs, and we apply them to existing treebanks in five typologically different languages: Basque, German, Greek, Polish and Portuguese. We also perform a linguistic study of the literal occurrences extracted by the different heuristics. The results suggest that literal occurrences constitute a rare phenomenon. We also identify some properties that may distinguish them from their idiomatic counterparts. This article is a largely extended version of Savary and Cordeiro (2018).

Suggestions

Du même auteur

PARSEME corpus release 1.3

Archive ouverte | Savary, Agata | CCSD

International audience

Advances in Multiword Expression Identification for the Italian language: The PARSEME shared task edition 1.1

Archive ouverte | Monti, Johanna | CCSD

International audience. EThis contribution describes the results of the second edition of the shared task on automatic identification of verbalmultiword expressions, organized as part of the LAW-MWE-CxG 2018 worksho...

UniDive: A COST Action on Universality, Diversity and Idiosyncrasy in Language Technology

Archive ouverte | Savary, Agata | CCSD

International audience. This paper presents the objectives, organization and activities of the UniDive COST Action, a scientific network dedicated to universality, diversity and idiosyncrasy in language technology. ...

Chargement des enrichissements...