Natural Language Processing Faculty Publications

Multi-lingual and Multi-cultural Figurative Language Understanding

Anubha Kabra, Carnegie Mellon University
Emmy Liu, Carnegie Mellon University
Simran Khanuja, Carnegie Mellon University
Alham Fikri Aji, Mohamed bin Zayed University of Artificial IntelligenceFollow
Genta Indra Winata, Bloomberg
Samuel Cahyawijaya, Hong Kong University of Science and Technology
Anuoluwapo Aremu, Masakhane
Perez Ogayo, Carnegie Mellon University

Document Type

Conference Proceeding

Publication Title

Proceedings of the Annual Meeting of the Association for Computational Linguistics

Abstract

Figurative language permeates human communication, but at the same time is relatively understudied in NLP. Datasets have been created in English to accelerate progress towards measuring and improving figurative language processing in language models (LMs). However, the use of figurative language is an expression of our cultural and societal experiences, making it difficult for these phrases to be universally applicable. In this work, we create a figurative language inference dataset, MABL, for seven diverse languages associated with a variety of cultures: Hindi, Indonesian, Javanese, Kannada, Sundanese, Swahili and Yoruba. Our dataset reveals that each language relies on cultural and regional concepts for figurative expressions, with the highest overlap between languages originating from the same region. We assess multilingual LMs' abilities to interpret figurative language in zero-shot and few-shot settings. All languages exhibit a significant deficiency compared to English, with variations in performance reflecting the availability of pre-training and fine-tuning data, emphasizing the need for LMs to be exposed to a broader range of linguistic and cultural variation during training.

First Page

8269

Last Page

8284

Publication Date

1-1-2023

Recommended Citation

A. Kabra et al., "Multi-lingual and Multi-cultural Figurative Language Understanding," Proceedings of the Annual Meeting of the Association for Computational Linguistics, pp. 8269 - 8284, Jan 2023.

This document is currently not available here.

COinS

Natural Language Processing Faculty Publications

Multi-lingual and Multi-cultural Figurative Language Understanding

Document Type

Publication Title

Abstract

First Page

Last Page

Publication Date

Recommended Citation

Browse

Contribute

Links

Natural Language Processing Faculty Publications

Multi-lingual and Multi-cultural Figurative Language Understanding

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

Publication Date

Recommended Citation

Share

Browse

Contribute

Links