Document Type

Conference Proceeding

Publication Title

17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop

Abstract

The naïve approach for fine-tuning pretrained deep learning models on downstream tasks involves feeding them mini-batches of randomly sampled data. In this paper, we propose a more elaborate method for fine-tuning Pretrained Multilingual Transformers (PMTs) on multilingual data. Inspired by the success of curriculum learning approaches, we investigate the significance of fine-tuning PMTs on multilingual data in a sequential fashion language by language. Unlike the curriculum learning paradigm where the model is presented with increasingly complex examples, we do not adopt a notion of “easy” and “hard” samples. Instead, our experiments draw insight from psychological findings on how the human brain processes new information and the persistence of newly learned concepts. We perform our experiments on a challenging news-framing dataset that contains texts in six languages. Our proposed method outperforms the naïve approach by achieving improvements of 2.57% in terms of F1 score. Even when we supplement the naïve approach with recency fine-tuning, we still achieve an improvement of 1.34% with a 3.63% convergence speed-up. Moreover, we are the first to observe an interesting pattern in which deep learning models exhibit a human-like primacy-recency effect.

First Page

58

Last Page

63

DOI

10.18653/v1/2023.semeval-1.7

Publication Date

7-2023

Keywords

Deep learning, Learning systems, Curricula

Comments

Archived thanks to ACL Anthology

License: CC by 4.0

Uploaded: April 03, 2024

Share

COinS