Document Type
Conference Proceeding
Publication Title
17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop
Abstract
The naïve approach for fine-tuning pretrained deep learning models on downstream tasks involves feeding them mini-batches of randomly sampled data. In this paper, we propose a more elaborate method for fine-tuning Pretrained Multilingual Transformers (PMTs) on multilingual data. Inspired by the success of curriculum learning approaches, we investigate the significance of fine-tuning PMTs on multilingual data in a sequential fashion language by language. Unlike the curriculum learning paradigm where the model is presented with increasingly complex examples, we do not adopt a notion of “easy” and “hard” samples. Instead, our experiments draw insight from psychological findings on how the human brain processes new information and the persistence of newly learned concepts. We perform our experiments on a challenging news-framing dataset that contains texts in six languages. Our proposed method outperforms the naïve approach by achieving improvements of 2.57% in terms of F1 score. Even when we supplement the naïve approach with recency fine-tuning, we still achieve an improvement of 1.34% with a 3.63% convergence speed-up. Moreover, we are the first to observe an interesting pattern in which deep learning models exhibit a human-like primacy-recency effect.
First Page
58
Last Page
63
DOI
10.18653/v1/2023.semeval-1.7
Publication Date
7-2023
Keywords
Deep learning, Learning systems, Curricula
Recommended Citation
T. Mahmoud and P. Nakov, "BERTastic at SemEval-2023 Task 3: Fine-Tuning Pretrained Multilingual Transformers – Does Order Matter?," 17th International Workshop on Semantic Evaluation, SemEval 2023 - Proceedings of the Workshop, pp. 58 - 63, Jul 2023.
The definitive version is available at https://doi.org/10.18653/v1/2023.semeval-1.7
Comments
Archived thanks to ACL Anthology
License: CC by 4.0
Uploaded: April 03, 2024