Submissions from 2024
Ethical Reasoning and Moral Value Alignment of LLMs Depend on the Language we Prompt them in, Utkarsh Agarwal, Kumar Tanmay, Aditi Khandelwal, and Monojit Choudhury
OSACT 2024 Task 2: Arabic Dialect to MSA Translation, Hanin Atwany, Nour Rabih, Ibrahim Mohammed, Abdul Waheed, and Bhiksha Raj
The CLEF-2024 CheckThat! Lab: Check-Worthiness, Subjectivity, Persuasion, Roles, Authorities, and Adversarial Robustness, Alberto Barrón-Cedeño, Firoj Alam, Tanmoy Chakraborty, Tamer Elsayed, Preslav Nakov, Piotr Przybyła, Julia Maria Struß, and Fatima Haouari
Preface, Bharathi Raja Chakravarthi, B. Bharathi, Miguel Ángel García Cumbreras, Salud María Jiménez Zafra, Malliga Subramanian, Kogilavani Shanmugavadivel, and Preslav Nakov
Unleashing the Power of Discourse-Enhanced Transformers for Propaganda Detection, Alexander Chernyavskiy, Dmitry Ilvovsky, and Preslav Nakov
Why do we not stand up to misinformation? Factors influencing the likelihood of challenging misinformation on social media and the role of demographics, Selin Gurgun, Deniz Cemiloglu, Emily Arden Close, Keith Phalp, Preslav Nakov, and Raian Ali
Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation?, Rishav Hada, Varun Gumma, Adrian de Wynter, Harshita Diddee, Mohamed Ahmed, Monojit Choudhury, Kalika Bali, and Sunayana Sitaram
DYNAMIC-SUPERB: TOWARDS A DYNAMIC, COLLABORATIVE, AND COMPREHENSIVE INSTRUCTION-TUNING BENCHMARK FOR SPEECH, Chien Yu Huang, Ke Han Lu, Shih Heng Wang, Chi Yuan Hsiao, Chun Yi Kuan, Haibin Wu, Siddhant Arora, and Kai Wei Chang
REFeREE: A REference-FREE Model-Based Metric for Text Simplification, Yichen Huang and Ekaterina Kochmar
Using natural language processing and patient journey clustering for temporal phenotyping of antimicrobial therapies for cat bite abscesses, Brian Hur, Karin M. Verspoor, Timothy Baldwin, Laura Y. Hardefeldt, Caitlin Pfeiffer, Caroline Mansfield, Riati Scarborough, and James R. Gilkerson
CAMERA3: An Evaluation Dataset for Controllable Ad Text Generation in Japanese, Go Inoue, Akihiko Kato, Masato Mita, Ukyo Honda, and Peinan Zhang
To Drop or Not to Drop? Predicting Argument Ellipsis Judgments: A Case Study in Japanese, Yukiko Ishizuki, Tatsuki Kuribayashi, Yuichiroh Matsubayashi, Ryohei Sasano, and Kentaro Inui
Saliency-Aware Interpolative Augmentation for Multimodal Financial Prediction, Samyak Jain, Parth Chhabra, Atula Neerkaje, Puneet Mathur, Ramit Sawhney, Shivam Agarwal, Preslav Nakov, and Sudheer Chava
Applications and Related Tasks, Tommi Jauhiainen, Marcos Zampieri, Timothy Baldwin, and Krister Lindén
Conclusion and Future Directions, Tommi Jauhiainen, Marcos Zampieri, Timothy Baldwin, and Krister Lindén
Evaluation and Measurement, Tommi Jauhiainen, Marcos Zampieri, Timothy Baldwin, and Krister Lindén
Features and Methods, Tommi Jauhiainen, Marcos Zampieri, Timothy Baldwin, and Krister Lindén
Introduction to Language Identification, Tommi Jauhiainen, Marcos Zampieri, Timothy Baldwin, and Krister Lindén
Large Scale, Multi-domain Language Identification, Tommi Jauhiainen, Marcos Zampieri, Timothy Baldwin, and Krister Lindén
Specific Challenges of Variation and Text Types, Tommi Jauhiainen, Marcos Zampieri, Timothy Baldwin, and Krister Lindén
Do Moral Judgment and Reasoning Capability of LLMs Change with Language? A Study using the Multilingual Defining Issues Test, Aditi Khandelwal, Utkarsh Agarwal, Kumar Tanmay, and Monojit Choudhury
Zero-shot Sentiment Analysis in Low-Resource Languages Using a Multilingual Sentiment Lexicon, Fajri Koto, Tilman Beck, Zeerak Talat, Iryna Gurevych, and Timothy Baldwin
A Paradigm Shift: The Future of Machine Translation Lies with Large Language Models, Chenyang Lyu, Zefeng Du, Jitao Xu, Yitao Duan, Minghao Wu, Teresa Lynn, Alham Fikri Aji, and Derek F. Wong
DocScript: Document-Level Script Event Prediction, Puneet Mathur, Rajiv Jain, Vlad Morariu, Aparna Garimella, Franck Dernoncourt, Jiuxiang Gu, Ramit Sawhney, and Preslav Nakov
Considering the IMPACT framework to understand the AI-well-being-complex from an interdisciplinary perspective, Christian Montag, Preslav Nakov, and Raian Ali
Challenging others when posting misinformation: a UK vs. Arab cross-cultural comparison on the perception of negative consequences and injunctive norms, Muaadh Noman, Selin Gurgun, Keith Phalp, Preslav Nakov, and Raian Ali
In-Contextual Gender Bias Suppression for Large Language Models, Daisuke Oba, Masahiro Kaneko, and Danushka Bollegala
Tricking LLMs into Disobedience: Formalizing, Analyzing, and Detecting Jailbreaks, Abhinav Rao, Sachin Vashistha, Atharva Naik, Somak Aditya, and Monojit Choudhury
FRAPPE: FRAming, Persuasion, and Propaganda Explorer, Ahmed Sajwani, Alaa El Setohy, Ali Mekky, Diana Turmakhan, Lara Hassan, Mohamed El Zeftawy, Omar El Herraoui, and Osama Mohammed Afzal
RISE: Robust Early-exiting Internal Classifiers for Suicide Risk Evaluation, Ritesh Soun, Atula Neerkaje, Ramit Sawhney, Nikolaos Aletras, and Preslav Nakov
Temporal dynamics of coordinated online behavior: Stability, archetypes, and influence, Serena Tardelli, Leonardo Nizzoli, Maurizio Tesconi, Mauro Conti, Preslav Nakov, Giovanni Da San Martino, and Stefano Cresci
OATS: Opinion Aspect Target Sentiment Quadruple Extraction Dataset for Aspect-Based Sentiment Analysis, Siva Uday, Sampreeth Chebolu, Franck Dernoncourt, Nedim Lipka, and Thamar Solorio
Do-Not-Answer: Evaluating Safeguards in LLMs, Yuxia Wang, Haonan Li, Xudong Han, Preslav Nakov, and Timothy Baldwin
M4: Multi-Generator, Multi-Domain, and Multi-Lingual Black-Box Machine-Generated Text Detection, Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Chenxi Whitehouse, and Osama Mohammed Afzal
Rethinking STS and NLI in Large Language Models, Yuxia Wang, Minghan Wang, and Preslav Nakov
Submissions from 2023
Mirages. On Anthropomorphism in Dialogue Systems, Gavin Abercrombie, Amanda Cercas Curry, Tanvi Dinkar, Verena Rieser, and Zeerak Talat
Team TheSyllogist at SemEval-2023 Task 3: Language-Agnostic Framing Detection in Multi-Lingual Online News: A Zero-Shot Transfer Approach, Osama Mohammed Afzal and Preslav Nakov
THINK: Temporal Hypergraph Hyperbolic Network, Shivam Agarwal, Ramit Sawhney, Megh Thakkar, Preslav Nakov, Jiawei Han, and Tyler Derr
Handling Realistic Label Noise in BERT Text Classification, Maha Tufail Agro and Hanan Al Darmaki
Overview of the CLEF-2023 CheckThat! Lab Task 1 on Check-Worthiness of Multimodal and Multigenre Content, Firoj Alam, Alberto Barrón-Cedeño, Gullal S. Cheema, Gautam Kishore Shahi, Sherzod Hakimov, Maram Hasanain, Chengkai Li, Rubén Míguez, Hamdy Mubarak, Wajdi Zaghouani, and Preslav Nakov
Diacritic Recognition Performance in Arabic ASR, Hanan Aldarmaki and Ahmad Ghannam
Advancements in Arabic Grammatical Error Detection and Correction: An Empirical Investigation, Bashar Alhafni, Go Inoue, Christian Khairallah, and Nizar Habash
Text augmentation for semantic frame induction and parsing, Saba Anwar, Artem Shelmanov, Nikolay Arefyev, Alexander Panchenko, and Chris Biemann
The CLEF-2023 CheckThat! Lab: Checkworthiness, Subjectivity, Political Bias, Factuality, and Authority, Alberto Barrón-Cedeño, Firoj Alam, Tommaso Caselli, Giovanni Da San Martino, Tamer Elsayed, Andrea Galassi, Fatima Haouari, Federico Ruggeri, Julia Maria Struß, Rabindra Nath Nandi, Gullal S. Cheema, Dilshod Azizov, and Preslav Nakov
Overview of the CLEF–2023 CheckThat! Lab on Checkworthiness, Subjectivity, Political Bias, Factuality, and Authority of News Articles and Their Source, Alberto Barrón-Cedeño, Firoj Alam, Andrea Galassi, Giovanni Da San Martino, Preslav Nakov, Tamer Elsayed, Dilshod Azizov, Tommaso Caselli, Gullal S. Cheema, Fatima Haouari, Maram Hasanain, Mucahid Kutlu, Chengkai Li, Federico Ruggeri, Julia Maria Struß, and Wajdi Zaghouani
Grammatical Error Correction: A Survey of the State of the Art, Christopher Bryant, Zheng Yuan, Muhammad Reza Qorib, Hannan Cao, Hwee Tou Ng, and Ted Briscoe
NusaCrowd: Open Source Initiative for Indonesian NLP Resources, Samuel Cahyawijaya, Holy Lovenia, Alham Fikri Aji, Genta Indra Winata, Bryan Wilie, Fajri Koto, Rahmad Mahendra, and Christian Wibisono
Ten years after ImageNet: a 360° perspective on artificial intelligence, Sanjay Chawla, Preslav Nakov, Ahmed Ali, Wendy Hall, Issa Khalil, Xiaosong Ma, Husrev Taha Sencar, Ingmar Weber, Michael Wooldridge, and Ting Yu
Overview of GUA-SPA at IberLEF 2023: Guarani-Spanish Code Switching Analysis, Luis Chiruzzo, Marvin Agüero-Torales, Gustavo Giménez-Lugo, Aldo Alvarez, Yliana Rodríguez, Santiago Góngora, and Thamar Solorio
Overview of the CLEF-2023 CheckThat! Lab Task 3 on Political Bias of News Articles and News Media, Giovanni Da San Martino, Firoj Alam, Maram Hasanain, Rabindra Nath Nandi, Dilshod Azizov, Preslav Nakov, Tamer Elsayed, Dilshod Azizov, Tommaso Caselli, Gullal S. Cheema, Fatima Haouari, Maram Hasanain, Mucahid Kutlu, Chengkai Li, Federico Ruggeri, Julia Maria Struß, and Wajdi Zaghouani
Self-supervised learning with diffusion-based multichannel speech enhancement for speaker verification under noisy conditions, Sandipana Dowerah, Ajinkya Kulkarni, Romain Serizel, and Denis Jouvet
How Useful Are Educational Questions Generated by Large Language Models?, Sabina Elkins, Ekaterina Kochmar, Iulian Serban, and Jackie C.K. Cheung
Enhancing Arabic Content Generation with Prompt Augmentation Using Integrated GPT and Text-to-Image Models, Wala Elsharif, James She, Preslav Nakov, and Simon Wong
A Federated Approach for Hate Speech Detection, Jay Gala, Deep Gandhi, Jash Mehta, and Zeerak Talat
Understanding political polarization using language models: A dataset and method, Samiran Gode, Supreeth Bare, Bhiksha Raj, and Hyungon Yoo
Fair Enough: Standardizing Evaluation and Model Selection for Fairness Research in NLP, Xudong Han, Timothy Baldwin, and Trevor Cohn
bgGLUE: A Bulgarian General Language Understanding Evaluation Benchmark, Momchil Hardalov, Pepa Atanasova, Todor Mihaylov, Galia Angelova, Kiril Simov, Petya Osenova, Ves Stoyanov, and Ivan Koychev
Enriched Pre-trained Transformers for Joint Slot Filling and Intent Detection, Momchil Hardalov, Ivan Koychev, and Preslav Nakov
ArAIEval Shared Task: Persuasion Techniques and Disinformation Detection in Arabic Text, Maram Hasanain, Firoj Alam, Hamdy Mubarak, Samir Abdaljalil, Wajdi Zaghouani, Preslav Nakov, Giovanni Da San Martino, and Abed Alhakim Freihat
QCRI at SemEval-2023 Task 3: News Genre, Framing and Persuasion Techniques Detection using Multilingual Models, Maram Hasanain, Ahmed Oumar El-Shangiti, Rabindra Nath Nandi, Preslav Nakov, and Firoj Alam
Faking Fake News for Real Fake News Detection: Propaganda-Loaded Training Data Generation, Kung Hsiang Huang, Kathleen McKeown, Preslav Nakov, Yejin Choi, and Heng Ji
Robustness Tests for Automatic Machine Translation Metrics with Adversarial Attacks, Yichen Huang and Timothy Baldwin
Low-Resource Clickbait Spoiling for Indonesian via Question Answering, Ni Putu Intan Maharani, Ayu Purwarianti, and Alham Fikri Aji
Multi-lingual and Multi-cultural Figurative Language Understanding, Anubha Kabra, Emmy Liu, Simran Khanuja, Alham Fikri Aji, Genta Indra Winata, Samuel Cahyawijaya, Anuoluwapo Aremu, and Perez Ogayo
TARJAMAT: Evaluation of Bard and ChatGPT on Machine Translation of Ten Arabic Varieties, Karima Kadaoui, Samar M. Magdy, Abdul Waheed, Md Tawkat Islam Khondaker, Ahmed Oumar El-Shangiti, El Moatez Billah Nagoudi, and Muhammad Abdul-Mageed
Thorny Roses: Investigating the Dual Use Dilemma in Natural Language Processing, Lucie Aimée Kaffee, Arnav Arora, Zeerak Talat, and Isabelle Augenstein
Reducing Sequence Length by Predicting Edit Spans with Large Language Models, Masahiro Kaneko and Naoaki Okazaki
GPTAraEval: A Comprehensive Evaluation of ChatGPT on Arabic NLP, Md Tawkat Islam Khondaker, Abdul Waheed, El Moatez Billah Nagoudi, and Muhammad Abdul-Mageed
Introduction, Ekaterina Kochmar, Jill Burstein, Andrea Horbach, Ronja Laarmann-Quante, Nitin Madnani, Anaïs Tack, Victoria Yaneva, and Zheng Yuan
Large Language Models Only Pass Primary School Exams in Indonesia: A Comprehensive Test on IndoMMLU, Fajri Koto, Nurul Aisyah, Haonan Li, and Timothy Baldwin
Yet Another Model for Arabic Dialect Identification, Ajinkya Kulkarni and Hanan Al Darmaki
Adapting the adapters for code-switching in multilingual ASR, Atharva Kulkarni, Ajinkya Kulkarni, Miguel Couceiro, and Hanan Al Darmaki
MarsEclipse at SemEval-2023 Task 3: Multi-Lingual and Multi-Label Framing Detection with Contrastive Learning, Qisheng Liao, Meiting Lai, and Preslav Nakov
Arabic Fine-Grained Entity Recognition, Haneen Abdallatif Liqreina, Mustafa Jarrar, Mohammed Khalilia, Ahmed Oumar El-Shangiti, and Muhammad Abdul-Mageed
NEREL: a Russian information extraction dataset with rich annotation for nested entities, relations, and wikidata entity links, Natalia Loukachevitch, Ekaterina Artemova, Tatiana Batura, Pavel Braslavski, Vladimir Ivanov, Suresh Manandhar, Alexander Pugachev, Igor Rozhkov, Artem Shelmanov, Elena Tutubalina, and Alexey Yandutov
SCITAB: A Challenging Benchmark for Compositional Reasoning and Claim Verification on Scientific Tables, Xinyuan Lu, Liangming Pan, Qian Liu, Preslav Nakov, and Min Yen Kan
On the Effectiveness of Images in Multi-modal Text Classification: An Annotation Study, Chunpeng Ma, Aili Shen, Hiyori Yoshikawa, Tomoya Iwakura, Daniel Beck, and Timothy Baldwin
BERTastic at SemEval-2023 Task 3: Fine-Tuning Pretrained Multilingual Transformers – Does Order Matter?, Tarek Mahmoud and Preslav Nakov
Multimodality Representation Learning: A Survey on Evolution, Pretraining and Its Applications, Muhammad Arslan Manzoor, Sarah Albarri, Ziting Xian, Zaiqiao Meng, Preslav Nakov, and Shangsong Liang
Violet: A Vision-Language Model for Arabic Image Captioning with Gemini Decoder, Abdelrahman Mohamed, Fakhraddin Alwajih, El Moatez Billah Nagoudi, Alcides Alcoba Inciarte, and Muhammad Abdul-Mageed
Crosslingual Generalization through Multitask Finetuning, Niklas Muennighoff, Thomas Wang, Lintang Sutawika, Adam Roberts, Stella Biderman, Teven Le Scao, M. Saiful Bari, and Sheng Shen
DOLPHIN: A Challenging and Diverse Benchmark for Arabic NLG, El Moatez Billah Nagoudi, Abdel Rahim Elmadany, Ahmed Oumar El-Shangiti, and Muhammad Abdul-Mageed
Overview of the CLEF-2023 CheckThat! Lab Task 4 on Factuality of Reporting of News Media, Preslav Nakov, Firoj Alam, Giovanni Da San Martino, Maram Hasanain, Dilshod Azizov, Rabindra Nath Nandi, and Panayotov Panayot
On “Scientific Debt” in NLP: A Case for More Rigour in Language Model Pre-Training Research, Made Nindyatama Nityasya, Haryo Akbarianto Wibowo, Alham Fikri Aji, Genta Indra Winata, Radityo Eko Prasojo, Phil Blunsom, and Adhiguna Kuncoro
Second Language Acquisition of Neural Language Models, Miyu Oba, Tatsuki Kuribayashi, Hiroki Ouchi, and Taro Watanabe
Gpachov at CheckThat! 2023: A Diverse Multi-Approach Ensemble for Subjectivity Detection in News Articles, Georgi Pachov, Dimitar Dimitrov, Ivan Koychev, and Preslav Nakov
QACHECK: A Demonstration System for Question-Guided Multi-Hop Fact-Checking, Liangming Pan, Xinyuan Lu, Min Yen Kan, and Preslav Nakov
Fact-Checking Complex Claims with Program-Guided Reasoning, Liangming Pan, Xiaobao Wu, Xinyuan Lu, Anh Tuan Luu, William Yang Wang, Min Yen Kan, and Preslav Nakov
On the Risk of Misinformation Pollution with Large Language Models, Yikang Pan, Liangming Pan, Wenhu Chen, Preslav Nakov, Min Yen Kan, and William Yang Wang
SemEval-2023 Task 3: Detecting the Category, the Framing, and the Persuasion Techniques in Online News in a Multi-lingual Setup, Jakub Piskorski, Nicolas Stefanovitch, Giovanni Da San Martino, and Preslav Nakov
Multilingual Multifaceted Understanding of Online News in Terms of Genre, Framing and Persuasion Techniques, Jakub Piskorski, Nicolas Stefanovitch, Nikolaos Nikolaidis, Giovanni Da San Martino, and Preslav Nakov
The Hidden Dance of Phonemes and Visage: Unveiling the Enigmatic Link between Phonemes and Facial Features, Liao Qu, Xianwei Zou, Xiang Li, Yandong Wen, Rita Singh, and Bhiksha Raj
QASiNa: Religious Domain Question Answering Using Sirah Nabawiyah, Muhammad Razif Rizqullah, Ayu Purwarianti, and Alham Fikri Aji
Disease progression modelling of Alzheimer's disease using probabilistic principal components analysis, Martin Saint-Jalmes, Victor Fedyashov, Daniel Beck, Timothy Baldwin, Noel G. Faux, Pierrick Bourgeat, Jurgen Fripp, Colin L. Masters, and Benjamin Goudey
On the effect of dropping layers of pre-trained transformer models, Hassan Sajjad, Fahim Dalvi, Nadir Durrani, and Preslav Nakov
Can You Answer This? - Exploring Zero-Shot QA Generalization Capabilities in Large Language Models, Saptarshi Sengupta, Shreya Ghosh, Preslav Nakov, and Prasenjit Mitra
What Do You MEME? Generating Explanations for Visual Semantic Role Labelling in Memes, Shivam Sharma, Siddhant Agarwal, Tharun Suresh, Preslav Nakov, Md Shad Akhtar, and Tanmoy Chakraborty
Enhancing Video-based Learning Using Knowledge Tracing: Personalizing Students’ Learning Experience with ORBITS, Shady Shehata, David Santandreu, Philip Purnell, and Mark Thompson
GlobalBench: A Benchmark for Global Progress in Natural Language Processing, Yueqi Song, Catherine Cui, Simran Khanuja, Pengfei Liu, Fahim Faisal, Alissa Ostapenko, Genta Indra Winata, and Alham Fikri Aji
DetectLLM: Leveraging Log-Rank Information for Zero-Shot Detection of Machine-Generated Text, Jinyan Su, Terry Yue Zhuo, Di Wang, and Preslav Nakov