Natural Language Processing Faculty Publications

OffensEval 2023: Offensive language identification in the age of Large Language Models

Marcos Zampieri, George Mason University
Sara Rosenthal, IBM Research
Preslav Nakov, Mohamed Bin Zayed University of Artificial IntelligenceFollow
Alphaeus Dmonte, George Mason University
Tharindu Ranasinghe, Aston University

Document Type

Article

Publication Title

Infection Control and Hospital Epidemiology

Abstract

The OffensEval shared tasks organized as part of SemEval-2019-2020 were very popular, attracting over 1300 participating teams. The two editions of the shared task helped advance the state of the art in offensive language identification by providing the community with benchmark datasets in Arabic, Danish, English, Greek, and Turkish. The datasets were annotated using the OLID hierarchical taxonomy, which since then has become the de facto standard in general offensive language identification research and was widely used beyond OffensEval. We present a survey of OffensEval and related competitions, and we discuss the main lessons learned. We further evaluate the performance of Large Language Models (LLMs), which have recently revolutionalized the field of Natural Language Processing. We use zero-shot prompting with six popular LLMs and zero-shot learning with two task-specific fine-tuned BERT models, and we compare the results against those of the top-performing teams at the OffensEval competitions. Our results show that while some LMMs such as Flan-T5 achieve competitive performance, in general LLMs lag behind the best OffensEval systems.

First Page

1737

Last Page

1747

DOI

10.1017/ice.2023.69

Publication Date

11-28-2023

Keywords

Machine learning, Text classification

Comments

IR Deposit conditions:

OA version (pathway b) Accepted version

6 months embargo

License: CC BY-NC-ND

Must state accepted for publication

Should link to publisher version or journal website

Recommended Citation

M. Zampieri et al., "OffensEval 2023: Offensive language identification in the age of Large Language Models," Infection Control and Hospital Epidemiology, vol. 44, no. 11, pp. 1737 - 1747, Nov 2023.

The definitive version is available at https://doi.org/10.1017/ice.2023.69

Additional Links

DOI link: https://doi.org/10.1017/S1351324923000517

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

Natural Language Processing Faculty Publications

OffensEval 2023: Offensive language identification in the age of Large Language Models

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Included in

Browse

Contribute

Links

Natural Language Processing Faculty Publications

OffensEval 2023: Offensive language identification in the age of Large Language Models

Authors

Document Type

Publication Title

Abstract

First Page

Last Page

DOI

Publication Date

Keywords

Comments

Recommended Citation

Additional Links

Included in

Share

Browse

Contribute

Links