Document Type

Article

Publication Title

arXiv

Abstract

Negation is a common linguistic feature that is crucial in many language understanding tasks, yet it remains a hard problem due to diversity in its expression in different types of text. Recent work has shown that state-of-the-art NLP models underperform on samples containing negation in various tasks, and that negation detection models do not transfer well across domains. We propose a new negation-focused pre-training strategy, involving targeted data augmentation and negation masking, to better incorporate negation information into language models. Extensive experiments on common benchmarks show that our proposed approach improves negation detection performance and generalizability over the strong baseline NegBERT (Khandelwal and Sawant, 2020). © 2022, CC BY.

DOI

10.48550/arXiv.2205.04012

Publication Date

4-8-2022

Keywords

Computational linguistics, Data augmentation, Detection models, Detection performance, Hard problems, Language model, Language understanding, Linguistic features, Pre-training, State of the art, Training strategy, Benchmarking, Computation and Language (cs.CL)

Comments

Preprint: arXiv

Archived with thanks to arXiv

Preprint License: CC by 4.0

Uploaded 01 July 2022

Share

COinS