M4: Multi-Generator, Multi-Domain, and Multi-Lingual Black-Box Machine-Generated Text Detection

Document Type

Conference Proceeding

Publication Title

ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings

Abstract

Pre-trained models such as BERT have achieved remarkable results in text matching tasks. However, existing models still suffer from the challenge of capturing local subtle differences when modeling complex semantic matching relationships. In this work, we find that the integration of local syntax awareness and global semantics is crucial for text matching. Meanwhile, we propose the Local and Global Syntax Graph Calibration (LG-SGC) module, which can explore local syntactic and global semantic information for the matching task. Specifically, we first introduce an auxiliary task inside BERT to capture local subtle grammatical differences. Then, we retain the original attention operation to capture global matching features. Finally, we design an information fusion layer to effectively combine local and global information to deepen the understanding of the matching task. We conduct extensive experiments on 10 benchmarks, and LG-SGC significantly outperforms previous models.

First Page

11571

Last Page

11575

DOI

10.1109/ICASSP48485.2024.10446461

Publication Date

1-1-2024

Keywords

attention calibration, neural language processing, semantic matching, syntax graph

This document is currently not available here.

Share

COinS