Unveiling Vulnerabilities: Robustness Analysis of Black-Box Machine-Generated Text Detectors

Date of Award

4-30-2024

Document Type

Thesis

Degree Name

Master of Science in Natural Language Processing

Department

Natural Language Processing

First Advisor

Timothy Baldwin

Second Advisor

Hanan Aldarmaki

Abstract

In an era characterized by rapid technological advancement, artificial intelligence has smoothly integrated into people’s daily lives. Among its significant achievements are large language models (LLMs), which have demonstrated remarkable abilities in text generation across diverse domains, made increasingly accessible through simple APIs. However, alongside this quick progress, concerns have raised regarding the safety and ethical use of generated texts, particularly in fields such as research, education, and news dissemination. Moreover, the spreading of generated text across the web poses challenges for its use in further research, as distinguishing between authentic and synthetic content becomes very difficult. In this study, we conduct a comprehensive analysis of the robustness of black-box detectors under various scenarios and their capability for generalization. We show the challenges encountered by these detectors in adapting to new domains, generators, languages or instructions. Furthermore, we demonstrate that detectors trained on data generated by instruction-tuned generators leads to lower robustness compared to those trained on generations from untuned generators. Through empirical analysis, we reveal the vulnerabilities inherent in current detection mechanisms and emphasize the critical need for further advancements in this domain. We anticipate that the insights from this study will inform efforts aimed at enhancing the efficacy of existing detectors, thereby contributing to a safer and more reliable environment

Comments

Thesis submitted to the Deanship of Graduate and Postdoctoral Studies In partial fulfilment of the requirements for the M.Sc degree in Science in Natural Language Processing Advisors: Timothy Baldwin,Hanan Aldarmaki with 2 years embargo period

This document is currently not available here.

Share

COinS