Unveiling Vulnerabilities: Robustness Analysis of Black-Box Machine-Generated Text Detectors
Date of Award
4-30-2024
Document Type
Thesis
Degree Name
Master of Science in Natural Language Processing
Department
Natural Language Processing
First Advisor
Timothy Baldwin
Second Advisor
Hanan Aldarmaki
Abstract
In an era characterized by rapid technological advancement, artificial intelligence has smoothly integrated into people’s daily lives. Among its significant achievements are large language models (LLMs), which have demonstrated remarkable abilities in text generation across diverse domains, made increasingly accessible through simple APIs. However, alongside this quick progress, concerns have raised regarding the safety and ethical use of generated texts, particularly in fields such as research, education, and news dissemination. Moreover, the spreading of generated text across the web poses challenges for its use in further research, as distinguishing between authentic and synthetic content becomes very difficult. In this study, we conduct a comprehensive analysis of the robustness of black-box detectors under various scenarios and their capability for generalization. We show the challenges encountered by these detectors in adapting to new domains, generators, languages or instructions. Furthermore, we demonstrate that detectors trained on data generated by instruction-tuned generators leads to lower robustness compared to those trained on generations from untuned generators. Through empirical analysis, we reveal the vulnerabilities inherent in current detection mechanisms and emphasize the critical need for further advancements in this domain. We anticipate that the insights from this study will inform efforts aimed at enhancing the efficacy of existing detectors, thereby contributing to a safer and more reliable environment
Recommended Citation
J. Mansurov, "Unveiling Vulnerabilities: Robustness Analysis of Black-Box Machine-Generated Text Detectors,", Apr 2024.
Comments
Thesis submitted to the Deanship of Graduate and Postdoctoral Studies In partial fulfilment of the requirements for the M.Sc degree in Science in Natural Language Processing Advisors: Timothy Baldwin,Hanan Aldarmaki with 2 years embargo period