Towards Equal Opportunity Fairness through Adversarial Learning
Document Type
Article
Publication Title
arXiv
Abstract
Adversarial training is a common approach for bias mitigation in natural language processing. Although most work on debiasing is motivated by equal opportunity, it is not explicitly captured in standard adversarial training. In this paper, we propose an augmented discriminator for adversarial training, which takes the target class as input to create richer features and more explicitly model equal opportunity. Experimental results over two datasets show that our method substantially improves over standard adversarial debiasing methods, in terms of the performance-fairness trade-off. Copyright © 2022, The Authors. All rights reserved.
DOI
10.48550/arXiv.2203.06317
Publication Date
3-11-2022
Keywords
Machine learning, Natural language processing systems, Adversarial learning, De-biasing, Equal opportunity, Performance, Rich features, Target class, Trade off, Economic and social effects, Artificial Intelligence (cs.AI), Computation and Language (cs.CL)
Recommended Citation
X. Han, T. Baldwin, and T. Cohn, "Towards Equal Opportunity Fairness through Adversarial Learning", 2022, arXiv:2203.06317
Comments
IR Deposit conditions: non-described
Preprint available on arXiv