FairGauge: A Modularized Evaluation of Bias in Masked Language Models
Document Type
Conference Proceeding
Publication Title
Proceedings of the 2023 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2023
Abstract
Prejudice is a pre-conceived depiction of an entity within a person's mind. It tends to devalue people as a consequence of their perceived membership in a social group. The origin of prejudice can be traced back to the categorization process people use to form a plausible perception of their surroundings. The process of constructing these perceptions generally results in prejudices, which authorizes inequalities to develop across a variety of social groups. In all their forms, biases can be relayed in language by generalizing a negative adjective onto an social group as a function of prejudgement. Using this reduced linguistic formulation, we set out to (1) create a benchmark of 23,736 prejudiced sentences that encompass a plethora of bias types including racism, sexism, classism, ethnic discrimination, and religious discrimination; (2) propose a prejudice score that incorporates both the masked prediction probability and the top-k index (rank) of the matched word; (3) conduct a case study, using our benchmark, to evaluate bias in three pre-trained language models: BERT, DistilBERT, and Context-Debias DistilBERT.
First Page
131
Last Page
135
DOI
10.1145/3625007.3627592
Publication Date
11-6-2023
Keywords
benchmark, bias evaluation, language models
Recommended Citation
J. Doughman et al., "FairGauge: A Modularized Evaluation of Bias in Masked Language Models," Proceedings of the 2023 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2023, pp. 131 - 135, Nov 2023.
The definitive version is available at https://doi.org/10.1145/3625007.3627592