Measuring Fairness of Text Classifiers via Prediction Sensitivity
Satyapriya Krishna, Rahul Gupta, Apurv Verma, Jwala Dhamala, Yada Pruksachatkun, and Kai-Wei Chang, in ACL, 2022.
Download the full text
Abstract
With the rapid growth in language processing applications, fairness has emerged as an important consideration in data-driven solutions. Although various fairness definitions have been explored in the recent literature, there is lack of consensus on which metrics most accurately reflect the fairness of a system. In this work, we propose a new formulation : ACCUMULATED PREDICTION SENSITIVITY, which measures fairness in machine learning models based on the model’s prediction sensitivity to perturbations in input features. The metric attempts to quantify the extent to which a single prediction depends on a protected attribute, where the protected attribute encodes the membership status of an individual in a protected group. We show that the metric can be theoretically linked with a specific notion of group fairness (statistical parity) and individual fairness. It also correlates well with humans’ perception of fairness. We conduct experiments on two text classification datasets : JIGSAW TOXICITY, and BIAS IN BIOS, and evaluate the correlations between metrics and manual annotations on whether the model produced a fair outcome. We observe that the proposed fairness metric based on prediction sensitivity is statistically significantly more correlated with human annotation than the existing counterfactual fairness metric.
Bib Entry
@inproceedings{krishna2022measuring,
title = {Measuring Fairness of Text Classifiers via Prediction Sensitivity},
author = {Krishna, Satyapriya and Gupta, Rahul and Verma, Apurv and Dhamala, Jwala and Pruksachatkun, Yada and Chang, Kai-Wei},
booktitle = {ACL},
year = {2022}
}
Related Publications
- Does Robustness Improve Fairness? Approaching Fairness with Word Substitution Robustness Methods for Text Classification, ACL-Finding, 2021
- LOGAN: Local Group Bias Detection by Clustering, EMNLP (short), 2020
- Towards Understanding Gender Bias in Relation Extraction, ACL, 2020
- Mitigating Gender Bias Amplification in Distribution by Posterior Regularization, ACL (short), 2020
- Mitigating Gender in Natural Language Processing: Literature Review, ACL, 2019
- Gender Bias in Coreference Resolution: Evaluation and Debiasing Methods, NAACL (short), 2018
- Men Also Like Shopping: Reducing Gender Bias Amplification using Corpus-level Constraints, EMNLP, 2017