Mitigating Bias for Question Answering Models by Tracking Bias Influence
Mingyu Derek Ma, Jiun-Yu Kao, Arpit Gupta, Yu-Hsiang Lin, Wenbo Zhao, Tagyoung Chung, Wei Wang, Kai-Wei Chang, and Nanyun Peng, in NAACL, 2024.
Abstract
Bib Entry
@inproceedings{ma2024mitigating,
title = {Mitigating Bias for Question Answering Models by Tracking Bias Influence},
author = {Ma, Mingyu Derek and Kao, Jiun-Yu and Gupta, Arpit and Lin, Yu-Hsiang and Zhao, Wenbo and Chung, Tagyoung and Wang, Wei and Chang, Kai-Wei and Peng, Nanyun},
booktitle = {NAACL},
year = {2024}
}
Related Publications
-
A Meta-Evaluation of Measuring LLM Misgendering, COLM 2025, 2025
-
White Men Lead, Black Women Help? Benchmarking Language Agency Social Biases in LLMs, ACL, 2025
-
Controllable Generation via Locally Constrained Resampling, ICLR, 2025
-
On Localizing and Deleting Toxic Memories in Large Language Models, NAACL-Finding, 2025
-
Attribute Controlled Fine-tuning for Large Language Models: A Case Study on Detoxification, EMNLP-Finding, 2024
-
Are you talking to ['xem'] or ['x', 'em']? On Tokenization and Addressing Misgendering in LLMs with Pronoun Tokenization Parity, NAACL-Findings, 2024
-
Are Personalized Stochastic Parrots More Dangerous? Evaluating Persona Biases in Dialogue Systems, EMNLP-Finding, 2023
-
Kelly is a Warm Person, Joseph is a Role Model: Gender Biases in LLM-Generated Reference Letters, EMNLP-Findings, 2023
-
The Tail Wagging the Dog: Dataset Construction Biases of Social Bias Benchmarks, ACL (short), 2023
-
Factoring the Matrix of Domination: A Critical Review and Reimagination of Intersectionality in AI Fairness, AIES, 2023
-
How well can Text-to-Image Generative Models understand Ethical Natural Language Interventions?, EMNLP (Short), 2022
-
On the Intrinsic and Extrinsic Fairness Evaluation Metrics for Contextualized Language Representations, ACL (short), 2022
-
Societal Biases in Language Generation: Progress and Challenges, ACL, 2021
-
"Nice Try, Kiddo": Investigating Ad Hominems in Dialogue Responses, NAACL, 2021
-
BOLD: Dataset and metrics for measuring biases in open-ended language generation, FAccT, 2021
-
Towards Controllable Biases in Language Generation, EMNLP-Finding, 2020
-
The Woman Worked as a Babysitter: On Biases in Language Generation, EMNLP (short), 2019