Examining Gender Bias in Languages with Grammatical Gender

Pei Zhou, Weijia Shi, Jieyu Zhao, Kuan-Hao Huang, Muhao Chen, Ryan Cotterell, and Kai-Wei Chang, in EMNLP, 2019.

Poster Code

Download the full text

Abstract

Recent studies have shown that word embeddings exhibit gender bias inherited from the training corpora. However, most studies to date have focused on quantifying and mitigating such bias only in English. These analyses cannot be directly extended to languages that exhibit morphological agreement on gender, such as Spanish and French. In this paper, we propose new metrics for evaluating gender bias in word embeddings of these languages and further demonstrate evidence of gender bias in bilingual embeddings which align these languages with English. Finally, we extend an existing approach to mitigate gender bias in word embeddings under both monolingual and bilingual settings. Experiments on modified Word Embedding Association Test, word similarity, word translation, and word pair translation tasks show that the proposed approaches effectively reduce the gender bias while preserving the utility of the embeddings.

Our EMNLP paper "Examining Gender Bias in Languages with Grammatical Gender" is on https://t.co/qVLvrfXv8O (w/@WeijiaShi2 @jieyuzhao11 @kuanhao_ @muhao_chen @ryandcotterell @kaiwei_chang). We separate semantic and grammatical gender info and found asymmetry between genders. pic.twitter.com/qTSN2RpgCE
— Pei Zhou (@peizNLP) September 6, 2019

Bib Entry

@inproceedings{zhou2019examining,
  author = {Zhou, Pei and Shi, Weijia and Zhao, Jieyu and Huang, Kuan-Hao and Chen, Muhao and Cotterell, Ryan and Chang, Kai-Wei},
  title = {Examining Gender Bias in Languages with Grammatical Gender},
  booktitle = {EMNLP},
  year = {2019}
}

Related Publications

Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal, ACL Finding, 2022
Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies, EMNLP, 2021
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer, ACL, 2020
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations, ICCV, 2019
Gender Bias in Contextualized Word Embeddings, NAACL (short), 2019
Learning Gender-Neutral Word Embeddings, EMNLP (short), 2018
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings, NeurIPS, 2016