UCLA-NLP (Chang's and PLUS lab) @ EMNLP 2021

At UCLA-NLP, our mission is to develop reliable, fair, accountable, robust natural language understanding and generation technology to benefit everyone.

Please see our recent papers at

In the following, we will highlight our research papers at EMNLP 2021 on the following topics:

Language Generation

Fairness and Robustness

Tutorial on Robustness and Adversarial Examples in NLP, Kai-Wei Chang, He He, Robin Jia, Sameer Singh

Searching for an Effiective Defender: Benchmarking Defense against Adversarial Word Substitution, Zongyi Li, Jianhan Xu, Jiehang Zeng, Linyang Li, Xiaoqing Zheng, Qi Zhang, Kai-Wei Chang, and Cho-Jui Hsieh, in EMNLP, 2021. Details
Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies, Sunipa Dev, Masoud Monajatipoor, Anaelia Ovalle, Arjun Subramonian, Jeff Phillips, and Kai-Wei Chang, in EMNLP, 2021. Details
On the Transferability of Adversarial Attacks against Neural Text Classifier, Liping Yuan, Xiaoqing Zheng, Yi Zhou, Cho-Jui Hsieh, and Kai-Wei Chang, in EMNLP, 2021. Details

Multimodal, Mulitlingual, and Culture Diversity

Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training, Kuan-Hao Huang, Wasi Ahmad, Nanyun Peng, and Kai-Wei Chang, in EMNLP, 2021. Details
Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning, Da Yin, Liunian Harold Li, Ziniu Hu, Nanyun Peng, and Kai-Wei Chang, in EMNLP, 2021. Details
Retrieval Augmented Code Generation and Summarization, Md Rizwan Parvez, Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang, in EMNLP-Finding, 2021. Details

Information Extraction & Question Answering

Relation-Guided Pre-Training for Open-Domain Question Answering, Ziniu Hu, Yizhou Sun, and Kai-Wei Chang, in EMNLP-Finding, 2021. Details

Language Generation

Fairness and Robustness

Searching for an Effiective Defender: Benchmarking Defense against Adversarial Word Substitution
Zongyi Li, Jianhan Xu, Jiehang Zeng, Linyang Li, Xiaoqing Zheng, Qi Zhang, Kai-Wei Chang, and Cho-Jui Hsieh, in EMNLP, 2021.

QA Sessions: VIRTUAL POSTER SESSION I: MACHINE LEARNING FOR NLP Paper link in the virtual conference
Full Text BibTeX Details
```
Recent studies have shown that deep neural networks are vulnerable to intentionally crafted adversarial examples, and various methods have been proposed to defend against adversarial word-substitution attacks for neural NLP models. However, there is a lack of systematic study on comparing different defense approaches under the same attacking setting. In this paper, we seek to fill the gap of systematic studies through comprehensive researches on understanding the behavior of neural text classifiers trained by various defense methods under representative adversarial attacks. In addition, we propose an effective method to further improve the robustness of neural text classifiers against such attacks and achieved the highest accuracy on both clean and adversarial examples on AGNEWS and IMDB datasets by a significant margin.
```
```
@inproceedings{li2021searching,
  title = {Searching for an Effiective Defender: Benchmarking Defense against Adversarial Word Substitution},
  author = {Li, Zongyi and Xu, Jianhan and Zeng, Jiehang and Li, Linyang and Zheng, Xiaoqing and Zhang, Qi and Chang, Kai-Wei and Hsieh, Cho-Jui},
  presentation_id = {https://underline.io/events/192/posters/8225/poster/38025-searching-for-an-effective-defender-benchmarking-defense-against-adversarial-word-substitution},
  booktitle = {EMNLP},
  year = {2021}
}
```
Related Publications
1. VideoCon: Robust video-language alignment via contrast captions, CVPR, 2024
2. CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning, ICCV, 2023
3. Red Teaming Language Model Detectors with Language Models, TACL, 2023
4. ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation, EMNLP, 2022
5. Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers, EMNLP-Finding (short), 2022
6. Unsupervised Syntactically Controlled Paraphrase Generation with Abstract Meaning Representations, EMNLP-Finding (short), 2022
7. Improving the Adversarial Robustness of NLP Models by Information Bottleneck, ACL-Finding, 2022
8. On the Transferability of Adversarial Attacks against Neural Text Classifier, EMNLP, 2021
9. Defense against Synonym Substitution-based Adversarial Attacks via Dirichlet Neighborhood Ensemble, ACL, 2021
10. Double Perturbation: On the Robustness of Robustness and Counterfactual Bias Evaluation, NAACL, 2021
11. Provable, Scalable and Automatic Perturbation Analysis on General Computational Graphs, NeurIPS, 2020
12. On the Robustness of Language Encoders against Grammatical Errors, ACL, 2020
13. Robustness Verification for Transformers, ICLR, 2020
14. Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification, EMNLP, 2019
15. Retrofitting Contextualized Word Embeddings with Paraphrases, EMNLP (short), 2019
16. Generating Natural Language Adversarial Examples, EMNLP (short), 2018
Details

Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies

Sunipa Dev, Masoud Monajatipoor, Anaelia Ovalle, Arjun Subramonian, Jeff Phillips, and Kai-Wei Chang, in EMNLP, 2021.

QA Sessions: 4D: ETHICS AND NLP Paper link in the virtual conference

Full Text Slides Poster BibTeX Details

Gender is widely discussed in the context of language tasks and when examining the stereotypes propagated by language models. However, current discussions primarily treat gender as binary, which can perpetuate harms such as the cyclical erasure of non-binary gender identities. These harms are driven by model and dataset biases, which are consequences of the non-recognition and lack of understanding of non-binary genders in society. In this paper, we explain the complexity of gender and language around it, and survey non-binary persons to understand harms associated with the treatment of gender as binary in English language technologies. We also detail how current language representations (e.g., GloVe, BERT) capture and perpetuate these harms and related challenges that need to be acknowledged and addressed for representations to equitably encode gender information.

@inproceedings{dev2021harms,
  title = {Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies},
  author = {Dev, Sunipa and Monajatipoor, Masoud and Ovalle, Anaelia and Subramonian, Arjun and Phillips, Jeff and Chang, Kai-Wei},
  presentation_id = {https://underline.io/events/192/sessions/7788/lecture/37320-harms-of-gender-exclusivity-and-challenges-in-non-binary-representation-in-language-technologies},
  blog_url = {https://uclanlp.medium.com/harms-of-gender-exclusivity-and-challenges-in-non-binary-representation-in-language-technologies-5f89891b5aee},
  booktitle = {EMNLP},
  year = {2021}
}

🌈 Harms of Gender Exclusivity and Challenges in Non-Binary Representation in Language Technologies 🏳️‍⚧️ #EMNLP2021 paper w/ @sunipa17 @MMonajatipoor @ovalle_elia @probablyjeff @kaiwei_chang @uclanlp

👉 paper: https://t.co/n4TyuhzSkX
👉 blog post: https://t.co/yX41TbS9no
👇 🧵
— Arjun Subramonian (th🦦y/th🐨m, அவங்க/இவங்க) (@arjunsubgraph) September 20, 2021

Related Publications

Mitigating Gender Bias in Distilled Language Models via Counterfactual Role Reversal, ACL Finding, 2022
Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer, ACL, 2020
Examining Gender Bias in Languages with Grammatical Gender, EMNLP, 2019
Balanced Datasets Are Not Enough: Estimating and Mitigating Gender Bias in Deep Image Representations, ICCV, 2019
Gender Bias in Contextualized Word Embeddings, NAACL (short), 2019
Learning Gender-Neutral Word Embeddings, EMNLP (short), 2018
Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings, NeurIPS, 2016

Details

On the Transferability of Adversarial Attacks against Neural Text Classifier
Liping Yuan, Xiaoqing Zheng, Yi Zhou, Cho-Jui Hsieh, and Kai-Wei Chang, in EMNLP, 2021.

QA Sessions: VIRTUAL POSTER SESSION I: INTERPRETABILITY AND ANALYSIS OF MODELS FOR NLP Paper link in the virtual conference
Full Text BibTeX Details
```
Deep neural networks are vulnerable to adversarial attacks, where a small perturbation to an input alters the model prediction. In many cases, malicious inputs intentionally crafted for one model can fool another model. In this paper, we present the first study to systematically investigate the transferability of adversarial examples for text classification models and explore how various factors, including network architecture, tokenization scheme, word embedding, and model capacity, affect the transferability of adversarial examples. Based on these studies, we propose a genetic algorithm to find an ensemble of models that can be used to induce adversarial examples to fool almost all existing models. Such adversarial examples reflect the defects of the learning process and the data bias in the training set. Finally, we derive word replacement rules that can be used for model diagnostics from these adversarial examples.
```
```
@inproceedings{yuan2021on,
  title = {On the Transferability of Adversarial Attacks against Neural Text Classifier},
  author = {Yuan, Liping and Zheng, Xiaoqing and Zhou, Yi and Hsieh, Cho-Jui and Chang, Kai-Wei},
  presentation_id = {https://underline.io/events/192/posters/8223/poster/38067-on-the-transferability-of-adversarial-attacks-against-neural-text-classifier},
  booktitle = {EMNLP},
  year = {2021}
}
```
Related Publications
1. VideoCon: Robust video-language alignment via contrast captions, CVPR, 2024
2. CleanCLIP: Mitigating Data Poisoning Attacks in Multimodal Contrastive Learning, ICCV, 2023
3. Red Teaming Language Model Detectors with Language Models, TACL, 2023
4. ADDMU: Detection of Far-Boundary Adversarial Examples with Data and Model Uncertainty Estimation, EMNLP, 2022
5. Investigating Ensemble Methods for Model Robustness Improvement of Text Classifiers, EMNLP-Finding (short), 2022
6. Unsupervised Syntactically Controlled Paraphrase Generation with Abstract Meaning Representations, EMNLP-Finding (short), 2022
7. Improving the Adversarial Robustness of NLP Models by Information Bottleneck, ACL-Finding, 2022
8. Searching for an Effiective Defender: Benchmarking Defense against Adversarial Word Substitution, EMNLP, 2021
9. Defense against Synonym Substitution-based Adversarial Attacks via Dirichlet Neighborhood Ensemble, ACL, 2021
10. Double Perturbation: On the Robustness of Robustness and Counterfactual Bias Evaluation, NAACL, 2021
11. Provable, Scalable and Automatic Perturbation Analysis on General Computational Graphs, NeurIPS, 2020
12. On the Robustness of Language Encoders against Grammatical Errors, ACL, 2020
13. Robustness Verification for Transformers, ICLR, 2020
14. Learning to Discriminate Perturbations for Blocking Adversarial Attacks in Text Classification, EMNLP, 2019
15. Retrofitting Contextualized Word Embeddings with Paraphrases, EMNLP (short), 2019
16. Generating Natural Language Adversarial Examples, EMNLP (short), 2018
Details

Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training

Kuan-Hao Huang, Wasi Ahmad, Nanyun Peng, and Kai-Wei Chang, in EMNLP, 2021.

QA Sessions: 3G - IN PERSON POSTER SESSION Paper link in the virtual conference

Full Text Code BibTeX Details

Pre-trained multilingual language encoders, such as multilingual BERT and XLM-R, show great potential for zero-shot cross-lingual transfer. However, these multilingual encoders do not precisely align words and phrases across languages. Especially, learning alignments in the multilingual embedding space usually requires sentence-level or word-level parallel corpora, which are expensive to be obtained for low-resource languages. An alternative is to make the multilingual encoders more robust; when fine-tuning the encoder using downstream task, we train the encoder to tolerate noise in the contextual embedding spaces such that even if the representations of different languages are not aligned well, the model can still achieve good performance on zero-shot cross-lingual transfer. In this work, we propose a learning strategy for training robust models by drawing connections between adversarial examples and the failure cases of zero-shot cross-lingual transfer. We adopt two widely used robust training methods, adversarial training and randomized smoothing, to train the desired robust model. The experimental results demonstrate that robust training improves zero-shot cross-lingual transfer on text classification tasks. The improvement is more significant in the generalized cross-lingual transfer setting, where the pair of input sentences belong to two different languages.

@inproceedings{huang2021improving,
  title = {Improving Zero-Shot Cross-Lingual Transfer Learning via Robust Training},
  author = {Huang, Kuan-Hao and Ahmad, Wasi and Peng, Nanyun and Chang, Kai-Wei},
  presentation_id = {https://underline.io/events/192/posters/7783/poster/40656-improving-zero-shot-cross-lingual-transfer-learning-via-robust-training},
  booktitle = {EMNLP},
  year = {2021}
}

Related Publications

LiveCLKTBench: Towards Reliable Evaluation of Cross-Lingual Knowledge Transfer in Multilingual LLMs, ACL, 2026
Contextual Label Projection for Cross-Lingual Structured Prediction, NAACL, 2024
Multilingual Generative Language Models for Zero-Shot Cross-Lingual Event Argument Extraction, ACL, 2022
Evaluating the Values of Sources in Transfer Learning, NAACL, 2021
Syntax-augmented Multilingual BERT for Cross-lingual Transfer, ACL, 2021
GATE: Graph Attention Transformer Encoder for Cross-lingual Relation and Event Extraction, AAAI, 2021
Cross-Lingual Dependency Parsing by POS-Guided Word Reordering, EMNLP-Finding, 2020
Cross-lingual Dependency Parsing with Unlabeled Auxiliary Languages, CoNLL, 2019
Target Language-Aware Constrained Inference for Cross-lingual Dependency Parsing, EMNLP, 2019
On Difficulties of Cross-Lingual Transfer with Order Differences: A Case Study on Dependency Parsing, NAACL, 2019

Details

Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning

Da Yin, Liunian Harold Li, Ziniu Hu, Nanyun Peng, and Kai-Wei Chang, in EMNLP, 2021.

QA Sessions: 4F: SPEECH, VISION, ROBOTICS, MULTIMODAL GROUNDING 1 Paper link in the virtual conference

Full Text Code BibTeX Details

Commonsense is defined as the knowledge that is shared by everyone. However, certain types of commonsense knowledge are correlated with culture and geographic locations and they are only shared locally. For example, the scenarios of wedding ceremonies vary across regions due to different customs influenced by historical and religious factors. Such regional characteristics, however, are generally omitted in prior work. In this paper, we construct a Geo-Diverse Visual Commonsense Reasoning dataset (GD-VCR) to test vision-and-language models’ ability to understand cultural and geo-location-specific commonsense. In particular, we study two state-of-the-art Vision-and-Language models, VisualBERT and ViLBERT trained on VCR, a standard multimodal commonsense benchmark with images primarily from Western regions. We then evaluate how well the trained models can generalize to answering the questions in GD-VCR. We find that the performance of both models for non-Western regions including East Asia, South Asia, and Africa is significantly lower than that for Western region. We analyze the reasons behind the performance disparity and find that the performance gap is larger on QA pairs that: 1) are concerned with culture-related scenarios, e.g., weddings, religious activities, and festivals; 2) require high-level geo-diverse commonsense reasoning rather than low-order perception and recognition.

@inproceedings{yin2021broaden,
  title = {	Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning},
  author = {Yin, Da and Li, Liunian Harold and Hu, Ziniu and Peng, Nanyun and Chang, Kai-Wei},
  booktitle = {EMNLP},
  presentation_id = {https://underline.io/events/192/sessions/7790/lecture/37514-broaden-the-vision-geo-diverse-visual-commonsense-reasoning},
  year = {2021}
}

(1/n) Commonsense is more diverse than we think! Excited to share that our paper “Broaden the Vision: Geo-Diverse Visual Commonsense Reasoning” got accepted at #EMNLP2021.

Paper: https://t.co/Is9JWB7IdX
Data and Code: https://t.co/nOj9bVdIcK
— Da Yin (@Wade_Yin9712) September 23, 2021

Related Publications

Unsupervised Vision-and-Language Pre-training Without Parallel Images and Captions, NAACL, 2021
What Does BERT with Vision Look At?, ACL (short), 2020
VisualBERT: A Simple and Performant Baseline for Vision and Language, Arxiv, 2019

Details

Retrieval Augmented Code Generation and Summarization

Md Rizwan Parvez, Wasi Ahmad, Saikat Chakraborty, Baishakhi Ray, and Kai-Wei Chang, in EMNLP-Finding, 2021.

QA Sessions: FINDINGS PAPERS - GENERATION Paper link in the virtual conference

Full Text BibTeX Details

Software developers write a lot of source code and documentation during software development. Intrinsically, developers often recall parts of source code or code summaries that they had written in the past while implementing software or documenting them. To mimic developers’ code or summary generation behavior, we propose a retrieval augmented framework, \tool, that retrieves relevant code or summaries from a retrieval database and provides them as a supplement to code generation or summarization models. \tool has a couple of uniqueness. First, it extends the state-of-the-art dense retrieval technique to search for relevant code or summaries. Second, it can work with retrieval databases that include unimodal (only code or natural language description) or bimodal instances (code-description pairs). We conduct experiments and extensive analysis on two benchmark datasets of code generation and summarization in Java and Python, and the promising results endorse the effectiveness of our proposed retrieval augmented framework.

@inproceedings{parvez2021retrieval,
  title = {Retrieval Augmented Code Generation and Summarization},
  author = {Parvez, Md Rizwan and Ahmad, Wasi and Chakraborty, Saikat and Ray, Baishakhi and Chang, Kai-Wei},
  booktitle = {EMNLP-Finding},
  presentation_id = {https://underline.io/events/192/sessions/7923/lecture/38314-retrieval-augmented-code-generation-and-summarization},
  year = {2021}
}

Related Publications

AutoSUIT Bench - Automated Security UnIt Test Benchmark for LLM Coding, ACL-Findings, 2026
METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling, ACL, 2025
MQT-LLaVA: Matryoshka Query Transformer for Large Vision-Language Models, NeurIPS, 2024
DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation, NeurIPS (Datasets and Benchmarks Track), 2024
VDebugger: Harnessing Execution Feedback for Debugging Visual Programs, EMNLP-Finding, 2024
AVATAR: A Parallel Corpus for Java-Python Program Translation, ACL-Finding (short), 2023
Unified Pre-training for Program Understanding and Generation, NAACL, 2021

Details

Information Extraction and Question Answering

Relation-Guided Pre-Training for Open-Domain Question Answering

Ziniu Hu, Yizhou Sun, and Kai-Wei Chang, in EMNLP-Finding, 2021.

QA Sessions: FINDINGS PAPERS - QUESTION ANSWERING Paper link in the virtual conference

Full Text BibTeX Details

Answering complex open-domain questions requires understanding the latent relations between involving entities. However, we found that the existing QA datasets are extremely imbalanced in some types of relations, which hurts the generalization performance over questions with long-tail relations. To remedy this problem, in this paper, we propose a Relation-Guided Pre-Training (RGPT-QA) framework. We first generate a relational QA dataset covering a wide range of relations from both the Wikidata triplets and Wikipedia hyperlinks. We then pre-train a QA model to infer the latent relations from the question, and then conduct extractive QA to get the target answer entity. We demonstrate that by pretraining with propoed RGPT-QA techique, the popular open-domain QA model, Dense Passage Retriever (DPR), achieves 2.2%, 2.4%, and 6.3% absolute improvement in Exact Match accuracy on Natural Questions, TriviaQA, and WebQuestions. Particularly, we show that RGPT-QA improves significantly on questions with long-tail relations

@inproceedings{hu2021relation,
  title = {Relation-Guided Pre-Training for Open-Domain Question Answering},
  author = {Hu, Ziniu and Sun, Yizhou and Chang, Kai-Wei},
  presentation_id = {https://underline.io/events/192/sessions/7932/lecture/38507-relation-guided-pre-training-for-open-domain-question-answering},
  booktitle = {EMNLP-Finding},
  year = {2021}
}

Related Publications

An Integer Linear Programming Framework for Mining Constraints from Data, ICML, 2021
Generating Syntactically Controlled Paraphrases without Using Annotated Parallel Pairs, EACL, 2021
Clinical Temporal Relation Extraction with Probabilistic Soft Logic Regularization and Global Inference, AAAI, 2021
GPT-GNN: Generative Pre-Training of Graph Neural Networks, KDD, 2020
PolicyQA: A Reading Comprehension Dataset for Privacy Policies, EMNLP-Finding (short), 2020
SentiBERT: A Transferable Transformer-Based Architecture for Compositional Sentiment Semantics, ACL, 2020
Building Language Models for Text with Named Entities, ACL, 2018
Learning from Explicit and Implicit Supervision Jointly For Algebra Word Problems, EMNLP, 2016

Details

Language Generation

Fairness and Robustness

Multi-Modal, Multi-Lingual, and Culture Diversity

Information Extraction and Question Answering