UCLA-NLP @ ACL 2023

At UCLA-NLP, our mission is to develop reliable, fair, accountable, robust natural language understanding and generation technology to benefit everyone. See our recent papers here

ACL is one of the major conferences in the field of natural language processing (NLP). We will participate in the following activities this year.

Tutorial

We will present a tutorial on “Indirectly Supervised Natural Language Processing” in 14:00-17:30 on 7/9

Dear @ACLattendees: Prepare to be blown away at our mind-blowing tutorial "Indirectly Supervised NLP"! Join us at the "Metropolitan Centre" hall on July 9th, sharp at 2pm. Your attendance and comments are priceless. @muhao_chen @BenZhou96 , Qiang Ning, @kaiwei_chang @DanRothNLP pic.twitter.com/vt5D6O79bO
— Wenpeng_Yin (@Wenpeng_Yin) July 7, 2023

Workshop

We will host the 3rd Trustworthy Natual Language Processing Workshop on 7/14.

Accepted Papers

we will present papers on the following topics:

Trustworthy NLP

Resolving Ambiguities in Text-to-Image Generative Models, Ninareh Mehrabi, Palash Goyal, Apurv Verma, Jwala Dhamala, Varun Kumar, Qian Hu, Kai-Wei Chang, Richard Zemel, Aram Galstyan, and Rahul Gupta, in ACL, 2023. Details
Efficient Shapley Values Estimation by Amortization for Text Classification, Chenghao Yang, Fan Yin, He He, Kai-Wei Chang, Xiaofei Ma, and Bing Xiang, in ACL, 2023. Details
PLUE: Language Understanding Evaluation Benchmark for Privacy Policies in English, Jianfeng Chi, Wasi Uddin Ahmad, Yuan Tian, and Kai-Wei Chang, in ACL (short), 2023. Details
The Tail Wagging the Dog: Dataset Construction Biases of Social Bias Benchmarks, Nikil Roashan Selvam, Sunipa Dev, Daniel Khashabi, Tushar Khot, and Kai-Wei Chang, in ACL (short), 2023. Details

Vision-Language and Multimodal Model

MetaVL: Transferring In-Context Learning Ability From Language Models to Vision-Language Models, Masoud Monajatipoor, Liunian Harold Li, Mozhdeh Rouhsedaghat, Lin Yang, and Kai-Wei Chang, in ACL (short), 2023. Details
UniFine: A Unified and Fine-grained Approach for Zero-shot Vision-Language Understanding, Rui Sun, Zhecan Wang, Haoxuan You, Noel Codella, Kai-Wei Chang, and Shih-Fu Chang, in ACL-Finding, 2023. Details
AVATAR: A Parallel Corpus for Java-Python Program Translation, Wasi Ahmad, Md Golam Rahman Tushar, Saikat Chakraborty, and Kai-Wei Chang, in ACL-Finding (short), 2023. Details

Language and Reasoning

A Survey of Deep Learning for Mathematical Reasoning, Pan Lu, Liang Qiu, Wenhao Yu, Sean Welleck, and Kai-Wei Chang, in ACL, 2023. Details
Symbolic Chain-of-Thought Distillation: Small Models Can Also "Think" Step-by-Step, Liunian Harold Li, Jack Hessel, Youngjae Yu, Xiang Ren, Kai-Wei Chang, and Yejin Choi, in CL, 2023. Details

Semantic and Syantatic Analysis

ParaAMR: A Large-Scale Syntactically Diverse Paraphrase Dataset by AMR Back-Translation, Kuan-Hao Huang, Varun Iyer, I.-Hung Hsu, Anoop Kumar, Kai-Wei Chang, and Aram Galstyan, in ACL, 2023. Details
TAGPRIME: A Unified Framework for Relational Structure Extraction, I.-Hung Hsu, Kuan-Hao Huang, Shuning Zhang, Wenxin Cheng, Prem Natarajan, Kai-Wei Chang, and Nanyun Peng, in ACL, 2023. Details
GENEVA: Pushing the Limit of Generalizability for Event Argument Extraction with 100+ Event Types, Tanmay Parekh, I.-Hung Hsu, Kuan-Hao Huang, Kai-Wei Chang, and Nanyun Peng, in ACL, 2023. Details
PIP: Parse-Instructed Prefix for Syntactically Controlled Paraphrase Generation, Yixin Wan, Kuan-Hao Huang, and Kai-Wei Chang, in ACL-Finding (short), 2023. Details
Enhancing Unsupervised Semantic Parsing with Distributed Contextual Representations, Zixuan Ling, Xiaoqing Zheng, Jianhan Xu, Jinshu Lin, Kai-Wei Chang, Cho-Jui Hsieh, and Xuanjing Huang, in ACL-Finding, 2023. Details

Trustworthy NLP: Fairness, Robustness, Model Explaination, and Social Good Application

Resolving Ambiguities in Text-to-Image Generative Models

Ninareh Mehrabi, Palash Goyal, Apurv Verma, Jwala Dhamala, Varun Kumar, Qian Hu, Kai-Wei Chang, Richard Zemel, Aram Galstyan, and Rahul Gupta, in ACL, 2023.

QA Sessions: POSTER SESSION 4:July 11 11:00 AM - July 11 12:30 AM Paper link in the virtual conference

Full Text BibTeX Details

Natural language often contains ambiguities that can lead to misinterpretation and miscommunication. While humans can handle ambiguities effectively by asking clarifying questions and/or relying on contextual cues and common-sense knowledge, resolving ambiguities can be notoriously hard for machines. In this work, we study ambiguities that arise in text-to-image generative models. We curate a benchmark dataset covering different types of ambiguities that occur in these systems. We then propose a framework to mitigate ambiguities in the prompts given to the systems by soliciting clarifications from the user. Through automatic and human evaluations, we show the effectiveness of our framework in generating more faithful images aligned with human intention in the presence of ambiguities.

@inproceedings{mehrabi2023resolving,
  author = {Mehrabi, Ninareh and Goyal, Palash and Verma, Apurv and Dhamala, Jwala and Kumar, Varun and Hu, Qian and Chang, Kai-Wei and Zemel, Richard and Galstyan, Aram and Gupta, Rahul},
  booktitle = {ACL},
  title = {Resolving Ambiguities in Text-to-Image Generative Models},
  presentation_id = {https://underline.io/events/395/posters/15237/poster/76575-resolving-ambiguities-in-text-to-image-generative-models},
  year = {2023}
}

Efficient Shapley Values Estimation by Amortization for Text Classification

Chenghao Yang, Fan Yin, He He, Kai-Wei Chang, Xiaofei Ma, and Bing Xiang, in ACL, 2023.

QA Sessions: Interpretability and Analysis of Models for NLP 2: 7/11 5:45PM Paper link in the virtual conference

Full Text BibTeX Details

Despite the popularity of Shapley Values in explaining neural text classification models, computing them is prohibitive for large pretrained models due to a large number of model evaluations as it needs to perform multiple model evaluations over various perturbed text inputs. In practice, Shapley Values are often estimated stochastically with a smaller number of model evaluations. However, we find that the estimated Shapley Values are quite sensitive to random seeds – the top-ranked features often have little overlap under two different seeds, especially on examples with the longer input text. As a result, a much larger number of model evaluations is needed to reduce the sensitivity to an acceptable level. To mitigate the trade-off between stability and efficiency, we develop an amortized model that directly predicts Shapley Values of each input feature without additional model evaluation. It is trained on a set of examples with Shapley Values estimated from a large number of model evaluations to ensure stability. Experimental results on two text classification datasets demonstrate that, the proposed amortized model can estimate black-box explanation scores in milliseconds per sample in inference time and is up to 60 times more efficient than traditional methods.

@inproceedings{yang2023efficient,
  title = {Efficient Shapley Values Estimation by Amortization for Text Classification},
  author = {Yang, Chenghao and Yin, Fan and He, He and Chang, Kai-Wei and Ma, Xiaofei and Xiang, Bing},
  year = {2023},
  presentation_id = {https://underline.io/events/395/sessions/15249/lecture/76179-efficient-shapley-values-estimation-by-amortization-for-text-classification},
  booktitle = {ACL}
}

Efficient Shapley Values Estimation by Amortization for Text Classification

Chenghao Yang, Fan Yin, He He, Kai-Wei Chang, Xiaofei Ma, and Bing Xiang, in ACL, 2023.
Full Text Abstract BibTeX Details

Despite the popularity of Shapley Values in explaining neural text classification models, computing them is prohibitive for large pretrained models due to a large number of model evaluations as it needs to perform multiple model evaluations over various perturbed text inputs. In practice, Shapley Values are often estimated stochastically with a smaller number of model evaluations. However, we find that the estimated Shapley Values are quite sensitive to random seeds – the top-ranked features often have little overlap under two different seeds, especially on examples with the longer input text. As a result, a much larger number of model evaluations is needed to reduce the sensitivity to an acceptable level. To mitigate the trade-off between stability and efficiency, we develop an amortized model that directly predicts Shapley Values of each input feature without additional model evaluation. It is trained on a set of examples with Shapley Values estimated from a large number of model evaluations to ensure stability. Experimental results on two text classification datasets demonstrate that, the proposed amortized model can estimate black-box explanation scores in milliseconds per sample in inference time and is up to 60 times more efficient than traditional methods.

@inproceedings{yang2023efficient,
  title = {Efficient Shapley Values Estimation by Amortization for Text Classification},
  author = {Yang, Chenghao and Yin, Fan and He, He and Chang, Kai-Wei and Ma, Xiaofei and Xiang, Bing},
  year = {2023},
  presentation_id = {https://underline.io/events/395/sessions/15249/lecture/76179-efficient-shapley-values-estimation-by-amortization-for-text-classification},
  booktitle = {ACL}
}

Tutorial

Workshop

Accepted Papers

Trustworthy NLP: Fairness, Robustness, Model Explaination, and Social Good Application

Multi-modal Model: Vision+Language, Program+Language

Reasoning and Langauge

Semantic and Syntactic Analysis