Talks

Tutorials

Indirectly Supervised Natural Language Processing, ACL23
Details/Slides Abstract

This tutorial targets researchers and practitioners who are interested in ML technologies for NLP from indirect supervision. In particular, we will present a diverse thread of indirect supervision studies that try to answer the following questions: (i) when and how can we provide supervision for a target task T, if all we have is data that corresponds to a ''related'' task T'? (ii) humans do not use exhaustive supervision; they rely on occasional feedback, and learn from incidental signals from various sources; how can we effectively incorporate such supervision in machine learning? (iii) how can we leverage multi-modal supervision to help NLP? To the end, we will discuss several lines of research that address those challenges, including (i) indirect supervision from T' that handles T with outputs spanning from a moderate size to an open space, (ii) the use of sparsely occurring and incidental signals, such as partial labels, noisy labels, knowledge-based constraints, and cross-domain or cross-task annotations---all having statistical associations with the task, (iii) principled ways to measure and understand why these incidental signals can contribute to our target tasks, and (iv) indirect supervision from vision-language signals. We will conclude the tutorial by outlining directions for further investigation.

Robustness and Adversarial Examples in NLP, EMNLP21
Details/Slides Abstract

Recent studies show that many NLP systems are sensitive and vulnerable to a small perturbation of inputs and do not generalize well across different datasets. This lack of robustness derails the use of NLP systems in real-world applications. This tutorial aims at bringing awareness of practical concerns about NLP robustness. It targets NLP researchers and practitioners who are interested in building reliable NLP systems. In particular, we will review recent studies on analyzing the weakness of NLP systems when facing adversarial inputs and data with a distribution shift. We will provide the audience with a holistic view of (1) how to use adversarial examples to examine the weakness of NLP models and facilitate debugging; (2) how to enhance the robustness of existing NLP models and defense against adversarial inputs; (3) how the consideration of robustness affects the real-world NLP applications used in our daily lives. We will conclude the tutorial by outlining future research directions in this area.

Recent Advances in Transferable Representation Learning, AAAI 2020
Details/Slides Abstract

Many AI tasks require cross-domain decision making. For example, many NLP tasks involve predictions across multiple languages, in which different languages can be treated as different domains; in AI-aided biomedical study, the prediction of side effects of drugs is often in parallel to modeling the interactions of proteins and organisms. To support machine learning models to solve such cross-domain tasks, a requisite is to extract the characteristics and relations of data components in different domains, and capture their associations in a unified representation scheme. Towards such a demand, recent advances of representation learning often involve mapping unlabeled data of different domains into shared embedding spaces. In such a way, cross-domain knowledge transfer can be realized by vector collocation or transformations. Such transferable representations have seen successes in a range of AI applications involving crossdomain decision making. However, frontier research in this area faces two key challenges. One is to efficaciously extract features from specific domains with very few learning resources. The other is to precisely align and transfer knowledge with minimal supervision, since the alignment information that connects between different domains can often be insufficient and noisy. In this tutorial, we will comprehensively review recent developments of transferable representation learning methods, with a focus on those for text, multi-relational and multimedia data. Beyond introducing the intra-domain embedding learning approaches, we will discuss various semi-supervised, weakly supervised, multi-view and selfsupervised learning techniques to connect multiple domainspecific embedding representations. We will also compare retrofitting and joint learning processes for both intradomain embedding learning and cross-domain alignment learning. In addition, we will discuss how obtained transferable representations can be utilized to address low-resource and label-less learning tasks. Participants will learn about recent trends and emerging challenges in this topic, representative tools and learning resources to obtain ready-to-use models, and how related models and techniques benefit realworld AI applications.

Bias and Fairness in Natural Language Processing, EMNLP 2019
Details/Slides Abstract

Recent advances in data-driven machine learning techniques (e.g., deep neural networks) have revolutionized many natural language processing applications. These approaches automatically learn how to make decisions based on the statistics and diagnostic information from large amounts of training data. Despite the remarkable accuracy of machine learning in various applications, learning algorithms run the risk of relying on societal biases encoded in the training data to make predictions. This often occurs even when gender and ethnicity information is not explicitly provided to the system because learning algorithms are able to discover implicit associations between individuals and their demographic information based on other variables such as names, titles, home addresses, etc. Therefore, machine learning algorithms risk potentially encouraging unfair and discriminatory decision making and raise serious privacy concerns. Without properly quantifying and reducing the reliance on such correlations, broad adoption of these models might have the undesirable effect of magnifying harmful stereotypes or implicit biases that rely on sensitive demographic attributes. In this tutorial, we will review the history of bias and fairness studies in machine learning and language processing and present recent community effort in quantifying and mitigating bias in natural language processing models for a wide spectrum of tasks, including word embeddings, co-reference resolution, machine translation, and vision-and-language tasks.

Hands-on Learning to Search for Structured Prediction, NAACL 2015
Details/Slides Abstract

Many problems in natural language processing involve building outputs that are structured. The predominant approach to structured prediction is global models (such as conditional random fields), which have the advantage of clean underlying semantics at the cost of computational burdens and extreme difficulty in implementation. An alternative strategy is the learning to search (L2S) paradigm, in which the structured prediction task is cast as a sequential decision making process. One can then devise training-time algorithms that learn to make near optimal collective decisions. This paradigm has been gaining increasing traction over the past five years: most notably in dependency parsing (e.g., MaltParser, ClearNLP, etc.), but also much more broadly in less sequential tasks like entity/relation classification and even graph prediction problems found in social network analysis and computer vision. This tutorial has precisely one goal: an attendee should leave the tutorial with hands on experience writing small programs to perform structured prediction for a variety of tasks, like sequence labeling, dependency parsing and, time-permitting, more.

Learning and Inference in Structured Prediction Models, AAAI 2016
Details/Slides Abstract

Many prediction problems required structured decisions. That is, the goal is to assign values to multiple interdependent variables. The relationships between the output variables could represent a sequence, a set of clusters, or in the general case, a graph. When solving these problems, it is important to make consistent decisions that take the interdependencies among output variables into account. Such problems are often referred to as structured prediction problems. In past decades, multiple structured prediction models have been proposed and studied and success has been demonstrated in a range of applications, including natural language processing, information extraction, computer vision and computational biology. However, the high computational cost often limits both models' expressive power and the size of the data that can be handled. Therefore, designing efficient inference and learning algorithms for these models is a key challenge for structured prediction. In this tutorial, we will focus on recent developments in discriminative structured pre- diction models such as Structured SVMs and Structured Perceptron. Beyond introducing the algorithmic approaches in this domain, we will discuss ideas that result in significant improvements both in the learning and in the inference stages of these algorithms. In par- ticular, we will discuss the use of caching techniques to reuse computations and methods for decomposing complex structures, along with learning procedures that make use of it to simplify the learning stage. We will also present a recently proposed formulation that cap- tures similarities between structured labels by using distributed representation. Participants will learn about existing trends in learning and the inference for the structured prediction models, recent tools developed in this area, and how they can be applied to AI applications.

Hands-on Tutorial: Quantifying and Reducing Gender Stereotypes in Word Embeddings, FAT 2018
Details/Slides Abstract

Ensuring fairness in algorithmically-driven decision-making is important to avoid inadvertent cases of bias and perpetuation of harmful stereotypes. However, modern natural language processing techniques, which learn model parameters based on data, might rely on implicit biases presented in the data to make undesirable stereotypical associations. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. Recent results show that even word embeddings trained on Google News articles exhibit female and male gender stereotypes to a disturbing extent. This raises concerns because of their widespread use, as we describe, often tends to amplify these biases. In this tutorial, we will provide attendees hands on experience writing small programs to display and quantify the gender stereotypes in word embedding. We will also show how to reduce such a gender stereotype from the word embedding.

Structured Predictions: Practical Advancements and Applications in Natural Language Processing, TAAI 2017
Details/Slides Abstract

Many machine learning problems required structured decisions. That is, the goal is to assign values to multiple interdependent variables. The relationships between the output variables could represent a sequence, a set of clusters, or in the general case, a graph. When solving these problems, it is important to make consistent decisions that take the interdependencies among output variables into account. Such problems are often referred to as structured prediction problems. In past decades, multiple structured prediction models have been proposed and studied and success has been demonstrated in a range of applications, including natural language processing, information extraction, computer vision and computational biology. However, the high computational cost often limits both models' expressive power and the size of the data that can be handled. Therefore, designing efficient inference and learning algorithms for these models is a key challenge for structured prediction. In this tutorial, we will focus on recent developments in discriminative structured prediction models such as Structured SVMs and Structured Perceptron. Beyond introducing the algorithmic approaches in this domain, we will discuss ideas that result in significant improvements both in the learning and in the inference stages of these algorithms. In particular, we will discuss the use of caching techniques to reuse computations and methods for decomposing complex structures, along with learning procedures that make use of it to simplify the learning stage. We will also present a recently proposed formulation that captures similarities between structured labels by using distributed representation. We will also discuss potential risks and challenges when using structured prediction models. Participants will learn about existing trends in learning and the inference for the structured prediction models, recent tools developed in this area, and how they can be applied to several natural language processing tasks.

Selected Invited Talks

Bias and Exclusivity in Large Language Models.,
Details/Slides Abstract

The rise of Large Language Models (LLMs) has revolutionized creative writing and personalized interactions. However, these powerful tools carry a hidden risk: amplifying societal biases embedded in their training data. Without adequate measures to quantify and mitigate these biases, the widespread use of these models might inadvertently magnify prejudice or harmful implicit biases associated with sensitive demographic attributes, including gender. This talk will explore metrics and datasets for evaluating gender bias in language generation models. We will review existing bias measurements, demonstrate the inconsistencies between intricate bias metrics and extrinsic ones, and propose a comprehensive evaluation framework to measure bias. Additionally, this presentation will address the challenges of gender exclusivity and the representation of non-binary genders in NLP, alongside the critical examination of gender bias in LLM-generated content such as recommendation letters.

What It Takes to Control Societal Bias in Natural Language Processing,
Details/Slides Abstract

Natural language processing techniques play important roles in our daily life. Despite these methods being successful in various applications, they run the risk of exploiting and reinforcing the societal biases (e.g. gender bias) that are present in the underlying data. In this talk, I will describe a collection of results that quantify and control implicit societal biases in a wide spectrum of language processing tasks, including word embeddings, coreference resolution, and visual semantic role labeling. These results lead to greater control of NLP systems to be socially responsible and accountable.

Structured Predictions: Practical Advancements and Applications,
Details/Slides Abstract

Many machine learning problems involve making joint predictions over a set of mutually dependent output variables. The dependencies between output variables can be represented by a structure, such as a sequence, a tree, a clustering of nodes, or a graph. Structured prediction models have been proposed for problems of this type, and they have been shown to be successful in many application areas, such as natural language processing, computer vision, and bioinformatics. There are two families of algorithms for these problems: graphical model approaches and learning to search approaches. In this talk, I will describe a collection of results that improve several aspects of these approaches. Our results lead to efficient learning algorithms for structured prediction models and for online clustering models, which, in turn, support reduction in problem size, improvements in training and evaluation speed, and improved performance. We have used our algorithms to learn expressive models from large amounts of annotated data and achieve state-of-the-art performance on several natural language processing tasks.

Practical Learning Algorithms for Structured Prediction Models ,
Details/Slides