Tutorials

  • Recent Advances in Transferable Representation Learning, AAAI 2020

  • Bias and Fairness in Natural Language Processing, EMNLP 2019
    [Abstract]

    Recent advances in data-driven machine learning techniques (e.g., deep neural networks) have revolutionized many natural language processing applications. These approaches automatically learn how to make decisions based on the statistics and diagnostic information from large amounts of training data. Despite the remarkable accuracy of machine learning in various applications, learning algorithms run the risk of relying on societal biases encoded in the training data to make predictions. This often occurs even when gender and ethnicity information is not explicitly provided to the system because learning algorithms are able to discover implicit associations between individuals and their demographic information based on other variables such as names, titles, home addresses, etc. Therefore, machine learning algorithms risk potentially encouraging unfair and discriminatory decision making and raise serious privacy concerns. Without properly quantifying and reducing the reliance on such correlations, broad adoption of these models might have the undesirable effect of magnifying harmful stereotypes or implicit biases that rely on sensitive demographic attributes. In this tutorial, we will review the history of bias and fairness studies in machine learning and language processing and present recent community effort in quantifying and mitigating bias in natural language processing models for a wide spectrum of tasks, including word embeddings, co-reference resolution, machine translation, and vision-and-language tasks.

  • Hands-on Learning to Search for Structured Prediction, NAACL 2015
    [Link/Slides] [Abstract]

    Many problems in natural language processing involve building outputs that are structured. The predominant approach to structured prediction is global models (such as conditional random fields), which have the advantage of clean underlying semantics at the cost of computational burdens and extreme difficulty in implementation. An alternative strategy is the learning to search (L2S) paradigm, in which the structured prediction task is cast as a sequential decision making process. One can then devise training-time algorithms that learn to make near optimal collective decisions. This paradigm has been gaining increasing traction over the past five years: most notably in dependency parsing (e.g., MaltParser, ClearNLP, etc.), but also much more broadly in less sequential tasks like entity/relation classification and even graph prediction problems found in social network analysis and computer vision. This tutorial has precisely one goal: an attendee should leave the tutorial with hands on experience writing small programs to perform structured prediction for a variety of tasks, like sequence labeling, dependency parsing and, time-permitting, more.

  • Learning and Inference in Structured Prediction Models, AAAI 2016
    [Link/Slides] [Abstract]

    Many prediction problems required structured decisions. That is, the goal is to assign values to multiple interdependent variables. The relationships between the output variables could represent a sequence, a set of clusters, or in the general case, a graph. When solving these problems, it is important to make consistent decisions that take the interdependencies among output variables into account. Such problems are often referred to as structured prediction problems. In past decades, multiple structured prediction models have been proposed and studied and success has been demonstrated in a range of applications, including natural language processing, information extraction, computer vision and computational biology. However, the high computational cost often limits both models' expressive power and the size of the data that can be handled. Therefore, designing efficient inference and learning algorithms for these models is a key challenge for structured prediction. In this tutorial, we will focus on recent developments in discriminative structured pre- diction models such as Structured SVMs and Structured Perceptron. Beyond introducing the algorithmic approaches in this domain, we will discuss ideas that result in significant improvements both in the learning and in the inference stages of these algorithms. In par- ticular, we will discuss the use of caching techniques to reuse computations and methods for decomposing complex structures, along with learning procedures that make use of it to simplify the learning stage. We will also present a recently proposed formulation that cap- tures similarities between structured labels by using distributed representation. Participants will learn about existing trends in learning and the inference for the structured prediction models, recent tools developed in this area, and how they can be applied to AI applications.

  • Hands-on Tutorial: Quantifying and Reducing Gender Stereotypes in Word Embeddings, FAT 2018
    [Link/Slides] [Abstract]

    Ensuring fairness in algorithmically-driven decision-making is important to avoid inadvertent cases of bias and perpetuation of harmful stereotypes. However, modern natural language processing techniques, which learn model parameters based on data, might rely on implicit biases presented in the data to make undesirable stereotypical associations. Such a danger is facing us with word embedding, a popular framework to represent text data as vectors which has been used in many machine learning and natural language processing tasks. Recent results show that even word embeddings trained on Google News articles exhibit female and male gender stereotypes to a disturbing extent. This raises concerns because of their widespread use, as we describe, often tends to amplify these biases. In this tutorial, we will provide attendees hands on experience writing small programs to display and quantify the gender stereotypes in word embedding. We will also show how to reduce such a gender stereotype from the word embedding.

  • Structured Predictions: Practical Advancements and Applications in Natural Language Processing, TAAI 2017
    [Link/Slides] [Abstract]

    Many machine learning problems required structured decisions. That is, the goal is to assign values to multiple interdependent variables. The relationships between the output variables could represent a sequence, a set of clusters, or in the general case, a graph. When solving these problems, it is important to make consistent decisions that take the interdependencies among output variables into account. Such problems are often referred to as structured prediction problems. In past decades, multiple structured prediction models have been proposed and studied and success has been demonstrated in a range of applications, including natural language processing, information extraction, computer vision and computational biology. However, the high computational cost often limits both models' expressive power and the size of the data that can be handled. Therefore, designing efficient inference and learning algorithms for these models is a key challenge for structured prediction. In this tutorial, we will focus on recent developments in discriminative structured prediction models such as Structured SVMs and Structured Perceptron. Beyond introducing the algorithmic approaches in this domain, we will discuss ideas that result in significant improvements both in the learning and in the inference stages of these algorithms. In particular, we will discuss the use of caching techniques to reuse computations and methods for decomposing complex structures, along with learning procedures that make use of it to simplify the learning stage. We will also present a recently proposed formulation that captures similarities between structured labels by using distributed representation. We will also discuss potential risks and challenges when using structured prediction models. Participants will learn about existing trends in learning and the inference for the structured prediction models, recent tools developed in this area, and how they can be applied to several natural language processing tasks.

Selected Invited Talks

  • What It Takes to Control Societal Bias in Natural Language Processing,
    [Link/Slides] [Abstract]

    Natural language processing techniques play important roles in our daily life. Despite these methods being successful in various applications, they run the risk of exploiting and reinforcing the societal biases (e.g. gender bias) that are present in the underlying data. In this talk, I will describe a collection of results that quantify and control implicit societal biases in a wide spectrum of language processing tasks, including word embeddings, coreference resolution, and visual semantic role labeling. These results lead to greater control of NLP systems to be socially responsible and accountable.

  • Structured Predictions: Practical Advancements and Applications,
    [Link/Slides] [Abstract]

    Many machine learning problems involve making joint predictions over a set of mutually dependent output variables. The dependencies between output variables can be represented by a structure, such as a sequence, a tree, a clustering of nodes, or a graph. Structured prediction models have been proposed for problems of this type, and they have been shown to be successful in many application areas, such as natural language processing, computer vision, and bioinformatics. There are two families of algorithms for these problems: graphical model approaches and learning to search approaches. In this talk, I will describe a collection of results that improve several aspects of these approaches. Our results lead to efficient learning algorithms for structured prediction models and for online clustering models, which, in turn, support reduction in problem size, improvements in training and evaluation speed, and improved performance. We have used our algorithms to learn expressive models from large amounts of annotated data and achieve state-of-the-art performance on several natural language processing tasks.

  • Practical Learning Algorithms for Structured Prediction Models ,
    [Link/Slides]