Controllable Text Generation with Neurally-Decomposed Oracle

Tao Meng, Sidi Lu, Nanyun Peng, and Kai-Wei Chang, in NeurIPS, 2022.

Oral Presentation, 201 out of 10411, top 1.9%

Code

Download the full text

Abstract

We propose a general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO). Given a pre-trained base language model and a sequence-level boolean oracle function, we propose to decompose the oracle function into token-level guidance to steer the base model in text generation. Specifically, the token-level guidance is approximated by a neural model trained with examples sampled from the base model, demanding no additional auxiliary labeled data. We present the closed-form optimal solution to incorporate the token-level guidance into the base model for controllable generation. We further provide a theoretical analysis of how the approximation quality of NADO affects the controllable generation results. Experiments conducted on two applications: (1) text generation with lexical constraints and (2) machine translation with formality control demonstrate that our framework efficiently guides the base model towards the given oracle while maintaining high generation quality.

Source Code

Bib Entry

@inproceedings{meng2022controllable,
  title = {Controllable Text Generation with Neurally-Decomposed Oracle},
  author = {Meng, Tao and Lu, Sidi and Peng, Nanyun and Chang, Kai-Wei},
  booktitle = {NeurIPS},
  year = {2022}
}

Related Publications

A Pseudo-Semantic Loss for Deep Generative Models with Logical Constraints

Kareem Ahmed, Kai-Wei Chang, and Guy Van den Broeck, in NeurIPS, 2023.
Full Text Abstract BibTeX Details

Neuro-symbolic approaches bridge the gap between purely symbolic and neural approaches to learning. This often requires maximizing the probability of a symbolic constraint in the neural network’s output. However, output distributions are typically assumed to be fully-factorized, which prohibits the application of neurosymbolic learning to more expressive output distributions, such as autoregressive deep generative models. There, such probability computation is #P-hard, even for simple constraints. Instead, we propose to locally approximate the probability of the symbolic constraint under the pseudolikelihood distribution – the product of its full conditionals given a sample from the model. This allows our pseudo-semantic loss function to enforce the symbolic constraint. Our method bears relationship to several classical approximation schemes, including hogwild Gibbs sampling, consistent pseudolikelihood learning, and contrastive divergence. We test our proposed approach on three distinct settings: Sudoku, shortest-path prediction, and detoxifying large language models. Experiments show that pseudo-semantic loss greatly improves upon the base model’s ability to satisfy the desired logical constraint in its output distribution.

@inproceedings{ahmed2023neuro,
  title = {	A Pseudo-Semantic Loss for Deep Generative Models with Logical Constraints},
  author = {Ahmed, Kareem and Chang, Kai-Wei and den Broeck, Guy Van},
  booktitle = {NeurIPS},
  year = {2023}
}

Details

Semantic Strengthening of Neuro-Symbolic Learning

Kareem Ahmed, Kai-Wei Chang, and Guy Van den Broeck, in AISTATS, 2023.
Full Text Code Abstract BibTeX Details

Numerous neuro-symbolic approaches have recently been proposed typically with the goal of adding symbolic knowledge to the output layer of a neural network. Ideally, such losses maximize the probability that the neural network’s predictions satisfy the underlying domain. Unfortunately, this type of probabilistic inference is often computationally infeasible. Neuro-symbolic approaches therefore commonly resort to fuzzy approximations of this probabilistic objective, sacrificing sound probabilistic semantics, or to sampling which is very seldom feasible. We approach the problem by first assuming the constraint decomposes conditioned on the features learned by the network. We iteratively strengthen our approximation, restoring the dependence between the constraints most responsible for degrading the quality of the approximation. This corresponds to computing the mutual information between pairs of constraints conditioned on the network’s learned features, and may be construed as a measure of how well aligned the gradients of two distributions are. We show how to compute this efficiently for tractable circuits. We test our approach on three tasks: predicting a minimum-cost path in Warcraft, predicting a minimum-cost perfect matching, and solving Sudoku puzzles, observing that it improves upon the baselines while sidestepping intractability.

@inproceedings{ahmed2023semantic,
  author = {Ahmed, Kareem and Chang, Kai-Wei and Van den Broeck, Guy},
  title = {Semantic Strengthening of Neuro-Symbolic Learning},
  booktitle = {AISTATS},
  year = {2023}
}

Details

Controllable Text Generation with Neurally-Decomposed Oracle

Tao Meng, Sidi Lu, Nanyun Peng, and Kai-Wei Chang, in NeurIPS, 2022.
Full Text Code Abstract BibTeX Details Oral Presentation, 201 out of 10411, top 1.9%

We propose a general and efficient framework to control auto-regressive generation models with NeurAlly-Decomposed Oracle (NADO). Given a pre-trained base language model and a sequence-level boolean oracle function, we propose to decompose the oracle function into token-level guidance to steer the base model in text generation. Specifically, the token-level guidance is approximated by a neural model trained with examples sampled from the base model, demanding no additional auxiliary labeled data. We present the closed-form optimal solution to incorporate the token-level guidance into the base model for controllable generation. We further provide a theoretical analysis of how the approximation quality of NADO affects the controllable generation results. Experiments conducted on two applications: (1) text generation with lexical constraints and (2) machine translation with formality control demonstrate that our framework efficiently guides the base model towards the given oracle while maintaining high generation quality.

@inproceedings{meng2022controllable,
  title = {Controllable Text Generation with Neurally-Decomposed Oracle},
  author = {Meng, Tao and Lu, Sidi and Peng, Nanyun and Chang, Kai-Wei},
  booktitle = {NeurIPS},
  year = {2022}
}

Details