The steerability of large language models toward data-driven personas

Junyi Li, Ninareh Mehrabi, Charith Peris, Palash Goyal, Kai-Wei Chang, Aram Galstyan, Richard Zemel, and Rahul Gupta, in NAACL, 2024.

Download the full text

Abstract

The recent surge in Large Language Model (LLM) related applications has led to a concurrent escalation in expectations for LLMs to accommodate a myriad of personas and encompass a broad spectrum of perspectives. An important first step towards addressing this demand is to align language models with specific personas, be it groups of users or individuals. Towards this goal, we first present a new conceptualization of a ¡¥persona¡¦. Moving beyond the traditional reliance on demographics like age, gender, or political party affiliation, we introduce a data-driven persona definition methodology built on collaborative-filtering. In this methodology, users are embedded into a continuous vector space based on their opinions and clustered into cohorts that manifest coherent views across specific inquiries. This methodology allows for a more nuanced understanding of different latent social groups present in the overall population (as opposed to simply using demographic groups) and enhances the applicability of model steerability. Finally, we present an efficient method to steer LLMs towards a particular persona. We learn a soft-prompting model to map the continuous representation of users into sequences of virtual tokens which, when prepended to the LLM input, enables the LLM to produce responses aligned with a given user. Our results show that our steerability algorithm is superior in performance compared to a collection of baselines.

Bib Entry

@inproceedings{li2024steerability,
  title = {The steerability of large language models toward data-driven personas},
  author = {Li, Junyi and Mehrabi, Ninareh and Peris, Charith and Goyal, Palash and Chang, Kai-Wei and Galstyan, Aram and Zemel, Richard and Gupta, Rahul},
  booktitle = {NAACL},
  year = {2024}
}