Share this page:

Unlearning as Multi-task Optimization: A Normalized Gradient Difference Approach with an Adaptive Learning Rate

Xiaomeng Jin, Zhiqi Bu, Bhanukiran Vinzamuri, Anil Ramakrishna, Kai-Wei Chang, Volkan Cevher, and Mingyi Hong, in NAACL, 2025.

Download the full text


Abstract

Machine unlearning has been used to remove unwanted knowledge acquired by large language models (LLMs). In this paper, we examine machine unlearning from an optimization perspective, framing it as a regularized multi-task optimization problem, where one task optimizes a forgetting objective and another optimizes the model performance. In particular, we introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives, while integrating a new, automatic learning rate scheduler. We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets while exhibiting stable training.


Bib Entry

@inproceedings{jin2025unlearning,
  title = {Unlearning as Multi-task Optimization: A Normalized Gradient Difference Approach with an Adaptive Learning Rate},
  author = {Jin, Xiaomeng and Bu, Zhiqi and Vinzamuri, Bhanukiran and Ramakrishna, Anil and Chang, Kai-Wei and Cevher, Volkan and Hong, Mingyi},
  booktitle = {NAACL},
  year = {2025}
}

Related Publications

  1. From Narrow Unlearning to Emergent Misalignment in LLMs, ACL, 2026
  2. BLUR: A Bi-Level Optimization Approach for LLM Unlearning, EACL, 2026
  3. Not Every Token Needs Forgetting: Selective Unlearning to Limit Change in Utility in Large Language Model Unlearning, EMNLP-Finding, 2025
  4. LUME: LLM Unlearning with Multitask Evaluations, EMNLP-Finding, 2025