Share this page:

Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue

Jia-Chen Gu, Hao-Xiang Xu, Jun-Yu Ma, Pan Lu, Zhen-Hua Ling, Kai-Wei Chang, and Nanyun Peng, in EMNLP, 2024.

Download the full text


Abstract

Model editing is a technique that edits the large language models (LLMs) with updated knowledge to alleviate hallucinations without resource-intensive retraining. While current model editing methods can effectively modify a model’s behavior within a specific area of interest, they often overlook the potential unintended side effects on the general abilities of LLMs such as reasoning, natural language inference, and question answering. In this paper, we raise concerns that model editing’s improvements on factuality may come at the cost of a significant degradation of the model’s general abilities. We systematically analyze the side effects by evaluating four popular editing methods on three LLMs across eight representative tasks. Our extensive empirical experiments show that it is challenging for current editing methods to simultaneously improve factuality of LLMs and maintain their general abilities. Our analysis reveals that the side effects are caused by model editing altering the original model weights excessively, leading to overfitting to the edited facts. To mitigate this, a method named RECT (RElative Change in weighT) is proposed to regularize the edit update weights. Evaluation results show that RECT can significantly mitigate the side effects of editing while still maintaining over 94% editing performance.


Bib Entry

@inproceedings{gu2024model,
  title = {Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue},
  author = {Gu, Jia-Chen and Xu, Hao-Xiang and Ma, Jun-Yu and Lu, Pan and Ling, Zhen-Hua and Chang, Kai-Wei and Peng, Nanyun},
  booktitle = {EMNLP},
  year = {2024}
}

Related Publications

  1. QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search, ICML, 2025
  2. DRS: Deep Question Reformulation With Structured Output, ACL-Findings, 2025
  3. V-ALPHASOCIAL: Benchmark and Self-Reflective Chain-of-Thought Generation for Visual Social Commonsense Reasoning, ACL-Findings, 2025
  4. Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation, ACL-Findings, 2025
  5. VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning, CVPR, 2025
  6. MQuAKE-Remastered: Multi-Hop Knowledge Editing Can Only Be Advanced with Reliable Evaluations, ICLR, 2025
  7. BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression, NAACL-Finding, 2025
  8. QUDSELECT: Selective Decoding for Questions Under Discussion Parsing, EMNLP, 2024
  9. LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning, EMNLP-Finding, 2024
  10. Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs, ACL, 2024
  11. Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data, ACL-Findings, 2024
  12. Can small language models help large language models reason better?: LM-guided chain-of-thought, LREC-COLING, 2024
  13. IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models, EMNLP-Finding, 2023