Share this page:

Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue

Jia-Chen Gu, Hao-Xiang Xu, Jun-Yu Ma, Pan Lu, Zhen-Hua Ling, Kai-Wei Chang, and Nanyun Peng, in EMNLP, 2024.

Download the full text


Abstract

Model editing is a technique that edits the large language models (LLMs) with updated knowledge to alleviate hallucinations without resource-intensive retraining. While current model editing methods can effectively modify a model’s behavior within a specific area of interest, they often overlook the potential unintended side effects on the general abilities of LLMs such as reasoning, natural language inference, and question answering. In this paper, we raise concerns that model editing’s improvements on factuality may come at the cost of a significant degradation of the model’s general abilities. We systematically analyze the side effects by evaluating four popular editing methods on three LLMs across eight representative tasks. Our extensive empirical experiments show that it is challenging for current editing methods to simultaneously improve factuality of LLMs and maintain their general abilities. Our analysis reveals that the side effects are caused by model editing altering the original model weights excessively, leading to overfitting to the edited facts. To mitigate this, a method named RECT (RElative Change in weighT) is proposed to regularize the edit update weights. Evaluation results show that RECT can significantly mitigate the side effects of editing while still maintaining over 94% editing performance.


Bib Entry

@inproceedings{gu2024model,
  title = {Model Editing Harms General Abilities of Large Language Models: Regularization to the Rescue},
  author = {Gu, Jia-Chen and Xu, Hao-Xiang and Ma, Jun-Yu and Lu, Pan and Ling, Zhen-Hua and Chang, Kai-Wei and Peng, Nanyun},
  booktitle = {EMNLP},
  year = {2024}
}

Related Publications

  1. Training LLMs for Divide-and-Conquer Reasoning, ACL, 2026
  2. BRIEF-Pro: Universal Context Compression with Short-to-Long Synthesis for Fast and Accurate Multi-Hop Reasoning, ACL-Findings, 2026
  3. Beyond Facts- Benchmarking Distributional Reading Comprehension in Large Language Models, ACL-Findings, 2026
  4. QLASS: Boosting Language Agent Inference via Q-Guided Stepwise Search, ICML, 2025
  5. DRS: Deep Question Reformulation With Structured Output, ACL-Findings, 2025
  6. V-ALPHASOCIAL: Benchmark and Self-Reflective Chain-of-Thought Generation for Visual Social Commonsense Reasoning, ACL-Findings, 2025
  7. Towards Safety Reasoning in LLMs: AI-agentic Deliberation for Policy-embedded CoT Data Creation, ACL-Findings, 2025
  8. MQuAKE-Remastered: Multi-Hop Knowledge Editing Can Only Be Advanced with Reliable Evaluations, ICLR, 2025
  9. VISCO: Benchmarking Fine-Grained Critique and Correction Towards Self-Improvement in Visual Reasoning, CVPR, 2025
  10. BRIEF: Bridging Retrieval and Inference for Multi-hop Reasoning via Compression, NAACL-Finding, 2025
  11. QUDSELECT: Selective Decoding for Questions Under Discussion Parsing, EMNLP, 2024
  12. LLM-A*: Large Language Model Enhanced Incremental Heuristic Search on Path Planning, EMNLP-Finding, 2024
  13. Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs, ACL, 2024
  14. Are LLMs Capable of Data-based Statistical and Causal Reasoning? Benchmarking Advanced Quantitative Reasoning with Data, ACL-Findings, 2024
  15. Can small language models help large language models reason better?: LM-guided chain-of-thought, LREC-COLING, 2024
  16. IdealGPT: Iteratively Decomposing Vision and Language Reasoning via Large Language Models, EMNLP-Finding, 2023