Prompt-Driven LLM Safeguarding via Directed Representation Optimization
Chujie Zheng, Fan Yin, Hao Zhou, Fandong Meng, Jie Zhou, Kai-Wei Chang, Minlie Huang, and Nanyun Peng, in ICML, 2024.
Abstract
Bib Entry
@inproceedings{zheng2024prompt,
title = {Prompt-Driven LLM Safeguarding via Directed Representation Optimization},
author = {Zheng, Chujie and Yin, Fan and Zhou, Hao and Meng, Fandong and Zhou, Jie and Chang, Kai-Wei and Huang, Minlie and Peng, Nanyun},
year = {2024},
booktitle = {ICML}
}
Related Publications