Share this page:

Open-Domain Safety Policy Construction

Di Wu, Siyue Liu, Zixiang Ji, Ya-Liang Chang, Zhe-Yu Liu, Andrew Pleffer, and Kai-Wei Chang, in EACL-Findings, 2026.

Abstract


Bib Entry

@inproceedings{wu2026opendomain,
  title = {Open-Domain Safety Policy Construction},
  author = {Wu, Di and Liu, Siyue and Ji, Zixiang and Chang, Ya-Liang and Liu, Zhe-Yu and Pleffer, Andrew and Chang, Kai-Wei},
  booktitle = {EACL-Findings},
  year = {2026}
}

Related Publications

  1. Customize Multi-modal RAI Guardrails with Precedent-based predictions, COLM 2025, 2025
  2. X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents, COLM 2025, 2025
  3. Vulnerability of LLMs to Vertically Aligned Text Manipulations, ACL, 2025
  4. Exploring Visual Vulnerabilities via Multi-Loss Adversarial Search for Jailbreaking Vision-Language Models, CVPR, 2025
  5. Vulnerability of Large Language Models to Output Prefix Jailbreaks: Impact of Positions on Safety, NAACL-Finding, 2025
  6. SafeWorld: Geo-Diverse Safety Alignment, NeurIPS, 2024
  7. FLIRT: Feedback Loop In-context Red Teaming, EMNLP, 2024
  8. Data Advisor: Data Curation with Foresight for Safety Alignment of Large Language Models, EMNLP, 2024
  9. Prompt-Driven LLM Safeguarding via Directed Representation Optimization, ICML, 2024