AutoSUIT Bench - Automated Security UnIt Test Benchmark for LLM Coding
Samuel Osebe, Fan Yang, Junyi Li, Yue Gu, Yongxin Wang, Satyapriya Krishna, Kai-Wei Chang, Aram Galstyan, Rahul Gupta, and Weitong Ruan, in ACL-Findings, 2026.
Abstract
Bib Entry
@inproceedings{osebe2026autosuit,
title = {AutoSUIT Bench - Automated Security UnIt Test Benchmark for LLM Coding},
author = {Osebe, Samuel and Yang, Fan and Li, Junyi and Gu, Yue and Wang, Yongxin and Krishna, Satyapriya and Chang, Kai-Wei and Galstyan, Aram and Gupta, Rahul and Ruan, Weitong},
booktitle = {ACL-Findings},
year = {2026}
}
Related Publications
-
METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling, ACL, 2025
-
MQT-LLaVA: Matryoshka Query Transformer for Large Vision-Language Models, NeurIPS, 2024
-
DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation, NeurIPS (Datasets and Benchmarks Track), 2024
-
VDebugger: Harnessing Execution Feedback for Debugging Visual Programs, EMNLP-Finding, 2024
-
AVATAR: A Parallel Corpus for Java-Python Program Translation, ACL-Finding (short), 2023
-
Retrieval Augmented Code Generation and Summarization, EMNLP-Finding, 2021
-
Unified Pre-training for Program Understanding and Generation, NAACL, 2021