Intent Classification and Slot Filling for Privacy Policies

Wasi Ahmad, Jianfeng Chi, Tu Le, Thomas Norton, Yuan Tian, and Kai-Wei Chang, in ACL, 2021.

Code

Download the full text

Abstract

Understanding privacy policies is crucial for users as it empowers them to learn about the information that matters to them. Sentences written in a privacy policy document explain privacy practices, and the constituent text spans convey further specific information about that practice. We refer to predicting the privacy practice explained in a sentence as intent classification and identifying the text spans sharing specific information as slot filling. In this work, we propose PolicyIE, a corpus consisting of 5,250 intent and 11,788 slot annotations spanning 31 privacy policies of websites and mobile applications. PolicyIE corpus is a challenging benchmark with limited labeled examples reflecting the cost of collecting large-scale annotations. We present two alternative neural approaches as baselines: (1) formulating intent classification and slot filling as a joint sequence tagging and (2) modeling them as a sequence-to-sequence (Seq2Seq) learning task. Experiment results show that both approaches perform comparably in intent classification, while the Seq2Seq method outperforms the sequence tagging approach in slot filling by a large margin. Error analysis reveals the deficiency of the baseline approaches, suggesting room for improvement in future works. We hope the PolicyIE corpus will stimulate future research in this domain.

Bib Entry

@inproceedings{ahmad2021intent,
  title = {Intent Classification and Slot Filling for Privacy Policies},
  author = {Ahmad, Wasi and Chi, Jianfeng and Le, Tu and Norton, Thomas and Tian, Yuan and Chang, Kai-Wei},
  booktitle = {ACL},
  year = {2021}
}

Related Publications

DiCoRe: Enhancing Zero-shot Event Detection via Divergent-Convergent LLM Reasoning, EMNLP, 2025
SNaRe: Domain-aware Data Generation for Low-Resource Event Detection, EMNLP, 2025
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory, ICLR, 2025
SPEED++: A Multilingual Event Extraction Framework for Epidemic Prediction and Preparedness, EMNLP, 2024
TextEE: Benchmark, Reevaluation, Reflections, and Future Challenges in Event Extraction, ACL-Findings, 2024
Event Detection from Social Media for Epidemic Prediction, NAACL, 2024
GENEVA: Pushing the Limit of Generalizability for Event Argument Extraction with 100+ Event Types, ACL, 2023
TAGPRIME: A Unified Framework for Relational Structure Extraction, ACL, 2023
Enhancing Unsupervised Semantic Parsing with Distributed Contextual Representations, ACL-Finding, 2023
DEGREE: A Data-Efficient Generative Event Extraction Model, NAACL, 2022