Query Containment for Data Integration Systems
Journal of Computer and System Sciences (JCSS), 66(1):20-39, February 2003.
Todd Millstein, Alon Halevy, Marc Friedman
The problem of query containment is fundamental to many aspects of
database systems, including query optimization, determining
independence of queries from updates, and rewriting
queries using views.
In the data-integration framework, however, the standard notion of
query containment does not suffice. We define relative
which formalizes the notion of query containment relative to the
sources available to the data-integration system. First we provide optimal
bounds for relative containment for
several important classes of datalog queries, including the
common case of conjunctive queries. Next we provide bounds for the
case when sources enforce access restrictions in the form of binding
pattern constraints. Surprisingly, we show that relative containment
for conjunctive queries is still decidable in this case,
even though it is known
that finding all answers to such queries may require a recursive
datalog program over the sources. Finally, we provide tight bounds
for variants of relative containment when the queries and
descriptions may contain comparison predicates.