Query Containment for Data Integration Systems

19th ACM Symposium on Principles of Database Systems (PODS 2000), Dallas, Texas, May 15-17, 2000.
Todd Millstein, Alon Halevy, Marc Friedman
The problem of query containment is fundamental to many aspects of database systems, including query optimization, determining independence of queries from updates, and rewriting queries using views. In the data-integration framework, however, the standard notion of query containment does not suffice. We define relative containment, which formalizes the notion of query containment relative to the sources available to the data-integration system. First we provide optimal bounds for relative containment for several important classes of datalog queries, including the common case of conjunctive queries. Next we provide bounds for the case when sources enforce access restrictions in the form of binding pattern constraints. Surprisingly, we show that relative containment for conjunctive queries is still decidable in this case, even though it is known that finding all answers to such queries may require a recursive datalog program over the sources. Finally, we provide tight bounds for variants of relative containment when the queries and source descriptions may contain comparison predicates.


Superceded by this paper.