User-Defined Aggregates for Advanced Database Applications
PI: Carlo Zaniolo,University of California, Los Angeles, zaniolo@cs.ucla.edu
Grant: NSF-IIS 0070135
Duration: September 2000 - August 2003.
An explosive growth in scale and complexity of information services is stretching data base technology beyond its limits. In particular, while state-of-the-art data base management systems (DBMSs) provide extensibility through user-defined functions (UDFs) and data types, they are ineffective in many critical applications areas, e.g., data mining. Therefore, the project's goal is developing powerful extensibility mechanisms to enable DBMSs to support effectively new application domains, and advanced information systems. To realize this goal, a new approach is proposed based on User-Defined Aggregates (UDAs) expressed in a new SQL-based language. Since SQL is the standard language for DBMSs, this approach ensures ease of use and compatibility---overcoming a problem besetting UDF-based approaches. The project's expected accomplishments are:
- Design and implementation of an SQL-based language for UDAs, and of a system for supporting their efficient execution. Compilation and optimization techniques that ensure UDAs' performance and scalability on large data sets will therefore be developed.
- Development of advanced application testbeds to demonstrate the power of the UDA system, and to evaluate and tune its performance. In most applications, such as datamining functions, spatio-temporal queries, and time series data blades, the system will be used as a DBMS extender; but in applications, such as analysis of computational data or WEB data, the system will be used as a stand-alone query processor.
- Delivery of the software to the public, with a dedicated website providing documentation, tutorials, application testbeds, and online demos.