240B: Sample of topics for presentations and projects

DSMS: Overviews (Your presentation and projects cannot be on these overview papers, which  however can help you in your selection and preparation)

Languages

Pattern Languages

Windows, Operators and Timestamps

Approximate Query Answering on Data Streams

Execution, Scheduling, Optimization

Load Shedding, Sampling

  1. Distributed TopK Monitoring, by Brian Babcock, Chris Olston, in the ACM International Conference on Management of Data (SIGMOD) 2003.
  2. Maintaining Stream Statistics over Sliding Windows, by Mayur Datar, Aristides Gionis, Piotr Indyk, Rajeev Motwani, in the ACM-SIAM Symposium on Discrete Algorithms (SODA) 2002.
  3. Maintaining Variance and k-Medians over Data Stream Windows, by Brian Babcock, Mayur Datar, Rajeev Motwani, LiadanO O'Callaghan, in the ACM Symposium on Principles of Database Systems (PODS) 2003.
  4. StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time, by Yunyue Zhu, Dennis Shasha, in the International Conference on Very Large Data Bases (VLDB) 2002.
  5. Mining A Stream of Transactions for Customer Patterns, by Diane Lambert, Jose C. Pinheiro, in the ACM International Conference on Knowledge Discovery and Data Mining (SIGKDD) 2001.
  6. Approximate Medians and other Quantiles in One Pass and with Limited Memory, by Gurmeet Singh Manku, Sridhar Rajagopalan, Bruce G. Lindsay, in the ACM International Conference on Management of Data (SIGMOD) 1998.
  7. Random Sampling Techniques for Space Efficient Online Computation of Order Statistics of Large Datasets, by Gurmeet Singh Manku, Sridhar Rajagopalan, Bruce G. Lindsay, in the ACM International Conference on Management of Data (SIGMOD) 1999.
  8. Synopsis Data Structures for Massive Data Sets, by Phillip B. Gibbons, Yossi Matias, in the ACM-SIAM Symposium on Discrete Algorithms (SODA) 1999.

 Processing of Streaming XML documents

Complex Event Processing (CEP)

DSMS/CEP Applications

1. Cranor, Johnson, Spatscheck & Shkapenyuk. Gigascope: A Stream Database for Network Applications. SIGMOD 2003

2. Joseph M. Hellerstein. From Database to Dataflow: New Directions in IT. Medical Records Institute Health IT Advisory Report 3(6) (2002).

3. Lerner & Shasha. The Virtues and Challenges of Ad Hoc + Streams Querying in Finance. IEEE Data Engineering Bulletin, March 2003.

4. Sistal, Wolfson, Chamberlain, Dao. Modeling and Querying Moving Objects. ICDE 1997.

5. Yao & Gehrke. Query Processing for Sensor Networks. CIDR 2003.

6. Di Wang, et al. Active Complex Event Processing over Event Streams:  VLDB 2011.

Data Mining  Query Languages  for DBMS and DSMS