Presentation and Final for CS240B

240B: Advanced Data and Knowledge Bases:

Sample of topics for presentations and projects

Event Processing Languages and Systems

[Query languages for complex events. 1--3: early proposals. 4 language+optimization. 5+6 proposed standards+blog]

Seshadri, P., Linvy, M., And Ramakrishnan, R. 1994. In Proceedings of ACM SIGMOD Conference on Management of Data. ACM, New York, 430–441.
Seshadri, P., Livny, M., And Ramakrishnan, R. 1995. SEQ: A model for sequence databases. In
ICDE. 232–239.
Ramakrishnan, R., Donjerkovic, D., Ranganathan, A., Beyer, K., and Krishnaprasad,
SRQL: Sorted relational query language. In Proceedings of the 10th Annual International Conference
on Scientific and Statistical Database Management (Capri, Italy, July 1–3), 1988, 84–95.
Reza Sadri, Carlo Zaniolo, Amir Zarkesh, Jafar Adibi: Expressing and optimizing sequence queries in database systems. ACM Transactions on Database Systems (TODS) Volume 29 , Issue 2 (June 2004).
Fred Zemke, Andrew Witkowski, Mitch Cherniak, Latha Colby, Pattern matching in sequences of row, ANSI change proposal, March 27, http://www.cs.ucla.edu/classes/spring07/cs240B/notes/row-pattern-recogniton-11.pdf.
Discussion Blog for above: http://tkyte.blogspot.com/2007/04/so-in-your-opinion.html

[Event Processing using WebSpere]

IBM Redbooks | WebSphere Business Integration Adapter Development , http://www.redbooks.ibm.com/abstracts/redp9119.html?Open

Ana Biazetti and Kim Gadja: Achieving complex event processing with Active Correlation Technology--Rule your domains with rules to trigger automated processes.http://www.ibm.com/developerworks/autonomic/library/ac-acact/index.html

[Event Processing using Java Message Service]

Sun's official JMS site includes documentation, FAQs and a JMS vendor list. java.sun.com/products/jms/

[Pub/Sub]
Patrick Th. Eugster et al.: The many faces of publish/subscribe. CM Computing Surveys (CSUR) archive
Volume 35 , Issue 2 (June 2003), 114 - 131.

Data Streams

[Overviews]
B. Babcock, S. Babu, M. Datar, R. Motwani, J. Widom: Models and Issues in Data Stream
Systems. PODS 2002: 1-16

Lukasz Golab and M. Tamer ¨Ozsu. Issues in data stream management. ACM SIGMOD Record, 32(2):5–14, 2003.

[Language and Systems]

Arvind Arasu, Shivnath Babu, Jennifer Widom: The CQL continuous query language: semantic foundations and query execution. VLDB J. 15(2): 121-142 (2006)
Hari Balakrishnan, Magdalena Balazinska, Donald Carney, Ugur Çetintemel, Mitch Cherniack, Christian Convey, Eduardo F. Galvez, Jon Salz, Michael Stonebraker, Nesime Tatbul, Richard Tibbetts, Stanley B. Zdonik: Retrospective on Aurora. VLDB J. 13(4): 370-383 (2004).
Charles D. Cranor, Theodore Johnson, Oliver Spatscheck, Vladislav Shkapenyuk: Gigascope: A Stream Database for Network Applications. SIGMOD Conference 2003: 647-651
Arvind Arasu, Mitch Cherniack, Eduardo F. Galvez, David Maier, Anurag Maskey, Esther Ryvkina, Michael Stonebraker, Richard Tibbetts: Linear Road: A Stream Data Management Benchmark. VLDB 2004.
Yan-Nei Law, Haixun Wang, Carlo Zaniolo: Query Languages and Data Models for Database Sequences and Data Streams. VLDB 2004. 492-503.

J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. In Proc. of the 2000 ACM SIGMOD Intl. Conf. on Management of Data, pages 379-390, May 2000.
Sam Madden, Mehul A. Shah, Joseph M. Hellerstein, Vijayshankar Raman: Continuously Adaptive Continuous Queries over Streams. SIGMOD 2002, 49-61.
D. Barbara. The characterization of continuous queries. Intl. Journal of Cooperative Information Systems, 8(4):295-323, 1999.
S. Chandrasekaran and M. Franklin. Streaming queries over streaming data. In VLDB, 2002.
J. Chen, D. J. DeWitt, F. Tian, and Y. Wang. NiagaraCQ: A scalable continuous query system for internet databases. In SIGMOD, pages 379-390, May 2000.
H. Jagadish, I. Mumick, and A. Silberschatz. View maintenance issues for the chronicle data model. In PODS, pages 113-124, 1995.
L. Liu, C. Pu, and W. Tang. Continual queries for internet scale event-driven information delivery. IEEE TKDE, 11(4):583-590, 1999.
|
M. Sullivan. Tribeca: A stream database manager for network traffic analysis. In VLDB, 1996.
D. Terry, D. Goldberg, D. Nichols, and B. Oki. Continuous queries over append-only databases. In SIGMOD, pages 321-330, 1992.

[Windows, Operators and Timestamps]

Arvind Arasu, Jennifer Widom: Resource Sharing in Continuous Sliding-Window Aggregates. VLDB 2004.
Utkarsh Srivastava, Jennifer Widom: Memory-Limited Execution of Windowed Stream Joins. VLDB 2004: 324-335
Yijian Bai, Hetal Thakkar, Chang Luo, Haixun Wang, Carlo Zaniolo: A Data Stream Language and System Designed for Power and Extensibility. Proc. of the ACM 15th Conference on Information and Knowledge Management (CIKM'06), 2006.
Yijian Bai et al., Optimizing Timestamp Management in Data Stream Management Systems, ICDE 2007.
Theodore Johnson, S. Muthukrishnan, Vladislav Shkapenyuk, Oliver Spatscheck: A Heartbeat Mechanism and Its Application in Gigascope. VLDB 2005: 1079-1088.
Utkarsh Srivastava, Jennifer Widom: Flexible Time Management in Data Stream Systems. PODS 2004: 263-274
Jin Li, David Maier, Kristin Tufte, Vassilis Papadimos, Peter A. Tucker: Semantics and Evaluation Techniques for Window Aggregates in Data Streams. SIGMOD Conference 2005: 311-322.

[Approximate Query Answering on Data Streams]

Swarup Acharya, Phillip B. Gibbons, Viswanath Poosala,Sridhar Ramaswamy: Join Synopses for Approximate Query Answering. SIGMOD1999, pp.275--286.

Abhinandan Das, Johannes Gehrke, Mirek Riedewald: Approximate Join Processing Over Data Streams.
SIGMOD2003, pp.40--51.
Yan-Nei Law, and C. Zaniolo, Load Shedding for Window Joins on
Multiple Data Streams. First International Workshop on Scalable Stream Processing Systems (SSPS'07)
April 16-20, 2007, Istanbul, Turkey.
A Robust, Optimization-Based Approach for Approximate Answering of Aggregate Queries. By Surajit Chaudhuri, Gautam Das, Vivek Narasayya ACM SIGMOD/PODS 2001
On Computing Correlated Aggregates Over Continual Data Streams. By Johannes Gehrke (Cornell Univ.), Flip Korn, and Divesh Srivastava ACM SIGMOD/PODS 2001
Space-Efficient Online Computation of Quantile Summaries. By Michael Greenwald and Sanjeev Khanna (Univ. of Pennsylvania) ACM SIGMOD/PODS 2001
Alin Dobra, Minos N. Garofalakis, Johannes Gehrke, Rajeev Rastogi: Processing complex aggregate queries over data streams. SIGMOD2002, pp.61--72.
Arvind Arasu, Gurmeet Singh Manku. Approximate Counts and Quantiles over Sliding Windows. In the ACM Symposium on Principles of Database Systems (PODS) 2004.
Brian Babcock, Chris Olston. Distributed Top-k Monitoring. In the ACM International Conference on Management of Data (SIGMOD) 2003.
Brian Babcock, Mayur Datar, Rajeev Motwani, LiadanO O'Callaghan. Maintaining Variance and k-Medians over Data Stream Windows. In the ACM Symposium on Principles of Database Systems (PODS) 2003.
Brian Babcock, Mayur Datar, Rajeev Motwani: Load Shedding for Aggregation Queries over Data Streams.
ICDE2004, pp.350--361.
Jeffrey Considine, Feifei Li, George Kollios, John W. Byers:Approximate Aggregation Techniques for Sensor Databases. ICDE 2004.
Tao Li, Qi Li, Shenghuo Zhu, Mitsunori Ogihara: A Survey on Wavelet Applications in Data Mining.
SIGKDD Explorations 2002 4(2), pp.49--68.
Minos N. Garofalakis, Phillip B. Gibbons: Wavelet synopses with error guarantees. SIGMOD 2002, pp.476--487.
Anna C. Gilbert, Yannis Kotidis, S. Muthukrishnan, Martin Strauss: Surfing Wavelets on Streams: One-Pass Summaries for Approximate Aggregate Queries. VLDB2001, pp.79--88.
Kaushik Chakrabarti, Minos N. Garofalakis, Rajeev Rastogi, Kyuseok Shim: Approximate Query Processing Using Wavelets. VLDB2000, pp.111--122.

[Scheduling, Load Shedding, and Distributed Processing]

B. Babcock, S. Babu, M. Datar, and R. Motwani. Chain: Operator Scheduling for Memory Minimization in Data Stream Systems To appear in Proc. of the ACM Intl Conf. on Management of Data (SIGMOD 2003), June 2003
Donald Carney, Ugur Çetintemel, Alex Rasin, Stanley B. Zdonik, Mitch Cherniack, Michael Stonebraker: Operator Scheduling in a Data Stream Manager. VLDB 2003: 838-849.
Brian Babcock, Mayur Datar, Rajeev Motwani: Load Shedding for Aggregation Queries over Data Streams. ICDE 2004.
Nesime Tatbul, Ugur etintemel, Stanley B. Zdonik, Mitch Cherniack, Michael Stonebraker: Load Shedding in a Data Stream Manager.VLDB2003, pp.309--320.

Jeong-Hyon Hwang, Magdalena Balazinska, Alex Rasin, Ugur Çetintemel, Michael Stonebraker, Stanley B. Zdonik: High-Availability Algorithms for Distributed Stream Processing. ICDE 2005: 779-790.
Magdalena Balazinska, Hari Balakrishnan, Samuel Madden, Michael Stonebraker: Fault-tolerance in the Borealis distributed stream processing system. SIGMOD Conference 2005: 13-24.

[Processing of Streaming XML documents]

M. Altinel and M. J. Franklin. “Efficient Filtering of XML Documents for Selective Dissemination of Information”. In Proc. Of VLDB, 2000. [Xfilter]
C.-Y. Chan, P. Felber, M. Garofalakis, and R. Rastogi. “Efficient Filtering of XML Documents with XPath Expressions”. In Proc. of ICDE, 2002.
Z. G. Ives, A. Y. Halevy, D. S. Weld. “An XML Query Engine for Network-Bound Data”. In VLDB Journal, 2002.
J. Chen, D. J. Dewitt, F. Tian, Y. Wang. “NiagaraCQ: a scalable continuous query system for internet databases”. In Proc. Of SIGMOD, 2002.
C. Barton, P. Charles, D. Goyal, M. Raghavachari, M. Fontoura, and V. Josifovski. “Streaming XPath Processing with Forward and Backward Axes”. In Proc. of ICDE, 2003.
Y. Diao, M. Altinel, M. Franklin, et al. Path Sharing and Predicate Evaluation for High-Performance XML Filtering.
In TODS, pages 467–516, 2003.
Xin Zhou, Hetal Thakkar and Carlo Zaniolo: Unifying the Processing of XML Streams and Relational Data Streams, ICDE 2006.

Data Mining Systems

Tomasz Imielinski and Heikki Mannila. A database perspective on knowledge discovery. Communication ACM, 39(11):58, 1996.
S. Sarawagi, S. Thomas, and R. Agrawal. Integrating association rule mining with relational database systems: Alternatives and implications. In SIGMOD, 1998.
T. Imielinski and A. Virmani. MSQL: a query language for database mining. Data Mining and Knowledge Discovery, 3:373--408, 1999.
J. Han, Y. Fu, W. Wang, K. Koperski, and O. R. Zaiane. DMQL: A data mining query language for relational databases. In Workshop on Research Issues on Data Mining and Knowledge Discovery (DMKD), pages 27--33, Montreal, Canada, June 1996.
R. Meo, G. Psaila, and S. Ceri. A new SQL-like operator for mining association rules. In VLDB, pages 122--133, Bombay, India, 1996.
Marco Botta, Jean-Francois Boulicaut, Cyrille Masson, and Rosa Meo. Query languages supporting descriptive rule mining: A comparative study. In Database Support for Data Mining Applications, pages 24--51, 2004.
Carlo Zaniolo: Mining Databases and Data Streamswith Query Languages and Rules: Invited Talk, Fourth International Workshop on Knowledge Discovery in Inductive Databases, KDID 2005.
ORACLE. Oracle Data Miner Release 10gr2: http://www.oracle.com/technology/products/bi/odm.
Data Mining Group (DMG). Predictive model markup language (pmml). http://sourceforge.net/projects/pmml.
Z. Tang, J. Maclennan, and P. Kim. Building data mining solutions with OLE DB for DM and XML analysis. SIGMOD Record, 34(2):80–85, 2005.

Mining Data Bases and Data Streams

Clustering

[Book] G. J. McLachlan and K.E. Bkasford. Mixture Models: Inference and Applications to Clustering. John Wiley and Sons, 1988.

[Book] L. Kaufman and P. J. Rousseeuw. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley & Sons, 1990.

[CLARANS] R. Ng and J. Han. Efficient and effective clustering method for spatial data mining. VLDB'94.

[CLIQUE] R. Agrawal, J. Gehrke, D. Gunopulos, and P. Raghavan. Automatic subspace clustering of high dimensional data for data mining applications. SIGMOD'98

[OPTICS] M. Ankerst, M. Breunig, H.-P. Kriegel, and J. Sander. Optics: Ordering points to identify the clustering structure, SIGMOD’99.

[Text] Beil F., Ester M., Xu X.: "Frequent Term-Based Text Clustering", KDD'02

[Outliers] M. M. Breunig, H.-P. Kriegel, R. Ng, J. Sander. LOF: Identifying Density-Based Local Outliers. SIGMOD 2000.

[DBSCAN] M. Ester, H.-P. Kriegel, J. Sander, and X. Xu. A density-based algorithm for discovering clusters in large spatial databases. KDD'96.

[Categorical] D. Gibson, J. Kleinberg, and P. Raghavan. Clustering categorical data: An approach based on dynamic systems. VLDB’98.

[Categorical] V. Ganti, J. Gehrke, R. Ramakrishan. CACTUS Clustering Categorical Data Using Summaries. KDD'99.

[CURE] S. Guha, R. Rastogi, and K. Shim. Cure: An efficient clustering algorithm for large databases. SIGMOD'98.

[ROCK] S. Guha, R. Rastogi, and K. Shim. ROCK: A robust clustering algorithm for categorical attributes. In ICDE'99, pp. 512-521, Sydney, Australia, March 1999.

[Hierarchical] G. Karypis, E.-H. Han, and V. Kumar. CHAMELEON: A Hierarchical Clustering Algorithm Using Dynamic Modeling. COMPUTER, 32(8): 68-75, 1999.

[Outliers] E. Knorr and R. Ng. Algorithms for mining distance-based outliers in large datasets. VLDB’98.

[DENCLUE] A. Hinneburg, D.l A. Keim: An Efficient Approach to Clustering in Large Multimedia Databases with Noise. KDD’98

[Wavelets] G. Sheikholeslami, S. Chatterjee, and A. Zhang. WaveCluster: A multi-resolution clustering approach for very large spatial databases. VLDB’98.

[Constraints] A. K. H. Tung, J. Han, L. V. S. Lakshmanan, and R. T. Ng. Constraint-Based Clustering in Large Databases, ICDT'01.

[p-cluster] H. Wang, W. Wang, J. Yang, and P.S. Yu. Clustering by pattern similarity in large data sets, SIGMOD’ 02.

[STING] W.. Wang, Yang, R. Muntz, STING: A Statistical Information grid Approach to Spatial Data Mining, VLDB’97.

[BIRCH] T. Zhang, R. Ramakrishnan, and M. Livny. BIRCH : an efficient data clustering method for very large databases. SIGMOD'96.

[Data Stream Clustering]

Liadan O'Callaghan, Adam Meyerson, Rajeev Motwani, Nina Mishra, Sudipto Guha: Streaming-Data Algorithms for High-Quality Clustering. ICDE 2002: 685+
Sudipto Guha, Adam Meyerson, Nina Mishra, Rajeev Motwani, Liadan O'Callaghan: Clustering Data Streams: Theory and Practice. IEEE Trans. Knowl. Data Eng. 15(3): 515-528 (2003)
C. Aggarwal, J. Han, J. Wang, P. S. Yu. A Framework for Clustering Data Streams, VLDB'03
C. Aggarwal, J. Han, J. Wang, and P. S. Yu. A Framework for Projected Clustering of High Dimensional Data Streams, VLDB'04.

[Association Rule Mining]

R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. SIGMOD'93.

R. Agrawal and R. Srikant. Fast algorithms for mining association rules. VLDB'94

J. S. Park, M. S. Chen, and P. S. Yu. An effective hash-based algorithm for mining association rules. SIGMOD'95.

A. Savasere, E. Omiecinski, and S. Navathe. Mining for strong negative associations in a large database of customer transactions. ICDE'98.

D. Tsur, J. D. Ullman, S. Abitboul, C. Clifton, R. Motwani, and S. Nestorov. Query flocks: A generalization of association-rule mining. SIGMOD'98.

H. Mannila, H Toivonen, and A. I. Verkamo. Discovery of frequent episodes in event sequences. DAMI:97.

M. Zaki. SPADE: An Efficient Algorithm for Mining Frequent Sequences. Machine Learning:01.

(Max-pattern) R. J. Bayardo. Efficiently mining long patterns from databases. SIGMOD'98.

(Closed-pattern) N. Pasquier, Y. Bastide, R. Taouil, and L. Lakhal. Discovering frequent closed itemsets for association rules. ICDT'99.

(FP-Growth) J. Han, J. Pei, and Y. Yin. Mining frequent patterns without candidate generation. SIGMOD’ 00.

J. Liu, Y. Pan, K. Wang, and J. Han. Mining Frequent Item Sets by Opportunistic Projection. KDD'02

Gösta Grahne, Jianfei Zhu: Efficiently Using Prefix-trees in Mining Frequent Itemsets. FIMI 2003

Zaki and Hsiao. CHARM: An Efficient Algorithm for Closed Itemset Mining, SDM'02.

R. Srikant and R. Agrawal. Mining generalized association rules. VLDB'95.

J. Han and Y. Fu. Discovery of multiple-level association rules from large databases. VLDB'95.

B. Lent, A. Swami, and J. Widom. Clustering association rules. ICDE'97.

M. Klemettinen, H. Mannila, P. Ronkainen, H. Toivonen, and A. I. Verkamo. Finding interesting rules from large sets of discovered association rules. CIKM'94.

S. Brin, R. Motwani, and C. Silverstein. Beyond market basket: Generalizing association rules to correlations. SIGMOD'97.

C. Silverstein, S. Brin, R. Motwani, and J. Ullman. Scalable techniques for mining causal structures. VLDB'98.

P.-N. Tan, V. Kumar, and J. Srivastava. Selecting the Right Interestingness Measure for Association Patterns. KDD'02.

E. Omiecinski. Alternative Interest Measures for Mining Associations. TKDE’03.

Y. K. Lee, W.Y. Kim, Y. D. Cai, and J. Han. CoMine: Efficient Mining of Correlated Patterns. ICDM’03.

[Association on Data Streams]

G. Manku, R. Motwani. Approximate Frequency Counts over Data Streams, VLDB’02

Richard M. Karp, Scott Shenker, Christos H. Papadimitriou: A simple algorithm for finding frequent elements in streams and bags. ACM Trans. Database Syst. 28: 51-55 (2003)
C. Giannella, J. Han, J. Pei, X. Yan and P.S. Yu. Mining frequent patterns in data streams at multiple time granularities, Kargupta, et al. (eds.), Next Generation Data Mining’04
Ahmed Metwally, Divyakant Agrawal, Amr El Abbadi: Efficient Computation of Frequent and Top-k Elements in Data Streams. ICDT 2005: 398-412

[Classification]

T.-S. Lim, W.-Y. Loh, and Y.-S. Shih. A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning, 2000.
J. Magidson. The Chaid approach to segmentation modeling: Chi-squared automatic interaction detection. In R. P. Bagozzi, editor, Advanced Methods of Marketing Research, Blackwell Business, 1994.
M. Mehta, R. Agrawal, and J. Rissanen. SLIQ : A fast scalable classifier for data mining. EDBT'96.
J. R. Quinlan. Bagging, boosting, and c4.5. AAAI'96.
R. Rastogi and K. Shim. Public: A decision tree classifier that integrates building and pruning. VLDB’98.
J. Shafer, R. Agrawal, and M. Mehta. SPRINT : A scalable parallel classifier for data mining. VLDB’96.
H. Yu, J. Yang, and J. Han. Classifying large data sets using SVM with hierarchical clusters. KDD'03.
J. Gehrke, R. Ramakrishnan, and V. Ganti. Rainforest: A framework for fast decision tree construction of large datasets. VLDB’98.
X. Yin and J. Han. CPAR: Classification based on predictive association rules. SDM'03..
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Wadsworth International Group, 1984.
Haixun Wang, Carlo Zaniolo: CMP: A Fast Decision Tree Classifier Using Multivariate Predictions. ICDE 2000: 449-460.

[Classification on Data Streams]

P. Domingos and G. Hulten, “Mining high-speed data streams”, KDD'00
C. C. Aggarwal, J. Han, J. Wang and P. S. Yu. On-Demand Classification of Evolving Data Streams, KDD'04
Fang Chu, Carlo Zaniolo: Fast and Light Boosting for Adaptive Mining of Data Streams. PAKDD 2004: 282-292.
Fang Chu, Yizhou Wang, Carlo Zaniolo: An Adaptive Learning Approach for Noisy Data Streams. ICDM 2004: 351-354.
C. C. Aggarwal, J. Han, J. Wang and P. S. Yu. On-Demand Classification of Evolving Data Streams, KDD'04
Yan-Nei Law, Carlo Zaniolo: An Adaptive Nearest Neighbor Classification Algorithm for Data Streams. PKDD 2005: 108-120.

[Time Series]

C. Chatfield. The Analysis of Time Series: An Introduction, 3rd ed. Chapman & Hall, 1984.

R.H. Shumway & D.S. Stoffer. Time Series Analysis and Its Applications: With R Examples (2nd ed.), Springer Texts in Statistics, 2006. http://www.stat.pitt.edu/stoffer/tsa2/index.html

StatSoft. Electronic Textbook. www.statsoft.com/textbook/stathome.html

R. Agrawal, C. Faloutsos, and A. Swami. Efficient similarity search in sequence databases. FODO’93 (Foundations of Data Organization and Algorithms).

R. Agrawal, K.-I. Lin, H.S. Sawhney, and K. Shim. Fast similarity search in the presence of noise, scaling, and translation in time-series databases. VLDB'95.

R. Agrawal, G. Psaila, E. L. Wimmers, and M. Zait. Querying shapes of histories. VLDB'95.

C. Faloutsos, M. Ranganathan, and Y. Manolopoulos. Fast subsequence matching in time-series databases. SIGMOD'94.

Carlo Zaniolo,Stefano Ceri,Christos Faloutsos, Richard T. Snodgrass,VS Subrahmanian, Roberto Zicari. Advanced Database Systems (Chater 12), Morgan-Kaufmann, 1997.

Nasser Yazdani, Z. Meral Özsoyoglu: Sequence Matching of Images. SSDBM 1996: 53-62

Y. Moon, K. Whang, W. Loh. Duality Based Subsequence Matching in Time-Series Databases, ICDE’02

B.-K. Yi, H. V. Jagadish, and C. Faloutsos. Efficient retrieval of similar time sequences under time warping. ICDE'98.

B.-K. Yi, N. Sidiropoulos, T. Johnson, H. V. Jagadish, C. Faloutsos, and A. Biliris. Online data mining for co-evolving time sequences. ICDE'00.

Dennis Shasha and Yunyue Zhu. High Performance Discovery in Time Series: Techniques and Case Studies, SPRINGER, 2004

L. R. Rabiner. A tutorial on hidden markov models and selected applications in speech recognition. Proc. IEEE, 77:257--286, 1989.

R.Durbin, S.Eddy, A.Krogh and G.Mitchison. Biological Sequence Analysis: Probability Models of Proteins and Nucleic Acids. Cambridge University Press, 1998.

240B: Advanced Data and Knowledge Bases: Sample of topics for presentations and projects