CS 6220: Data Mining Techniques

News

[1/13/2016] Classes start at 6:10pm

[1/13/2016] First day of classes


Class Schedule

(Future lectures and events are tentative.)

Week# Date Topic Slides Assignment Project Reading (Textbook or Other Materials)
1 Jan. 13 Introduction and Know Your Data 01Introduction
02Data
    Chapter 1, 2, 3 
Math overview:
2 Jan. 20 Course Project Introduction
Matrix Data: Prediction (linear regression); Classification (decision tree, evaluation)
Course Project Overview
03Matrix_Prediction
04Matrix_Classification_1
#1 out   Notes by Andrew Ng (Sec. 1-3 in Part 1): http://cs229.stanford.edu/notes/cs229-notes1.pdf

Chapter 8.1, 8.2, 8.5
3 Jan. 27 Matrix Data: Classification (Naive Bayes, logistic regression) Prob_review
04Matrix_Classification_2
  Team formation due (Jan. 27) Chapter 8.3, 9.1
Notes by Tom Mitchell: http://www.cs.cmu.edu/~tom/mlbook/NBayesLogReg.pdf
Notes on derivation of P(C_j) in Naive Bayes
review of probability: http://cs229.stanford.edu/section/cs229-prob.pdf
4 Feb. 3 Matrix Data: Classification (SVM, kNN, and other issues) 04Matrix_Classification_3 #1 due (Feb. 2)/ #2 out   Chapter 9.3, 9.5, 8.6, 9.7
Notes on SVM by Andrew Ng: http://cs229.stanford.edu/notes/cs229-notes3.pdf
5 Feb. 10 Matrix Data: Clustering (k-means, hierarchical clustering, DBSCAN) 05Matrix_Clustering_1     Chapter 10.1, 10.2, 10.3, 10.4, 10.6
6 Feb. 17 Matrix Data: Clustering (GMM)
Text Data: Topic Models (PLSA )

05Matrix_Clustering_2

06Text_PLSA

#2 due (Feb. 16) / #3 out
  Chapter 11.1, 11.3
Notes on mixture models and EM algorithm: http://www.stat.cmu.edu/~cshalizi/350/lectures/29/lecture-29.pdf and http://www.cs.ubc.ca/~murphyk/Teaching/CS340-Fall06/reading/mixtureModels.pdf
pLSA tutorial: http://arxiv.org/pdf/1212.3900.pdf
topic modeling tutorial: https://www.cs.princeton.edu/~blei/kdd-tutorial.pdf
7 Feb. 24 Set Data: Frequent Pattern Mining (Apriori, FP-growth) 07Set   Proposal due (Feb. 23) Chapter 6
8 Mar. 2 Midterm Exam   #3 due (Mar. 4) #4 out    
9 Mar. 9

Spring Break

       
10 Mar. 16 Graph / Network I: Ranking, Proximity, Clustering
08Graph #4 due (Mar. 15)   Spectral clustering: http://www.kyb.mpg.de/fileadmin/user_upload/files/publications/attachments/Luxburg07_tutorial_4488%5B0%5D.pdf
11 Mar. 23

Graph /Network II: Recommendation

09Recommendation #5 out (Mar. 23) Midterm Report due (Mar. 22) http://ijcai13.org/files/tutorial_slides/td3.pdf
12 Mar. 30 Sequence Data and Time Series Data 10Sequence_TS     Reference: Chapter 8.3 in Han's Data Mining Book, Edition 2; GSP; DTW
13 Apr. 6 Image Data: Neural Networks, deep learning   #5 due (Apr. 9)    
14 Apr. 13 No class        
15 Apr. 20

No class

       
16 Apr. 27 Course Project Final Presentation     Final Report & Code (Apr. 27)