CS 6220: Data Mining Techniques
- Office hours: Wednesdays 1:00-3:00pm at 320 WVH
- Yupeng Gu
- Email: email@example.com
- Office hours: Tuesdays 2:30-4:30pm at 472 WVH
- Kosha Shah
- Office hours: Thursdays 10:00am-12:00pm at 102 Main Lab WVH
Lecture times: Mondays 6 - 9 PM
Lecture location: Shillman Hall 335
About the Course
This course introduces concepts, algorithms, and techniques of data mining on
different types of datasets, including (1) matrix data, (2) set data, (3)
sequence data, (4) time series,
and (5) graph and network. The class project involves
hands-on practice of mining useful knowledge from large data sets. The course is
a graduate-level computer science course, which is also a good option for senior-level
computer science undergraduate students interested in the field. Also, the
course may attract students from other disciplines who need to understand,
develop, and use data mining systems to analyze large amounts of data.
- CS 5800 or CS 7800, or consent of instructor
- The students are expected to have knowledge in data structures,
algorithms, basic linear algebra, and basic statistics. You will also need to be familiar with at
least one programming language, and have programming experiences.
- Homework: 40%
- Midterm exam: 25%
- Course project: 30%
- Participation: 5%
*Note: all the deadlines are 11:59PM (midnight) of the
due dates; No late submissions accepted!
If you have doubts in your grading, please submit a
regrading form (via emails to both TAs and CC to the Instructor) indicating clearly the reason why you think it should be
The deadline of the regrading form should be submitted within
one week after you receive your socre
We will regrade the whole homework/exam
Jiawei Han, Micheline Kamber, and Jian Pei.
Data Mining: Concepts and Techniques,
3rd edition, Morgan Kaufmann, 2011
Recommended books for further reading:
- "Data Mining" by Pang-Ning Tan, Michael Steinbach, and Vipin Kumar (http://www-users.cs.umn.edu/~kumar/dmbook/index.php)
- "Machine Learning" by Tom Mitchell (http://www.cs.cmu.edu/~tom/mlbook.html)
- "Introduction to Machine Learning" by Ethem ALPAYDIN (http://www.cmpe.boun.edu.tr/~ethem/i2ml/)
- "Pattern Classification" by Richard O. Duda, Peter E. Hart, David G.
- "The Elements of Statistical Learning: Data Mining, Inference, and
Prediction" by Trevor Hastie, Robert Tibshirani, and Jerome Friedman (http://www-stat.stanford.edu/~tibs/ElemStatLearn/)
- "Pattern Recognition and Machine Learning" by Christopher M. Bishop (http://research.microsoft.com/en-us/um/people/cmbishop/prml/))
Q & A
You are encouraged to come to the office hours of TAs and
Peer-based Q&A via Piazza (Note that Piazza will only be
used for peer discussion):
Academic Integrity Policy
A commitment to the principles of academic integrity is essential to the
mission of Northeastern University. The promotion of independent and original
scholarship ensures that students derive the most from their educational
experience and their pursuit of knowledge. Academic dishonesty violates the most
fundamental values of an intellectual community and undermines the achievements
of the entire University.
For more information, please refer to the
Integrity Web page.