Course description
Machine Learning encompasses the study of algorithms that learn from data. It has been a key component in a number of problem domains including computer vision, natural language processing, computational biology and robotics. This class will introduce the fundamental concepts and algorithms in machine learning (supervised as well as unsupervised learning) as well as best practices in applying machine learning to practical problems. The class consists of lectures, problem sets that contain mathematical and programming exercises and two in-class exams.
Prerequisites
Undergraduate level training or coursework in algorithms, linear algebra, calculus and multivariate calculus, basic probability and statistics; an undergraduate level course in Artificial Intelligence may be helpful but is not required. A background in programming will also be necessary for the problem sets; specifically students are expected to be familiar with python and scikit-learn (a machine learning package for python) or learn it during the course.
Contact Info
Instructor: Sriram Sankararaman
Office Hours: Boelter 4531D, Tuesday 11:00am-noon
Email: sriram at cs dot ucla dot edu
Teaching assistants
- Sajad Darabi
Office hours: Boelter 2432, Wednesday 8:00am-10:00am
Email: sajad10 at ucla dot edu
- Nolan Donoghue
Office hours: Boelter 3809, Tuesday, Thursday 9:30-10:30am
Email: nolan at cs dot ucla dot edu
- Ariel Wu
Office hours: Boelter 2432, Tuesday, Thursday 11:30-am-12:30pm
Email: arielwu at cs dot ucla dot edu
Textbooks
While there is not one textbook that covers all the material from this course, readings will come from the following texts:
For a more advanced treatment, the following are useful:
Machine Learning requires a strong mathematical foundation. You may find the following resources useful to brush up your math background.
- Probability
- Linear algebra
- Optimization
Course format
Software
We will extensively be using
Python 2.7.x to implement ML algorithms and run experiments. You will require and need to familiarize yourself with the following packages:
- numpy: contains tools for numerical linear algebra, random number generation. For a numpy tutorial, see here .
- scipy
- scikit-learn : contains tools for machine learning and data science. For a tutorial, see here
Forums
Piazza
We will use Piazza for class discussions. Please go to this
Piazza website to join the course forum (note: you must use a
ucla.edu email account to join the forum). We strongly encourage
students to post on this forum rather than emailing the course staff
directly (this will be more efficient for both students and staff).
Students should use Piazza to:
- Ask clarifying questions about the course material.
- Share useful resources with classmates (so long as they do not
contain solutions).
- Look for project partners or other students to form study groups.
- Answer questions posted by other students to solidify your own
understanding of the material.
The course Academic Integrity Policy must be followed on the message
boards at all times. Do not post or request solutions to problem sets! Also,
please be polite.
Gradescope
We will use gradescope to manage and grade problem sets and exams.
Policies
Academic Integrity Policy
Group studying and collaborating on problem sets are encouraged, as
working together is a great way to understand new material. Students
are free to discuss the homework problems with anyone under the
following conditions:
- Students must write their own solutions and understand the
solutions that they wrote down.
- Students must list the names of their collaborators (i.e., anyone
with whom the assignment was discussed).
- Students may not use old solution sets from this class or any
other class under any circumstances, unless the instructor grants
special permission.
Students are encouraged to read the Dean of Students'
guide to Academic Integrity.
Attendance and class participation
Although not a formal component of the course grade, attendance is essential for success in this course. If you are absent without a documented excuse, the instructor and TAs will not be able to go over missed lecture material with you. We emphatically welcome questions and your active participation in this course will enhance your learning experience and that of the other students.
Regrade requests
Regrade requests for homework and exams must be made through gradescope within one week after the graded homeworks have been released, regardless of your attendance on that day and regardless of any intervening holidays such as Memorial Day. We reserve the right to regrade all problems for a given regrade request.
Acknowledgments
The course website is based on material developed by Ameet Talwalkar and
Fei Sha.
Some of the administrative content on the course website is adapted
from material from Jenn
Wortman Vaughan, Rich Korf, and Alexander Sherstov.
Tentative Schedule (subject to change)
Date |
Topics |
Readings |
Problem Sets |
10/02 |
Introduction
|
|
Problem Set 0 |
10/04 |
Probability
|
PRML 1.2-1.2.2 |
|
10/09 |
Math review mini quiz. Statistics
|
|
|
10/11 |
Decision trees.
|
CIML 1.3,1.5-1.10 |
|
10/16 |
Nearest neighbors
|
CIML 2-2.3 |
|
10/18 |
Linear classification (perceptron)
|
CIML 3 |
|
10/23 |
Logistic regression
|
CIML 6.3 |
|
10/25 |
Linear regression
|
CIML 6-6.2, 6.4-6.6 |
|
10/30 |
Overfitting and regularization
|
|
|
11/01 |
Kernels
|
CIML 9-9.2, 9.4-9.6 |
|
11/06 |
In-class mid-term
|
|
|
11/08 |
Support Vector Machines
|
CIML 6.7 |
|
11/13 |
Ensemble methods
|
|
|
11/15 |
Dimensionality reduction
|
|
|
11/20 |
Clustering
|
CIML 2.4, 13-13.1 |
|
11/22 |
Mixture models
|
CIML 14-14.1 |
|
11/27 |
The Expectation Maximization algorithm
|
CIML 14.2 |
|
11/29 |
Hidden Markov Models (HMMs)
|
|
|
12/04 |
HMMs continued
|
|
|
12/06 |
Neural networks
|
|
|