CS 260C Deep Learning
Spring 2023


Tue/Thur 2:00 pm - 3:50 pm, Engineering VI, MLC




Instructor: Prof. Cho-Jui Hsieh
Office location: EVI 482
Email: chohsieh@cs.ucla.edu
Office hour: TBD
Office hour for online students: TBD
TA: Discussion 1A: Li-Cheng Lan (office hour TBD)
Discussion 1B: Chengdi Cao (office hour TBD)
MSOnline TA : Yuanhao Xiong (office hour TBD)



Course Overview

In this course, we will teach the basics of deep neural networks and their applications, including but not limited to computer vision, natural language processing and graph mining. The course will cover topics including the foundation of deep learning, how to train a neural network (optimization), architecture designs for various tasks, and some other advanced topics. By the end of the course, the students are expected to be familiar with deep learning and be able to apply deep learning algorithms to a variety of tasks.

Prerequisites
Basic knowledge in numerical linear algebra (singular value decomposition), probability, and calculus (gradient).

Textbooks

There will be no textbook. We suggest the following books if you are interested in studying more advanced topics:
  1. Deep Learning (by Ian Goodfellow, Yoshua Bengio, Aaron Courville)

Grading Policy

Grades will be based on the following componenets:
  1. Homeworks (50%)
  2. Final project (50%)

Tentative Schedule

Part I: Deep Learning Foundation (week 1 -- week 3.5)
We will first discuss the general concept of deep neural networks and why it can extract important knowledge from data. We will then discuss how to train a neural network in detail, including the detailed derivation of back-propagation and the foundation of stochastic optimization algorithms for training neural networks.
  1. Lecture 1-2: introduction and basic definitions
  2. Lecture 3-5: training algorithms and back-propagation
  3. Lecture 6-7: regularizations and normalizations
Part II: Deep Learning in Various Tasks (Week 3.5 -- Week 8)
This will cover deep learning architectures in various domains, including computer vision, natural language processing, and graph mining. For computer vision tasks, we will introduce convolutional neural networks (CNN), residual network (ResNet), and generative adversarial network (GAN) for image generation. For natural language processing, we will introduce recurrent neural networks (RNN) and Transformer architectures, and discuss how to apply them in a variety of NLP tasks. For graph mining, we will introduce unsupervised and supervised graph neural networks. Finally, networks with multi-modality such as vision+text or text+graph will be discussed.
  1. Lecture 8-10: Neural networks for computer vision (CNN, ResNet, GAN)
  2. Lecture 11-13: Neural networks for NLP (RNN, Transformer)
  3. Lecture 14-15: Neural networks for graph
  4. Lecture 16: Multi-modal networks (text+vision or text+graph)
Part III: Advanced Topics in Deep Learning (Week 9 --Week 10)
We will talk about several advanced topics including neural architecture search, meta-learning, and discuss the current limitations of deep learning (robustness, fairness, interpretability, scalability, reproducibility).
  1. Lecture 17: Neural architecture search
  2. Lecture 18: Few-shot learning/Meta-learning
  3. Lecture 19-20: Limitations of deep learning (robustness, fairness, interpretability, scalability, reproducibility).