Probabilistic Programming and Relational Learning

CS 267A - Spring 2020

Instructor: Professor Guy Van den Broeck <guyvdb@cs.ucla.edu>;

TA: Tal Friedman <tal@cs.ucla.edu> ; Office hours: Tuesday 1pm-3pm, Zoom (same link as lectures).

TA: Yitao Liang <yliang@cs.ucla.edu> ; Office hours: Thursday 4pm-6pm, Zoom (same link as lectures).

TA: YooJung Choi <yjchoi@cs.ucla.edu> ; Office hours: Wednesday 2pm-4pm, Zoom (same link as lectures).

Lectures: Spring 2020, MW 4pm-5:50pm; Zoom (see link on CCLE)

CCLE: https://ccle.ucla.edu/course/view/20S-COMSCI267A-1

Course Description

This course introduces probability distributions defined through computation (probabilistic programs) and statistical models of relational data. It studies relational representations such as probabilistic databases, relational graphical models, and Markov logic networks, as well as various probabilistic programming languages. It covers their syntax and semantics, probabilistic inference problems, parameter, and structure learning algorithms, and theoretical properties of representation and inference. This course teaches expressive statistical modeling, how to formalize and reason about complex statistical assumptions and encode knowledge and structure in machine learning models. It also surveys key applicatons of relational learning and probabilistic programming.

Prerequisites

This course requires basic computer science knowledge (logic, probability, programming, complexity).

Grading

Grading will be based on frequent Homeworks/Quizzes (50%), and a Project (50%). There is no midterm or final exam.

Homeworks and Quizzes

Regular homeworks will be announced on CCLE with a one-week deadline. There is no late policy: late submissions will not be graded. However, at the end of the quarter there will be an optional homework that can replace one bad grade on a previous homework. Homeworks need to be submitted in PDF form, typeset in LaTeX, and occasionally with source code in a zip file. All homeworks are subject to the honor code below.

In addition to the homeworks there will be a number of short quizzes that replace the traditional midterm exam.

Projects

You will have to do a final project, preferably in small groups, or (with special permission) alone. It will be a self-selected open-ended project related to the course content. We will provide a list of suggested project topics to help make the choice. Motivated students can suggest projects that apply some of the course topics to their existing research or interests.

Detailed instructions for the project options will be provided on CCLE. All projects are subject to the honor code below.

Readings

We will refer to selected readings for more details on the material taught in class. The following books and publications are freely available.

David Poole and Alan Mackworth. Artificial Intelligence: Foundations of Computational Agents
Guy Van den Broeck and Dan Suciu. Query Processing on Probabilistic Data: A Survey
Pedro Domingos and Daniel Lowd. Markov Logic: An Interface Layer for Artificial Intelligence
Luc De Raedt, Kristian Kersting, Sriraam Natarajan, David Poole. Statistical Relational Artificial Intelligence: Logic, Probability, and Computation
N. D. Goodman and A. Stuhlmueller. The Design and Implementation of Probabilistic Programming Languages.
Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, Frank Wood. An Introduction to Probabilistic Programming .
Various research papers referred to in the slides.

Readings C and D are only free to download from the UCLA network.

Optionally, students may also want to consult the following material that is not freely available.

Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. (3rd Edition)
Adnan Darwiche. Modeling and Reasoning with Bayesian Networks
Lise Getoor and Ben Taskar. Introduction to Statistical Relational Learning
Fabrizio Riguzzi. Foundations of Probabilistic Logic Programming: Languages, Semantics, Inference and Learning

Tentative Schedule

Honor Code

You are encouraged to work on your own or in groups in this class. If you or your group get stuck, you may discuss the problem with other students, PROVIDED THAT YOU SUBMIT THEIR NAMES ALONG WITH YOUR ASSIGNMENT. ALL SOLUTIONS MUST BE WRITTEN UP INDEPENDENTLY, HOWEVER. This means that you should never see another student's or group's solution before submitting your own. You may always discuss any problem with me or the TAs. YOU MAY NOT USE OLD SOLUTION SETS UNDER ANY CIRCUMSTANCES. Making your solutions available to other students, EVEN INADVERTENTLY (e.g., by keeping backups on github), is aiding academic fraud, and will be treated as a violation of this honor code.

You are expected to subscribe to the highest standards of academic honesty. This means that every idea that is not your own must be explicitly credited to its author. Failure to do this constitutes plagiarism. Plagiarism includes using ideas, code, data, text, or analyses from any other students or individuals, or any sources other than the course notes, without crediting these sources by name. Any verbatim text that comes from another source must appear in quotes with the reference or citation immediately following. Academic dishonesty will not be tolerated in this class. Any student suspected of academic dishonesty will be reported to the Dean of Students. A typical penalty for a first plagiarism offense is suspension for one quarter. A second offense usually results in dismissal from the University of California.