CS 239 Debugging and Testing of AI/ML based Systems-- Current Topics in Programming Languages and Systems, Winter 2019

CS 239: Debugging and Testing of AI/ML Based Systems, Special Topics in Programming Systems, Winter 2019

Instructor: Dr. Miryung Kim (ENG 6, Room 474)

Lectures: Mondays and Wednesdays 12PM to 1:50 PM
Office Hours: By appointment only

General Description

Multiple research disciplines, from cognitive sciences to biology, finance, physics, and the social sciences, as well as many companies, believe that data-driven and intelligent solutions are necessary. Unfortunately, current artificial intelligence (AI) and machine learning (ML) technologies are not sufficiently democratized — building complex AI and ML systems requires deep expertise in computer science and extensive programming skills to work with various machine reasoning and learning techniques at a rather low level of abstraction. It also requires extensive trial and error exploration for model selection, data cleaning, feature selection, and parameter tuning. Moreover, there is a lack of theoretical understanding that could be used to abstract away these subtleties. Conventional programming languages and software engineering paradigms have also not been designed to address challenges faced by AI and ML practitioners.

The goal of this graduate seminar is to study recent research projects in the domain of debugging and testing AI/ML based systems and discuss open problems on how to improve the productivity of data scientists, software engineers, and AI-ML practitioners in industry.

* How do we re-think software development tools such as debugging, testing, and verification tools for complex AI-ML-based systems?
* How do we reason about correctness and explainability, while building AI-ML pipeline?

Reading List

The class schedule and reading material is available here.

Audience and Prerequisites

This is a seminar class geared towards software engineering research oriented students. If you are not comfortable with reading academic research papers (2-4 papers per week and each of them is 12+ pages), this class will be challenging for you to keep up. A significant portion of your grade will be based on your ability to articulate your own in-depth analysis of research papers. If you are unsure of your qualifications, please contact the instructor, who will determine whether this course is right for you.

In the first week, please submit your CV, unofficial transcripts, your status in a graduate program, and the description of your current research project with your advisor using CCLE. I will provide PTEs to the interested students during the first two weeks.

Grading

50%: Team Project
30%: Paper Presentation including Technology Tool Tutorial (you should expect about 2 presentations). Each presenter will create a set of toy examples, on-line tutorials, and live in-class demonstration to teach tools and environments discussed in the class. A tool demonstration should consists of 5 minute overview followed by 15 minute live demonstration. The presentation will be done individually. For paper presentations, each person will discuss the assigned paper to discuss recent advances related to the lecture's topic. Each presentation should be about 20 minutes long. The presentation will be done individually. After the presentation, you should lead an in-class discussion with your fellow classmates.
20%: Pop-Quiz, In-Class Q&A, Index Card Submission, Attendance (These are the ways that we will check whether you read the papers or not and whether you are actively contributing to the class discussion.)

Reading Questions

Please consider the following points as you read the papers. During in-class Q&A, we may use Piazza to enter your question about the reading assignment.

Cool or significant ideas. What is new here? What are the main contributions of the paper? What did you find most interesting? Is this whole paper just a one-off clever trick or are there fundamental ideas here which could be reused in other contexts?
Fallacies and blind spots. Did the authors make any assumptions or disregard any issues that make their approach less appealing? Are there any theoretical problems, practical difficulties, implementation complexities, overlooked influences of evolving technology, and so on? Do you expect the technique to be more or less useful in the future? What kind of code or situation would defeat this approach, and are those programs or scenarios important in practice? Note: we are not interested in flaws in presentation, such as trivial examples, confusing notation, or spelling errors. However, if you have a great idea on how some concept could be presented or formalized better, mention it.
New ideas and connections to other work. How could the paper be extended? How could some of the flaws of the paper be corrected or avoided? Also, how does this paper relate to others we have read, or even any other research you are familiar with? Are there similarities between this approach and other work, or differences that highlight important facets of both?

Class Discussion: Think-Pair-Share

How Does It Work?
1) Think. The teacher provokes students' thinking with a question or prompt or observation. The students should take a few moments (probably not minutes) just to THINK about the question.

2) Pair. Using designated partners (such as with Clock Buddies), nearby neighbors, or a deskmate, students PAIR up to talk about the answer each came up with. They compare their mental or written notes and identify the answers they think are best, most convincing, or most unique.

3) Share. After students talk in pairs for a few moments (again, usually not minutes), the teacher calls for pairs to SHARE their thinking with the rest of the class. She can do this by going around in round-robin fashion, calling on each pair; or she can take answers as they are called out (or as hands are raised). Often, the teacher or a designated helper will record these responses on the board or on the overheads

Presentation grading scheme

5 pt: Excellent design, complete implementation, selection of a task that is highly intellectually challenging, creative, concise yet comprehensive writing, nearly perfect answers, beautifully written and verbally communicated, eloquent presentation within a time limit
4 pt: Very good, mostly correct answers, i.e., >85%, selection of a intellectually challenging project topic, well written, good verbal presentation, well practiced presentation within a time limit
3 pt: Good understanding of the key concepts, mostly correct answers, i.e., >70%, selection of a intellectually challenging project topic, well written, good verbal presentation, well practiced presentation within a time limit
2 pt or 1pt: Poor, shallow, minimally sufficient, or needlessly wordy, key concepts misunderstood or missing, selection of easy project tasks, poor written and verbal communication, presentation over time, not following specified formats