Fall 2020, CS 239 Schedule
Date |
Papers |
Presentation Slides |
Due |
Week 1. Introduction + Distributed Storage Systems |
|||
Oct. 5 (Mon) |
1. Introduction, overview of the class, state-of-the-art in big data systems (Harry) 2. Challenges in distributed storage systems (Harry) 3. HDFS – Yahoo (Presenter: Scribe:) |
Class overview Challenges in distributed storage systems |
|
Oct. 7 (Wed) |
1. GFS - Google (Presenters: , Scribes:) 2. Bigtable – Google (Presenters:, Scribes:) 3. Spanner - Google (Presenters:, Scribes:) |
|
Paper presentation selection due on Friday |
Week 2. Distributed Storage Systems + Engines |
|||
Oct. 12 (Mon) |
1. Azure storage – Microsoft (Presenters: , Scribes: ) 2. Introduction to data-parallel engines (Harry) 3. MapReduce – Google (Presenters: , Scribes:) |
Challenges in data parallel engines |
Group formation due |
Oct. 14 (Wed) |
1. Dryad – Microsoft (Presenters:, Scribes:) 2. Spark + RDD – Berkeley + DataBricks (Presenters: Scribes: ) 3. Distributed aggregation –Microsoft (Presenters:, Scribes:) |
||
Week 3. Engines + Batch Processing |
|||
Oct. 19 (Mon) |
1. Map-reduce online (Presenters: Scribes:) 2. Map-reduce-merge (Presenters:, Scribes:) |
Harry is out of town. |
|
Oct. 21 (Wed) |
1. Introduction to batch-processing systems (Harry) 2. Hive – Facebook (Presenters:, Scribes:) 3. Spark SQL -- Databricks (Presenters:, Scribes: ) |
Challenges in batch-processing systems |
|
Week 4. Batch Processing |
|||
Oct. 26 (Mon) |
1. SCOPE -- Microsoft (Presenters: , Scribes:) 2. FlumeJava -- Google (Presenters: , Scribes:) 3. DryadLINQ -- Microsoft (Presenters: , Scribes:) |
|
|
Oct. 28 (Wed) |
1. Introduction to Scheduling and resource Management (Harry) 2. Mesos (Presenters: , Scribes:) 3. YARN (Presenters:, Scribes:) |
Challenges in scheduling and resource management |
|
Week 5. Scheduling + Resource Management |
|||
Nov. 2 (Mon) |
1. Sparrow (Presenters:, Scribes:) 2. Borg -- Google (Presenters:, Scribes:) 3. Nexus -- UW (Presenters:, Scribes:) |
|
Harry is out of town. |
Nov. 4 (Wed) |
1. Introduction to stream processing (Harry) 2. Storm (Presenters:, Scribes:) 3. Flink (Presenters: Scribes:) |
|
|
Week 6. Stream Processing |
|||
Nov. 9 (Mon) |
1. Kafka (Presenters: , Scribes:) 2. Naiad -- Microsoft (Presenters: , Scribes:) 3. Trill -- Microsoft (Presenters:, Scribes:) |
|
|
Nov. 11 (Wed) |
Veterans Day - No class |
|
|
Week 7. Project Presentations |
|||
Nov. 16 (Mon) |
Project proposal presentations I |
|
|
Nov. 18 (Wed) |
Project proposal presentations II |
||
Week 8. Stream Processing |
|||
Nov. 23 (Mon) |
1. SVE --Facebook (Presenters:, Scribes:) 2. Drizzle (Presenters:, Scribes:) 3. Structured Streaming -- Data bricks (Presenters:, Scribes:) |
|
|
Nov. 25 (Wed) |
Class canceled due to Thanksgiving |
|
|
Week 9. Graph Systems |
|||
Nov. 30 (Mon) |
1. Introduction to graph processing and ML systems (Harry) 2. Pregel -- Google (Presenters:, Scribes:) 3. Ligra (Presenters: , Scribes: ) |
Challenges
in graph processing and ML systems |
|
Dec. 2 (Wed) |
1. GraphChi (Presenters: , Scribes: ) 2. XStream (Presenters:, Scribes:) 3. RSrtream (Presenters:, Scribes:) |
|
|
Week 10. Memory Management + Project Presentation Week |
|||
Dec. 7 (Mon) |
1. Parameter server (Presenters:,
Scribes:) 2. TensorFlow -- Google (Presenters:, Scribes:) 3. Ray (Presenters:, Scribes:) 4. TVM (Presenters:, Scribes:) |
|
|
Dec. 9 (Wed) |
1. Introduction to big data memory management (Harry) 2. Broom (Presenters:, Scribes:) 3. Yak (Presenters:, Scribes:) 4. Niijima (Presenters: , Scribes:) |
Challenges in memory management |
Final report due on Friday |
Dec. 10-11 |
Final Presentation |
|
|