Fall 2019, CS 239 Schedule
|
Date |
Papers |
Presentation Slides |
Due |
|
Week 1. Introduction + Distributed Storage Systems |
|||
|
Sept. 30 (Mon) |
1. Introduction, overview of the class, state-of-the-art in big data systems (Harry) 2. Challenges in distributed storage systems (Harry) 3. HDFS – Yahoo (Presenter: Yifan and Haoran, Scribe: Jiyuan and Usama) |
Challenges in distributed storage systems
|
|
|
Oct. 2 (Wed) |
1. GFS - Google (Presenter: Shiqi, Scribe: Yuanqi, Xuanqing) 2. Bigtable – Google (Presenter: Jiyuan and Usama, Scribe: Haoran and Yifan ) 3. Spanner - Google (Presenter: Willie Wu and Kevin Hsieh, Scribe: Junyu and Kedar) |
|
Paper presentation selection due on Friday |
|
Week 2. Distributed Storage Systems + Engines |
|||
|
Oct. 7 (Mon) |
1. Azure storage – Microsoft (Yuanqi Li, Scribe: Amir, Shivam) 2. Introduction to data-parallel engines (Harry) 3. MapReduce – Google (Gaohao Liu, Scribe: Zhaoning, Chun) |
Challenges in data parallel engines
|
Group formation due |
|
Oct. 9 (Wed) |
1. Dryad – Microsoft (Amir, Scribe: Yu-Chen, Tianyi) 2. Spark + RDD – Berkeley + DataBricks (Shivam, Pratik Scribe: David, Rach) 3. Distributed aggregation –Microsoft (Zhaoning, Scribe: Howard, Shen) |
|
|
|
Week 3. Engines + Batch Processing |
|||
|
Oct. 14 (Mon) |
1. Map-reduce online (Tianyi Ma, scribes: Liran Xiao, Chengyao Zhang) 2. Map-reduce-merge (Kedar Deshpande, Victor Fu, Scribes: Enbang Zhang, Swati Sharma)
|
|
Harry is out of town. |
|
Oct. 16 (Wed) |
1. Introduction to batch-processing systems (Harry) 2. Hive – Facebook (Yujun Zhao, Scribes: Wenlong Xiong, Arjun Srinivasan) 3. Spark SQL -- Databricks (Junyu Guo, Scribes: Neil Agarwal) |
Challenges in batch-processing systems |
|
|
Week 4. Batch Processing |
|||
|
Oct. 21 (Mon) |
1. SCOPE -- Microsoft (Chun Chen, Yu-Chen Lin, Scribes: Jintao Jiang, Sahil Gandhi) 2. FlumeJava -- Google (David Shan, Arjun Srinivasan, Scribes: Matt Hickey, Qiyue Yao) 3. DryadLINQ -- Microsoft (Howard Xie, Allen Huang, Scribes: Jay Arora, Wei-ting Chen) |
|
|
|
Oct. 23 (Wed) |
1. Introduction to Scheduling and resource Management (Harry) 2. Mesos (Liran Xiao, Zhuyan Chen, Scribes: Srishti Majumadar, Nandan Parikh) 3. YARN (Chengyao Zhang, Enbang Zhang, Scribes: Siqi Liu and Jules Ahmar) |
|
|
|
Week 5. Scheduling + Resource Management |
|||
|
Oct. 28 (Mon) |
1. Sparrow (Swati Sharma, Scribes: Mathanky) 2. Borg -- Google (Wenlong Xiong, Calvin Pham, Scribes: Tanya Chinchore, Yujun Zhao) 3. Tachyon -- Databricks (Arghya Mukherjee, Scribes: Zhufeng Pan, Pratik Nichat) |
|
Harry is out of town. |
|
Oct. 30 (Wed) |
1. Introduction to stream processing (Harry) 2. Storm (Neil Agarwal, Scribes: Jonathan Chee, Rustem Can Aygun) 3. Flink (Rach Liu, Thomas Pan, Scribes: Keerthana Sankar and Enbang Zhang) |
|
|
|
Week 6. Stream Processing |
|||
|
Nov. 4 (Mon) |
1. Kafka (Jintao Jiang, Millan Batra, Scribes: Kaushik Mahorke, Horan Ma) 2. Naiad -- Microsoft (Matt Hickey, Scribes: Chia-Hung Ni, Sijie Xiong) 3. Trill -- Microsoft (Qiyue Yao, Scribes: Yifan Qiao, Usama Hameed) |
|
|
|
Nov. 6 (Wed) |
Project proposal presentations I |
|
|
|
Week 7. Stream Processing |
|||
|
Nov. 11 (Mon) |
Veterans Day - No class |
|
|
|
Nov. 13 (Wed) |
Project proposal presentations II |
|
|
|
Week 8. Graph Processing |
|||
|
Nov. 18 (Mon) |
1. SVE --Facebook (Jay Arora, Chia-Hung Ni, Scribes: Jinyuan Wang, Yuanqi Li) 2. Drizzle (Keerthana Sankar, Scribes: Yuanqi Liu, Rupa Mahadevan) 3. Structured Streaming -- Data bricks (Wei-ting Chen, Scribes: Kevin Hsieh, Jay Arora) |
|
|
|
Nov. 20 (Wed) |
1. Introduction to graph processing (Harry) 2. Pregel -- Google (Christian Warloe, Sahil Gandhi, Scribes: Willie Wu, Jules Ahmar) 3. Ligra (Shen Teng, Scribes: Mathanky Sankaranarayanan, Jonathan Chee) |
|
|
|
Week 9. ML Systems |
|||
|
Nov. 25 (Mon) |
1. GraphChi (Srishti Majumdar, Nandan Parikh, Scribes: Kevin Hsieh, Shivam) 2. XStream (Pooja Nagaraja, Rupa Mahadevan, Scribes: Willie Wu, Millan Batra) 3. GridGraph (Jules Ahmar, Scribes: Ryan Tsang, Amir Yazdi-Nejad) 4. Parameter server (Kaushik Mahorker, Scribes: Pooja Janagal Nagaraja, Austin Guo) |
|
|
|
Nov. 27 (Wed) |
Class cancelled
|
|
|
|
Week 10. Memory Management + Project Presentation Week |
|||
|
Dec. 2 (Mon) |
1. Project Adam -- Microsoft (Mathanky Sankaranarayanan, Tanmay Chinchore, Scribes: Millan Batra, rustem can aygun) 2. TensorFlow -- Google (Austin Guo, Ryan Tsang, Scribes: Sijie Xiong, Rupa Mahadevan) 3. Framework for emerging AI (Zhufeng Pan, Scribes: Pooja Janagal Nagaraja, Keerthana Sankar) 4. TVM: ompiler for Deep Learning (Xuanqing Liu, Scribes: Qiyue Yao) |
|
|
|
Dec. 4 (Wed) |
1. Introduction to big data memory management (Harry) 2. Bloat-aware design (Sijie Xiong, Scribes: Ryan Tsang, Matt Hickey) 3. Broom (Jonathan Chee, Scribes: Austin Guo, Nandan Parikh) 4. Yak (Rustem can Aygun, Scribes: Neil Agarwal, Arjun Srinivasan) |
|
Final report due on Friday |