I am a second year Ph.D. student in Computer Science Department at University of California, Los Angeles. I am generally interested in cloud
computing. I build large scale ML training systems on cloud. I also develop system support for cloud. Currently, I
an working on resource disaggregation for Cloud 3.0. I am a member of SOLAR group and co-advised by Professor Harry Xu and Professor Miryung Kim.
Prior to graduate school, I earned my B.E. in Computer Science from Tsinghua University in 2019, and I was a
research intern in PACMAN group. I also worked with Professor Umut Acar on scheduling algorithms for multithreaded parallel computing in 2018.
Dorylus: Affordable and Scalable GNN Training over Billion-Edge Graphs
I built Dorylus, a distributed system for training graph neural networks (GNNs) together with John. Uniquely, Dorylus could take
of serverless computing to increase scalability at a low cost. To be specific, we
- Leveraged thousands of serverless threads for graph neural network (GNN) computations;
- Constructed a deep, bounded-asynchronous pipeline to fully utilize the massive parallelism provided by
Dorylus outperformed both existing systems (by up to 3.8x faster and 10.7x cheaper) and our GPU-based variant
(by 2.05x more performance per dollar).
PMALLOC: An Efficient Allocator for Non-volatile Memory (NVM)
This is my undergraduate thesis project, which is one of eight finalists of the best thesis in the CS
this project, I
- Proposed a novel light-weight version-commit mechanism for efficient crash consistency;
- Implemented PMALLOC as a library and got good performance (up to 1.88x faster in allocation and 5.44x faster
in fault recovery compared to existing allocators).
Efficient Scheduling with Private Deques in Multiprogrammed Environments
The work improved the performance of the task scheduler in MPL, a compiler for parallel ML (a variant of
Standard ML). To be specific, I
- Designed a new private-deque based schedule strategy that reduces lock contentions and synchronizations by
marrying work-sharing and work-stealing across processors;
- Integrated my scheduler into MPL runtime and got up to 163x speedup.
Crash Consistency in Non-volatile Memory (NVM) for High Performance Computing (HPC)
I worked as an undergraduate research assistant with two PhD students. My contributions:
- Participated in designing and implementing lightweight algorithm-based crash consistency supports for HPC
applications running on NVM with no logging or checkpointing;
- Experimented on three typical HPC workloads and got good performance (overhead of our methods are lower than
3% in most cases and up to 8.3% in all cases).
John Thorpe*, Yifan Qiao*, Jonathan Eyolfson, Shen Teng, Guanzhou Hu, Zhihao Jia, Jinliang Wei, Keval Voral, Ravi Netravali, Miryung Kim, and Guoqing Harry Xu. Dorylus: Affordable, Scalable, and Accurate GNN Training over Billion-Edge Graphs. OSDI 2021 (* contributed equally) [pdf][code]
Shuo Yang, Kai Wu, Yifan Qiao, Dong Li, Jidong Zhai. Algorithm-Directed Crash
Consistence in Non-Volatile Memory for HPC. CLUSTER 2017
2019 Magna Cum Laude in Beijing (8/140)
2019 Magna Cum Laude at Department of Computer Science and Technology, Tsinghua University
2019 Cum Laude at Tsinghua University (14/140)
2018 CNPC Scholarship for Comprehensive Excellence (8/140)
2018 Qualcomm Scholarship (Top 6%)
2017 National Scholarship (6/140)
2017 35th "Challenge Cup" National College Student Curricular Academic Science and Technology Works
Competition, third prize
2016, 2017 Electrical Trading Challange, Champion of 50+ teams.
2016 National University Student Physics Competition, first prize.
Last updated 01/2021