Tianyi Zhang

Postdoctoral Fellow, Harvard University

Ph.D. in Computer Science, UCLA

Office: Maxwell-Dworkin 242, Cambridge, MA

E-mail: tianyi@seas.harvard.edu

I have moved to Harvard as a postdoc since 2019 Fall. This website is no longer maintained. Please check my new homepage.

I obtained my PhD in Computer Science from UCLA, where I spent six wonderful years working with my PhD advisor, Prof. Miryung Kim. Now I am a Postdoctoral Fellow in Computer Science at Harvard University. I am working with Prof. Elena Glassman to design and build systems for interacting with population-level structures and patterns in large code and data corpora.


My research resides primarily in Software Engineering and Human-Computer Interaction. I enjoy building interactive program analysis techniques and development tools that lower learning barriers for coding, increase development productivity, and improve software quality. These days, I am interested in harnessing the power of Big Code and enabling developers to explore common practices and potential design & implementation alternatives in open source communities.

During my Ph.D. at UCLA, I have been working on several projects that discover and represent the commonalities and variations among similar programs for systematic software development. The intuition is that by unveiling what has and has not been done in other similar contexts, we can help developers avoid unintentional inconsistencies, identify better implementation alternatives, and gain a deeper understanding about the code under investigation.

I started this research by leveraging code duplications and redundancies in local codebases. My collaborators and I built two techniques to improve the effectiveness of code reviews and differential testing via interactive template construction and code transplantation. [ICSE 2015] [ICSE 2017]

Since 2017, I have been focusing on extending this research to exploit similar programs in the large and growing body of open-source projects in GitHub. I collaborated with researchers in SE, PL, and HCI to build systems that scale the reasoning about program semantics to massive code copora, mine common API usage patterns and code adaptation patterns, and visualize hundreds of code examples at scale. [ICSE 2018][CHI 2018][ICSE 2019]


  • [Aug. 2020] Our demo paper on debloating modern java applications was accepted to FSE 2020 Demonstration Track! Congratulations to Konner and Mihir!
  • [Aug. 2020] Our paper about example generation was accepted to FSE 2020 Industry Track! Congratulations to Celeste!
  • [Jun. 2020] Our paper about interactive program synthesis was accepted to UIST 2020!
  • [May 2020] Our paper about debloating modern Java applications was accepted to FSE 2020!
  • [Dec. 2019] Our paper about adversarial attacks and defenses of autonomous driving models was accepted to PerCom 2020! Congratulations to Yao!
  • [Dec. 2019] Our paper about the unmet needs and desired tool support for gathering and intepreting community usage data for API design was accepted to CHI 2020!
  • [Nov. 2019] I gave a talk on "Programming at Scale by Harnessing the Power of Big Code" at Facebook.
  • [Oct. 2019] I presented "An Empirical Study of Common Challenges in Developing Deep Learning Applications" at ISSRE 2019.
  • [Jul. 2019] I have graduated from UCLA and started as a postdoc at Harvard University!
  • [Jul. 2019] Our paper about common challenges in developing deep learning applications was accepted to ISSRE 2019!
  • [Mar. 2019] I successfully defended my PhD thesis!
  • [Feb. 2019] The research artifact of online code adaptation passed the ICSE artifact evaluation. GitHub link
  • [Feb. 2019] The research artifact of active inductive logic programming for code search passed the ICSE artifact evaluation. GitHub link
  • [Dec. 2018] Our paper about common adaptation patterns of online code examples was accepted to ICSE 2019!
  • [Dec. 2018] Our paper about interactive code search via active learning was accepted to ICSE 2019. Congratulations to Aish!
  • [Nov. 2018] I have released a command-line API misuse detector based on common API usage patterns mined from 380K Java projects in GitHub. The tool is now available on the ExampleCheck website. link
  • [Jul. 2018] Our demo paper on detecting API usage violations in Stack Overflow was accepted to FSE 2018 Demonstrations Track. Congratulations to Anastasia!
  • [Jul. 2018] I will serve on the Artifacts Evaluation Committee of ICSE 2019.
  • [Jun. 2018] Both the dataset and the tool of our API misuse study of Stack Overflow are publically available. link
  • [Jun. 2018] Presented "Are Code Examples on an Online Q&A Forum Reliable? A Study of API Misuse on Stack Overflow" at ICSE 2018.
  • [Apr. 2018] Co-presented "Visualizing API Usage Examples at Scale" with Elena Glassman at CHI 2018.
  • [Apr. 2018] Examplore, an interactive system for visualizing and exploring hundreds of API usage examples is now publicly available! link
  • [Mar. 2018] Our poster about automated transplantation and differential testing for code clones was accepted to ICSE 2018!
  • [Dec. 2017] Our paper on visualizing API usage examples at scale was accepted to CHI 2018!
  • [Dec. 2017] Our paper on the reliability of Stack Overflow examples was accepted to ICSE 2018!
  • [Dec. 2017] Critics, an interactive code review technique for searching similar program edits is now open sourced! link
  • [Dec. 2017] We have completed the tech transfer of Critics to Huawei.
  • [Jul. 2017] I built a command line tool, BibMerge to remove duplicates in bib files and also update the corresponding references in tex files. Feel free to grab it if you also have trouble with merging bib files.
  • [Jul. 2017] I received the 2017-2018 UCLA Dissertation Year Fellowship.
  • [Apr. 2017] I received the 2017-2018 Google Outstanding Graduate Student Research Award.
  • [Jan. 2017] Our test reuse tool and dataset are now publicly available here.
  • [Jan. 2017] Our work about test reuse was presented at the Dagstuhl Seminar!
  • [Dec. 2016] Our paper on test reuse and differential testing was accepted to ICSE 2017!
  • [Sept. 2016] I have passed the Oral Qualifying Exam (OQE) and now advanced to candidacy!


Interactive Program Synthesis by Augmented Examples
Tianyi Zhang, London Lowmanstone, Xinyu Wang, Elena Glassman [PDF][Preview]
JShrink: In-depth Investigation into Debloating Modern Java Applications
Bobby Bruce*, Tianyi Zhang*, Jaspreet Arora, Guoqing Harry Xu, Miryung Kim [PDF][Artifact]
* equal contribution
Enabling Data-driven API Design with Community Usage Data: A Need-Finding Study
Tianyi Zhang, Björn Hartmann, Miryung Kim, Elena Glassman [PDF][Presentation]
An Analysis of Adversarial Attacks and Defenses on Autonomous Driving Models
Yao Deng, James Xi Zheng, Tianyi Zhang, Chen Chen, Guannan Lou, Miryung Kim [PDF]
Exempla Gratis (E.G.): Code Examples for Free
Celeste Barnaby, Koushik Sen, Tianyi Zhang, Elena Glassman, Satish Chandra [PDF]
WebJShrink: A Web Service for Debloating Java Bytecode
Konner Macias, Mihir Mathur, Bobby R. Bruce, Tianyi Zhang, Miryung Kim [PDF][Demo]
Ph.D. Dissertation
Leveraging Program Commonalities and Variations for Systematic Software Development and Maintenance
Tianyi Zhang [PDF]
Analyzing and Supporting Adaptation of Online Code Examples
Tianyi Zhang, Di Yang, Cristina Lopes, Miryung Kim
[PDF][Datasets][Analysis Replication][Chrome Extension][User Study Material & Results]
Active Inductive Logic Programming for Code Search
Aishwarya Sivaraman, Tianyi Zhang, Guy Van den Broeck, Miryung Kim
[PDF][Code][VirtualBox Image]
An Empirical Study of Common Challenges in Developing Deep Learning Applications
Tianyi Zhang*, Cuiyun Gao*, Lei Ma, Michael R. Lyu, Miryung Kim [PDF][Dataset and Tool]
* equal contribution
Book Chapter
Software Evolution in Handbook of Software Engineering, Springer, 2019
Miryung Kim, Na Meng, Tianyi Zhang [PDF]
Are Code Examples on an Online Q&A Forum Reliable? A Study of API Misuse on Stack Overflow
Tianyi Zhang, Ganesha Upadhyaya, Anastasia Reinhardt, Hridesh Rajan, Miryung Kim [PDF][Dataset and Tool]
Visualizing API Usage Examples at Scale
Elena L. Glassman*, Tianyi Zhang*, Björn Hartmann, Miryung Kim [PDF][Code][Tool]
* equal contribution
Augmenting Stack Overflow with API Usage Patterns Mined from GitHub
Anastasia Reinhardt, Tianyi Zhang, Mihir Mathur, Miryung Kim [PDF][Demo][Tool]
Poster: Grafter: Transplantation and Differential Testing for Clones
Tianyi Zhang, Miryung Kim [Abstract][Poster]
Automated Transplantation and Differential Testing for Clones
Tianyi Zhang, Miryung Kim [PDF][Demo][Tool]
Interactive Code Review for Systematic Changes
Tianyi Zhang, Myoungkyu Song, Joseph Pinedo, Miryung Kim [PDF][Code]
Critics: An Interactive Code Review Tool for Searching and Inspecting Systematic Changes
Tianyi Zhang, Myoungkyu Song, Miryung Kim [PDF][Demo]


  • Program Committee: ICSE 2019 AEC, ICSE 2020 AEC, PerCom 2020 WiP, FSE Demo 2020
  • ACM TOSEM Board of Distinguished Reviewer
  • Journal Reviewer: TSE, TOCHI, EMSE, IST, IEEE Software
  • External Reviewer: CSCW 2020, UIST 2020
  • Student Volunteer: ICSE 2016


Userful Advice

Students to Conference by David Notkin

Seven Things I Learned on the Way to Not Achieving My Career Goal by Harry Shum

Patterns for writing good rebuttals by Andreas Zeller

7 Tips for Attending a Conference Alone (And Having a Good Time) by Yuanyuan Zhou

Things I Keep Repeating About Writing by Claire Le Goues