I obtained my PhD in Computer Science from UCLA, where I spent six wonderful years working with my PhD advisor, Prof. Miryung Kim. Now I am a Postdoctoral Fellow in Computer Science at Harvard University. I am working with Prof. Elena Glassman to design and build systems for interacting with population-level structures and patterns in large code and data corpora.
My research resides primarily in Software Engineering and Human-Computer Interaction. I enjoy building interactive program analysis techniques and development tools that lower learning barriers for coding, increase development productivity, and improve software quality. These days, I am interested in harnessing the power of Big Code and enabling developers to explore common practices and potential design & implementation alternatives in open source communities.
During my Ph.D. at UCLA, I have been working on several projects that discover and represent the commonalities and variations among similar programs for systematic software development. The intuition is that by unveiling what has and has not been done in other similar contexts, we can help developers avoid unintentional inconsistencies, identify better implementation alternatives, and gain a deeper understanding about the code under investigation.
I started this research by leveraging code duplications and redundancies in local codebases. My collaborators and I built two techniques to improve the effectiveness of code reviews and differential testing via interactive template construction and code transplantation. [ICSE 2015] [ICSE 2017]
Since 2017, I have been focusing on extending this research to exploit similar programs in the large and growing body of open-source projects in GitHub. I collaborated with researchers in SE, PL, and HCI to build systems that scale the reasoning about program semantics to massive code copora, mine common API usage patterns and code adaptation patterns, and visualize hundreds of code examples at scale. [ICSE 2018][CHI 2018][ICSE 2019]
[Dec. 2019] Our paper about adversarial attacks and defenses of autonomous driving models was accepted to PerCom 2020! Congratulations to Yao!
[Dec. 2019] Our paper about the unmet needs and desired tool support for gathering and intepreting community usage data for API design was accepted to CHI 2020!
[Nov. 2019] I gave a talk on "Programming at Scale by Harnessing the Power of Big Code" at Facebook.
[Oct. 2019] I presented "An Empirical Study of Common Challenges in Developing Deep Learning Applications" at ISSRE 2019.
[Jul. 2019] I have graduated from UCLA and started as a postdoc at Harvard University!
[Jul. 2019] Our paper about common challenges in developing deep learning applications was accepted to ISSRE 2019!
[Mar. 2019] I successfully defended my PhD thesis!
[Feb. 2019] The research artifact of online code adaptation passed the ICSE artifact evaluation. GitHub link
[Feb. 2019] The research artifact of active inductive logic programming for code search passed the ICSE artifact evaluation. GitHub link
[Dec. 2018] Our paper about common adaptation patterns of online code examples was accepted to ICSE 2019!
[Dec. 2018] Our paper about interactive code search via active learning was accepted to ICSE 2019. Congratulations to Aish!
[Nov. 2018] I have released a command-line API misuse detector based on common API usage patterns mined from 380K Java projects in GitHub. The tool is now available on the ExampleCheck website. link
[Jul. 2018] Our demo paper on detecting API usage violations in Stack Overflow was accepted to FSE 2018 Demonstrations Track. Congratulations to Anastasia!
[Jul. 2018] I will serve on the Artifacts Evaluation Committee of ICSE 2019.
[Jun. 2018] Both the dataset and the tool of our API misuse study of Stack Overflow are publically available. link
[Jun. 2018] Presented "Are Code Examples on an Online Q&A Forum Reliable? A Study of API Misuse on Stack Overflow" at ICSE 2018.
[Apr. 2018] Co-presented "Visualizing API Usage Examples at Scale" with Elena Glassman at CHI 2018.
[Apr. 2018] Examplore, an interactive system for visualizing and exploring hundreds of API usage examples is now publicly available! link
[Mar. 2018] Our poster about automated transplantation and differential testing for code clones was accepted to ICSE 2018!
[Dec. 2017] Our paper on visualizing API usage examples at scale was accepted to CHI 2018!
[Dec. 2017] Our paper on the reliability of Stack Overflow examples was accepted to ICSE 2018!
[Dec. 2017] Critics, an interactive code review technique for searching similar program edits is now open sourced! link
[Dec. 2017] We have completed the tech transfer of Critics to Huawei.
[Jul. 2017] I built a command line tool, BibMerge to remove duplicates in bib files and also update the corresponding references in tex files. Feel free to grab it if you also have trouble with merging bib files.
[Jul. 2017] I received the 2017-2018 UCLA Dissertation Year Fellowship.
[Apr. 2017] I received the 2017-2018 Google Outstanding Graduate Student Research Award.
[Jan. 2017] Our test reuse tool and dataset are now publicly available here.
Teaching Assistant, UCLA CS230 Advanced Software Engineering Spring 2017, UCLA CS130 Software Engineering Fall 2015, Fall 2016, UT Austin EE461L Software Engineering and Design Laboratory Fall 2013 (2013 Best TA Award)