Who Broke the Build? Automatically Identifying Changes That Induce Test Failures In Continuous Integration at Google Scale

Proceedings of the 39th International Conference on Software Engineering: Software Engineering in Practice Track, IEEE Press, Buenos Aires, Argentina (2017), pp. 113-122 (to appear)

Abstract

Quickly identifying and fixing code changes that
introduce regressions is critical to keep the momentum on
software development, especially in very large scale software
repositories with rapid development cycles, such as at Google.
Identifying and fixing such regressions is one of the most
expensive, tedious, and time consuming tasks in the software
development life-cycle. Therefore, there is a high demand
for automated techniques that can help developers identify
such changes while minimizing manual human intervention.
Various techniques have recently been proposed to identify such
code changes. However, these techniques have shortcomings
that make them unsuitable for rapid development cycles as
at Google. In this paper, we propose a novel algorithm to
identify code changes that introduce regressions, and discuss
case studies performed at Google on 140 projects. Based on our
case studies, our algorithm automatically identifies the change
that introduced the regression in the top-5 among thousands of
candidates 82% of the time, and provides considerable savings
on manual work developers need to perform

Research Areas