De-Flake Your Tests: Automatically Locating Root Causes of Flaky Tests in Code At Google

Diego Cavalcanti
International Conference on Software Maintenance and Evolution (ICSME) 2020, IEEE

Abstract

Regression testing is a critical part of software development and maintenance.
It ensures that modifications to existing software do not break existing
behavior and functionality.

One of the key assumptions about regression tests is that their results are
deterministic: when executed without any modifications with the same
configuration, either they always fail or they always pass. In practice,
however, there exist tests that are non-deterministic, called flaky
tests. Flaky tests cause the results of test runs to be unreliable, and they
disrupt the software development workflow.

In this paper, we present a novel technique to automatically identify the
locations of the root causes of flaky tests on the code level to help
developers debug and fix them. We study the technique on flaky tests across
428 projects at Google.

Based on our case studies, the technique helps identify the location of the root
causes of flakiness with 82% accuracy. Furthermore, our studies show that
integration into the appropriate developer workflows, simplicity of debugging
aides and fully automated fixes are crucial and preferred components for
adoption and usability of flakiness debugging and fixing tools.

Research Areas