Publications at Software and Computational Systems Lab

Compact view

Publications about Flaky Tests

Articles in conference or workshop proceedings

Philipp Wendler and Stefan Winter. Regression-Test History Data for Flaky Test Research. In Proc. 1st International Workshop on Flaky Tests, pages 3–4, 2024. ACM. doi:10.1145/3643656.3643901
Keyword(s): Software Testing, Flaky Tests Publisher's Version PDF Presentation
Artifact(s)
1. doi:10.5281/zenodo.10639030
Abstract

Due to their random nature, flaky test failures are difficult to study. Without having observed a test to both pass and fail under the same setup, it is unknown whether a test is flaky and what its failure rate is. Thus, flaky-test research has greatly benefited from data records of previous studies, which provide evidence for flaky test failures and give a rough indication of the failure rates to expect. For assessing the impact of the studied flaky tests on developers' work, it is important to also know how flaky test failures manifest over a regression test history, i.e., under continuous changes to test code or code under test. While existing datasets on flaky tests are mostly based on re-runs on an invariant code base, the actual effects of flaky tests on development can only be assessed across the commits in an evolving commit history, against which (potentially flaky) regression tests are executed. In our presentation, we outline approaches to bridge this gap and report on our experiences following one of them. As a result of this work, we contribute a dataset of flaky test failures across a simulated regression test history.

BibTeX Entry

@inproceedings{RegressionTestData-FTW24, author = {Philipp Wendler and Stefan Winter}, title = {Regression-Test History Data for Flaky Test Research}, booktitle = {Proc.\ 1st International Workshop on Flaky Tests}, pages = {3–4}, year = {2024}, publisher = {ACM}, doi = {10.1145/3643656.3643901}, pdf = {https://www.sosy-lab.org/research/pub/2024-FTW24.Regression-Test_History_Data_for_Flaky_Test_Research.pdf}, presentation = {https://www.sosy-lab.org/research/prs/2024-04-14_FTW24_Regression-Test_History_Data_for_Flaky_Test_Research_Stefan.html}, abstract = {Due to their random nature, flaky test failures are difficult to study. Without having observed a test to both pass and fail under the same setup, it is unknown whether a test is flaky and what its failure rate is. Thus, flaky-test research has greatly benefited from data records of previous studies, which provide evidence for flaky test failures and give a rough indication of the failure rates to expect. For assessing the impact of the studied flaky tests on developers' work, it is important to also know how flaky test failures manifest over a regression test history, i.e., under continuous changes to test code or code under test. While existing datasets on flaky tests are mostly based on re-runs on an invariant code base, the actual effects of flaky tests on development can only be assessed across the commits in an evolving commit history, against which (potentially flaky) regression tests are executed. In our presentation, we outline approaches to bridge this gap and report on our experiences following one of them. As a result of this work, we contribute a dataset of flaky test failures across a simulated regression test history.}, keyword = {Software Testing, Flaky Tests}, artifact = {10.5281/zenodo.10639030}, keywords = {Software Testing, Dataset}, }

Disclaimer:

This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All person copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.

Last modified: Fri Jul 11 12:25:32 2025 UTC