A few weeks ago I mentioned how we fund Stefan’s work on improving HTTP(/3) in curl. Now, in similar spirit we are funding Dan Fandrich to work on further improving test infrastructure. Dan has worked fiercely on the introduction of parallel tests over the recent year or so and this is work that builds on that and continues down that road.
Test Analysis System
curl contains a regression test suite of over 1900 individual test cases that are run automatically on every commit submission and on every pull request in almost 130 different environments., meaning that every change can result in more than 140,000 tests being run. A spurious test failure rate of a mere 0.001% is likely to cause a perfectly good PR to end up showing with a red failure. A new contributor that doesn’t understand this problem can spend hours poring over his or her patch and the related code in curl, searching for a problem that isn’t there.
Analyzing 140,000 tests for each change to the curl source code to find failure trends (such as flaky tests) demands an automated solution. Dan has created a system (working name Test Clutch) that has been successfully ingesting curl CI test results for much of the past year and has been used by him to find flaky tests as well as permanently failing tests (often submitted under the mistaken impression that the failed test was merely flaky). It collects individual test results from all the CI systems used in the curl project into a database where they can be analyzed.
This system has potential to be useful to a broader base of curl developers to help see test trends, test platform coverage and to better determine which tests are flaky and could use improvement. It has been written in a fashion such that test results for other projects besides curl can also be added and analyzed separately.
Make Test Clutch available
The current test ingestion and analysis system will be productionized and the analysis summary table will be integrated into the curl web site for easy access for developers.
Assist in PR work
This task will involve writing code to trigger the test analysis system to retrieve detailed PR test results when available. It must make a reasonable determination of when all the expected tests have been completed (since not all tests will run for every PR) then commenting on the PR with a summary of the test results and believability of any test failures.
These are project that will benefit the project when implemented but they are not time sensitive and Dan is not going to work full time on them. There are no exact end dates set for them.
The result of Dan’s work will become visible in PRs and website updates as we go forward.