Write a script to find redundant tasks and run them less often (tier 2 or only on full autoland pushes)
Categories
(Testing :: General, enhancement, P3)
Tracking
(Not tracked)
People
(Reporter: marco, Assigned: marco)
References
(Blocks 1 open bug)
Details
Attachments
(1 file)
Using the data that we've been collecting for the machine learning-based scheduler, we can do nice analyses like figuring out which tasks are redundant (e.g. a couple of tasks that always pass together or always fail together).
Comment 1•5 years ago
|
||
Please be careful about this: Past results may not carry forward to the future. Correlation of past results from 2 or more tasks may be a starting point for a human code review to find and eliminate logical redundancies, but I would hope that it would not be used to automatically reduce test coverage.
For illustration of my concern, think of two simple tests, one testing a specific, usually stable media issue, the other testing a specific, usually stable layout issue. For years they have the same results, generally passing, but failing together when, for instance, page load is broken. Of course you will want both tests to be active the day someone breaks the media or layout issues covered by these tests.
Assignee | ||
Comment 2•5 years ago
|
||
Yes, sorry I choose a confusing title for the bug. I actually intended to make these kinds of tests run less often, and only after human review.
In the future, we can also consider actually disabling some tests, offering a dashboard for developers to go through the data (both past failure and code coverage) themselves.
Code coverage has also been partially used to disable some tests which were redundant with web-platform-tests.
Updated•5 years ago
|
Assignee | ||
Comment 3•5 years ago
|
||
I've built a script to analyze mingw32 failures and check if they are redundant with win32/win64/mingw64: https://github.com/mozilla/bugbug/commit/f7c11091e9dfddced8bf251409193ef1aca46e9d.
The script is pretty generic, so can be useful for other similar analyses too.
I'm going to keep this open for making the script find interesting redundancies automatically, without needing a human to input the groups to compare.
Assignee | ||
Comment 4•5 years ago
|
||
Assignee | ||
Comment 5•5 years ago
|
||
I've built a little hacky mach command to find redundant tasks reusing the "mach try chooser" interface.
To test it, apply https://phabricator.services.mozilla.com/D73860, then run the following script to download the failure data gathered via mozci:
import os
import requests
import zstandard
r = requests.get(
"https://community-tc.services.mozilla.com/api/index/v1/task/project.relman.bugbug.data_test_scheduling_history_push_data.latest/artifacts/public/push_data_label.json.zst",
stream=True,
)
r.raise_for_status()
with open("push_data_label.json.zst", "wb") as f:
for chunk in r.iter_content(chunk_size=4096):
f.write(chunk)
dctx = zstandard.ZstdDecompressor()
with open("push_data_label.json.zst", "rb") as input_f:
with open("push_data_label.json", "wb") as output_f:
dctx.copy_stream(input_f, output_f)
os.remove("push_data_label.json.zst")
Then, run mach task-redundancy
.
Updated•2 years ago
|
Description
•