[meta] Make web-platform-tests work with Fission enabled
Categories
(Core :: DOM: Navigation, task, P3)
Tracking
()
Fission Milestone | Future |
People
(Reporter: jgraham, Unassigned)
References
(Depends on 2 open bugs, Blocks 2 open bugs)
Details
(Keywords: meta, Whiteboard: fission-tests)
Meta-bug tracking work to make wpt tests work with fisson enabled.
Comment hidden (Intermittent Failures Robot) |
Comment hidden (Intermittent Failures Robot) |
Reporter | ||
Updated•5 years ago
|
Comment 3•5 years ago
|
||
James does this still need [stockwell needswork:owner]?
Reporter | ||
Comment 4•5 years ago
|
||
I don't think so, but I didn't add it in the first place.
Reporter | ||
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 5•5 years ago
|
||
In bug 1622338, jgraham added an --enable-fission
option to the mach-wpt test runner.
Comment 6•5 years ago
|
||
Assigning this WPT-Fis meta bug to Kashav because he will be leading the effort to green up WPT for Fission.
Comment 8•5 years ago
|
||
Steven is going to investigate WPT-Fis failures while Kashav is out.
Updated•5 years ago
|
Comment 9•5 years ago
|
||
Tracking for Fission Nightly M6b.
Kashav has fixed a lot of WPT bugs and he will close them soon. We still have some intermittent Windows failures.
Updated•4 years ago
|
Comment 10•4 years ago
•
|
||
Moving the meta for later but individual tests should be higher priority (M6 for nightly blocking or M7 for beta blocking).
Updated•4 years ago
|
Comment 11•4 years ago
|
||
:cpeterson, I don't see any other open bugs here, can we mark this done?
Comment 12•4 years ago
|
||
(In reply to Joel Maher ( :jmaher ) (UTC -0800) from comment #11)
:cpeterson, I don't see any other open bugs here, can we mark this done?
Sure. We still need to enable WPTs for some missing Fission configurations (such as linux1804-64-asan/opt and windows10-64-qr/debug), but I will file new bugs to verify and enable those tests.
Comment 13•4 years ago
|
||
We should keep this meta open and use it to track the remaining WPTs that are disabled for Fission. We'll need new bugs for those under this meta.
jgraham, can we re-enable all currently disabled WPTs for Fission, and then let the wpt-sync script disable the still failing ones, so that we don't have to individually check each intermittent to see if it can be re-enabled? Can the wpt-sync script or an adjacent script file a new bug under this meta bug for each Fission WPT that gets disabled so we have a consolidated list? Filing a new bug for each Fission disabled WPT should continue for each wpt-sync run so we don't miss tests that get disabled for Fission.
Reporter | ||
Comment 14•4 years ago
•
|
||
If I understand correctly, it's not just disabled tests you care about. In wpt, there are basically three categories of differences you might care about:
- Tests that are disabled in fission but not on other configurations. These tests either don't run at all (when whole test files are disabled) or are run but the results are ignored (when specific subtests are disabled; this is rare). The wpt sync never disables tests; it's only done by humans.
- Tests that have a fixed expectation that's different between fission and non-fission configurations e.g.
expected: FAIL
for fission, butexpected: PASS
for non-fission. These are things which clearly need to be fixed or at least understood. - Tests that have an intermittent result of some kind, especially one that differs between fission and non-fission. This is problematic because we don't have a great system for telling which of the intermittent results actually occur in practice. For example
expected: PASS
in non-fission andexpected: [PASS, FAIL]
in fission might have been a one-time failure that the sync added that's now a perma-pass or it might be a perma-fail. This can also affect cases where there isn't a fission-specific expectation e.g.expected: [PASS, FAIL]
in both fission and non-fission could hide a test that permafails in fission and perma-passes in non-fission.
Now answering the questions:
can we re-enable all currently disabled WPTs for Fission, and then let the wpt-sync script disable the still failing ones, so that we don't have to individually check each intermittent to see if it can be re-enabled?
It makes sense to go through the expectation ini files and remove any non-perma expectations that differ between fission and non-fission, and re-enable any tests that run on fission, and then run all of that through try with some rebuilds, and then use the mach wpt-update
command to update the expectations to the observed results. We can't get the wpt-sync to do this directly but we can do basically the same thing it would do. To make updating the expectations as straightforward as possible, we should try to run all the configurations found on mozilla-central. For comparison [1] is a recent wpt-sync try push; note that it uses --disable-target-task-filter
to enable some additional tasks.
Can the wpt-sync script or an adjacent script file a new bug under this meta bug for each Fission WPT that gets disabled so we have a consolidated list? Filing a new bug for each Fission disabled WPT should continue for each wpt-sync run so we don't miss tests that get disabled for Fission.
Getting the sync to do this would be quite non-trivial. The current way the sync files bugs is by looking at the per-PR results, bot on GitHub and in Gecko CI. We don't have fission runs in either of those places yet, and we don't have a mechanism to do something special with failures that only happen in a specific configuration. We also don't have any integration between bug filing and the try pushes we do immediately before landing, so we would miss anything that failed in that case which had passed in the per-PR run; this is fairly common in general.
Instead of tying this to the sync directly I suggest writing a job that will run on central pushes and create an artifact of ini file differences representing regressions between fission and non-fission. This is pretty straightforward and will ensure that we capture all the differences that are annotated in the ini file. It might still miss cases where the expectation is intermittent across configurations but the results are actually different between fission and non-fission. We could capture these by looking at the actual recorded results, but without historical data there are likely to be false-positives from tests that are actually just intermittent.
If you have a complete list of fission regressions, the remaining question is how to track those. The wpt-sync uses an external metadata repo to check if there are already bugs filed for a specific test failure. I don't think we want to reuse that here. Also the fact that the sync files bugs per-pr means that we can keep the volume of bugs filed down to a reasonable level. The problem with auto-filing bugs in general is that it's very hard to script a solution to "are these issues the same bug or a different bug". Humans don't do this perfectly but they are at least better at making informed guesses ;)
Given all of this, I'd prefer if actual bugs were filed by people. Of course we can still figure out some way to ensure that you know when there are fission regressions which are not associated with any bug (e.g. new ones). The main question in doing this is where we want the association between test result and bug to live. This can go in the wpt metadata (I think there's some precendent for that in the fission project, and certainly there is in general). That has the advantage that it's easy for the script that summarizes the regressions to tag each one with a bug number. The big disadvantage is that it means you need to make an actual m-c commit to update the annotations. The other option is that the association lives outside the source tree and we have some way (i.e. script) to update this with the latest data from mozilla-central. That could be in bugzilla if we find some way to pack the data into the bugs, but it's not designed for it. It could probably be something like google sheets assuming there's some API we could use to update a sheet.
Does that make sense?
[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=026810e69f92c9d345500a8d45e77b13b2c3edfc
Comment 15•4 years ago
•
|
||
I chatted with jgraham and Kashav on Matrix.
wpt-sync is run a few times a week.
jgraham will try writing a mach command to dump a report of wpt annotations for new Fission failures. I am confirming the exact requirements with jgraham in email.
When a new Fission intermittent is reported, someone on the Fission team will manually file a new bug and add the bug # as a comment on the new failure's annotation line in the wpt .ini metadata file. Keeping the comments with the annotations will allow humans to identify known vs new Fission annotations at a glance.
In bug 1694974, Kashav will remove the existing wpt annotations for Fission intermittents and re-run those tests on Try with wpt-sync to generate the new wpt annotations. Kashav (or cpeterson) will then file new bugs for Fission failures (intermittent or perma-fail) that are still reproducible and add bug # comments to their wpt annotations.
Updated•2 years ago
|
Description
•