[meta] Artifact builds produce too many perma-failures to be useful right now
Categories
(Testing :: General, task, P3)
Tracking
(Not tracked)
People
(Reporter: Gijs, Assigned: florian, NeedInfo)
References
(Depends on 1 open bug)
Details
(Keywords: meta)
Attachments
(5 files)
This type of thing is not atypical.
Push health says 145 failures. Only a handful of those are intermittents. 4 tests perma-fail cross-platform, accounting for about 40 failures total, and another 16 Windows a11y tests fail on Windows (which is the only place they are being run), for another 64 failures, tracked in bug 1885239. I think some of the remainder is even duplicate errors from the same failures (e.g. due to crashes via asserts or aborted runs due to too many failures).
Seeing the wood for the trees becomes very hard in this situation, especially if you don't push regularly or don't always use artifact builds.
The deps (I'll file a few more) will hopefully be sufficient to deal with the current set of issues. However, I'd like to avoid coming back to this place. But I'm not 100% sure how we'd do that.
Aryx, do you have ideas? I know we run artifact builds on central, but I don't think we run tests against them - is doing a tier-2 type test run and then noticing this type of thing when it lands a possibility? Or do you think we'd still miss that a given test has gone perma-fail in artifact mode?
Updated•1 year ago
|
Comment 1•11 months ago
|
||
Gijs, is this meta and needinfo still relevant after 3 months and the many dependencies landed, or could the remaining issue be moved to its own bug?
| Reporter | ||
Comment 2•11 months ago
|
||
All the deps are closed so from that PoV we could close. I would really like to find a more durable solution that avoids the problem recurring, though...
Comment 3•10 months ago
|
||
Lets keep this open just in case to find a better solution
| Reporter | ||
Comment 4•10 months ago
|
||
(In reply to Johannes from comment #3)
Lets keep this open just in case to find a better solution
Let's keep the needinfo for Aryx, then...
Comment 5•10 months ago
|
||
This came up in General triage with 23 days after needinfo. Do we need to do anything with this bug at this point?
| Reporter | ||
Comment 6•9 months ago
|
||
(In reply to Stephen Thompson [:sthompson] from comment #5)
This came up in General triage with 23 days after needinfo. Do we need to do anything with this bug at this point?
I'd like a more permanent way of making sure that artifact builds don't descend into an orange-fest, thence the needinfo for Aryx.
I've pushed a blank trypush to see how bad the sitch is at the moment: https://treeherder.mozilla.org/jobs?repo=try&landoCommitID=137313 .
Comment 7•9 months ago
|
||
The Bugbug bot thinks this bug should belong to the 'Core::Disability Access APIs' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.
Updated•9 months ago
|
Comment 8•1 month ago
|
||
Moving to Testing::General since it sounds like we're considering to add artifact build test failures as tier 2 failures.
| Assignee | ||
Comment 9•3 days ago
|
||
Run mochitest-browser-chrome, mochitest-chrome, mochitest-devtools-chrome,
mochitest-plain, and xpcshell tests against artifact builds to catch
perma-failures before they accumulate.
On Linux and Windows, these jobs run on mozilla-central and backstop pushes
on autoland. On macOS, they are try-only to preserve limited machine
capacity.
Updated•3 days ago
|
| Assignee | ||
Comment 10•3 days ago
|
||
These debug artifact builds are needed so that we can run tier 2 tests
against them, matching what we already do for opt artifact builds.
| Assignee | ||
Comment 11•3 days ago
|
||
The artifact-build kind had keep-artifacts: false, which prevented it from
uploading the build outputs that test jobs need to download.
| Assignee | ||
Comment 12•3 days ago
|
||
The attached patch stack adds tier2 test jobs on both opt and debug artifact builds on mozilla-central. That should be enough for sheriffs to notice new perma-fails and file bugs on them. I didn't schedule the jobs on autoland to limit the extra cost.
| Assignee | ||
Comment 13•2 days ago
|
||
| Assignee | ||
Comment 14•19 hours ago
|
||
Updated•15 hours ago
|
Comment 15•11 hours ago
|
||
Description
•