Autoclassify calls 100% occurring failures on a push intermittent if it has ever seen a push where they were intermittent

NEW
Unassigned

Status

2 years ago
20 days ago

People

(Reporter: Ehsan, Unassigned)

Tracking

Details

(Reporter)

Description

2 years ago
See this try push for example: <https://treeherder.mozilla.org/#/jobs?repo=try&revision=4dc3cf4f0b9e8c4c9ee1cd8002cd554955e6c4f6>

Here, devtools/client/debugger/test/mochitest/browser_dbg_sources-webext-contentscript.js has been failing on all of the platforms and configurations that it has ran on, however it has been mis-starred on OS X and Windows, but not Linux!
Wouldn't that be neat, if we ran the same number of chunks on every platform, and had so few disabled-per-platform tests that the chunking was the same between platforms?

dt8 and dt6 on asan, dt6 and dt2 on mac, dt8 and dt2 on Win7 opt, dt2 and dt1 on Win7 debug.
Summary: TreeHerder mis-stars 100% occurring failures → Autoclassify calls 100% occurring failures on a push intermittent if it has ever seen a push where they were intermittent

Comment 3

2 years ago
Is this due to detect-intermittents? If so, it's just been disabled in bug 1301434, so this is a dupe of that.
Flags: needinfo?(james)
In principle no, because a sheriff staring a bug that is first intermittent and then later permafail leads to the same outcome. OTOH it may occur less frequently when that task is disabled.

But I don't really want to spend time special-casing 100% permafails because the actual problem seems to be "tests that fail too often should be fixed with urgency, and may lead to tree closures". "Too often" here could be 50% or lower. That seems like a sub-problem of the larger "detect when the frequency of an intermittent increased" problem, which should be solved anyway. So I agree this is a real problem, but I don't think the fix implied by the bug summary is the right one.
Flags: needinfo?(james)
This feels like a UI problem to me, in that the star next to the jobs gives the user the false impression that a permafail is just a "known intermittent" (certainly this was my first impression looking at Ehsan's push).

A "test centric" view of results would be one possible solution to this problem (see bug 1059770), since it would probably highlight the fact that it's one test in particular that's failing (rather than the usual pattern of orange that we see on every push these days).
See Also: → bug 1059770

Updated

a year ago
Component: Treeherder → Treeherder: Log Parsing & Classification
You need to log in before you can comment on or make changes to this bug.