Open Bug 1506593 Opened 7 years ago Updated 3 years ago

Allow setting different alert thresholds for subtests

Categories

(Testing :: Talos, enhancement, P3)

Version 3
enhancement

Tracking

(Not tracked)

People

(Reporter: jmaher, Unassigned)

Details

currently right now we have a way to set a threshold for all tests: https://searchfox.org/mozilla-central/source/testing/talos/talos/output.py#126 but in the case of damp we would like to look at dynamically changing this as some tests are noisier than others. In that case we would either need data coming from the test itself, or a metadata file that supports this.

:igoldan are we seeing a lot of invalid alerts for the damp subtests? This seems like a reasonable way to reduce that for any subtests that have a large amount of noise.

Flags: needinfo?(igoldan)
Priority: -- → P3

(In reply to Dave Hunt [:davehunt] [he/him] ⌚️UTC from comment #1)

:igoldan are we seeing a lot of invalid alerts for the damp subtests? This seems like a reasonable way to reduce that for any subtests that have a large amount of noise.

I need to do a database query to figure this out.

Finished the query. Seems like we had 30 invalid alerts on devtools framework.

Flags: needinfo?(igoldan)

(In reply to Ionuț Goldan [:igoldan], Performance Sheriff from comment #3)

Finished the query. Seems like we had 30 invalid alerts on devtools framework.

Over how much time?

Flags: needinfo?(igoldan)

(In reply to Dave Hunt [:davehunt] [he/him] ⌚️UTC from comment #4)

(In reply to Ionuț Goldan [:igoldan], Performance Sheriff from comment #3)

Finished the query. Seems like we had 30 invalid alerts on devtools framework.

Over how much time?

From 2018 Aug 21 up to 2018 Oct 13. Then no more invalid damp alerts.

Flags: needinfo?(igoldan)

(In reply to Ionuț Goldan [:igoldan], Performance Sheriff from comment #5)

From 2018 Aug 21 up to 2018 Oct 13. Then no more invalid damp alerts.

It sounds like this is not a current concern then. Let's keep this open as having thresholds for subtests sounds like a reasonable enhancement.

Summary: allow damp test to have different thresholds for subtests → Allow setting different alert thresholds for subtests

Note that, since we took over the manual sheriffing of DAMP, we probably don't do it as well as the other test suites.

I know, that, myself, I tend to rather use our custom dashboard, available here:
https://firefox-dev.tools/performance-dashboard/
Which allows to quickly scan all the subtests rather than opening each subtest graph one by one.

The result of that is that I'm not sure we ever flagged any alert as INVALID, even if I imagine that a signifant part of them are.
Note that, as we aren't able to be fully effective with the alerts dashboard:
https://treeherder.mozilla.org/perf.html#/alerts?status=0&framework=12&hideDwnToInv=1&page=1
We are not flagging "valid" alerts correctly either...

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.