Open Bug 1960664 Opened 6 months ago Updated 3 months ago

[meta] Run an analysis to find high-value metrics/tests

Categories

(Testing :: Raptor, task, P2)

task

Tracking

(Not tracked)

People

(Reporter: sparky, Unassigned)

References

Details

(Keywords: meta, Whiteboard: [fxp])

This bug is for running a new analysis to find high value tests/metrics, and see if there are any tests that we could disable or metrics that we should not alert on.

In the past, we've used this code to determine "high-value" tests, not metrics specifically, but it could be modified to do either tests or metrics by changing up the query that we're using to get the data: https://github.com/mozilla/mozperftest-tools/blob/master/high-value-tests/generate_high_value_tests.py

The query used: https://github.com/mozilla/mozperftest-tools/blob/master/high-value-tests/sql_query.txt

The query wasn't fixed to work in postgresql, so it would need some fixes. It also only looks at fixed by backout alerts, but we should see if we could use the fixed option as well.

See this bug for a potential test to remove: bug 1964298

See Also: → 1964298

This redash query may be helpful for us to determine when a test should be removed (e.g. no FIXED resolutions): https://sql.telemetry.mozilla.org/queries/108327/source

btw recently attempted postgresql version of https://github.com/mozilla/mozperftest-tools/blob/master/high-value-tests/sql_query.txt here looking for android high value tests
https://sql.telemetry.mozilla.org/queries/108361/source

but i think i made a mistake somewhere... it only shows me sp3 results in the output (haven't looked to closely but hopefully this is a good base for the postgresql version to add metrics and fixed status)

Nice! Thanks for fixing it for postgresql :kshampur :)

So I think the isssue is related to AND summary.status = 8. I'm pretty sure we've had other alerts backed out in that time, but maybe the resolution isn't being added to the alert summaries properly anymore.

I have noticed that a lot of alert summaries aren't being updated in a timely manner as well. That's why I had to make the query in comment #2. We should have been able to use the performance_alert_summary table for that, but many alerts don't have the bug resolution.

Maybe we could use the fixed status instead? That's summary.status = 7. I also wonder if we should just get a field added to the alerts that signifies if it's a valid detection or an invalid detection since that would provide more value. Currently, the wontfix status also provides a signal since it may be a valid regression, but the developers decided it's not worth fixing for some reason. But we also use wontfix for things like harness changes so it introduces some noise there.

See Also: → 1779879
See Also: → 1973554
Summary: Run an analysis to find high-value metrics/tests → [meta] Run an analysis to find high-value metrics/tests
You need to log in before you can comment on or make changes to this bug.