analyze performance regressions to determine if opt vs pgo provides value
Categories
(Testing :: Performance, enhancement, P3)
Tracking
(Not tracked)
People
(Reporter: jmaher, Unassigned)
Details
Reporter | ||
Comment 1•6 years ago
|
||
my attempt using re-dash on the treeherder database to collect related data:
select ps.suite,
mp.platform,
opt.name,
pas.id,
pas.last_updated,
p.revision,
p.time
from
performance_alert_summary as pas,
performance_alert as pa,
performance_signature as ps,
machine_platform as mp,
option_collection as oc,
`option` as opt,
push as p
where
pas.push_id = p.id and
opt.id = oc.option_id and
oc.id = ps.option_collection_id and
mp.id = ps.platform_id and
pas.framework_id in (1,4,10,11) and
pas.status in (2,4,5,6,7) and
ps.platform_id in (104, 15, 123, 207, 208, 249, 420) and
pas.last_updated>'2018-01-01' and
pa.summary_id=pas.id and
pa.series_signature_id=ps.id
limit 10000;
now I need to look at the resulting data and see if it matches up for opt vs pgo. I might have to adjust this query in the future for pas.status, as I limited that to status !invalid, !downstream, !untriaged;
I will update this more when I get some clarity
Reporter | ||
Comment 2•6 years ago
|
||
I found that this was confusing and created a script to generate alerts:
https://github.com/jmaher/randomtools/tree/master/doalerts
In doing this there are a lot of differences in the last 6 months between opt and pgo (looking at raptor and talos for linux64/win7/win10 for mozilla-inbound). In fact there are 173 differences (alerts && improvements) and I looked at them all to see. These fall into a few buckets:
- alerts on autoland, but missed on inbound (rare, but often enough)
- alerts for test changes or build changes, often build changes affect opt || pgo, not both
- alerts on opt || pgo where one signal is stronger than the other (opt has 4%, pgo has 1.8% therefore no alert)
- alerts on noise or changing modalities
- alerts for many tests for same root cause, but missing opt or pgo for a single platform
What I looked to answer is:
Will we miss any regressions that we ship (i.e. PGO) that we got alerts for on opt but not on pgo?
- yes, but in all cases a bug was filed for other alerts that would have been detected (pgo on other platforms) and almost every single time on a related or same test.
I also looked for:
Will we get regressions on PGO where opt doesn't regress as well?
- no- all pgo only regressions that we got alerted on were mirrored in opt (except backouts with a corresponding prior improvement), therefore reducing the concern of a regression that is due to the pgo black box.
:davehunt, do you have concerns with me coming to this conclusion that stopping perf testing on opt will not result in missed regressions?
Comment 3•6 years ago
|
||
You're the expert here Joel, so I trust in your judgement. I do appreciate your analysis and find your case compelling. I wonder what signal(s) we might use to determine if this we need to revert a decision such as this?
Comment 4•5 years ago
|
||
Raptor tests stopped running against opt builds in bug 1565644.
Reporter | ||
Updated•5 years ago
|
Description
•