Closed Bug 1514829 Opened 6 years ago Closed 5 years ago

analyze performance regressions to determine if opt vs pgo provides value

Categories

(Testing :: Performance, enhancement, P3)

Version 3
enhancement

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jmaher, Unassigned)

Details

currently we run performance tests on both opt and pgo builds of linux64, windows7 and windows10. I am looking at stopping all testing on opt builds as we don't find regressions on there from our unittests and only test on pgo. What does that mean for performance tests? Currently we have 3 buckets perf tests can fall into between opt/pgo: 1) we see the same changes 2) we see alerts on PGO but not on opt (hard to translate to a root cause since the code itself that landed didn't regression anything specifically) 3) we see alerts on opt but not PGO (easy to find root cause here, but harder to assign priority since this regression we wouldn't ship with) I would like to look at the last 12 months of regressions and see how many fall into buckets 1,2,and 3. That would help us determine if we should make any changes in what we run performance tests on.

my attempt using re-dash on the treeherder database to collect related data:
select ps.suite,
mp.platform,
opt.name,
pas.id,
pas.last_updated,
p.revision,
p.time

from
   performance_alert_summary as pas,
   performance_alert as pa,
   performance_signature as ps,
   machine_platform as mp,
   option_collection as oc,
   `option` as opt,
   push as p
where
   pas.push_id = p.id and
   opt.id = oc.option_id and
   oc.id = ps.option_collection_id and
   mp.id = ps.platform_id and
   pas.framework_id in (1,4,10,11) and
   pas.status in (2,4,5,6,7) and
   ps.platform_id in (104, 15, 123, 207, 208, 249, 420) and
   pas.last_updated>'2018-01-01' and
   pa.summary_id=pas.id and
   pa.series_signature_id=ps.id
limit 10000;

now I need to look at the resulting data and see if it matches up for opt vs pgo. I might have to adjust this query in the future for pas.status, as I limited that to status !invalid, !downstream, !untriaged;

I will update this more when I get some clarity

I found that this was confusing and created a script to generate alerts:
https://github.com/jmaher/randomtools/tree/master/doalerts

In doing this there are a lot of differences in the last 6 months between opt and pgo (looking at raptor and talos for linux64/win7/win10 for mozilla-inbound). In fact there are 173 differences (alerts && improvements) and I looked at them all to see. These fall into a few buckets:

  1. alerts on autoland, but missed on inbound (rare, but often enough)
  2. alerts for test changes or build changes, often build changes affect opt || pgo, not both
  3. alerts on opt || pgo where one signal is stronger than the other (opt has 4%, pgo has 1.8% therefore no alert)
  4. alerts on noise or changing modalities
  5. alerts for many tests for same root cause, but missing opt or pgo for a single platform

What I looked to answer is:
Will we miss any regressions that we ship (i.e. PGO) that we got alerts for on opt but not on pgo?

  • yes, but in all cases a bug was filed for other alerts that would have been detected (pgo on other platforms) and almost every single time on a related or same test.

I also looked for:
Will we get regressions on PGO where opt doesn't regress as well?

  • no- all pgo only regressions that we got alerted on were mirrored in opt (except backouts with a corresponding prior improvement), therefore reducing the concern of a regression that is due to the pgo black box.

:davehunt, do you have concerns with me coming to this conclusion that stopping perf testing on opt will not result in missed regressions?

Flags: needinfo?(dave.hunt)

You're the expert here Joel, so I trust in your judgement. I do appreciate your analysis and find your case compelling. I wonder what signal(s) we might use to determine if this we need to revert a decision such as this?

Flags: needinfo?(dave.hunt)

Raptor tests stopped running against opt builds in bug 1565644.

:jmaher, can we close this bug now?

Priority: -- → P3
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.