Closed Bug 1305481 Opened 8 years ago Closed 8 years ago

2.11 - 22.28% tp5o / tp5o % Processor Time / tp5o responsiveness (windowsxp) regression on push d5355738ce1edf58cabee849024c7bd02c641b77 (Fri Sep 23 2016)

Categories

(Release Engineering :: Applications: MozharnessCore, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, talos-regression)

Talos has detected a Firefox performance regression from push d5355738ce1edf58cabee849024c7bd02c641b77. As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

 22%  tp5o responsiveness windowsxp pgo       50.68 -> 61.97
 20%  tp5o responsiveness windowsxp opt       86.95 -> 104.45
 18%  tp5o summary windowsxp opt              383.93 -> 451.82
 15%  tp5o summary windowsxp pgo              273.76 -> 315.12
  3%  tp5o % Processor Time windowsxp pgo     73.64 -> 75.65
  2%  tp5o % Processor Time windowsxp opt     80.29 -> 81.98


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=3412

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
:gps, I did a lot of retriggers and this is related to the series of patches here:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=27a183a2527cc4190691595c782b6fe53512af1b&tochange=d5355738ce1edf58cabee849024c7bd02c641b77

I know this is infrastructure changes and this only seems to affect windows XP tp.  I am not sure there is much to do here, but I would really like to understand why this happened and know that we didn't do something silly on accident.
Flags: needinfo?(gps)
Component: Untriaged → Mozharness
Product: Firefox → Release Engineering
QA Contact: jlund
O_o

If my patches are responsible for this, that implies that the Python or Python packages in use are impacting timings somehow. That would be very intriguing.

Let me dig into the logs...
I don't see anything obviously different in the logs.

jmaher: I'll need to pick your brain about how Talos works to rule out a few possibilities. I think you're away doing trust falls this week, so not sure when you'll be around. Let's try to Vidyo sometime soon.

In the mean time, I'll try looking harder.
This regression reminds me to something similar what we have seen for my download_unpack() changes for Mozharness a while ago, where we also weren't able to determine why that affected talos test execution times. See bug 1295226 for reference.
I'm thinking this had something to do with the incremental build and not something related to my patches. I retriggered a bunch of build jobs, which will get us Talos results from multiple binaries. Hopefully that deflects blame away from me. But since PGO is supposed to be a clobber build and PGO showed a regression, my confidence isn't as high as I'd like.
(In reply to Gregory Szorc [:gps] from comment #5)
> I'm thinking this had something to do with the incremental build and not
> something related to my patches. I retriggered a bunch of build jobs, which
> will get us Talos results from multiple binaries. Hopefully that deflects
> blame away from me. But since PGO is supposed to be a clobber build and PGO
> showed a regression, my confidence isn't as high as I'd like.

In the build system PGO is not a clobber. I don't know if we do something special at the buildbot build scheduling level to make that so.
:coop, can you verify that we will stop testing windows xp builds on trunk when firefox 52 merges to aurora?  it would help make closing this as wontfix a lot easier.
Flags: needinfo?(coop)
(In reply to Joel Maher ( :jmaher) from comment #7)
> :coop, can you verify that we will stop testing windows xp builds on trunk
> when firefox 52 merges to aurora?  it would help make closing this as
> wontfix a lot easier.

That is correct. We can disable XP on m-c when 52 is on aurora.
Flags: needinfo?(coop)
Thanks :coop.  :gps, as we discussed in IRC, I am closing this as wontfix- this is harness/infra related, not product related and winXP is "lower priority", especially given how this really doesn't affect the product.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(gps)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.