Closed Bug 1492096 Opened 6 years ago Closed 6 years ago

2.32 - 7.59% tp5o / tp5o_webext (linux64, linux64-qr, osx-10-10) regression on push 87d667d7ad73b681872a320f184aa00e1a7a41fd (Sun Sep 16 2018)

Categories

(Core :: Widget, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla64
Tracking Status
firefox-esr60 --- unaffected
firefox62 --- unaffected
firefox63 --- unaffected
firefox64 --- fixed

People

(Reporter: jmaher, Assigned: ehsan.akhgari)

References

Details

(Keywords: perf, regression, talos-regression)

Talos has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=87d667d7ad73b681872a320f184aa00e1a7a41fd

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

  8%  tp5o osx-10-10 opt e10s stylo            220.41 -> 237.14
  6%  tp5o_webext osx-10-10 opt e10s stylo     343.66 -> 364.93
  3%  tp5o linux64-qr opt e10s stylo           144.13 -> 148.99
  2%  tp5o_webext linux64 pgo e10s stylo       185.73 -> 190.04


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=15943

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Component: General → Widget
Product: Testing → Core
gecko profiles on osx which is the worse regression-

before:
https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FZZgdj7NKRnqtOmyMZJ7BBA%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_tp5o.zip

after:
https://perf-html.io/from-url/https%3A%2F%2Fqueue.taskcluster.net%2Fv1%2Ftask%2FExDZz0tQSC2hy57hGBEcKg%2Fruns%2F0%2Fartifacts%2Fpublic%2Ftest_info%2Fprofile_tp5o.zip

:ehsan, I see you authored the code that caused the regression, could you look into this to see if there is anything we can do to reduce/fix the regression?
Flags: needinfo?(ehsan)
and a AWSY regression:
== Change summary for alert #15945 (as of Sun, 16 Sep 2018 22:00:09 GMT) ==

Regressions:

  4%  Heap Unclassified osx-10-10 opt stylo     68,811,089.68 -> 71,664,311.16

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=15945
I have no immediate ideas...  Pushed some try talos runs to see backing out which revisions makes the numbers go back down...
So, a talos run between https://treeherder.mozilla.org/#/jobs?repo=try&revision=0cb82dc5e6b4e4218a8f88204a473fce0727db69 and https://treeherder.mozilla.org/#/jobs?repo=try&revision=0990be61c96909e18e99bca530c3cd58a6ca25ab shows that bug 1491594 is the culprit here!

<https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=0cb82dc5e6b4e4218a8f88204a473fce0727db69&newProject=try&newRevision=0990be61c96909e18e99bca530c3cd58a6ca25ab&framework=1>

I can't explain this result at all.  There is really nothing in the patch for bug 1491594 that should cause this, as far as I can tell.

The profiles in comment 1 are useless unfortunately since they lack symbols.  Joel, any way we can get profiles with symbols from Talos?  Any way I can run Tp5 locally?

Markus, can you think of what might have caused this?

I'll back out the patch for now...
Flags: needinfo?(ehsan) → needinfo?(jmaher)
The only behavioral difference I could find was in the nsBox constructor, where the old code was setting gGotTheme to true even when getting the theme failed. Maybe tp5o runs with native themes disabled, so now we make this call over and over and always get null?
I think Karl is right, that was a huge mistake on my part!
Oops! Good catch, Karl.
Let's mark this bug as FIXED (since the regression seems to have been fixed as far as I can tell <https://treeherder.mozilla.org/perf.html#/alerts?id=15991>) and continue this over in bug 1491594.
Assignee: nobody → ehsan
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
thanks for the fix :ehsan.

I looked at the profile, on the same revisions, I could retrigger linux64-qr tp5o and get profiles with symbols, doing more retriggers on talos jobs on osx all failed to have symbols.

:florian, is there a known issue getting profiles with symbols on osx?
Flags: needinfo?(jmaher) → needinfo?(florian)
(In reply to Joel Maher ( :jmaher ) (UTC-4) from comment #10)

> I looked at the profile, on the same revisions, I could retrigger linux64-qr
> tp5o and get profiles with symbols, doing more retriggers on talos jobs on
> osx all failed to have symbols.

Do you have the link to the jobs on treeherder? Looking at the logs may give a hint about what's not working.

> :florian, is there a known issue getting profiles with symbols on osx?

I'm not aware of any, maybe Markus knows?
Flags: needinfo?(florian) → needinfo?(mstange)
(In reply to Florian Quèze [:florian] from comment #11)
> > :florian, is there a known issue getting profiles with symbols on osx?
> 
> I'm not aware of any, maybe Markus knows?

There was a problem in the past that I was able to workaround by replacing the dump_syms_mac binary; not sure if that's the problem here. Experimentally pushed to try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=259b953d080b472bb2bbfcdeb98b49e13e7dc306
Flags: needinfo?(mstange)
Just to clarify, my needs have been met here already, but it would be nice to get to the bottom of the issue of the missing symbols for the next person, so I appreciate Markus looking into it.  :-)
I've filed bug 1493261 about the symbolication failure.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.