6.55 - 13.56% about_preferences_basic (linux64, osx-10-10) regression on push 95eb6ee7690b0c326bfb1db0c12d37123fdc3fff (Wed Feb 28 2018)

RESOLVED FIXED in Firefox 60

Status

()

defect
RESOLVED FIXED
a year ago
a year ago

People

(Reporter: igoldan, Assigned: gandalf)

Tracking

(Blocks 1 bug, {perf, regression, talos-regression})

unspecified
mozilla60
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox-esr52 unaffected, firefox58 unaffected, firefox59 unaffected, firefox60 fixed)

Details

Talos has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=6ba30e00066492bdd9fe24d9db3e8bc5420a4e92&tochange=95eb6ee7690b0c326bfb1db0c12d37123fdc3fff

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

 14%  about_preferences_basic osx-10-10 opt e10s stylo     218.32 -> 247.92
  7%  about_preferences_basic linux64 opt e10s stylo       147.01 -> 156.63


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=11818

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Buildbot/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Buildbot/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Component: Untriaged → Internationalization
Product: Firefox → Core
:gandalf Changes from bug 1441733 showed a perf regression on all desktop platforms. What can we do to resolve this?
Flags: needinfo?(gandalf)
:igoldan - I monitored the regression range and it doesn't seem like my patch was the only one in the range.

Here's the same regression on tpaint - https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,1537978,1,1&series=mozilla-central,1538533,1,1
Here's the same regression on ts_paint - https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,1537977,1,1&series=mozilla-central,1538532,1,1&series=mozilla-central,1661411,1,1&series=mozilla-central,1661492,1,1&series=mozilla-central,1538077,1,1&series=mozilla-central,1538692,1,1

Initially I was worried that it was specific to my patch, but with so many regressions on the startup path (which my patch doesn't touch) I doubt it.

Can you please tell me why did you report it as preferences talos test regression only, and ts_paint/tpaint regressions not despite being in the same range?
Flags: needinfo?(gandalf) → needinfo?(igoldan)
I run tests that confirm that there indeed is a regression from the XUL patch. It doesn't fully match the characteristic of the regression from the alert, but backout of bug 1435912 fixes it.

The source of the regression is the recently landed stylo-chrome patch. It changes how we resolve XBL bindings from lazy to eagerly which impacts the DOM paths used by Fluent. For details see bug 1441037.

The ultimate solution here is for XBL to go away (bug 1397874) and stylo to stop querying for XBL entries. The stylo team confirmed that the Web Components which we are moving to do not have the same impact on performance and we should be good there.

Unfortunately, not only do we have to remove XBL but we also have to remove support for XBL and that may take a bit longer if we'll want to give time to Tb and Sm to catch up.

That means that we're going to need a temporary workaround.
I've been working on a patch in bug 1363862 which shifts the querying and applying translations to C++ and in result avoids hitting those XBL checking paths.

I intended to work on it with a timeline safely into 61 as a prerequisite for placing Fluent on the startup path, but depending on how ok we are with this regression we may want to rush it.
I confirmed that the patch from that bug removes the regression.

Here's a graph of three talos runs:

1) current mozilla-central
2) mozilla-central with bug 1435912 backed out 
3) mozilla-central with bug 1363862 applied

https://pike.github.io/talos-compare/?revision=67b4498d75f9&revision=68cc89e4139e&revision=2f59d6fc4d35


I'm going to focus on getting the patch in bug 1363862 ready for landing and we can assess if we should rush it into 60 or wait till 61.
Depends on: 1363862
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #2)
> :igoldan - I monitored the regression range and it doesn't seem like my
> patch was the only one in the range.
> 
> Here's the same regression on tpaint -
> https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,
> 1537978,1,1&series=mozilla-central,1538533,1,1
> Here's the same regression on ts_paint -
> https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,
> 1537977,1,1&series=mozilla-central,1538532,1,1&series=mozilla-central,
> 1661411,1,1&series=mozilla-central,1661492,1,1&series=mozilla-central,
> 1538077,1,1&series=mozilla-central,1538692,1,1
> 
> Initially I was worried that it was specific to my patch, but with so many
> regressions on the startup path (which my patch doesn't touch) I doubt it.
> 
> Can you please tell me why did you report it as preferences talos test
> regression only, and ts_paint/tpaint regressions not despite being in the
> same range?

The 2 regressions hinted are happening on _Windows platforms only_. They were caused by the backout of
bug 1431161 and were _many_ other kinds of regressions because of it -- probably all Talos tests we run on Windows [1].
The tpaint and ts_paint regressions don't reproduce on Linux and OS X.

The about_preferences_basic regression was the only one that happened on multiple desktop platforms.
I think it would have reproduced on Windows also, but I see there are some technical issues -- Treeherder
shows the perf tests finished running with success, still I don't see any results for them. I will file a
separate bug for that matter.

[1] https://treeherder.mozilla.org/perf.html#/alerts?id=11817
Flags: needinfo?(igoldan)
(In reply to Ionuț Goldan [:igoldan], Performance Sheriffing from comment #4)
> I think it would have reproduced on Windows also, but I see there are some
> technical issues -- Treeherder
> shows the perf tests finished running with success, still I don't see any
> results for them. I will file a
> separate bug for that matter.
> 
> [1] https://treeherder.mozilla.org/perf.html#/alerts?id=11817

:jmaher Do you know something about this issue?
i.e This is a perf graph [1]. If you look at the corresponding Treeherder page [2], you can see that the T-e10s(o) suite from changeset 1bc5d8dfe84a has "Performance" data, but the previous changeset 5347f06b4438 hasn't.

[1] https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-inbound,1644157,1,1&highlightedRevisions=1bc5d8dfe84a
[2] https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&group_state=expanded&filter-searchStr=windows%2010%20x64%20pgo%20t-e10s(o)&tochange=1bc5d8dfe84ad29bbe0b6248adacec7ac679f401&fromchange=ef7b22a78379bf809533d4061937d6205a8c3d33&selectedJob=164898776
Flags: needinfo?(jmaher)
this was when we had across the board hardware issues and had to revert back.  Revision 5347f06b4438 ran on the new moonshots (didn't run originally because they were all broken), and revision 1bc5d8dfe84a is on the old IX hardware.
Flags: needinfo?(jmaher)
The patch in bug 1363862 has landed and it wins us even more than a backout of the patch from bug 1435912 would.

https://pike.github.io/talos-compare/?revision=3230251ec0b5&revision=34d2d72e4afa&revision=768a021719d7

I hope it'll show up on perherder and we'll be able to close this regression bug!
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #8)
> Ionut, can you confirm that the regression is gone -
> https://treeherder.mozilla.org/perf.html#/graphs?series=mozilla-central,
> 1660577,1,1&series=mozilla-central,1660607,1,1 ?

Yep, I confirm it is gone. Even better, I see improvements when comparing against values up to Feb 28.
Flags: needinfo?(igoldan)
Status: NEW → RESOLVED
Last Resolved: a year ago
Resolution: --- → FIXED
Here are the new perf results. Thanks a lot for landing bug 1363862!

== Change summary for alert #11929 (as of Mon, 05 Mar 2018 00:09:39 GMT) ==

Improvements:

 13%  about_preferences_basic windows10-64 pgo e10s stylo     234.30 -> 203.67
 12%  about_preferences_basic windows7-32 opt e10s stylo      289.72 -> 255.08
 12%  about_preferences_basic windows7-32 pgo e10s stylo      218.66 -> 193.18
 11%  about_preferences_basic windows10-64 opt e10s stylo     283.39 -> 251.47
 11%  about_preferences_basic linux64 pgo e10s stylo          144.56 -> 129.33
 10%  about_preferences_basic osx-10-10 opt e10s stylo        237.69 -> 212.97
 10%  about_preferences_basic linux64 opt e10s stylo          157.60 -> 142.62
  4%  about_preferences_basic osx-10-10 opt e10s stylo        247.74 -> 238.35
  3%  about_preferences_basic linux64 opt e10s stylo          161.77 -> 157.67

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=11929
Assignee: nobody → gandalf
Target Milestone: --- → mozilla60
You need to log in before you can comment on or make changes to this bug.