Closed Bug 1280951 Opened 8 years ago Closed 8 years ago

2.29 - 4.46% tp5o XRes (linux64) regression on push eebf02e5a399 (Fri Jun 10 2016)

Categories

(Core :: Widget: Gtk, defect)

50 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox50 --- fixed

People

(Reporter: jmaher, Assigned: karlt)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

Attachments

(1 file)

Talos has detected a Firefox performance regression from push eebf02e5a399. As author of one of the patches included in that push, we need your help to address this regression.

This is a list of all known regressions and improvements related to the push:
https://treeherder.mozilla.org/perf.html#/alerts?id=1515

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see:
https://wiki.mozilla.org/Buildbot/Talos/Tests#tp5

Reproducing and debugging the regression:
If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p linux64 -u none -t tp5o,tp5o-e10s --rebuild 5  # add "mozharness: --spsProfile" to generate profile data

(we suggest --rebuild 5 to be more confident in the results)

To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.lorg/Buildbot/Talos/Running#Running_locally_-_Source_Code

Then run the following command from the directory where you set up Talos:
talos --develop -e [path]/firefox -a tp5o

(add --e10s to run tests in e10s mode)

Making a decision:
As the patch author we need your feedback to help us handle this regression.
*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations:
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
the graph for this test needs some zooming love, but here is a better view:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bfx-team,2440041b70bf8c7aca5f2e0acc169916f4374358,1,1%5D&series=%5Bmozilla-inbound,2440041b70bf8c7aca5f2e0acc169916f4374358,1,1%5D&series=%5Bmozilla-inbound,36195bda5c567aa358f855f4667c7ae38744d194,1,1%5D&zoom=1465479310226.228,1465581309328.3904,3244.5875335193596,4572.604756541332

I did some retriggers, this seems like the root cause- there are 4 patches though, I am not sure which is the offender:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=eebf02e5a399

Possibly this is something we need to live with, it would be good to understand this prior to wontfix or spending time fixing this.

:karlt, can you take a look at this and figure out why our XRes measurements are increasing?
Component: Untriaged → Widget: Gtk
Flags: needinfo?(karlt)
Product: Firefox → Core
Thanks, Joel.
It is somewhat surprising that there is a difference as significant as this.
I'll look into it.
Assignee: nobody → karlt
Flags: needinfo?(karlt)
This was triggered by https://hg.mozilla.org/integration/mozilla-inbound/rev/80425dc038ac
probably because it triggered instantiation of the tooltip widget.
wow, thanks for figuring this out so fast!
I don't think the new patch is helping, after some retriggers it looks like we have fairly consisten 3928 vs 3832 numbers- maybe I have the try pushes mixed up and we have a 2.5% win vs a loss!
(In reply to Joel Maher (:jmaher) from comment #8)
> I don't think the new patch is helping, after some retriggers it looks like
> we have fairly consisten 3928 vs 3832 numbers- maybe I have the try pushes
> mixed up and we have a 2.5% win vs a loss!

(In reply to Joel Maher (:jmaher) from comment #7)
> https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=dfdc725935a9&newProject=try&newRevision=874e8b04f42e&framework=1&showOnlyImportant=0

(In reply to Karl Tomlinson (ni?:karlt) from comment #5)
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=874e8b04f42e&selectedJob=22627377
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=dfdc725935a9&selectedJob=22627521
> 
> (The treeherder display is confusing because the former shows
> https://hg.mozilla.org/try/rev/e8705ee011170c87ff5c01042d441514150a91f0
> even though it is not an ancestor.)

874e8b04f42e (so-called newRevision/new) is the reference, without e8705ee01117 as an ancestor.  e8705ee01117 is displayed on treeherder even though it is another head.
The parent of 874e8b04f42e is on m-c.

dfdc725935a9 (so-called originalRevision/base) has the fix e8705ee01117 as the parent, which is parented on the same m-c revision as 874e8b04f42e.
Treeherder does not show e8705ee01117 because it was not new in this push.
The patch looks fine but I don't have pre 3.20 gtk handy right now so I'm going to test it there this afternoon.
Attachment #8763800 - Flags: review?(stransky) → review-
Comment on attachment 8763800 [details]
bug 1280951 copy tooltip style context and cache that instead of widget

https://reviewboard.mozilla.org/r/59916/#review56856

I tested the pre3.20 and looks fine. Just need to address the 3.20 cache problem.

::: widget/gtk/WidgetStyleCache.cpp:210
(Diff revision 1)
>  
>  /* GetCssNodeStyleInternal is used by Gtk >= 3.20 */
>  static GtkStyleContext*
>  GetCssNodeStyleInternal(WidgetNodeType aNodeType)
>  {
> -  GtkStyleContext* style = sStyleStorage[aNodeType];
> +  MOZ_ASSERT(!sStyleStorage[aNodeType]);

We need the style cache here or at CreateChildCSSNode().

When GetCssNodeStyleInternal() is called recusrively (for instance MOZ_GTK_SCROLLBAR_TROUGH_HORIZONTAL -> MOZ_GTK_SCROLLBAR_CONTENTS_HORIZONTAL) it asserts here and creates the style instead of the cached one.

like:
 static GtkStyleContext*
 CreateChildCSSNode(const char* aName, WidgetNodeType aParentNodeType)
 {
-  return CreateCSSNode(aName, GetCssNodeStyleInternal(aParentNodeType));
+  GtkStyleContext* styleParent = sStyleStorage[aParentNodeType];
+  if (!styleParent)
+    styleParent = GetCssNodeStyleInternal(aParentNodeType);
+
+  return CreateCSSNode(aName, styleParent); 
 }
Comment on attachment 8763800 [details]
bug 1280951 copy tooltip style context and cache that instead of widget

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/59916/diff/1-2/
Attachment #8763800 - Flags: review- → review?(stransky)
https://reviewboard.mozilla.org/r/59916/#review56856

> We need the style cache here or at CreateChildCSSNode().
> 
> When GetCssNodeStyleInternal() is called recusrively (for instance MOZ_GTK_SCROLLBAR_TROUGH_HORIZONTAL -> MOZ_GTK_SCROLLBAR_CONTENTS_HORIZONTAL) it asserts here and creates the style instead of the cached one.
> 
> like:
>  static GtkStyleContext*
>  CreateChildCSSNode(const char* aName, WidgetNodeType aParentNodeType)
>  {
> -  return CreateCSSNode(aName, GetCssNodeStyleInternal(aParentNodeType));
> +  GtkStyleContext* styleParent = sStyleStorage[aParentNodeType];
> +  if (!styleParent)
> +    styleParent = GetCssNodeStyleInternal(aParentNodeType);
> +
> +  return CreateCSSNode(aName, styleParent); 
>  }

Ah, yes.  Thanks.  I've reverted this part, and instead checking sStyleStorage just for MOZ_GTK_TOOLTIP in GetWidgetStyleInternal() for now.
Comment on attachment 8763800 [details]
bug 1280951 copy tooltip style context and cache that instead of widget

https://reviewboard.mozilla.org/r/59916/#review57028

Looks fine, Thanks!
Attachment #8763800 - Flags: review?(stransky) → review+
Pushed by ktomlinson@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d56d7039aee0
copy tooltip style context and cache that instead of widget r=stransky+263117
Excellent, this patch has triggered an improvement alert on Talos.

https://treeherder.mozilla.org/perf.html#/alerts?id=1574

So far it has shown a 4.3% improvement in tp5o XRes linux64 opt e10s. I retriggered several times and the results are consistent.
https://hg.mozilla.org/mozilla-central/rev/d56d7039aee0
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
I think this issue is more of a bi-modal test (most likely the fix to this bug caused the bi-modal distribution) and we see issues where some runs can show all data points on the high or low mode.
See Also: → 1319957
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: