Closed
Bug 1270207
Opened 9 years ago
Closed 9 years ago
3.3% tabpaint (linux64 pgo only) regression on push 89b02b6959ee (Sun May 1 2016)
Categories
(Firefox :: Untriaged, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: jmaher, Unassigned)
References
Details
(Keywords: perf, regression, Whiteboard: [talos_regression])
Talos has detected a Firefox performance regression from push 89b02b6959ee:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=aad69ac9b1186e2e3709516b41f724f9695c8534&tochange=89b02b6959ee93a58c033c158005668bdb46773b
As author of one of the patches included in that push, we need your help to address this regression.
This is a list of all known regressions and improvements related to the push:
https://treeherder.mozilla.org/perf.html#/alerts?id=1093
On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.
To learn more about the regressing test(s), please see:
https://wiki.mozilla.org/Buildbot/Talos/Tests#tabpaint
Reproducing and debugging the regression:
If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p linux64 -u none -t other --rebuild 5 # add "mozharness: --spsProfile" to generate profile data
* note, you need pgo build: https://wiki.mozilla.org/ReleaseEngineering/TryChooser#What_if_I_want_PGO_for_my_build
To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.lorg/Buildbot/Talos/Running#Running_locally_-_Source_Code
Then run the following command from the directory where you set up Talos:
talos --develop -e [path]/firefox -a tabpaint
Making a decision:
As the patch author we need your feedback to help us handle this regression.
*** Please let us know your plans by Monday, or the offending patch(es) will be backed out! ***
Our wiki page outlines the common responses and expectations:
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Reporter | ||
Updated•9 years ago
|
Reporter | ||
Comment 1•9 years ago
|
||
:bas, this seems to be a pgo only regression. I have backfilled and done a series of retriggers to help me find the culprit:
https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,f91449d883e157f8cb3ce5d6e6ac3144d7b30689,1%5D&series=%5Bfx-team,f91449d883e157f8cb3ce5d6e6ac3144d7b30689,1%5D&zoom=1461837029146.371,1461853500270.1216,56.311868843125744,61.93346058453341&selected=%5Bmozilla-inbound,f91449d883e157f8cb3ce5d6e6ac3144d7b30689,30785,27227820,1%5D
and it falls on your set of changes:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=aad69ac9b1186e2e3709516b41f724f9695c8534&tochange=89b02b6959ee93a58c033c158005668bdb46773b
as this is pgo only, it adds to the complications, can you help figure out why this is happening and if there is anything we can do to reduce or fix the regression?
Flags: needinfo?(bas)
Comment 2•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #1)
> :bas, this seems to be a pgo only regression. I have backfilled and done a
> series of retriggers to help me find the culprit:
> https://treeherder.mozilla.org/perf.html#/graphs?series=%5Bmozilla-inbound,
> f91449d883e157f8cb3ce5d6e6ac3144d7b30689,1%5D&series=%5Bfx-team,
> f91449d883e157f8cb3ce5d6e6ac3144d7b30689,1%5D&zoom=1461837029146.371,
> 1461853500270.1216,56.311868843125744,61.93346058453341&selected=%5Bmozilla-
> inbound,f91449d883e157f8cb3ce5d6e6ac3144d7b30689,30785,27227820,1%5D
>
> and it falls on your set of changes:
> https://hg.mozilla.org/integration/mozilla-inbound/
> pushloghtml?fromchange=aad69ac9b1186e2e3709516b41f724f9695c8534&tochange=89b0
> 2b6959ee93a58c033c158005668bdb46773b
>
> as this is pgo only, it adds to the complications, can you help figure out
> why this is happening and if there is anything we can do to reduce or fix
> the regression?
Not as far as I can tell. The actual control path on Linux isn't even affected by this patch. I suppose somehow the change in code structure makes PGO make different decisions than before. It's hard with PGO to even say whether the new decisions are worse, they happen to regress one benchmark but they may very well improve others that we happen not to have in Talos.
The only other option would be to randomly fiddle around with refactoring the function some more and hoping PGO happens to make the same decisions as before, the culprit here is likely: https://hg.mozilla.org/integration/mozilla-inbound/rev/89b02b6959ee That doesn't practically change the amount of work done, but it does refactor the function a little bit.
Flags: needinfo?(bas)
Reporter | ||
Comment 3•9 years ago
|
||
doing a backout of 89b02b6959ee yields a 2.85% improvement:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=1cde312d3437&newProject=try&newRevision=943fde83c286e876f2d3a018365dd62b48a822fc&framework=1
I think that is the bulk of the work- is that something that can be optimized, or something we should leave as is and mark this bug as wontfix?
Flags: needinfo?(bas)
Comment 4•9 years ago
|
||
(In reply to Joel Maher (:jmaher) from comment #3)
> doing a backout of 89b02b6959ee yields a 2.85% improvement:
> https://treeherder.mozilla.org/perf.html#/
> compare?originalProject=try&originalRevision=1cde312d3437&newProject=try&newR
> evision=943fde83c286e876f2d3a018365dd62b48a822fc&framework=1
>
> I think that is the bulk of the work- is that something that can be
> optimized, or something we should leave as is and mark this bug as wontfix?
Instinctively I'd say wontfix, there's nothing obviously 'unoptimized' about the batch, i.e. the work doesn't change with the backout as far as I can see just some of the function structure changed. Trying to 'optimize code to cause PGO to make decisions favoring a certain test' seems like a tricky road. Any other change might change the PGO decisions again and regress this again, etc.
Flags: needinfo?(bas)
Reporter | ||
Comment 5•9 years ago
|
||
Thanks Bas! Let me get confirmation from mconley (the contact for tabpaint) so we can close this out.
Flags: needinfo?(mconley)
Comment 6•9 years ago
|
||
Tracking since this is a recent regression (though it looks like backouts already happened and we may be done here)
status-firefox49:
--- → affected
tracking-firefox49:
--- → +
Comment 7•9 years ago
|
||
Yeah, I think trying to "trick" PGO into doing faster things is likely an expensive use of time, and not worth it. Let's just take this one.
Status: NEW → RESOLVED
Closed: 9 years ago
Flags: needinfo?(mconley)
Resolution: --- → WONTFIX
Updated•9 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•