Closed Bug 1257638 Opened 8 years ago Closed 8 years ago

2-4% linux64 tsvgr_opacity regression on inbound (v.48) from push 2d59367c985a on March 11th, 2016

Categories

(Core :: General, defect)

48 Branch
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jmaher, Unassigned)

References

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

Talos has detected a Firefox performance regression from push 2d59367c985a. As author of one of the patches included in that push, we need your help to address this regression.

This is a list of all known regressions and improvements related to the push:

https://treeherder.mozilla.org/perf.html#/alerts?id=374

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see:
https://wiki.mozilla.org/Buildbot/Talos/Tests#tsvg-opacity

Reproducing and debugging the regression:

If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p linux64 -u none -t svgr-e10s,svgr --rebuild 5  # add "mozharness: --spsProfile" to generate profile data

(we suggest --rebuild 5 to be more confident in the results)

To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.lorg/Buildbot/Talos/Running#Running_locally_-_Source_Code

Then run the following command from the directory where you set up Talos:
talos --develop -e [path]/firefox -a tsvgr_opacity

(add --e10s to run tests in e10s mode)

Making a decision:
As the patch author we need your feedback to help us handle this regression.
*** Please let us know your plans by Monday, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations:
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
I did some retriggers and the compare view shows some wins on tart and tsvgx, but the regressions on tsvgr_opacity:
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=b8e6b7d7ff08&newProject=mozilla-inbound&newRevision=2d59367c985a&framework=1&filter=linux64

:glandium, can you help figure out what is going on here and determine a resolution for this bug?
Flags: needinfo?(mh+mozilla)
the list of alerts here look as though this is a huge win, these alerts were generated from 2 revisions in the future (yay for coalescing) and pick up the next push that :glandium did:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=13e2c954630c

most likely the cumulative sum of both pushes is a huge win with 2 smaller regressions- it would be nice to confirm that in more detail.
Bug 1175546 changed the compiler used for Linux builds. So that performance changed as a result is not unexpected. Regressions are sad, though, but I'm not sure what we can do about them.
Flags: needinfo?(mh+mozilla)
I would expect you to look into the regressions or justify why they changed.  Please treat these as serious regressions like all the other Mozilla developers do.
Flags: needinfo?(mh+mozilla)
(In reply to Mike Hommey [:glandium] from comment #3)
> Bug 1175546 changed the compiler used for Linux builds. So that performance
> changed as a result is not unexpected. Regressions are sad, though, but I'm
> not sure what we can do about them.

Well, one question is is this pgo only, or does it happen to nonpgo too?  Since a posibility is we don't execute the involved code in the training data, and it doesn't get optimized well.  I guess either way we could try adding more code similar to this test to the training data and see what happens.

Otherwise if we want to fix this regression I think we basically have to figure out what code is getting optimized less well than before, and find a way to force gcc to optimize it better.  Given that I don't know much about the code this test runs, and I'm pretty sure the same can be said for Mike we're really not very good people to look into it.

Other than that I really don't see what we can do other than accept the regression.
Flags: needinfo?(jmaher)
This is an interesting point Trevor, thanks for commenting in here.  These improvements and 2 regressions listed are for opt, not pgo.  I did look at tsvgr_opacity and we have a sustained regression on opt, BUT a small win for pgo!  I had overlooked that originally.

So I looked a few other tests and pgo is either a win or neutral.

If this were not the case, I would have wanted to look at different compiler flags, and a profiling run to determine what might be different.  I am fine closing this out as worksforme and moving forward.

:glandium, feel free to close this out, unless you have other thoughts or concerns?
Flags: needinfo?(jmaher)
Considering how confident I am with talos...
Did you look at the individual subtests on the pgo builds? because interestingly, there are two subtests in tsvgr_opacity, and only one of them is regressing (tsvgr_opacity big-optimizable-group-opacity-2500.svg). The other one is improved (tsvgr_opacity small-group-opacity-2500.svg).
Flags: needinfo?(mh+mozilla) → needinfo?(jmaher)
yes, this happens most of the time- we have many subtests and only some of them regress.  the big difference between these subtests will be the size and in the past networking related changes have triggered big-optimizable-group-opacity-2500.svg.

What is good as Trevor pointed out is that we see a win on pgo- we only ship pgo, so there is less concern with this bug.  The majority of regressions show up in both opt and pgo, and sometimes we see them show up in one or the other.
Flags: needinfo?(jmaher)
Component: Untriaged → General
Product: Firefox → Core
Is this bug actionable?
Version: unspecified → 48 Branch
shall we close this bug?
Flags: needinfo?(mh+mozilla)
Considering comment 8, I'd say yes.
Status: NEW → RESOLVED
Closed: 8 years ago
Flags: needinfo?(mh+mozilla)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.