Closed Bug 1144705 Opened 9 years ago Closed 9 years ago

12% Ubuntu 32 tsvgr_opacity regression on Fx-Team-Non-PGO (v.39) on March 16, 2015 from push 06e43f1e2a92

Categories

(Testing :: Talos, defect)

x86
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: vaibhav1994, Unassigned)

References

Details

Talos has detected a Firefox performance regression from your commit 06e43f1e2a92 in bug 1133763.  We need you to address this regression.

This is a list of all known regressions and improvements related to your bug:
http://alertmanager.allizom.org:8080/alerts.html?rev=436686833af0&showAll=1

On the page above you can see Talos alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test, please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#tsvg-opacity

Reproducing and debugging the regression:
If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p linux -u none -t svgr  # add "mozharness: --spsProfile" to generate profile data

To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code

Then run the following command from the directory where you set up Talos:
talos --develop -e <path>/firefox -a tsvgr_opacity

Making a decision:
As the patch author we need your feedback to help us handle this regression.

Our wiki page oulines the common responses and expectations:
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
Sorry this depends on bug 1143057
Depends on: 1143057
Blocks: 1138995
:shorlander, can you look into this?  From the retriggers it looks like your patch is the culprit- although the history of the test sort of looks odd.
Flags: needinfo?(shorlander)
(In reply to Joel Maher (:jmaher) from comment #2)
> :shorlander, can you look into this?  From the retriggers it looks like your
> patch is the culprit- although the history of the test sort of looks odd.

I will check it out. It was a straight image swap though, seems odd that it would cause a regression.
Flags: needinfo?(shorlander)
thanks!  if nothing is obvious, maybe a push to try with it backed out to see if it does remove the regression would be a good step to take.  Unfortunately if that proves it is better, than we have to face the reality that it is the patch.  Either way- we can weigh the options and make a decision.
(In reply to Stephen Horlander [:shorlander] from comment #3)
> I will check it out. It was a straight image swap though, seems odd that it
> would cause a regression.

Quoting from the tsvg-opacity test description https://wiki.mozilla.org/Buildbot/Talos/Tests#tsvg-opacity :

> Note that this test also tends to reflect changes in network efficiency
> and navigation bar rendering issues:
>
>    Most of the page load tests measure from before the location is changed,
> until onload + mozafterpaint, therefore any changes in chrome performance from
> the location change, or network performance (the pages load from a local web
> server) would affect page load times.
>
> SVG opacity is rather quick by itself, so any such chrome/network/etc performance
> changes would affect this test more than other page load tests (relatively, in
> percentages).

So yes, this test is also affected by tabs/ navigation bar / location bar rendering performance.

Joel, wasn't TART affected too?
I didn't see tart regressions, we should have enough data to generate alerts/regressions for tart.  Maybe this is a case where a tart subtest regressed but the rest didn't so the summary yielded no regression.
(In reply to Vaibhav (:vaibhav1994) from comment #1)
> Sorry this depends on bug 1143057

What made you change your mind?

The previous one (bug 1133763) deals with network stuff (HTTPS), and may also affect tsvg-opacity results (see comment 5).
:avih, I was using the template in alert_manager and the bug was there by default which I forgot to remove by mistake. If you look at the graph for this regression:
http://graphs.mozilla.org/graph.html#tests=[[225,132,33],[225,131,33]]&sel=1426094392433,1426699192433&displayrange=7&datatype=geo

there is a clear regression on 06e43f1e2a92 revision which is Bug 1143057
Thanks! :)
Baseline: https://treeherder.mozilla.org/#/jobs?repo=try&revision=39c941918ebd
Backout: https://treeherder.mozilla.org/#/jobs?repo=try&revision=27cd92783c5a

I will retrigger some tsvgr runs after the first run has completed, and then use compare-talos to see if there's a measureable improvement with the backout. If so, we can assume that (somehow) bug 1143057 is to blame for this regression.
Initial builds done. I've requested retriggers.

Here's the compare talos we'll use:

http://compare-talos.mattn.ca/?oldRevs=39c941918ebd&newRev=27cd92783c5a&server=graphs.mozilla.org&submit=true

If we see improvement on tsvgr_opacity after the retriggers are done, I think we've got our culprit.
Hey - the compare talos in comment 11 seems to suggest that backing out shorlander's patch will not result in a meaningful improvement, meaning that he likely didn't cause a regression. Does that sound right to you?
Flags: needinfo?(jmaher)
Can you tell us more about this patch, shorlander?

What are the exact differences between the images that you're swapping out and the images that you're swapping in?
Flags: needinfo?(shorlander)
(In reply to Mike Conley (:mconley) - Catching up on reviews. from comment #14)
> Can you tell us more about this patch, shorlander?
> 
> What are the exact differences between the images that you're swapping out
> and the images that you're swapping in?

I removed a noise texture from the tabs. New images have a smaller file size than the old images. Otherwise they are the same.

Some more info starting here: https://bugzilla.mozilla.org/show_bug.cgi?id=1005105#c12
Flags: needinfo?(shorlander)
Ok, thanks jmaher / shorlander.

Hypothesis from shorlander: bug 1005105 actually introduced a performance improvement, and we regressed that improvement somehow with the new dimensions when bug 1143057 landed.

We can test this by doing a push from just before bug 1005105 landed, then a push for when bug 1005105 landed, and a final push with bug 1005105 and the patch for bug 1143057 grafted on top.

Before bug 1005105: https://treeherder.mozilla.org/#/jobs?repo=try&revision=3a1c2e807e6d
Bug 1005105: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9d8446c05278
Bug 1005105 + Bug 1143057: https://treeherder.mozilla.org/#/jobs?repo=try&revision=6eee925d34cb
So shorlander's hypothesis has legs.

According to compare-talos, bug 1005105 introduced a nice performance win on tsvg_opacity:

http://compare-talos.mattn.ca/breakdown.html?oldTestIds=46311502,46388326,46388336,46388342,46388356,46388376,46388404,46388440,46388450,46388632&newTestIds=46388366,46388390,46388402,46388420,46388430,46388460,46388470,46388480,46388490,46388500&testName=tsvgr_opacity&osName=Ubuntu%2032&server=graphs.mozilla.org

When bug 1143057 lands on top of bug 1005105, there is a performance regression:

http://compare-talos.mattn.ca/breakdown.html?oldTestIds=46388366,46388390,46388402,46388420,46388430,46388460,46388470,46388480,46388490,46388500&newTestIds=46311420,46388510,46388516,46388530,46388542,46388544,46388560,46388570,46388580,46388590&testName=tsvgr_opacity&osName=Ubuntu%2032&server=graphs.mozilla.org

And when I compare pre-bug 1005105 to bug 1005105 + bug 1143057, there's a performance gain:

http://compare-talos.mattn.ca/breakdown.html?oldTestIds=46311502,46388326,46388336,46388342,46388356,46388376,46388404,46388440,46388450,46388632&newTestIds=46311420,46388510,46388516,46388530,46388542,46388544,46388560,46388570,46388580,46388590&testName=tsvgr_opacity&osName=Ubuntu%2032&server=graphs.mozilla.org

So here's how I interpret this:

1) Bug 1005105 lands. There's a large improvement in our tsvg_opacity performance.
2) Bug 1143057 lands. There's a regression in tsvg_opacity performance, eating away into the gains that bug 1005105 gave us, but combined, bug 1005105 and bug 1143057 are an improvement over what we started with.

How does my reasoning sound, jmaher?
Flags: needinfo?(jmaher)
Reading https://wiki.mozilla.org/Buildbot/Talos/Tests and talking with avih, it sounds like tsvg_opacity is sensitive to both network performance as well as performance in drawing things in the navbar / tabstrip area.

That's not immediately obvious from the name of the test, but with that understanding, it now makes more sense how shorlander's patches have affected this test.

The thing to note here is that we cannot ship bug 1005105 without bug 1143057. Bug 1143057 is necessary in order to make sure the tabstrip doesn't look like ass.

The other thing to underscore is that if we fold those two bugs together, we end up with an overall performance improvement over what we had pre bug 1143057.

So I'm going to WONTFIX this. If there are other opinions / arguments, feel free to re-open.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Flags: needinfo?(jmaher)
You need to log in before you can comment on or make changes to this bug.