Closed Bug 1216924 Opened 9 years ago Closed 8 years ago

9-13% Linux 64/Win7/WinXP tps regression on Mozilla-Inbound on October 20, 2015 from push 1e2fc068682e617ec5d72784cd2e34264157c88b

Categories

(Core :: Panning and Zooming, defect)

44 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox44 --- unaffected
firefox45 --- unaffected
firefox46 --- wontfix
firefox47 --- fixed
firefox48 --- fixed

People

(Reporter: jmaher, Assigned: kats)

References

(Depends on 1 open bug)

Details

(Keywords: perf, regression, Whiteboard: [talos_regression][gfx-noted])

Attachments

(1 file)

Talos has detected a Firefox performance regression from your commit 1e2fc068682e617ec5d72784cd2e34264157c88b in bug 1216072.  We need you to address this regression.

This is a list of all known regressions and improvements related to your bug:
http://alertmanager.allizom.org:8080/alerts.html?rev=1e2fc068682e617ec5d72784cd2e34264157c88b&showAll=1

On the page above you can see Talos alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test, please see: https://wiki.mozilla.org/Buildbot/Talos/Tests#tps

Reproducing and debugging the regression:
If you would like to re-run this Talos test on a potential fix, use try with the following syntax:
try: -b o -p linux64,win32 -u none -t g2-e10s  # add "mozharness: --spsProfile" to generate profile data

To run the test locally and do a more in-depth investigation, first set up a local Talos environment:
https://wiki.mozilla.org/Buildbot/Talos/Running#Running_locally_-_Source_Code

Then run the following command from the directory where you set up Talos:
talos --develop -e <path>/firefox -a tps --e10s

Making a decision:
As the patch author we need your feedback to help us handle this regression.
*** Please let us know your plans by Monday, or the offending patch will be backed out! ***

Our wiki page oulines the common responses and expectations:
https://wiki.mozilla.org/Buildbot/Talos/RegressionBugsHandling
there are 4 regressions here and they are all e10s only:
13% win7 tps
9% winxp tps
10% linux64 tps
6% linux64 tp5o xres

perfherder isn't working right now, so we will need to rely on the alerts generated automatically and validating those with the graphs.

:benwa, this falls into our category of backing changes out as we have a 10%+ regression on windows tests, can you prioritize this and help us figure out the next actions on it?
Flags: needinfo?(bgirard)
Thanks for flagging this down.

I think given that the patch also got us ~50% tp5_scroll and tscroll-asap improvements on windows and Linux we should keep this regression. This patch also align displayport behavior on Linux and Windows to behave more like mobile and Mac. The change in TPS make sense given that we round-out the display port so it will cause a small regression on pageload since we're building a displayport that doesn't benefit us on page load, but it will once we start scrolling.

We've been making good progress overall on improving the scores of the e10s+apz configuration and hopefully in the future we can target a patch at moving these scores forward.
Flags: needinfo?(bgirard)
Flags: needinfo?(jmaher)
that seems reasonable to me.  tps is a specific test where we load 50 pages at once, I suspect it is a more extreme case.  I do like the gains we get on scrolling- right now this is just making a call about what tradeoffs we want, much better scrolling, or a bit slower pageloading in bulk.  I don't see tp5o regressing, so a single pageload doesn't seem to be affected.

:vladan, thoughts?
Flags: needinfo?(jmaher) → needinfo?(vladan.bugzilla)
- comment 0 states that bug 1216072 is responsible, but I think that's the wrong bug #. Benoit landed patches for 3 bugs in push 1e2fc068682e617ec5d72784cd2e34264157c88b: bug 1216287, bug 1204136, and bug 1215596. Did it get bisected to a single bug?
- Which bug caused the improvements referenced in comment 2?
- If comment 2 is correct and the same bugs cause a 50% scroll APZ improvement and a 13% e10s tab-switch regression, the choice is not clear. Is E10S tabs-switching with APZ-disabled regressed?
Flags: needinfo?(vladan.bugzilla)
it is bug 1204136 which caused this.  I am not sure what caused the improvements, I don't see alerts for those specifically related to this time window- I am collecting more numbers to look for the improvements.
there is a 4% win7 (bug 1208465) tps regression on uplift to aurora, but no linux64 or winxp.  This didn't seem to affect pgo/uplift.  Should we leave this open or close it?
Flags: needinfo?(vladan.bugzilla)
I spoke to Benoit, this patch is APZ-specific and the regressions and improvements should be APZ-specific as well. Since APZ is currently a Nightly-only feature, I'm ok with making trade-offs that only affect Firefox in the APZ configuration. This regression should be fixed before APZ rides the trains though (likely not soon).

We shouldn't see any regressions from this patch on Aurora. I guess the 4% win7 tps regression on Aurora 44 is caused by unrelated bug 1208465.

Let's leave this bug open as a blocker of meta-bug 1178298 (APZ riding the trains)
Flags: needinfo?(vladan.bugzilla)
having it track firefox 45 then!
Blocks: 1220148
No longer blocks: 1207141
Blocks: apz-talos
No longer blocks: all-aboard-apz
No longer blocks: apz-talos
Depends on: apz-talos
this is now on beta
Whiteboard: [talos_regression] → [talos_regression][gfx-noted]
Assignee: nobody → bugmail.mozilla
Attachment #8729252 - Flags: review?(bgirard)
Comment on attachment 8729252 [details] [diff] [review]
Patch

Review of attachment 8729252 [details] [diff] [review]:
-----------------------------------------------------------------

Nice. Thank you Kats!
Attachment #8729252 - Flags: review?(bgirard) → review+
(See https://bugzilla.mozilla.org/show_bug.cgi?id=1254273#c9 for why this was backed out). I triggered the jobs that filed on my try push that I did for this patch, to see if it was this patch or the other one that caused the failure.
It appears to be this patch. I'll investigate early next week.
Most likely there's some other test in the chunk that's causing the code to enter displayport suppression and not exit it. That seems bad, so we should definitely fix it.
So displayport suppression is enabled while this test is running, and as far as I can tell there's no bug with that code. The suppression enables because we open a new tab and switch to it as part of the test, and the test runs fast enough that it finishes entirely before the tabswitcher code gets around to removing the displayport suppression. I tested a bunch of different scenarios as well as real browser usecases and the suppression seems to be working as intended. In particular, it's NOT the case that suppression gets left on accidentally, which is what I was worried about.

I filed bug 1255997 to fix an issue with the test where it wasn't accounting for the scrollbars properly. The displayport suppression exposed the issue, which was masked before by the displayport getting padded out to include the scrollbar area. With that patch in place the test seems to pass locally. I'll do a try push to confirm.
https://hg.mozilla.org/mozilla-central/rev/fbe36b282024
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Marking 44/45 as unaffected as APZ is not enabled there. 46 will have APZ for a couple of betas but this is probably not worth uplifting there. We should uplift to 47 though, it should help with tab switch and resize performance a bit.
Comment on attachment 8729252 [details] [diff] [review]
Patch

Approval Request Comment
[Feature/regressing bug #]: APZ
[User impact if declined]: Tab switch operations a tiny bit slower because we paint a larger area.
[Describe test coverage new/current, TreeHerder]: talos coverage
[Risks and why]: low risk, small patch
[String/UUID change made/needed]: none
Attachment #8729252 - Flags: approval-mozilla-aurora?
Joel, Kats: Have we tested the fix in Nightly/try to confirm that the perf regression is fixed? I just want to make sure before taking this in 47. Thanks!
Flags: needinfo?(jmaher)
Flags: needinfo?(bugmail.mozilla)
thanks kats!
Flags: needinfo?(jmaher)
Comment on attachment 8729252 [details] [diff] [review]
Patch

Thanks Kats, verified that this fix improves tab switching performance. Aurora47+
Attachment #8729252 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 1328066
No longer depends on: 1328066
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: