Closed Bug 1387079 Opened 2 years ago Closed 2 years ago

[Linux] Noticeable performance issues when scrolling on some popular websites (twitter, linkedin, pinterest)

Categories

(Core :: Graphics, defect, P1, major)

55 Branch
x86
All
defect

Tracking

()

VERIFIED FIXED
mozilla58
Tracking Status
platform-rel --- ?
firefox-esr52 --- unaffected
firefox55 --- wontfix
firefox56 --- wontfix
firefox57 --- verified
firefox58 --- verified

People

(Reporter: bogdan_maris, Assigned: lsalzman)

References

Details

(Keywords: regression, Whiteboard: [platform-rel-LinkedIn] [gfx-noted])

Attachments

(3 files)

[Affected versions]:
- Firefox 55.0-build 2
- Firefox 54.0.1
- Firefox 57.0a1

[Affected platforms]:
- Ubuntu 16.04

[Steps to reproduce]:
1. Start Firefox
2. Visit twitter and login using an account
3. Scroll the timeline until you hit a video (gif)

Alternative steps:
1. Start Firefox
2. Visit https://www.linkedin.com/in/bobhutchinsfacebookadexpert/detail/recent-activity/ and login
3. Start the video and scroll the page

1. Start Firefox
2. Visit pinterest and login
3. Click a long image and scroll the page

[Expected result]:
- User can scroll through the page nice and smooth.

[Actual result]:
- Noticeable lag can be noticed when scrolling, CPU usage jumps to 100% when scrolling.

[Regression range]:
- This is not a recent regression since it reproduces on 54.0.1 as well, will try and find a regression though ASAP.

[Additional notes]:
- Here is the Graphics section info from about:support https://pastebin.com/x2P3tkcf
Severity: normal → major
Forgot to say that I did not reproduce this issue on Windows 10 which is on the same machine that I have my Ubuntu.
Summary: Noticeable performance issues when scrolling on some popular websites (twitter, linkedin, pinterest) → [Linux] Noticeable performance issues when scrolling on some popular websites (twitter, linkedin, pinterest)
I was able to track down a regression using mozregression, but at the end It skipped one build but the pushlog before it did so is this:

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=58f6e99c9a83dd85e44d435037dabdf7a0afdd83&tochange=e06077091df1f2577a052b43e86135cc12e87a4c

So I manually checked the first build with and first build without for all changesets except for the ones from bug 1340627,
(838652a84b76, 4c580a771776, 7986e8790db6) and I found that the culprit could actually be: 
7986e8790db6	Emilio Cobos Álvarez — Bug 1363666: followup - Remove outdated comment. r=me

I let mozregression finish even though it skipped one build and I got this output which points to bug 1340627:

Last good revision: e6daa01f4c3ca434da646e1329cb9795b71539d4
First bad revision: e06077091df1f2577a052b43e86135cc12e87a4c
Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=e6daa01f4c3ca434da646e1329cb9795b71539d4&tochange=e06077091df1f2577a052b43e86135cc12e87a4c

Reqesting needinfos from both possible culprit bugs.
Flags: needinfo?(lsalzman)
Flags: needinfo?(emilio+bugs)
Has Regression Range: --- → yes
Has STR: --- → yes
Well, my patch is just removing a comment in the build system, so there's no way it's the culprit.

The Skia update seems a better candidate :)
Flags: needinfo?(emilio+bugs)
Note that the Skia update in bug 1340627 happened in 55. You noted the symptom in version 54, which was clearly before the landing of that bug. So in the absence of other evidence, I think there's something else going on there, and I don't believe the Skia update is the cause.
Flags: needinfo?(lsalzman) → needinfo?(bogdan.maris)
Jet, is there someone who can take a look to see how severe important this is. thanks.
Flags: needinfo?(bugs)
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #2)
> I was able to track down a regression using mozregression

Bogdan: If it's true that you first saw this bug in v.54, you'll need to go backwards from when we branched to 55 to find the regression point:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6583496f169c

Can you help with the regression window, or confirm if the bug actually started happening in v.55+? Thx!
Yeah I might have been misleading you guys with the 54.0.1 build (sorry for that), I retested again today and I *can't* reproduce with 54.0.1 only 55.0 and up.

I also double checked the regression using nightly builds and I ended up with the same range. Also using 'layers.acceleration.draw-fps' set to 'true' I got reading of below 10 fps when scrolling on an affected build and 50-60 fps using an unaffected build.
Flags: needinfo?(bogdan.maris)
So, in the interests of aiding resolution of this, can we either get a less onerous and simpler testcase that doesn't require signing up with various services and logging in, and also, a recorded profile from this so I can look at the stack trace of what's going on?
Flags: needinfo?(bogdan.maris)
Priority: -- → P3
platform-rel: --- → ?
Whiteboard: [platform-rel-LinkedIn]
Attached file profile.zip
(In reply to Lee Salzman [:lsalzman] from comment #8)
> So, in the interests of aiding resolution of this, can we either get a less
> onerous and simpler testcase that doesn't require signing up with various
> services and logging in, and also, a recorded profile from this so I can
> look at the stack trace of what's going on?

Here is a link for the recorded profile http://bit.ly/2x3csay. Also attached the same thing in case people don't have access to bit.ly (myself included). I'll try and make a simpler testcase but if anyone can provide one earlier then me please go ahead.
Attached file Testcase.html
I managed to make a simple testcase, hope this helps. I can still see this issue on latest Nightly 58.0a1.
Flags: needinfo?(bogdan.maris)
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #10)
> Created attachment 8915590 [details]
> Testcase.html
> 
> I managed to make a simple testcase, hope this helps. I can still see this
> issue on latest Nightly 58.0a1.

I tried this testcase but it doesn't seem to reproduce the issue for me. Also for some reason, the profile does not have symbols with it. Did you generate this profile with a release or a local build? Without symbols, it's hard to see what's going on here. All we know for sure is most of the time is spent in a semaphore, but no telling where from.
(In reply to Lee Salzman [:lsalzman] from comment #11)
> (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #10)
> > Created attachment 8915590 [details]
> > Testcase.html
> > 
> > I managed to make a simple testcase, hope this helps. I can still see this
> > issue on latest Nightly 58.0a1.
> 
> I tried this testcase but it doesn't seem to reproduce the issue for me.
> Also for some reason, the profile does not have symbols with it. Did you
> generate this profile with a release or a local build? Without symbols, it's
> hard to see what's going on here. All we know for sure is most of the time
> is spent in a semaphore, but no telling where from.

I don't recall what Firefox I used but I recorded another session on latest Nightly 58.0a1 https://perfht.ml/2wIdUw0. Note that I could not reproduce this only on 32bit Ubuntu version (only tried 16.04). Maybe this helps.

Also note that I did receive some no symbols messages in terminal when using profiler.

> /usr/bin/nm: /usr/lib/i386-linux-gnu/libgtk-3.so.0: no symbols
> /usr/bin/nm: /usr/lib/i386-linux-gnu/libxcb.so.1: no symbols
> /usr/bin/nm: /lib/i386-linux-gnu/libglib-2.0.so.0: no symbols
> /usr/bin/nm: /lib/i386-linux-gnu/libm.so.6: no symbols
> /usr/bin/nm: /usr/lib/i386-linux-gnu/libdbus-glib-1.so.2: no symbols
> /usr/bin/nm: /lib/i386-linux-gnu/libc.so.6: no symbols
> /usr/bin/nm: /usr/lib/i386-linux-gnu/libgio-2.0.so.0: no symbols
> /usr/bin/nm: /usr/lib/i386-linux-gnu/libgdk-3.so.0: no symbols
> /usr/bin/nm: /usr/lib/i386-linux-gnu/libX11.so.6: no symbols
> /usr/bin/nm: /usr/lib/i386-linux-gnu/libgobject-2.0.so.0: no symbols
> /usr/bin/nm: /lib/i386-linux-gnu/libdbus-1.so.3: no symbols
> /usr/bin/nm: /usr/lib/i386-linux-gnu/libffi.so.6: no symbols
> /usr/bin/nm: /lib/i386-linux-gnu/libgcc_s.so.1: no symbols
> /usr/bin/nm: /lib/ld-linux.so.2: no symbols
> /usr/bin/nm: /usr/lib/i386-linux-gnu/libatk-1.0.so.0: no symbols
> /usr/bin/nm: /usr/lib/i386-linux-gnu/libibus-1.0.so.5: no symbols
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #12)
> (In reply to Lee Salzman [:lsalzman] from comment #11)
> > (In reply to Bogdan Maris, QA [:bogdan_maris] from comment #10)
> > > Created attachment 8915590 [details]
> > > Testcase.html
> > > 
> > > I managed to make a simple testcase, hope this helps. I can still see this
> > > issue on latest Nightly 58.0a1.
> > 
> > I tried this testcase but it doesn't seem to reproduce the issue for me.
> > Also for some reason, the profile does not have symbols with it. Did you
> > generate this profile with a release or a local build? Without symbols, it's
> > hard to see what's going on here. All we know for sure is most of the time
> > is spent in a semaphore, but no telling where from.
> 
> I don't recall what Firefox I used but I recorded another session on latest
> Nightly 58.0a1 https://perfht.ml/2wIdUw0. Note that I could not reproduce
> this only on 32bit Ubuntu version (only tried 16.04). Maybe this helps.
> 
> Also note that I did receive some no symbols messages in terminal when using
> profiler.
> 
> > /usr/bin/nm: /usr/lib/i386-linux-gnu/libgtk-3.so.0: no symbols
> > /usr/bin/nm: /usr/lib/i386-linux-gnu/libxcb.so.1: no symbols
> > /usr/bin/nm: /lib/i386-linux-gnu/libglib-2.0.so.0: no symbols
> > /usr/bin/nm: /lib/i386-linux-gnu/libm.so.6: no symbols
> > /usr/bin/nm: /usr/lib/i386-linux-gnu/libdbus-glib-1.so.2: no symbols
> > /usr/bin/nm: /lib/i386-linux-gnu/libc.so.6: no symbols
> > /usr/bin/nm: /usr/lib/i386-linux-gnu/libgio-2.0.so.0: no symbols
> > /usr/bin/nm: /usr/lib/i386-linux-gnu/libgdk-3.so.0: no symbols
> > /usr/bin/nm: /usr/lib/i386-linux-gnu/libX11.so.6: no symbols
> > /usr/bin/nm: /usr/lib/i386-linux-gnu/libgobject-2.0.so.0: no symbols
> > /usr/bin/nm: /lib/i386-linux-gnu/libdbus-1.so.3: no symbols
> > /usr/bin/nm: /usr/lib/i386-linux-gnu/libffi.so.6: no symbols
> > /usr/bin/nm: /lib/i386-linux-gnu/libgcc_s.so.1: no symbols
> > /usr/bin/nm: /lib/ld-linux.so.2: no symbols
> > /usr/bin/nm: /usr/lib/i386-linux-gnu/libatk-1.0.so.0: no symbols
> > /usr/bin/nm: /usr/lib/i386-linux-gnu/libibus-1.0.so.5: no symbols

Matt, this profile shows a really long time during rasterize is being spent waiting on a CrossProcessSemaphore. This was added by you in bug 1325227. Any ideas?
Flags: needinfo?(matt.woodrow)
Blocking on the CrossProcessSemaphore usually just means that the compositor is too slow, and it hasn't unlocked buffers from the previous frame yet.

Doesn't look like we have markers for the compositor, so it's hard to tell how long compositions are taking, but the compositor thread looks to be extremely busy here.
Flags: needinfo?(matt.woodrow)
Bogdan, did you mean this only happens on 32 bit? Or that it *doesn't* happen on 32 bit?
Flags: needinfo?(bogdan.maris)
I don't really know the skia code, but it looks like we might be using the non-optimized version of SkRasterPipelineBlitter here, I can't see any mention of SSE2 or similar.

It's also possible that SkRasterPipelineBlitter isn't what we want to be using for a ColorLayer.
(In reply to Matt Woodrow (:mattwoodrow) from comment #16)
> I don't really know the skia code, but it looks like we might be using the
> non-optimized version of SkRasterPipelineBlitter here, I can't see any
> mention of SSE2 or similar.
> 
> It's also possible that SkRasterPipelineBlitter isn't what we want to be
> using for a ColorLayer.

That's why I asked Bogdan above to clarify what he meant about 32 bit. In Skia m59, SkRasterPipeline disables acceleration for 32 bit, and only thus uses SSE acceleration for 64 bit. I had assumed SkRasterPipeline went unused for our use cases, so if it is preferentially using it over its formerly favored hard-coded SSE blitters, and even still using no acceleration on 32 bit platforms when doing that, that would be disturbing.

I'm seeing if it's possible to just disable SkRasterPipeline until we update to a newer Skia update where 32 bit platforms are accelerated. That is premised on the fact that Bogdan is using 32 bit (which is not clear from his wording).
Bogdan, if indeed you are using 32 bit, can you try the following experimental build and see if it fixes the issue for you?

https://queue.taskcluster.net/v1/task/BpuiSYzVS4CxXfzX-hLebQ/runs/0/artifacts/public/build/target.tar.bz2
(In reply to Lee Salzman [:lsalzman] from comment #18)
> Bogdan, if indeed you are using 32 bit, can you try the following
> experimental build and see if it fixes the issue for you?
> 
> https://queue.taskcluster.net/v1/task/BpuiSYzVS4CxXfzX-hLebQ/runs/0/
> artifacts/public/build/target.tar.bz2

That build looks very good, I can only see like ~10fps drop when scrolling but not every time. This is not noticeable by naked eye though. And yes I only reproduced this on a 32bit Ubuntu.
Flags: needinfo?(bogdan.maris)
This patch limits using SkRasterPipeline to those platforms for which SkJumper is actually accelerated. Platform defines referenced from SkRasterPipeline::run_with_jumper here: https://dxr.mozilla.org/mozilla-central/source/gfx/skia/skia/src/jumper/SkJumper.cpp?q=run_with_jumper&redirect_type=direct#326

In terms of release platforms, this affects us on 32-bit x86 on all OSes, but 64-bit x86 and ARM are both accelerated properly.

Bypassing SkRasterPipeline here causes us to go back to using the normal software shaders that Skia was using before m59.
Assignee: nobody → lsalzman
Status: NEW → ASSIGNED
Attachment #8917094 - Flags: review?(jmuizelaar)
Attachment #8917094 - Flags: review?(jmuizelaar) → review+
Blocks: 1340627
Component: Layout: View Rendering → Graphics
OS: Linux → All
Priority: P3 → P1
Hardware: All → x86
Whiteboard: [platform-rel-LinkedIn] → [platform-rel-LinkedIn] [gfx-noted]
Version: Trunk → 55 Branch
Pushed by lsalzman@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/32227656b09d
only use SkRasterPipeline when SkJumper is accelerated. r=jrmuizel
https://hg.mozilla.org/mozilla-central/rev/32227656b09d
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Verified that this is fixed using latest Nightly 58.0a1 (2017-10-11) on Ubuntu 16.04 32bit and Windows 7 32bit.
This brought improvements on Windows 7:

== Change summary for alert #9950 (as of October 10 2017 22:02 UTC) ==

Improvements:

  6%  tresize windows7-32 pgo e10s     10.39 -> 9.77
  4%  tresize windows7-32 opt e10s     11.86 -> 11.38

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=9950
Please request Beta approval on this when you get a chance.
Flags: needinfo?(bugs) → needinfo?(lsalzman)
Comment on attachment 8917094 [details] [diff] [review]
only use SkRasterPipeline when SkJumper is accelerated

Approval Request Comment
[Feature/Bug causing the regression]: bug 1340627
[User impact if declined]: Severe composition performance regressions on 32-bit x86 on all OSes.
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no 
[List of other uplifts needed for the feature/fix]:
[Is the change risky?]: no
[Why is the change risky/not risky?]: Disables an experimental feature in Skia m59 that was not really ready for production on 32-bit x86, so that we are using the previous behavior from before our update to m59.
[String changes made/needed]: none
Flags: needinfo?(lsalzman)
Attachment #8917094 - Flags: approval-mozilla-beta?
Comment on attachment 8917094 [details] [diff] [review]
only use SkRasterPipeline when SkJumper is accelerated

Fixes a perf regression, Beta57+
Attachment #8917094 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Verified as fixed using Firefox 57beta8 on Ubuntu 16.04 32bit and Windows 7 32bit
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.