Closed Bug 1439293 Opened 5 years ago Closed 4 years ago

Scroll is not smooth in webrender (bad frame uniformity)

Categories

(Core :: Graphics: WebRender, defect, P2)

x86_64
Windows 10
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: alice0775, Assigned: kats)

References

(Blocks 1 open bug)

Details

Attachments

(6 files)

Attached file about:support
Reproducible: always

Steps To Reproduce:
1. Open https://www.ecma-international.org/ecma-262/8.0/index.html
2. Wait for loading page
3. Try scroll


Actual Results:
 It lacks smoothness compared with Webrender disabled.
   with mouse wheel -- noticeable
   with arrow key --- slightly noticeable
   with autoscroll --- noticeable
For the reference,
Results of scroll test (using bookmarklet of attachment 776049 [details])

==Webrender enabled==

Page: https://www.ecma-international.org/ecma-262/8.0/index.html#sec-ecmascript-language-expressions
Steps (after ignoring 120): 323
Step: 5 px
Duration: 7000 ms (target: 7000)

Window size: 1280 x 895
Average interval: 16.20 ms
STDDEV intervals: 13.97 ms

intervals histogram:
4.0 - 6.0 ms: 2
6.0 - 8.0 ms: 16
8.0 - 10.0 ms: 11
10.0 - 12.0 ms: 37
12.0 - 14.0 ms: 45
14.0 - 15.9 ms: 72
15.9 - 17.3 ms: 49
18.0 - 22.0 ms: 49
22.0 - 32.0 ms: 33
32.0 - 35.0 ms: 2
35.0 - 50.0 ms: 3
50.0 - 80.0 ms: 2
120.0 - 180.0 ms: 1
180.0 - 250.0 ms: 1



==Webrender disabled:==

Page: https://www.ecma-international.org/ecma-262/8.0/index.html#sec-intro
Steps (after ignoring 120): 394
Step: 5 px
Duration: 7026 ms (target: 7000)

Window size: 1280 x 895
Average interval: 13.60 ms
STDDEV intervals: 3.28 ms

intervals histogram:
2.0 - 4.0 ms: 1
8.0 - 10.0 ms: 1
10.0 - 12.0 ms: 13
12.0 - 14.0 ms: 132
14.0 - 15.9 ms: 218
15.9 - 17.3 ms: 23
18.0 - 22.0 ms: 1
32.0 - 35.0 ms: 1
35.0 - 50.0 ms: 4
Don't know if this is APZ related that we're already working on, or general performance, nor how this behaves on the reference hardware, but we should find out.
Assignee: nobody → bugmail
Priority: -- → P1
Next steps here:
1) Try to repro on reference hardware
2) If it repro's, get a profile to see why frame intervals are so high.
Flags: needinfo?(jmuizelaar)
This works fine for me on the reference hardware. Can you still reproduce this?
Flags: needinfo?(jmuizelaar) → needinfo?(alice0775)
Tested with STR comment#0 on Nightly61.0a1 x64 Build ID 20180314102135:

I can still reproduce the jerky scroll when webrender is enabled. And it is very smooth when webrender is disabled.
Flags: needinfo?(alice0775)
Can you post a profile with "RenderBackend,Renderer" added to the threads list in the Gecko profiler?
Flags: needinfo?(alice0775)
Also note that the bookmarklet you're using is Avi's original one which does main-thread scrolling (it scrolls by a small increment in a rAF loop), not APZ scrolling. The one at https://staktrace.github.io/moz-pages/scrolltest.html does APZ scrolling.
(In reply to Jeff Muizelaar [:jrmuizel] from comment #6)
> Can you post a profile with "RenderBackend,Renderer" added to the threads
> list in the Gecko profiler?

I scrolled the page with mouse wheel up, down, up and down.
I tested 2 time each.

WrbRender on  : https://perfht.ml/2pcpm0X https://perfht.ml/2FVlQ4Q
WebRender off : https://perfht.ml/2pbE6Nz https://perfht.ml/2pc9Jqm
Flags: needinfo?(alice0775)
Curious. According to the profiles, it's a solid 60FPS even with webrender enabled. Composites are slower though, 5ms vs 2ms, but that shouldn't result in a noticeable lack of smoothness. So there must be more to it.
Scroll with autoscroll(middle click and then move mouse about 80px to the bottom:
Web render on  : https://perfht.ml/2peLpnR https://perfht.ml/2p8Utuv
Web render off : https://perfht.ml/2pcSaGV https://perfht.ml/2pbUIVE
Is it only slow on this page or every page?
Flags: needinfo?(alice0775)
Attached file screencaast
It is a bit emphasized because of the load of the screen capture application, but if you enable WebRender you can see that it is jerky scroll.
Flags: needinfo?(alice0775)
If you turn on gfx.webrender.debug.compact-profiler and gfx.webrender.debug.profiler what fps do you get?
Flags: needinfo?(alice0775)
Attached image fps
around 75fps +/-0.2 (scroll with autoscroll)
Flags: needinfo?(alice0775)
What's the refresh rate of your monitor?
Flags: needinfo?(alice0775)
Attached image monitor refresh rate
Display property indicates 75Hz
Flags: needinfo?(alice0775)
I think the problem here is one of frame uniformity rather than performance. i.e. we're able to composite at 75fps fine, but the async scroll offset doesn't change linearly per composite. If you step through the WebRender_on.avi in the attachment from comment 12 frame by frame, you can see some frames have a small amount of scroll change and some frames have a larger amount. Normally this behaviour can be caused by "bunching" of async scroll offset changes but given the smoothness of the profile I'm surprised that this would happen. However if it's reproducible we should be able to make a build with logging to see where the numbers are coming from and if they're getting improperly interleaved with vsync somehow.
Summary: Scroll is not smooth in webrender → Scroll is not smooth in webrender (bad frame uniformity)
Is this still reproducible? Now that we sample the async scroll offset at render time from the render backend thread it might be better.
Flags: needinfo?(alice0775)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #19)
> Is this still reproducible? Now that we sample the async scroll offset at
> render time from the render backend thread it might be better.

Yes, still reproduce.

str
1. open https://www.ecma-international.org/ecma-262/8.0/index.html#sec-corrections-and-clarifications-in-ecmascript-2015-with-possible-compatibility-impact
2. wait for the page to finish loading
3. scroll with mouse wheel up and down

compare smoothness without webrender and with webrender.

Actual Results:
scroll is not smooth when webrender is enabled


Build ID 	20180609100028
User Agent 	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:62.0) Gecko/20100101 Firefox/62.0
Flags: needinfo?(alice0775)
We can't release this to the field, but we can let this ride to beta.
Priority: P1 → P2
Kat is on leave for 3 more weeks.  Un-assigning so we know it needs a new owner.
Assignee: kats → nobody
Priority: P2 → P3
Kats, do you want to see if this is still an issue and if there's something we can do about it?
Flags: needinfo?(kats)
Priority: P3 → P2
Sure, I can own this.

Alice, can you still reproduce this? If so I'll make a build with extra logging for you to try.
Flags: needinfo?(kats) → needinfo?(alice0775)
Assignee: nobody → kats
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #24)
> Sure, I can own this.
> 
> Alice, can you still reproduce this? If so I'll make a build with extra
> logging for you to try.

Build ID 	20180928100051
User Agent 	Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0

I can still reproduce the issue if WebRender is enabled.
Flags: needinfo?(alice0775)
Try build is in progress at https://treeherder.mozilla.org/#/jobs?repo=try&bugfiler=&group_state=expanded&revision=f1c7b791f5ebcb428b93d060dcf19c87b49e68b3 - please use this build and get a profile similar to what you got in comment 8. The marker table for the WRRenderBackend thread should have markers related to APZ sampling and we can use that information combined with the compositing intervals to confirm that it's a frame uniformity problem.

Note that the build also will spew the APZ sampling markers to the console output. You can ignore that if you see it.
Flags: needinfo?(alice0775)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #26)
> Try build is in progress at
> https://treeherder.mozilla.org/#/
> jobs?repo=try&bugfiler=&group_state=expanded&revision=f1c7b791f5ebcb428b93d06
> 0dcf19c87b49e68b3 - please use this build and get a profile similar to what
> you got in comment 8. The marker table for the WRRenderBackend thread should
> have markers related to APZ sampling and we can use that information
> combined with the compositing intervals to confirm that it's a frame
> uniformity problem.
> 
> Note that the build also will spew the APZ sampling markers to the console
> output. You can ignore that if you see it.


Umm, If I start Gecko Profiler then I cannot reproduce the issue. The scroll smoothness is almost same between WebRender on and off.
However, If I stop Gecko Profiler, the problem appears.



Anyway, Gecko profiler log are as follows.

WebRender on http://bit.ly/2xIDwva   
WebRender off http://bit.ly/2xJSUYg
Flags: needinfo?(alice0775)
Huh, interesting. In that case can you try with WebRender on and the profiler off, but collect the console output? You'll have to start firefox from a command prompt and redirect the output to a file. If you need more detailed instructions let me know.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #28)
> If you need more detailed instructions let me know.

yes please
Instructions on getting console output log:
- Download the MozillaBuild installer from https://ftp.mozilla.org/pub/mozilla/libraries/win32/MozillaBuildSetup-Latest.exe and install it (defaults to c:\mozilla-build which is fine)
- Download the try build and unzip it, let's say it's in your C:\Users\<username>\Downloads\firefox folder
- Set up a profile to use for testing, let's call it "wr_apz". This profile should have gfx.webrender.all=true. Also preferably set the default behaviour to restore your last session's tabs on startup, and load the page that you want to test scrolling on, so that when the profile is loaded it automatically loads the test page.
- Go to c:\mozilla-build and run the start-shell.bat file there. It should open a terminal window.
- Make sure you have no other instances of Firefox running (otherwise the next step might just open a new window with your existing Firefox, instead of launching the try build)
- In the terminal window, type "~/Downloads/firefox/firefox.exe -P wr_apz 2>&1 | tee ~/log.txt" and hit enter. This will start the build from your Downloads\firefox with the wr_apz profile and copy all the stdout and stderr output into the "log.txt" file in your Users home folder. The output should also appear in the terminal window, lots of lines with "APZSampling at <number> -> sampleTime <number>".
- Wait until the page is done loading and the browser is idle. Make a note of the (approximate) last sample time that was output to the terminal window.
- Scroll up and down, and make a note of the (approximate) last sample time that was output to the terminal window.
- Close the browser, and attach the log.txt file to this bug along with the approximate sample times from the last two steps. That will just help me narrow down which part of the log to focus on for the scrolling stuff.
Attached file log.txt
-start:  approx. 
APZSampling at 94456.254053 -> sampleTime 94469.321117

-Scroll up and down

-end :   approx. 
APZSampling at 113622.273137 -> sampleTime 113635.145618
(Regarding performance win under using gecko profiler in comment#27, I think Bug 1495208 is related.)
Thanks!

So the range from (94456.254053, 113622.273137] is just over 19 seconds. I looked at the intervals at which the APZ sampling happened (the first timestamp per line) and it breaks down kind of like this:

 94456.254053 -  97937.392903 (3.4 seconds) - no APZ sampling
 97937.392903 -  98327.328871 (0.4 seconds) - sampling intervals are pretty noisy
 98327.328871 - 104056.266998 (5.7 seconds) - no APZ sampling
104056.266998 - 105102.052370 (1.0 seconds) - sampling intervals are pretty noisy
105102.052370 - 112445.868285 (7.3 seconds) - APZ sampling at fairly consistent 13.x millisecond intervals
112445.868285 - 113622.273137 (1.1 seconds) - sampling intervals are pretty noisy

I'm going to assume that the 7.3 second period is the interesting one with APZ scrolling; the noisy intervals are likely to be due to switching focus to/away from the browser or some such activity.

Within this interval, I see the occasional double-sample; for example:

12.7068
14.0587
12.6126
13.7113
13.4434
13.3164
12.8264
7.53831 <--\  here are two sampling intervals that
6.31265 <--/  add up to ~13.8ms. So we sampled twice in that vsync
12.9104
13.7436
13.4753
12.7011
13.819
13.3192
13.3078

And there are four "larger" intervals that are not in the 13.3ms range (all of which are followed by another interval to line it back up with vsync). Here's what that looks like:

13.4065
13.3766
13.1992
13.3782
13.4081
165.741 <-- this is the large interval
6.84233 <-- this brings it back in line with vsync
13.8415
13.4397
13.3029
13.2283
13.4237

I'm not really sure what these are, best guess is changing scroll direction and scrolling again after the previous scrolling animation finished. But anyway, the sampling intervals here are quite uniform.

I also looked at the second number in the output lines, which is the timestamp that APZ uses for sampling animations. This is supposed to be the "next vsync" timestamp, and sure enough, this value was consistently ~11ms ahead of the "now" timestamp. For the cases where we did double-sampling in a vsync, the numbers were also what I would expect, and as far as I can tell everything here is as it should be.

It's possible that there is non-uniformity coming from further downstream, where APZ uses the timestamp to figure out the scroll position. I feel like that's unlikely but I can make a build that logs that too to see what those numbers look like.
Here's a build with the scroll offset logging: https://treeherder.mozilla.org/#/jobs?repo=try&bugfiler=&group_state=expanded&revision=3c87bcf4683604d354c6e54092cbf162eb69da05

Alice, could you try this one please? Same steps as last time.
Flags: needinfo?(alice0775)
Attached file log.txt
-start:  approx. 
APZSampling at 17538.826313 -> sampleTime 17549.701359 2:0.000000 3:0.000000 4:225614.468750

-end :   approx. 
APZSampling at 41834.577294 -> sampleTime 41846.279364 2:0.000000 3:0.000000 4:228926.468750
Flags: needinfo?(alice0775)
So in this log the double-composites are much more frequent. Also I'm noting that the whenever a double composite happens, both the composites have the same y offset emitted by APZ. It's not exactly clear to me when these get presented to the screen but it's quite possible both of those get presented and that might result in the visual issue.

Alice, can you try setting apz.frame_delay.enabled to false and let me know if that makes things better/worse/same as before?
Flags: needinfo?(alice0775)
Almost the same between apz.frame_delay.enabled=true and false.


And, In the recent nightly64.0a1 build, even if Webrender is enabled, it is much smoother than  the previous. 
I think this bug can be closed as worksforme.

@Kartikaya,
Thanks for hard work.
Flags: needinfo?(alice0775)
Great! Most likely the recent work to remove the double-rendering was what helped here.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #38)
> Great! Most likely the recent work to remove the double-rendering was what
> helped here.

What was that work?
Flags: needinfo?(kats)
I was thinking of servo/webrender#3092 but that's been in the tree for a few days already so maybe that wasn't it.
Flags: needinfo?(kats)
You need to log in before you can comment on or make changes to this bug.