Closed Bug 1406008 Opened 2 years ago Closed 2 years ago

webrender: scrolling by mousewheel+scrollbar broken until I do autoscrolling once

Categories

(Core :: Graphics: WebRender, defect, P1)

x86_64
All
defect

Tracking

()

VERIFIED FIXED
mozilla58
Tracking Status
firefox-esr52 --- unaffected
firefox56 --- unaffected
firefox57 --- unaffected
firefox58 --- unaffected

People

(Reporter: darkspirit, Assigned: kats)

References

Details

(Keywords: nightly-community, regression, Whiteboard: [wr-mvp])

Attachments

(2 files)

Nightly 58 x64 20171004220309 de_DE @ Debian Testing (KDE / Radeon RX480)
layers force accel + webrender

This is only if webrender is enabled and may be a regression.
about:buildconfig > https://hg.mozilla.org/mozilla-central/rev/c3b7759671deae73e40ebca01d7f23a326a4b8c2

Using the mousewheel and moving the scrollbar only works
* after (re-)loading a tab,
* after doing autoscroll once.

(Side note: Often, it doesn't work as long the page is loading, too.)

mozregression currently does not go beyond 2017-10-04 18:39:00.089000, revision ee1c41cf
>  0:01.12 WARNING: Skipping build 2017-10-05: Unable to find build info for 2017-10-05
Have to retry it later.
> Using the mousewheel and moving the scrollbar only works
> * after (re-)loading a tab,
> * after doing autoscroll once.

STR: Load some tabs, switch tab, try to scroll by mousewheel/scrollbar: doesn't work.
I believe I've seen this on macOS as well
Duplicate of this bug: 1405944
OS: Linux → All
Whiteboard: [wr-mvp] [triage]
I've seen this a couple of times previously but I can't reliably reproduce it. It doesn't seem like a recent regression to me although perhaps it's an intermittent thing that got more frequent?
layers force accel + webrender

"bad" is when I am not able to scroll down on about:preferences when webrender is enabled.

> mozregression --good 2017-10-03 --bad 2017-10-05 --profile-persistence clone-first
> [...]
> 6:09.50 INFO: Last good revision: 29a20bc04c6f95f8014238511d2cce7f299fabf3about:preferences
> 6:09.50 INFO: First bad revision: 6869bf44878df743ee848048954a959f77d5d9d7
> 6:09.50 INFO: Pushlog:
> https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=29a20bc04c6f95f8014238511d2cce7f299fabf3&tochange=6869bf44878df743ee848048954a959f77d5d9d7

Bug 1404782 - Do not mutate display list in layers-free mode. r=mattwoodrow
Thanks!

It looks like the patch adds a recursive call at [1] which is not handled properly. The code assumes that only recursion that can happen is at [2] and does a whole bunch of stuff beforehand (e.g. recording stuff like layerCountBeforeRecursing and all the things inside the apzEnabled block) and that's not happening around this new recursive call site. That's going to result in the event regions being all messed up which could very well explain the bad scrolling.

[1] https://hg.mozilla.org/integration/autoland/rev/db40593b664c#l1.58 
[2] http://searchfox.org/mozilla-central/rev/7ba03cebf1dcf3b4c40c85a4efad42a5022cf1ec/gfx/layers/wr/WebRenderLayerManager.cpp#333-335
Blocks: 1404782
Priority: -- → P2
Whiteboard: [wr-mvp] [triage] → [wr-mvp]
I think with a little rearranging in the affected code we should be able to fix this.
Assignee: nobody → bugmail
Priority: P2 → P1
Status: NEW → ASSIGNED
I pushed a possible fix to try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=10e41a68a6f1005593f21cb5da70e2839006e433

Once the build is done, Jan, can you try it to see if it fixes the problem?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #8)
> I pushed a possible fix to try:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=10e41a68a6f1005593f21cb5da70e2839006e433
> 
> Once the build is done, Jan, can you try it to see if it fixes the problem?

Linux x64 opt:
comment 5 is fixed in this try build (both with a fresh profile and with my cloned main profile)
> "bad" is when I am not able to scroll down on about:preferences when webrender is enabled.

comment 0 is still reproducible when I'm using my cloned main profile, but unreproducible in a fresh profile
> STR: Load some tabs, switch tab, try to scroll by mousewheel/scrollbar: doesn't work.

(Might not be related, but I have no idea what I'm talking about.)
I ran a clone of my main profile in Linux x64 debug and activated highlight painted areas: I activated and deactivated autoscoll to be able to scroll down on https://www.golem.de. After switching to another tab, switching back and scrolling down 1/3 of the page, scrolling stopped (against my will) and I got this in my terminal while all text on the lower half of the page had a red background:
> [GPU 15174, Compositor] ###!!! ASSERTION: Tried to delete invalid animation: 'Error', file /builds/worker/workspace/build/src/gfx/layers/wr/WebRenderBridgeParent.cpp, line 420
Besides this I didn't get any message that could explain to me why scrolling by mousewheel didn't work after switching to this tab.
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #9)

> comment 0 is still reproducible when I'm using my cloned main profile, but unreproducible in a fresh profile
> > STR: Load some tabs, switch tab, try to scroll by mousewheel/scrollbar: doesn't work.
The difference is:
Dark/Light Theme: Problem.
Default Theme (=fresh profile): No Problem.
I switched many times between them and verified that my WebRender is enabled (layers force accel + webrender).

btw. I see a lot of
> WARN:webrender::scene: Property binding PropertyBindingKey { id: PropertyBindingId { namespace: IdNamespace(18130), uid: 1 }, _phantom: PhantomData } has an invalid value.
spamming my terminal.
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #10)
> > > STR: Load some tabs, switch tab, try to scroll by mousewheel/scrollbar: doesn't work.
> The difference is:
> Dark/Light Theme: Problem.
> Default Theme (=fresh profile): No Problem.

I should run mozregression with webrender + dark theme. Maybe I find something.
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #9)
> Linux x64 opt:
> comment 5 is fixed in this try build (both with a fresh profile and with my
> cloned main profile)
> > "bad" is when I am not able to scroll down on about:preferences when webrender is enabled.

Ok, that's good news. So one problem is fixed by the patch. I'll split that out into a separate bug and put it up for review.

> comment 0 is still reproducible when I'm using my cloned main profile, but
> unreproducible in a fresh profile
> > STR: Load some tabs, switch tab, try to scroll by mousewheel/scrollbar: doesn't work.

Ok, let's use this bug to continue to track this problem.

> (Might not be related, but I have no idea what I'm talking about.)
> I ran a clone of my main profile in Linux x64 debug and activated highlight
> painted areas: I activated and deactivated autoscoll to be able to scroll
> down on https://www.golem.de. After switching to another tab, switching back
> and scrolling down 1/3 of the page, scrolling stopped (against my will) and
> I got this in my terminal while all text on the lower half of the page had a
> red background:
> > [GPU 15174, Compositor] ###!!! ASSERTION: Tried to delete invalid animation: 'Error', file /builds/worker/workspace/build/src/gfx/layers/wr/WebRenderBridgeParent.cpp, line 420
> Besides this I didn't get any message that could explain to me why scrolling
> by mousewheel didn't work after switching to this tab.

This error is likely unrelated to the scrolling problem, although it does indicate another bug somewhere. I'll split this out into a separate bug as well.

(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #10)
> The difference is:
> Dark/Light Theme: Problem.
> Default Theme (=fresh profile): No Problem.
> I switched many times between them and verified that my WebRender is enabled
> (layers force accel + webrender).

Interesting, I'll try switching themes and see if I can repro that way.

> btw. I see a lot of
> > WARN:webrender::scene: Property binding PropertyBindingKey { id: PropertyBindingId { namespace: IdNamespace(18130), uid: 1 }, _phantom: PhantomData } has an invalid value.
> spamming my terminal.

This is tracked in bug 1377894. It is probably also unrelated.
Filed bug 1406119 for the issue in comment 5/6.
No longer blocks: 1404782
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #11)
> > > > STR: Load some tabs, switch tab, try to scroll by mousewheel/scrollbar: doesn't work.
> > The difference is:
> > Dark/Light Theme: Problem.
> > Default Theme (=fresh profile): No Problem.
> 
> I should run mozregression with webrender + dark theme. Maybe I find something.

layers force accel + webrender + Dark Theme.

There are definitely good and bad builds, but some broken builds wanted to start in Safe Mode where WebRender is disabled, so I had to type "skip".

I visited https://www.heise.de and waited until everything was loaded. Then I tried to scroll down. If I could, it was a "good" build, otherwise a "bad" one.

> mozregression --good 2017-10-03 --bad 2017-10-05 --profile-persistence clone-firs
> [...]
>  8:59.05 INFO: Last good revision: 29a20bc04c6f95f8014238511d2cce7f299fabf3
>  8:59.05 INFO: First bad revision: deeb7f472519545765518f901ba018bf0b9a6fe9
>  8:59.05 INFO: Pushlog:
> https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=29a20bc04c6f95f8014238511d2cce7f299fabf3&tochange=deeb7f472519545765518f901ba018bf0b9a6fe9

Second test, but with tracking protection and on https://www.golem.de (had to skip some "Safe Mode" builds, too):
>  9:21.38 INFO: Last good revision: ccd48f2cb1eb49321295252aef94d7d482c2f101
>  9:21.38 INFO: First bad revision: deeb7f472519545765518f901ba018bf0b9a6fe9
>  9:21.38 INFO: Pushlog:
> https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=ccd48f2cb1eb49321295252aef94d7d482c2f101&tochange=deeb7f472519545765518f901ba018bf0b9a6fe9
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #14)
My hope was that bug 1406119 primary/only affects internal pages.
I can reproduce this with the light theme (probably dark also, but didn't try). It looks like the hit testing tree we're generating has one layer split into two, and one of those two is sitting on top of the content area and eating all the events. Not really sure why as the gecko display list in the two cases looks equivalent. I'll keep digging.
Just a quick update: I traced the problem down to the existence of a FixedPosition item in the light theme. It's causing a change in the layer-generation for hit-testing because it forces a new layer and then event regions end up getting put on the wrong layer. I'm not sure yet how to fix it.
I wrote a patch that should fix it, here's the try push: https://treeherder.mozilla.org/#/jobs?repo=try&revision=2dfcbfb6d8d5b3398622c0c7830e7b6c0384d731

This event regions manipulation code is such a disaster. I can't wait to throw it all away and replace it with the WR hit-testing stuff...
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #18)
> I wrote a patch that should fix it, here's the try push:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=2dfcbfb6d8d5b3398622c0c7830e7b6c0384d731
> 
> This event regions manipulation code is such a disaster. I can't wait to throw it all away and replace it with the WR hit-testing stuff...

Thank you!

Linux x64 opt: layers force accel + webrender
1. I can scroll on about:preferences again. (bug 1406119). The preferences search bar looks broken/mispositioned (also in my Nightly), might be bug 1366295. (I don't know.)
2. Dark/Light Theme: scrolling down with mousewheeel/scrollbar after switching tabs works again. (this bug)

Off topic:
Linux x64 debug:
1. with layers force accel + webrender + webrendest + tracking protection on https://www.golem.de I saw
> WARN:webrender::prim_store: invalid primitive rect TypedRect(0×0 at (1,1))
2. with my cloned main profile I saw on https://www.heise.de
> [GPU 9067, Compositor] ###!!! ASSERTION: Unexpected layers id in RecvSetTargetAPZC; dropping message...: 'Error', file /builds/worker/workspace/build/src/gfx/layers/ipc/APZCTreeManagerParent.cpp, line 213
> IPDL protocol error: Handler returned error code!
> 
> ###!!! [Parent][DispatchAsyncMessage] Error: PAPZCTreeManager::Msg_SetTargetAPZC Processing error: message was deserialized, but the handler returned false (indicating failure)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #18)
> I wrote a patch that should fix it, here's the try push:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=2dfcbfb6d8d5b3398622c0c7830e7b6c0384d731
> 
> This event regions manipulation code is such a disaster. I can't wait to
> throw it all away and replace it with the WR hit-testing stuff...

Try push shows some failures, I think because of a silly mistake in my code. Here's another try push hopefully with it fixed:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=9d102fdbd36d24e16234add3c635e400ed0c0efe
Duplicate of this bug: 1406129
Comment on attachment 8915974 [details]
Bug 1406008 - Add pref for just dumping the parent process display list.

https://reviewboard.mozilla.org/r/187260/#review192266
Attachment #8915974 - Flags: review?(jmuizelaar) → review+
Comment on attachment 8915975 [details]
Bug 1406008 - Try to better handle an edge case in APZ hit-testing with layers-free WR.

https://reviewboard.mozilla.org/r/187262/#review192268
Attachment #8915975 - Flags: review?(jmuizelaar) → review+
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/858b80bd2e55
Add pref for just dumping the parent process display list. r=jrmuizel
https://hg.mozilla.org/integration/autoland/rev/3cf1ae279875
Try to better handle an edge case in APZ hit-testing with layers-free WR. r=jrmuizel
https://hg.mozilla.org/mozilla-central/rev/858b80bd2e55
https://hg.mozilla.org/mozilla-central/rev/3cf1ae279875
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #10)
> > comment 0 is still reproducible when I'm using my cloned main profile, but unreproducible in a fresh profile
> > > STR: Load some tabs, switch tab, try to scroll by mousewheel/scrollbar: doesn't work.
> The difference is:
> Dark/Light Theme: Problem.
> Default Theme (=fresh profile): No Problem.

Verified fixed in Nightly 58 x64 20171008131700 de_DE @ Debian Testing.
Thank you!
Status: RESOLVED → VERIFIED
Has STR: --- → yes
Fixed for me as well!
You need to log in before you can comment on or make changes to this bug.