Closed Bug 1425573 Opened 5 years ago Closed 5 years ago

overscroll-behavior not obeyed for unlayerized scroll frame when main thread is busy (WebRender edition)

Categories

(Core :: Panning and Zooming, defect, P3)

defect

Tracking

()

VERIFIED FIXED
mozilla60
Tracking Status
firefox57 --- unaffected
firefox58 --- unaffected
firefox59 --- wontfix
firefox60 --- verified
firefox61 --- verified

People

(Reporter: botond, Assigned: botond)

References

Details

(Whiteboard: [gfx-noted])

Attachments

(5 files)

STR:
  1. Load the attached testcase
  2. Quickly scroll down over the <div>.

Expected results:
  It's the <div> that scrolls, and since it's marked
  "overscroll-behavior: contain", when it reaches its
  end the page does not scroll.

Actual results:
  At first, the page scrolls instead of the <div>.
  Eventually the <div> is layerized, and if you
  scroll over it then, it scrolls (rather than the
  page) and obeys overscroll-behavior.
Depends on: 951793
Priority: -- → P3
Whiteboard: [gfx-noted]
The reason this happens is that initially, the <div> is not layerized. In this state, the only representation it has in APZ is being included in the dispatch-to-content region of the layer for the enclosing scroll frame (the page's).

When we scroll over it, we ask the main thread what to do. The main thread layerizes the <div>, and sends its layer (which contains the overscroll-behavior info) to APZ. If this response arrives in time (the current timeout is 400 ms), APZ knows to scroll it and obey overscroll-behavior. But on this page, the main thread is busy, so the response doesn't arrive in time, and APZ times out and just scrolls the enclosing scroll frame.
It's not immediately clear how to fix this. The DTC mechanism and associated timeout is a deliberate design choice to keep input events responsive, but in this case, the resulting effect of disobeying the developer's explicit wish to avoid scroll handoff (which is what overscroll-behavior:contain conveys) is unfortunate.

One approach to solve this would be to eagerly layerize scroll frames with a non-default overscroll-behavior. However, this may end up being very resource-intensive, especially if overscroll-behavior becomes a property that's widely adopted and used on many scroll frames.
With the nsDisplayCompositorHitTestInfo approach we're taking for webrender, this can be solved relatively easily by adding additional bits to the hit test info to indicate overscroll-behavior. The equivalent in current gecko would be to add another region to the event regions, I think.
(In reply to Botond Ballo [:botond] from comment #0)
> STR:
>   1. Load the attached testcase

There's no test-case?
Flags: needinfo?(botond)
Attached file Testcase
Doh! Forgot to attach the testcase. Here it is.
Flags: needinfo?(botond)
Attachment #8937185 - Attachment mime type: text/plain → text/html
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #3)
> With the nsDisplayCompositorHitTestInfo approach we're taking for webrender,
> this can be solved relatively easily by adding additional bits to the hit
> test info to indicate overscroll-behavior. The equivalent in current gecko
> would be to add another region to the event regions, I think.

To expand on this a bit: the behaviour on the APZ side when a hit-test result contains the new flag would be to drop the event if the timeout is reached. (Alternatively, we could wait longer before dropping the event, but the important thing is not to process the event without a content response.)

The flag itself could be something specific to overscroll-behavior, or it could be a general "processing the event *requires* a content response" flag.

We can start by implementing this for nsDislayCompositorHitTestInfo. It would be nice to avoid implementing this for EventRegions, as it would require adding a new kind of region; one idea that's been floated by Kats would be to replace EventRegions with nsDisplayCompositorHitTestInfo for the non-WR codepath as well.
(In reply to Botond Ballo [:botond] from comment #6)
> We can start by implementing this for nsDislayCompositorHitTestInfo. It
> would be nice to avoid implementing this for EventRegions, as it would
> require adding a new kind of region; one idea that's been floated by Kats
> would be to replace EventRegions with nsDisplayCompositorHitTestInfo for the
> non-WR codepath as well.

It appears that, as of bug 1434243, there are actually three codepaths by which compositor hit test flags make it to APZ:

  1) nsDisplayCompositorHitTestInfo -> WebRender
  2) nsDisplayCompositorHitTestInfo -> EventRegions
  3) nsDisplayLayerEventRegions -> EventRegions

I'm going to start by fixing this issue for codepath #1.

Milan pointed out that it would be nice to fix this for Firefox 60, because it's an ESR release, and it would be good if the ESR shipped with an overscroll-behavior implementation that's free of known bugs. That may mean we have to fix this issue for codepaths #2 and possibly #3 as well.
Assignee: nobody → botond
Attached is a proof of concept fix for codepath #1. It's only plumbed through for wheel events, and it could use some cleanup, but it does fix the STR (when scrolling with wheel) when running with WebRender.
^ Here is a completed and cleaned-up fix for code path #1.
Comment on attachment 8951814 [details]
Bug 1425573 - Introduce a TargetConfirmationFlags struct.

https://reviewboard.mozilla.org/r/221106/#review227470

Comments need addressing, looks good otherwise.

::: gfx/layers/apz/src/APZCTreeManager.cpp:1416
(Diff revision 1)
>                    targetApzc.get());
>  
>        // Dispatch the event to the input queue.
>        result = mInputQueue->ReceiveInputEvent(
>            targetApzc,
> -          /* aTargetConfirmed = */ true,
> +          TargetConfirmationFlags{hitResult},

This should be TargetConfirmationFlags(true)

::: gfx/layers/apz/src/APZUtils.h:86
(Diff revision 1)
> +  explicit TargetConfirmationFlags(bool aTargetConfirmed)
> +    : mTargetConfirmed(aTargetConfirmed)
> +  {}
> +
> +  explicit TargetConfirmationFlags(gfx::CompositorHitTestInfo aHitTestInfo)
> +    : mTargetConfirmed(!(aHitTestInfo & gfx::CompositorHitTestInfo::eInvisibleToHitTest) &&

eInvisibleToHitTest is not a bitmask, it is the absence of bitflags, so you can't do a & comparison as it will always produce the same result.
Attachment #8951814 - Flags: review?(bugmail) → review+
Comment on attachment 8951815 [details]
Bug 1425573 - Make the underlying type of TargetConfirmationState uint8_t.

https://reviewboard.mozilla.org/r/221108/#review227474
Attachment #8951815 - Flags: review?(bugmail) → review+
Comment on attachment 8951816 [details]
Bug 1425573 - Introduce InputBlockState::ShouldDropEvents().

https://reviewboard.mozilla.org/r/221110/#review227482

::: gfx/layers/apz/src/InputQueue.cpp:733
(Diff revision 1)
>      CancelableBlockState* cancelable = curBlock->AsCancelableBlock();
>      if (cancelable && !cancelable->IsReadyForHandling()) {
>        break;
>      }
>  
>      INPQ_LOG("processing input from block %p; preventDefault %d target %p\n",

This log statement would be good to update as well so it reports "shouldDropEvents" in addition to "preventDefault"
Attachment #8951816 - Flags: review?(bugmail) → review+
Comment on attachment 8951091 [details]
Bug 1425573 - Require target confirmation for events that hit unlayerized scroll frames with non-default overscroll-behavior.

https://reviewboard.mozilla.org/r/220346/#review227490

If we can add a test that would be great. But if it's too hard don't worry about it.
Attachment #8951091 - Flags: review?(bugmail) → review+
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #15)
> ::: gfx/layers/apz/src/APZCTreeManager.cpp:1416
> (Diff revision 1)
> >                    targetApzc.get());
> >  
> >        // Dispatch the event to the input queue.
> >        result = mInputQueue->ReceiveInputEvent(
> >            targetApzc,
> > -          /* aTargetConfirmed = */ true,
> > +          TargetConfirmationFlags{hitResult},
> 
> This should be TargetConfirmationFlags(true)

Good catch, fixed.

> ::: gfx/layers/apz/src/APZUtils.h:86
> (Diff revision 1)
> > +  explicit TargetConfirmationFlags(bool aTargetConfirmed)
> > +    : mTargetConfirmed(aTargetConfirmed)
> > +  {}
> > +
> > +  explicit TargetConfirmationFlags(gfx::CompositorHitTestInfo aHitTestInfo)
> > +    : mTargetConfirmed(!(aHitTestInfo & gfx::CompositorHitTestInfo::eInvisibleToHitTest) &&
> 
> eInvisibleToHitTest is not a bitmask, it is the absence of bitflags, so you
> can't do a & comparison as it will always produce the same result.

Likewise!
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #17)
> Comment on attachment 8951816 [details]
> Bug 1425573 - Introduce InputBlockState::ShouldDropEvents().
> 
> https://reviewboard.mozilla.org/r/221110/#review227482
> 
> ::: gfx/layers/apz/src/InputQueue.cpp:733
> (Diff revision 1)
> >      CancelableBlockState* cancelable = curBlock->AsCancelableBlock();
> >      if (cancelable && !cancelable->IsReadyForHandling()) {
> >        break;
> >      }
> >  
> >      INPQ_LOG("processing input from block %p; preventDefault %d target %p\n",
> 
> This log statement would be good to update as well so it reports
> "shouldDropEvents" in addition to "preventDefault"

Updated.
Blocks: 1440112
(In reply to Botond Ballo [:botond] from comment #7)
> It appears that, as of bug 1434243, there are actually three codepaths by
> which compositor hit test flags make it to APZ:
> 
>   1) nsDisplayCompositorHitTestInfo -> WebRender
>   2) nsDisplayCompositorHitTestInfo -> EventRegions
>   3) nsDisplayLayerEventRegions -> EventRegions
> 
> I'm going to start by fixing this issue for codepath #1.

I'm going to limit the scope of this bug to codepath #1 (which is handled by the patches currently posted). Codepath #2 (and, if necessary, #3) will be handled in bug 1440112.
Summary: overscroll-behavior not obeyed for unlayerized scroll frame when main thread is busy → overscroll-behavior not obeyed for unlayerized scroll frame when main thread is busy (WebRender edition)
Blocks: 1440118
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #18)
> If we can add a test that would be great. But if it's too hard don't worry
> about it.

We can add a test, but I'm going to do it in a separate bug (bug 1440118) because so far I've only fixed the issue for WebRender, and I'd want to add the test to test_group_touchevents.html or test_group_wheelevents.html, both of which are currently disabled for WebRender.
Pushed by bballo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/bc444482c4a1
Introduce a TargetConfirmationFlags struct. r=kats
https://hg.mozilla.org/integration/autoland/rev/7be69993e8dc
Make the underlying type of TargetConfirmationState uint8_t. r=kats
https://hg.mozilla.org/integration/autoland/rev/4d760ffc0527
Introduce InputBlockState::ShouldDropEvents(). r=kats
https://hg.mozilla.org/integration/autoland/rev/a94434a6f077
Require target confirmation for events that hit unlayerized scroll frames with non-default overscroll-behavior. r=kats
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/720caf516e51
Follow-up to fix logging compilation failure. r=me and DONTBUILD
Flags: qe-verify+
I managed to reproduce the bug using an older version of Nightly (2017-12-15) on Windows 10 x64.
I retested everything using the latest Nightly 61.0a1 and beta 60.0b6 on macOS 10.11, Ubuntu 16.04 x64 and Windows 10 x64 and the bug is not reproducing anymore.

However, it takes a few seconds until you can scroll inside the <div>. Is that expected behaviour?
Flags: needinfo?(botond)
(In reply to Oana Botisan from comment #31)
> I managed to reproduce the bug using an older version of Nightly
> (2017-12-15) on Windows 10 x64.
> I retested everything using the latest Nightly 61.0a1 and beta 60.0b6 on
> macOS 10.11, Ubuntu 16.04 x64 and Windows 10 x64 and the bug is not
> reproducing anymore.
> 
> However, it takes a few seconds until you can scroll inside the <div>. Is
> that expected behaviour?

Yes, that's expected. This is due to the testcase having a JS busy loop to simulate the main thread being busy. (A real webpage shouldn't perform that badly, I just made the testcase like that to make the bug more obvious.)
Flags: needinfo?(botond)
Considering the comment 31 and comment 32, I will mark this bug as verified fix.
Thank you for your quick response.
Status: RESOLVED → VERIFIED
Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.