Closed Bug 1166871 Opened 7 years ago Closed 7 years ago

[APZ] Wheel events are sometimes sent with the wrong position; tab bar scrolls while scrolling content

Categories

(Core :: Panning and Zooming, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: mstange, Assigned: kats)

References

Details

Attachments

(2 files, 2 obsolete files)

Steps to reproduce:
1. Put enough tabs in your tab bar so that it overflows.
2. Scroll to the very right of the tab bar.
3. Load http://planet.mozilla.org/ in a tab, or any other long and complex page.
4. Place your mouse in the middle of that tab's content area and keep scrolling up and down very quickly.

Actual results:
At some point, the tab bar itself will suddenly scroll to the left.

Expected results:
The tab bar shouldn't scroll.
Assignee: nobody → bugmail.mozilla
I reproduced this on linux as well.
Blocks: apz-linux
OS: Mac OS X → All
Summary: [OS X + APZ] Wheel events are sometimes sent with the wrong position; tab bar scrolls while scrolling content → [APZ] Wheel events are sometimes sent with the wrong position; tab bar scrolls while scrolling content
So this is basically the same issue as bug 973105, but for wheel events. The fix we applied in that bug was to flush pending repaints at the start of a touch block. That works for touch events but the concept of a "input block" isn't very crisp with wheel input so I'm not sure if it will carry over. We might need to come up with a different strategy for wheel input.
Braindumping some more thoughts:

There's actually two different things that bug 973105 fixes. One is the issue that on the first input for an input block, if we hit a dtc region in the APZ hittest, we don't really know which untransform-to-gecko we need to apply to that input. The patch on bug 973105 solves this problem by ensuring the untransform-to-gecko for all the APZCs is the same (i.e. an empty untransform) in this scenario, by flushing any pending repaints before doing the untransform. This code is missing in the wheel input path, and we should add it. To do this we need to add some code at [1] that detects if the wheel input the start of a new block (we can do this by checking the aOutInputBlockId against a previously stored value) and flush the pending repaints in that case.

However, I don't think that will fix this bug, which is a separate problem.

In this bug, what's happening is that the wheel events later in the middle of the block get untransformed to a point outside the scrollframe because of the transient async transform. In theory this could happen with touch events but I wrote a test for this and it doesn't happen because gecko implements touch capture so all touch events in a block go to the same target element. This appears to not be the case for wheel events, so wheel events which are in the same APZ transaction can end up triggering wheel listeners on different elements.

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/apz/src/APZCTreeManager.cpp?rev=c780203cf304#590
I think capturing wheel events at the element that the wheel transaction started over would be a good change, and one we should make even without APZ.
That would probably fix this bug, but I'm wondering if that would break backwards compatibility?
I don't know. Maybe we should write up a patch and test it.

Olli, Masayuki, do you have opinions on comment 4?
Flags: needinfo?(masayuki)
Flags: needinfo?(bugs)
Parent process could indeed capture wheel events in case wheel events are targeted to content process.
We do have the transaction model for wheel events so that during the transaction same thing is being
scrolled, but I'm not sure how that works with apz. In fact, why doesn't that work in parent process already?

I haven't seen this bug on linux.
Flags: needinfo?(bugs)
How to enable APZ? I'd like to confirm the behavior on Windows, though.

I guess that tabbar scroll by wheel is handled by a wheel event handler of the tabbar implementation because it can be scrolled by vertical wheel operation. So, even when the mouse cursor is moved over content, tabbar still thinks the cursor is on the tabbar??
Flags: needinfo?(masayuki)
It looks like I didn't express myself very clearly... I'll file a new bug with a more detailed description of my proposal and an example, and then we can continue this conversation there.

In this bug I think we should come up with a plan to always have an empty untransform.
(In reply to Masayuki Nakano (:masayuki) (Mozilla Japan) (working slowly due to bad health condition) from comment #8)
> How to enable APZ? I'd like to confirm the behavior on Windows, though.

layers.async-pan-zoom.enabled = true and enable also e10s
(and yes, apz mostly super wonderful)
(In reply to Markus Stange [:mstange] from comment #9)
> It looks like I didn't express myself very clearly... I'll file a new bug
> with a more detailed description of my proposal and an example, and then we
> can continue this conversation there.

Did you end up filing a bug for this?

> In this bug I think we should come up with a plan to always have an empty
> untransform.

I'm not sure this is feasible without losing much of the benefit of APZ on wheel events. I have a different idea which is to simply discard events that, after untransforming, end up in a different APZ. Not sure if that will work but I'll give it a shot.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #11)
> (In reply to Markus Stange [:mstange] from comment #9)
> > It looks like I didn't express myself very clearly... I'll file a new bug
> > with a more detailed description of my proposal and an example, and then we
> > can continue this conversation there.
> 
> Did you end up filing a bug for this?

bug 1168182
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #11)
> > In this bug I think we should come up with a plan to always have an empty
> > untransform.
> 
> I'm not sure this is feasible without losing much of the benefit of APZ on
> wheel events. I have a different idea which is to simply discard events
> that, after untransforming, end up in a different APZ. Not sure if that will
> work but I'll give it a shot.

On IRC Markus convinced me that this may not be so bad. I was concerned that having to flush a repaint request before every wheel event might be bad, but Markus said they shouldn't actually trigger repaints individually, only on refresh driver ticks. So they should still get coalesced on the content side.

I implemented this the trivial way, by sticking this:

+      {
+        MonitorAutoLock lock(mTreeLock);
+        FlushRepaintsRecursively(mRootNode);
+      }

at the top of the SCROLLWHEEL_INPUT handler in APZCTM and it seems to work on Linux but not on OS X. I'm investigating to see why it's different there (can't use rr :().
Ah, the problem is that in the OnScrollWheel handler we can actually do a scroll (if the scroll mode is INSTANT) and so in between flushing the repaint and doing the untransform we can change the async transform. This never happens with a touch-start event and so isn't an issue with touch events.

Furthermore this should only happen in cases where there is no wheel listener at that point (otherwise the wheel block would sit in the input queue waiting for content) so we should just be able to skip the untransform entirely for wheel events. Technically it means that that we're dispatching wheel events with the wrong point but the only code that should be affected by it is Gecko C++ code that uses wheel events without registering a wheel listener.
Attached patch Patch (obsolete) — Splinter Review
Comment on attachment 8610559 [details] [diff] [review]
Patch

Review of attachment 8610559 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/apz/src/APZCTreeManager.cpp
@@ +600,5 @@
>          // Update the out-parameters so they are what the caller expects.
>          apzc->GetGuid(aOutTargetGuid);
> +
> +        // We don't need to untransform the point because we flushed pending
> +        // repaints up above. See bug 1166871 comment 14.

Can you add a comment that says that the untransform can have become non-empty between the repaint flush and this line, and so respecting the untransform would actually lead to wrong results? Without such a comment the reader might be tempted to add an assertion about the untransform being empty here (which would fail).
Attached patch Patch v2 (obsolete) — Splinter Review
Updated comments. My gut feeling is that there's a better solution to this but I can't think of it :(
Attachment #8610559 - Attachment is obsolete: true
Attachment #8610788 - Flags: review?(botond)
Comment on attachment 8610788 [details] [diff] [review]
Patch v2

Review of attachment 8610788 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/apz/src/APZCTreeManager.cpp
@@ +574,5 @@
>        result = ProcessTouchInput(touchInput, aOutTargetGuid, aOutInputBlockId);
>        break;
>      } case SCROLLWHEEL_INPUT: {
> +      { // In this block we flush repaint requests for the entire APZ tree. We
> +        // need to do this at the start of an input block for a number of

This comment seems to be copied verbatim from GetTouchInputBlockAPZC(). Can we extract a function instead and use that?

Also, note that the "at the start of an input block" bit is no longer accurate, as for wheel events we're not doing this in the middle of an input block, too.

@@ +604,5 @@
> +        // We don't need to untransform the point because we flushed pending
> +        // repaints up above. See bug 1166871 comment 14. Note that in particular
> +        // the untransform can have become non-empty between the repaint flush
> +        // above and this line, and so doing the untransform would actually
> +        // produce incorrect results.

Can we do the FlushRepaints after the ReceiveInputEvent, and thereby avoid the incorrectness mentioned in comment 14, limited to Gecko C++ consumer as it may be?

We could then add an assert of the form Markus mentioned in comment 16. (I would even be supportive of adding a similar assertion to ProcessTouchInput() for the MULTITOUCH_START case).
(In reply to Botond Ballo [:botond] from comment #18)
> @@ +604,5 @@
> > +        // We don't need to untransform the point because we flushed pending
> > +        // repaints up above. See bug 1166871 comment 14. Note that in particular
> > +        // the untransform can have become non-empty between the repaint flush
> > +        // above and this line, and so doing the untransform would actually
> > +        // produce incorrect results.
> 
> Can we do the FlushRepaints after the ReceiveInputEvent, and thereby avoid
> the incorrectness mentioned in comment 14, limited to Gecko C++ consumer as
> it may be?

Can somebody convince me that it actually is incorrect? We want to send Gecko the event in its actual original position, and we want Gecko to have the pre-scroll state when it's processing the event. Isn't that what's happening?
(In reply to Markus Stange [:mstange] from comment #19)
> Can somebody convince me that it actually is incorrect? We want to send
> Gecko the event in its actual original position, and we want Gecko to have
> the pre-scroll state when it's processing the event. Isn't that what's
> happening?

If we want Gecko to have the pre-scroll state when it's processing the event, then we're doing it wrong in ProcessTouchInput(), because there too we apply a the gecko transform computed after the ReceiveInputEvent() call which may have done scrolling for a touch-move event.
Hmm. I'm actually not sure what should happen for touch events. It's certainly nice if the the offset between the touch point and the scroll position stays fixed, from content's perspective. (Which I think the current code achieves? Not sure though.)
(In reply to Botond Ballo [:botond] from comment #18)
> This comment seems to be copied verbatim from GetTouchInputBlockAPZC(). Can
> we extract a function instead and use that?

Doh. I meant to do that later when I wrote the original patch and then forgot. Will update the patch.

> Also, note that the "at the start of an input block" bit is no longer
> accurate, as for wheel events we're not doing this in the middle of an input
> block, too.

Right, I'll update that too.

Regarding the rest of the discussion, I realize now that touchmove events and wheel events differ in one fundamental way: touchmove events move around while scrolling, and wheel events do not. That is, say your finger is at y=10 and you move it to y=20. Ignoring the pan threshold, this causes a scroll of 10 pixels but the untransformed coordinate should be the same throughout. In the case of wheel events, if your mouse cursor is at y=10 and you wheel by 10 pixels, there is again a scroll of 10 pixels but the untransformed coordinate will change. Touch events move with the page but wheel events do not.

So in light of that, I think Markus is right - if we do a flush and no untransform the result is correct always. Alternatively, we could omit the flush, and do an untransform *before* ReceiveInputEvent - but that will still have the problem that the untransform will be based on a tentative APZC target which may turn out to be wrong.

Does that makes sense to everybody?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #22)
> (In reply to Botond Ballo [:botond] from comment #18)
> So in light of that, I think Markus is right - if we do a flush and no

(Specifically, the flush has to be before ReceiveInputEvent)
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #22)
> Does that makes sense to everybody?

Sounds good to me.
Attached patch Patch v3Splinter Review
Attachment #8610788 - Attachment is obsolete: true
Attachment #8610788 - Flags: review?(botond)
Attachment #8611225 - Flags: review?(botond)
Comment on attachment 8611225 [details] [diff] [review]
Patch v3

Review of attachment 8611225 [details] [diff] [review]:
-----------------------------------------------------------------

I'm convinced. Thanks!

::: gfx/layers/apz/src/APZCTreeManager.cpp
@@ +1079,5 @@
> +  // Another reason we want to clear the untransform is that if our APZ hit-test
> +  // hits a dispatch-to-content region then that's an ambiguous result and we
> +  // need to ask Gecko what actually got hit. In order to do this we need to
> +  // untransform the input event into Gecko space - but to do that we need to
> +  // know which APZC got hit! This leads to a circular problem; the only way to

s/problem/dependency
Attachment #8611225 - Flags: review?(botond) → review+
Comment on attachment 8611225 [details] [diff] [review]
Patch v3

Review of attachment 8611225 [details] [diff] [review]:
-----------------------------------------------------------------

A gtest for this might be useful.
Attached patch GtestSplinter Review
Attachment #8612295 - Flags: review?(botond)
Comment on attachment 8612295 [details] [diff] [review]
Gtest

Review of attachment 8612295 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/tests/gtest/TestAsyncPanZoomController.cpp
@@ +2426,5 @@
>    mcc->RunThroughDelayedTasks();
>  }
>  
> +TEST_F(APZHitTestingTester, TestRepaintFlushOnWheelEvents) {
> +  SCOPED_GFX_PREF(TouchActionEnabled, bool, false);

The scoped-pref isn't needed, I removed it locally.
Comment on attachment 8612295 [details] [diff] [review]
Gtest

Review of attachment 8612295 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks!
Attachment #8612295 - Flags: review?(botond) → review+
https://hg.mozilla.org/mozilla-central/rev/ce3eba7648ce
https://hg.mozilla.org/mozilla-central/rev/cba82f19ba1f
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
See Also: → 1169844
You need to log in before you can comment on or make changes to this bug.