Closed Bug 1751510 Opened 8 months ago Closed 2 months ago

Zoom gesture can not be performed at the same time as a pan gesture

Categories

(Core :: Panning and Zooming, defect, P3)

Firefox 98
Desktop
Linux
defect

Tracking

()

RESOLVED FIXED
105 Branch
Tracking Status
firefox105 --- fixed

People

(Reporter: sawyerbergeron, Assigned: rzvncj, Mentored)

References

Details

(Whiteboard: [lang=c++])

Attachments

(1 file)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:98.0) Gecko/20100101 Firefox/98.0

Steps to reproduce:

  1. Start a zoom gesture in
  2. Stop zoom gesture but keep fingers on trackpad
  3. Perform a pan gesture

Actual results:

Page does not pan

Expected results:

Page pans after the zoom gesture has been recognized, so that if my cursor was not directly centered over the area I want to zoom in on I can make small corrections while zooming or correct without having to start a new gesture

Test was performed on a wlroots compositor (wayland) using Libinput 1.19.3

Expected behavior is visible using Gnome Web (Epiphany) on the same platform

OS: Unspecified → Linux
Hardware: Unspecified → Desktop
Component: Untriaged → Panning and Zooming
Product: Firefox → Core

First step to investigating this would be to understand what sequence of events do we get at the GTK level.

  • Do we get a GDK_TOUCHPAD_GESTURE_PHASE_END at the transition?
    • If not, are the events during the "pan" phase GDK_TOUCHPAD_GESTURE_PHASE_UPDATE?
  • Or are the events during the "pan" phase now GdkEventScroll?

I can reproduce this. It's pretty clear from the observed behaviour that we continue to get GdkEventTouchpadPinch throughout the entire gesture, since we continue to zoom in/our in response to small changes in the finger span even during the "pan" phase of the gesture.

APZ does support this use case (called "two-finger panning"), if it gets PinchGestureInput events whose focus point (on touchscreens, the midpoint of the two touch points) is changing.

In the GTK widget code, we set the focus point based on the x and y (or x_root and y_root) fields of GdkEventTouchpadPinch. It follows that these properties are not changing over the course of the gesture (they probably correspond to the cursor position on screen).

I see that the event also has dx and dy fields; I'm going to venture a guess that these do change to reflect the touchpad gesture's focus point.

So, I think this can be fixed by revising our code to use something like x + dx and y + dy (possibly with a transformation applied to dx and dy to get them into the same coordinate system as x and y, or at least magnitudes suitable for addition) as the focus point.

Marking as S3 for now. Also making this a mentored bug. Sawyer, let me know if you're interested in trying this!

Mentor: botond
Severity: -- → S3
Priority: -- → P3
Whiteboard: [lang=c++]

I'd be happy to work on this! I'm currently having some trouble with my moz build repo (mozboot/mach throw a really angry stack trace) and I have a thesis proposal to complete over the next couple weeks, but can put more time into this after that.

I've attempted the following template specialization for GdkEventTouchpadPinch:

diff --git a/widget/gtk/nsWindow.cpp b/widget/gtk/nsWindow.cpp
--- a/widget/gtk/nsWindow.cpp
+++ b/widget/gtk/nsWindow.cpp
@@ -4071,6 +4071,23 @@ static LayoutDeviceIntPoint GetRefPoint(
          aWindow->WidgetToScreenOffset();
 }
 
+template <>
+static LayoutDeviceIntPoint GetRefPoint(nsWindow* aWindow,
+                                        GdkEventTouchpadPinch* aEvent) {
+  if (aEvent->window == aWindow->GetGdkWindow()) {
+    // we are the window that the event happened on so no need for expensive
+    // WidgetToScreenOffset
+    return aWindow->GdkEventCoordsToDevicePixels(aEvent->x + aEvent->dx,
+                                                 aEvent->y + aEvent->dy);
+  }
+  // XXX we're never quite sure which GdkWindow the event came from due to our
+  // custom bubbling in scroll_event_cb(), so use ScreenToWidget to translate
+  // the screen root coordinates into coordinates relative to this widget.
+  return aWindow->GdkEventCoordsToDevicePixels(aEvent->x_root + aEvent->dx,
+                                               aEvent->y_root + aEvent->dy) -
+         aWindow->WidgetToScreenOffset();
+}
+
 void nsWindow::OnMotionNotifyEvent(GdkEventMotion* aEvent) {
   if (!mGdkWindow) {
     return;

Unfortunately, it doesn't seem to work. I thought that no transformations are needed for dx and dy, since they're added directly to x and y (and x_root and y_root), but maybe I'm wrong. I'm also not much of a gestures person, so maybe my touchpad navigation skills have something to do with the results, but I've been unable to properly switch from zooming to panning without lifting my (two) fingers from the touchpad.

Actually, it kind of works... If I just keep my fingers on the touchpad after I zoom but wait for a few seconds before bringing them together to try to pan, then panning appears to work as expected. Could somebody confirm that this is how I'm supposed to switch from zooming to panning without lifting my fingers off the touchpad? If the diff above works properly I'm happy to make it into a patch and submit it.

Actually I can't reproduce it working so I must have lifted a finger or done something else to break the gesture. Nevermind.

I've logged what happens while zooming, and here it is:

x: 485, y: 511, dx: 0.533646, dy: -10.0116
x: 485, y: 511, dx: 0.637009, dy: -9.05539
x: 485, y: 511, dx: 2.41373, dy: -8.86981
x: 485, y: 511, dx: 4.99988, dy: -9.45335
x: 485, y: 511, dx: -4.13783, dy: -7.00249
x: 485, y: 511, dx: -4.31024, dy: -2.91769
x: 485, y: 511, dx: -0.862045, dy: -8.86981
x: 485, y: 511, dx: -1.03445, dy: -4.90173
x: 485, y: 511, dx: 0, dy: -3.26782
x: 485, y: 511, dx: -0.172409, dy: -2.45087
x: 485, y: 511, dx: 0.689636, dy: -2.80099
x: 485, y: 511, dx: 1.55168, dy: -4.43491
x: 485, y: 511, dx: -0.172409, dy: -3.96808
x: 485, y: 511, dx: -1.03445, dy: -3.96808
x: 485, y: 511, dx: -3.10336, dy: -3.96808
x: 485, y: 511, dx: -3.27579, dy: -3.61795
x: 485, y: 511, dx: -5.17229, dy: -2.45087
x: 485, y: 511, dx: -4.82747, dy: -1.63391
x: 485, y: 511, dx: -3.4482, dy: 0.233414
x: 485, y: 511, dx: -2.06891, dy: -0.116699
x: 485, y: 511, dx: -0.748291, dy: -0.101303
x: 485, y: 511, dx: 0.176315, dy: 0.298386
x: 485, y: 511, dx: 0.3871, dy: 0.0436707
x: 485, y: 511, dx: 0.341812, dy: -0.0578308
x: 485, y: 511, dx: 1.02119, dy: -0.259216
x: 485, y: 511, dx: 12.183, dy: -1.83264
x: 485, y: 511, dx: 1.55168, dy: -2.21745
x: 485, y: 511, dx: 10.3446, dy: -1.28378
x: 485, y: 511, dx: 7.58603, dy: -0.700241
x: 485, y: 511, dx: 7.58603, dy: -1.05037
x: 485, y: 511, dx: 5.3447, dy: 0.466827
x: 485, y: 511, dx: 3.62061, dy: 0.350113
x: 485, y: 511, dx: 2.06891, dy: 0.116699
x: 485, y: 511, dx: 2.06891, dy: 0.233414
x: 485, y: 511, dx: 0.432541, dy: 0.390396
x: 485, y: 511, dx: 0, dy: 0.336899
x: 485, y: 511, dx: 0.345901, dy: 0.702469
x: 485, y: 511, dx: 0.166336, dy: 2.58983
x: 485, y: 511, dx: 2.24132, dy: 2.80099
x: 485, y: 511, dx: 1.55168, dy: 2.10074
x: 485, y: 511, dx: 4.13783, dy: 1.5172
x: 485, y: 511, dx: 2.06891, dy: 1.05037
x: 485, y: 511, dx: 2.24132, dy: -0.350113
x: 485, y: 511, dx: 2.75854, dy: -0.583527
x: 485, y: 511, dx: 2.58614, dy: -0.583527
x: 485, y: 511, dx: 1.35304, dy: -0.34346

x and y do not change at all. Changing the focus to CONSTANT_X + dx, CONSTANT_Y + dy does do something, but the effect is barely noticeable. It's not clear to me what conversion (and from what) dx and dy would need to undergo to make this work.

I believe I've figured it out, dx and dy need to be continuously updated. That is, it's not x + dx, but rather static_dx += dx; use x + static_dx;. The deltas are relative to the last event, not to the beginning of the zoom gesture. I'll mess around with it a bit more and submit a patch when it's ready.

Accumulating dx and dy (as described above) does work for panning, however I've compared the behaviour to the one exhibited by the Gnome browser, and that one allows for simultaneous panning and zooming, while with these changes Firefox doesn't seem to want to ever return to zooming once the panning-while-zooming starts. I've not looked at the code in any great detail, but it feels that once the focus point changes (however little) we're in pan-only mode - and it's very difficult to have the focus point not change at all.

Looks like this might require subtler changes than originally envisioned?

We do have a feature called "pinch locking", where we try to detect if a two-finger touchscreen gesture is a pan rather than a pinch, and lock into panning in that case.

To verify that this is what you're seeing, could you try setting the pref apz.pinch_lock.mode to 0, and see if the issue goes away?

If it is indeed pinch locking that's causing this, the next step would be to see why it's so sensitive. The condition for entering pinch lock is found here and it requires both that the focus change over the lifetime of the gesture be above a threshold, and that the span distance (the amount by which the distance between the fingers has changed) over the lifetime of the gesture be below a threshold.

(In reply to Botond Ballo [:botond] from comment #11)

We do have a feature called "pinch locking", where we try to detect if a two-finger touchscreen gesture is a pan rather than a pinch, and lock into panning in that case.

To verify that this is what you're seeing, could you try setting the pref apz.pinch_lock.mode to 0, and see if the issue goes away?

If it is indeed pinch locking that's causing this, the next step would be to see why it's so sensitive. The condition for entering pinch lock is found here and it requires both that the focus change over the lifetime of the gesture be above a threshold, and that the span distance (the amount by which the distance between the fingers has changed) over the lifetime of the gesture be below a threshold.

Yes, it was indeed the pinch locking. With apz.pinch_lock.mode set to 0 I'm seeing Firefox behave like Epiphany. I'll look into the threshold later today or tomorrow. Thanks!

And another thing about this patch, I'm seeing this comment in nsWindow::SynthesizeNativeTouchPadPinch():

  // We only set the fields of GdkEventTouchpadPinch which are
  // actually used in OnTouchpadPinchEvent().
  // GdkEventTouchpadPinch has additional fields (for example, `dx` and `dy`).
  // If OnTouchpadPinchEvent() is changed to use other fields, this function
  // will need to change to set them as well.

Should I set dx and dy there too? If so, where would I pull them from?

(Also, if this helps anything - clarity? - you can assign this bug to me, I intend to see it through eventually. :))

(In reply to Razvan Cojocaru from comment #13)

Should I set dx and dy there too? If so, where would I pull them from?

Very good point! Yes, we should. I would suggest the following:

  • When receiving the PHASE_BEGIN event, save the value of pointInWindow in a new field of nsWindow (e.g. mLastTouchpadPinchBegin). The dx and dy values here can be 0.
  • For PHASE_UPDATE and PHASE_END events, set their x and y to mLastTouchpadPinchBegin, and set their dx and dy to the difference (pointInWindow - mLastTouchpadPinchBegin).

(In reply to Razvan Cojocaru from comment #14)

(Also, if this helps anything - clarity? - you can assign this bug to me, I intend to see it through eventually. :))

Done -- thanks!

Assignee: nobody → rzvncj

(In reply to Botond Ballo [:botond] from comment #15)

(In reply to Razvan Cojocaru from comment #13)

Should I set dx and dy there too? If so, where would I pull them from?

Very good point! Yes, we should. I would suggest the following:

  • When receiving the PHASE_BEGIN event, save the value of pointInWindow in a new field of nsWindow (e.g. mLastTouchpadPinchBegin). The dx and dy values here can be 0.
  • For PHASE_UPDATE and PHASE_END events, set their x and y to mLastTouchpadPinchBegin, and set their dx and dy to the difference (pointInWindow - mLastTouchpadPinchBegin).

Right, but that's not how GDK sets them according to my tests. Your assumption is that on each PHASE_UPDATE event dx and dy are "full" deltas with respect to the original point (specified as x and y in the event). But what I'm seeing is that on each PHASE_UPDATE event, the delta is relative to the position of the previous PHASE_UPDATE event.

Anyway, I got the idea, I'll update the code to reflect the GDK behaviour. :)

Blocks: 1779812
No longer blocks: 1779812
Depends on: 1779812
Blocks: 1780121

We ran into a complication while working on a test for this in bug 1780121.

The native events produced by the widget and processed by APZ need to be mapped onto web platform events which are dispatched to the web content, and the web content needs to have an opportunity to override the browser's default handling of the events by calling preventDefault().

As there are no events directly representing touchpad touches or pinch gestures in the web platform, for touchpad pinch gestures we map them onto wheel events with the Ctrl modifier set, with their deltaY set based on the amount of zooming (i.e. based on the change in finger span relative to the last event).

We have some code to drop pinch gesture events where the span change is zero, because in such cases the wheel events would have deltaY=0 and not get dispatched to content and, on pages which want to handle the zooming themselves, content would not get a chance to preventDefault() those events. This has led to bugs such as bug 1700819.

But dropping events with a zero span change means that pan-only movement during the pinch gesture won't work. (During manual testing, I'm guessing it worked because at least some events actually had a small amount of span change; however, events that did have a zero span change got dropped and this probably contibuted to the two-finger pan feeling slow. In our test, we tried to synthesize events which consistently had zero span change and those did not result in panning at all.)

Timothy, do you know if we support two-finger panning as a part of a pinch gesture on Windows touchpads? If so, how do we handle this, i.e. what sort of events do we send to APZ and to content during the pan phase of the gesture?

Flags: needinfo?(tnikkel)

(In reply to Botond Ballo [:botond] from comment #18)

But dropping events with a zero span change means that pan-only movement during the pinch gesture won't work. (During manual testing, I'm guessing it worked because at least some events actually had a small amount of span change; however, events that did have a zero span change got dropped and this probably contibuted to the two-finger pan feeling slow. In our test, we tried to synthesize events which consistently had zero span change and those did not result in panning at all.)

Yes, this is indeed exactly why it worked during manual testing. Not only that, but it's why it appeared to work in the initial mochitest I wrote: I was varying not only the focus point coordinates, but also the scale slightly. I also agree (and actually was planning to write about that here or in the diff) that the dropped events contribute to the less-than-ideal panning, once it begins.

(In reply to Botond Ballo [:botond] from comment #18)

We have some code to drop pinch gesture events where the span change is zero, because in such cases the wheel events would have deltaY=0 and not get dispatched to content and, on pages which want to handle the zooming themselves, content would not get a chance to preventDefault() those events. This has led to bugs such as bug 1700819.

I would say that we accumulate the event deltas until they are big enough to generate a wheel event with deltaY != 0, rather than just dropping them.

Timothy, do you know if we support two-finger panning as a part of a pinch gesture on Windows touchpads? If so, how do we handle this, i.e. what sort of events do we send to APZ and to content during the pan phase of the gesture?

No, once we start a pinch gesture, the gesture has to end before we can get back into panning. A gesture can start as panning and transition to pinching. In that case we send the pan end before the pinch start.

The code is here, where we can only enter panning if we had no previous state, but we can enter pinching if we were previously panning:
https://searchfox.org/mozilla-central/rev/42a517a4fd9d0c43486f1b20ffca06ce8876d3d7/widget/windows/DirectManipulationOwner.cpp#301

Direct manipulation just gives us a transform, so we could implement panning in the middle of pinching if we wanted, it would make the code in that file more complicated though.

Flags: needinfo?(tnikkel)

(In reply to Timothy Nikkel (:tnikkel) from comment #20)

Direct manipulation just gives us a transform, so we could implement panning in the middle of pinching if we wanted, it would make the code in that file more complicated though.

Do you have an idea of how that could work at the level of web platform events / preventDefault()?

I don't have any specific insight into that. Could we partition the gesture into distinct pan and pinch parts so that only one is changing at a time (one stops and then the other starts) and send the content events for each of those parts as if the other part of the gesture isn't happening and the preventDefault would only cancel that part of the gesture? Or if we didn't partition into distinct parts we'd have to send two content events for both things that are going on, handling preventDefault for one but not the other would be complicated. Or we could pack both the pan and pinch into a ctrl wheel event abusing the deltax/y/z fields. But that might confuse web content, the current setup (ctrl wheel event with deltay is already pretty weird), and web content would need to change to preventDefault these new types of pans it can get.

Markus, I'd be curious to know whether/how Mac handles this as well:

  • After you start a touchpad pinch-zoom gesture, if you then transition into a pan-only movement (two fingers moving in the same direction) without lifting your fingers, can you scroll that way?
  • If so, what kind of events do we dispatch to content during the pan movement?
  • Do we handle preventDefault() correctly in such cases (e.g. avoid activating browser zoom on a site like Bing Maps during a combined gesture like this)? (Do we pan the map instead?)
Flags: needinfo?(mstange.moz)

No, macOS does not allow you to transition a touchpad pinch zoom gesture into a scroll gesture, neither in Safari nor in Apple Maps.
In Apple maps you can zoom and rotate in the same gesture, but not zoom and pan. It seems that "pan or not pan" is decided at the start of the gesture, and then either only panning or only non-panning is allowed until both fingers are lifted.

Flags: needinfo?(mstange.moz)

Thank you all for the feedback.

It looks like this combined pinch + pan gesture is not something that's currently supported on either of our other desktop platforms (Windows or Mac), and a proper implementation (meaning, no events / deltas are dropped, the view responds to touchpad movements promptly, and sites with custom handling of zoom like mapping sites work well) would need some additional thought and potentially be fairly involved.

At the same time, we have an existing browser on Linux (Gnome Web) which supports this feature.

I think that an implementation that avoids dispatching events (to APZ or content) with a zero span change is better than not supporting panning at all, even if it's a bit less responsive / some events are dropped. So, I'm going to suggest that we land our current implementation, with one change: rather than dropping the delta for an event with zero span change, accumulate it into mCurrentTouchpadFocus such that it will be reflected as part of the next event with a nonzero span change.

For the test, we can work around the issue by making small changes to the span even during the "pan" part of the gesture. Perhaps we can also move the test patch from bug 1780121 to this bug, as the test and the fix are closely related and should probably land together.

(In reply to Botond Ballo [:botond] from comment #25)

At the same time, we have an existing browser on Linux (Gnome Web) which supports this feature.

Out of curiosity how does Gnone Web handle sending events to web content to potentially preventDefault in a pinch + pan gesture?

(In reply to Timothy Nikkel (:tnikkel) from comment #26)

(In reply to Botond Ballo [:botond] from comment #25)

At the same time, we have an existing browser on Linux (Gnome Web) which supports this feature.

Out of curiosity how does Gnone Web handle sending events to web content to potentially preventDefault in a pinch + pan gesture?

Good question. I'm not sure, but if we want to pursue a more proper solution in the future, then looking into that could turn up some useful ideas.

Pushed by bballo@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/89a3162d795d
Zoom gesture can not be performed at the same time as a pan gesture. r=botond
Status: UNCONFIRMED → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → 105 Branch
Duplicate of this bug: 1780121
You need to log in before you can comment on or make changes to this bug.