Closed Bug 1321069 Opened 3 years ago Closed Last year

Long tap breaks subsequent touch event handling (e.g. regular tap and momentum scrolling)

Categories

(Core :: User events and focus handling, defect, P3)

50 Branch
x86_64
Linux
defect

Tracking

()

VERIFIED FIXED
mozilla62
Tracking Status
firefox-esr52 --- wontfix
firefox-esr60 61+ verified
firefox60 --- wontfix
firefox61 --- verified
firefox62 --- verified

People

(Reporter: luciano, Assigned: kats)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0
Build ID: 20161104212021

Steps to reproduce:

Long-pressing on in the nav or search bar, until the right-click menu appears ,causes touch to stop working correctly.

This is firefox 50 on linux, gnome 3.20 , with e10s, and MOZ_USE_XPINUT_2=1.


Actual results:

After this, press or long-press has no mouse-click effect. It only behaves as a mouse-over event. e.g. links will become highlighted and their url will show in the url bar at the bottom of the screen; or tooltips will show up. But pressing will not follow the link. Only solutions I've found is to restart.


Expected results:

Touch should continue working as previously.
Component: Untriaged → Event Handling
Depends on: 1207700
OS: Unspecified → Linux
Product: Firefox → Core
Hardware: Unspecified → x86_64
I experience the same issue, although it doesn't seem to matter where the long-press happens. I can long-press on the page and it will break tap events until I restart Firefox. It also will cause scrolling with touch to be chunky and sluggish until I restart Firefox.
I have the same problem in Firefox Developer Edition 52.0. I had to set the preference dom.w3c_touch_events.enabled from 2 to 1 to enable touch scrolling. Long pressing breaks smooth scrolling and disables touch mouseclick events.
Stone, I feel like this is probably a widget issue but could it be related to touch events?
Marina, just to confirm: did you try this on Linux?
Flags: needinfo?(sshih)
Flags: needinfo?(marina.billes)
(In reply to luciano from comment #0)
> User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:50.0) Gecko/20100101
> Firefox/50.0
> Build ID: 20161104212021
> 
> Steps to reproduce:
> 
> Long-pressing on in the nav or search bar, until the right-click menu
> appears ,causes touch to stop working correctly.
> 
> This is firefox 50 on linux, gnome 3.20 , with e10s, and MOZ_USE_XPINUT_2=1.
Hi Luciano,
I'm unable to reproduce the symptom. Maybe I miss some steps. Could you kindly explain more about how to turn on e10s and set MOZ_USE_XPINUT_2=1?
And I guess you mean long-pressing on the search bar with touch, right?
Flags: needinfo?(sshih) → needinfo?(luciano)
(In reply to Randall Leeds [:tilgovi] from comment #1)
> I experience the same issue, although it doesn't seem to matter where the
> long-press happens. I can long-press on the page and it will break tap
> events until I restart Firefox. It also will cause scrolling with touch to
> be chunky and sluggish until I restart Firefox.
Hi Randall,
May I know the firefox version and platform you found this problem? Thanks.
Flags: needinfo?(randall.leeds)
(In reply to Andrew Overholt [:overholt] from comment #3)
> Stone, I feel like this is probably a widget issue but could it be related
> to touch events?
> Marina, just to confirm: did you try this on Linux?

Yes, Ubuntu 16.04 (Unity) with MOZ_USE_XPINUT_2=1 as the environment.
(In reply to Ming-Chou Shih [:stone] from comment #4)
> (In reply to luciano from comment #0)
> > User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:50.0) Gecko/20100101
> > Firefox/50.0
> > Build ID: 20161104212021
> > 
> > Steps to reproduce:
> > 
> > Long-pressing on in the nav or search bar, until the right-click menu
> > appears ,causes touch to stop working correctly.
> > 
> > This is firefox 50 on linux, gnome 3.20 , with e10s, and MOZ_USE_XPINUT_2=1.
> Hi Luciano,
> I'm unable to reproduce the symptom. Maybe I miss some steps. Could you
> kindly explain more about how to turn on e10s 

here: http://www.ghacks.net/2016/07/22/multi-process-firefox/

> and set MOZ_USE_XPINUT_2=1?

simply "export MOZ_USE_XINPUT_2=1" in a console, and then "firefox-bin" from the same console.

> And I guess you mean long-pressing on the search bar with touch, right?
Correct.

I can also confirm what Randall says; the long-press can be anywhere, not just in the search bar.
Flags: needinfo?(luciano)
Sorry, I didn't state my environment because it's more or less the same.

$ uname -a
Linux elpis 4.8.0-28-generic #30-Ubuntu SMP Fri Nov 11 14:03:52 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

$ firefox --version                                                ⏎
Mozilla Firefox 50.0.2

I have MOZ_USE_XINPUT2=1 set in my environment.

In my about:config I have:
  e10s.rollout.cohort = optedIn
  browser.tabs.remote.force-enable = true

I think these were things I had set to enable touch as quickly as I could after support was added. I don't know if all the same is still necessary.
Flags: needinfo?(randall.leeds)
(In reply to Randall Leeds [:tilgovi] from comment #8)
> Sorry, I didn't state my environment because it's more or less the same.
> 
> $ uname -a
> Linux elpis 4.8.0-28-generic #30-Ubuntu SMP Fri Nov 11 14:03:52 UTC 2016
> x86_64 x86_64 x86_64 GNU/Linux
> 
> $ firefox --version                                                ⏎
> Mozilla Firefox 50.0.2
> 
> I have MOZ_USE_XINPUT2=1 set in my environment.
> 
> In my about:config I have:
>   e10s.rollout.cohort = optedIn
>   browser.tabs.remote.force-enable = true
> 
> I think these were things I had set to enable touch as quickly as I could
> after support was added. I don't know if all the same is still necessary.

Thanks! I can reproduce it now.
When the problem happened, a GDK_TOUCH_END event is missed. It induces the Touch instance been kept in nsWindow::mTouches and never been removed until firefox is restarted. Wondering should we clear all instances in mTouches when showing the menu.
Hmm, there used to be code which cleared all the old touches when widget level dispatched touchstart with just one touch. Is that code not helping us here?
Looks like we only remove the old touch by the event sequence in [1]. When long pressing by touch to show the menu, GDK_TOUCH_END isn't dispatched so we didn't clear it. Wondering whether GDK_TOUCH_END is dispatched to other window or is missing. I'd do some tests to make sure of it.

[1] http://searchfox.org/mozilla-central/source/widget/gtk/nsWindow.cpp#3480
After long pressing with touch to show the context menu, we no longer received GDK_TOUCH_UPDATE. In this case, we received GDK_MOTION_NOTIFY (no matter the finger is on the menu or not) and GDK_BUTTON_RELEASE (when the finger is detached from the screen)
Hi Karl,
I have no idea why we won't receive touch events in this case. Could you kindly give me some suggestions to figure out of it? Thanks.

[1] http://searchfox.org/mozilla-central/rev/a6a339991dbc650bab3aefe939cd0e5ac302274a/widget/gtk/nsWindow.cpp#3778
Flags: needinfo?(karlt)
I've marked comments 10 to 12 obsolete based on comment 13.
If GDK_TOUCH_BEGIN is still being received, leading to new events in
nsWindow::mTouches, then please advise.

There are some things that can be tried to narrow down whether this is a
problem in X, GTK, or Gecko.

It may be possible to use https://xtrace.alioth.debian.org/ to log
X Events to see whether these are still received, but I haven't checked
whether that has support for details of XInput2 events.  Even if it doesn't show details, it may still be possible to observe whether there are any XInput2 events at all.

Use "xinput list" to identify the virtual core pointer.  If it says "id=2"
then use "xinput test-xi2 2" to see whether this happens with a simple X test
program.

Try a simple GTK program, such as gtk3-demo or gtk3-widget-factory, to see
whether the same issue reproduces there.  (The problem may however only be
triggered by specific interaction between Gecko and GTK.)

If X events are still received, the best way to find out what is going
wrong is to use rr to find where GDK_TOUCH_UPDATE events are generated in GDK (when they are generated) and then debug why they are not dispatched after the
long press.

Given the recent nature of this report, this seems likely to be a regression,
in which case finding a regression window, either in Firefox, GTK, or X would
provide some clues.

Note that the design of XInput2 or GDK's handling of the events is not yet
ready for production use of touch events due to bug 1182700.
Flags: needinfo?(karlt)
I am sorry for not reporting it earlier, but it may help to know that this has been the case since I enabled XInput2 with the environment variable in the first version of Firefox that I could. It does not seem to be a regression, to me, but a bug present since XInput2 support shipped.
(In reply to Karl Tomlinson (:karlt) from comment #15)
> I've marked comments 10 to 12 obsolete based on comment 13.
> If GDK_TOUCH_BEGIN is still being received, leading to new events in
> nsWindow::mTouches, then please advise.
Only the GDK_TOUCH_END of the first long press (with touch) is missed. After that, GDK_TOUCH_BEGIN / GDK_TOUCH_END is being received correctly.

I found it's because the long press gesture will fire eContextMenu event, which shows the (context menu) popup. And it will grabs pointer in [1] and induces the browser window can't receive the GDK_TOUCH_END event. In that case, the dom::Touch instance is remained in nsWindow::mTouches so the later GDK_TOUCH_BEGIN will be handled as multiple touches.

Wondering if it's ok to add GDK_TOUCH_MASK to the mask of gdk_pointer_grab and delegate touch events to the browser's nsWindow. (The problem is that we need to get the nsWindow of browser in the nsWindow of context menu). Or it's better to treat the touch gesture is finished and clear all touch instances when receiving some event. Could you kindly give me some suggestions about it? Thanks.

[1] http://searchfox.org/mozilla-central/rev/22be34bcc4d5c56b62482a537bba77a6cdce117b/widget/gtk/nsWindow.cpp#4693
Flags: needinfo?(karlt)
The X Input Extension 2.x doc [1] for version 2.3 says

"A touch sequence always generates TouchBegin and TouchEnd events"

"When the touch has physically ended, or a client will otherwise not receive
any more events for a given touchpoint, a TouchEnd event will be sent to that
client."

"A window set [for event propagation] is calculated on TouchBegin and remains
constant until the end of the sequence. Modifications to the window hierarchy,
new grabs or changed event selection do not affect the window set."

The XIGrabDevice "request does not affect the processing of XI 2.2 touch
events."

That implies to me that the same client that receives the TouchBegin should
receive the corresponding TouchEnd in the sequence, and that these should both
be received on the same window.  Is GDK changing the event processing so that
the TouchEnd is sent to the grab window and only when the event is selected by
the mask used in the grab?

(In reply to Ming-Chou Shih [:stone] from comment #17)
> Wondering if it's ok to add GDK_TOUCH_MASK to the mask of gdk_pointer_grab

If that helps, then my guess is that should be fine.

If GDK is breaking the model where TouchEnd is sent to the same window as
TouchBegin, then I guess mTouches will need to be a global variable rather
than a member variable on nsWindow.  That I guess requires that each touch will
need to be associated with a window so that only relevant touches are
dispatched in the WidgetTouchEvent.

"Touch tracking IDs" "are globally unique", which makes a global variable
easier.

> and delegate touch events to the browser's nsWindow. (The problem is that we
> need to get the nsWindow of browser in the nsWindow of context menu). Or
> it's better to treat the touch gesture is finished and clear all touch
> instances when receiving some event. Could you kindly give me some
> suggestions about it? Thanks.

I guess the nsWindow from TouchBegin can be used for dispatch when TouchEnd is
received, but the event state manager may already have a mechanism for
handling this.

What does the event state manager do to ensure that touch end events are sent
to the same DOM window that received the begin event when the the DOM window moves from one native window to another between touch begin and end?

[1] There's a older version of the protocol doc at
http://soc.if.usp.br/manual/x11proto-input-dev/XI2proto.html
I haven't found version 2.3 on a server that is responding right now.
Flags: needinfo?(karlt)
Duplicate of this bug: 1333098
Updating title to reflect that the missed GTK_TOUCH_END event is causing more general badness.
Summary: long-press in nav bar disables touchscreen mouse-click event → Long tap breaks subsequent touch event handling (e.g. regular tap and momentum scrolling)
Duplicate of this bug: 1333098
Assignee: nobody → sshih
Attachment #8834222 - Flags: review?(karlt)
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Comment on attachment 8834222 [details] [diff] [review]
Move nsWindow::mTouches to be a global variable and forward subsequent touch events to the nsWindow which received GDK_TOUCH_BEGIN

Sorry about the delay here.  I was hoping to investigate this some myself, but
it seems I'm not likely to make progress on that.

I'd like to know at least two things before making a change like this:

Why does the event state manager need the correct window for non-start events,
given it will need to assign all events in the sequence to the correct DOM
window anyway?

What is the bug that this is working around?

https://developer.gnome.org/gdk3/stable/gdk3-General.html#gdk-pointer-grab
says

"owner_events

 if FALSE then all pointer events are reported with respect to window and are
 only reported if selected by event_mask . If TRUE then pointer events for
 this application are reported as normal, but pointer events outside this
 application are reported with respect to window and only if selected by
 event_mask . In either mode, unreported events are discarded."

Given the "window set" in comment 18 and that owner_events is TRUE,
an entire sequence should be delivered to a single window.

If this is a bug in X or GTK, then that may already be fixed, and all we need
to do is check for that version (in considering which systems can have this
enabled by default).

(In reply to Ming-Chou Shih [:stone] from comment #17)
> Only the GDK_TOUCH_END of the first long press (with touch) is missed. After
> that, GDK_TOUCH_BEGIN / GDK_TOUCH_END is being received correctly.
> 
> I found it's because the long press gesture will fire eContextMenu event,
> which shows the (context menu) popup. And it will grabs pointer in [1] and
> induces the browser window can't receive the GDK_TOUCH_END event. In that
> case, the dom::Touch instance is remained in nsWindow::mTouches so the later
> GDK_TOUCH_BEGIN will be handled as multiple touches.

It is odd that the touch end would go to the wrong window, but subsequent
touch begin/end events go to the correct window.

If the source of the bug can be identified, and it is not yet fixed, then that
helps us to understand exactly what problems will need to be considered.

I fear this may be more evidence that GTK/X11 touch events are not in a usable
state.

>Bug 1321069: Move nsWindow::mTouches to be a global variable and forward subsequent touch events to the nsWindow which received GDK_TOUCH_BEGIN

>+static nsRefPtrHashtable<nsPtrHashKey<GdkEventSequence>, TouchInfo> sTouches;

static variable types must have trivial constructors.
I doubt that is true for nsRefPtrHashtable.

>-    if (!mHandleTouchEvent) {
>+    if (!sIsTouchEventsEnabled && !sIsPointerEventsEnabled) {

I assume this change is unrelated to this bug, in which case it needs to have
its own bug report.
Attachment #8834222 - Flags: review?(karlt)
(In reply to Karl Tomlinson (back Apr 26 :karlt) from comment #23)
Thanks for your feedbacks.

> I'd like to know at least two things before making a change like this:
> 
> Why does the event state manager need the correct window for non-start
> events,
> given it will need to assign all events in the sequence to the correct DOM
> window anyway?
We cached touch points in mTouches and dispatch them all together [1]. When gdk dispatch GDK_TOUCH_START to one window and GDK_TOUCH_END to another window, the window which got GDK_TOUCH_START will always keep the point in mTouches and never remove it. In that case, we always dispatch multiple points and may cause APZ to identify them as a gesture when next time user's finger down on the screen.

> What is the bug that this is working around?
Yes. This is work around. I'm not sure if there are other ways that we can force the touch end event being dispatched to the window which received touch start.
> 
> (In reply to Ming-Chou Shih [:stone] from comment #17)
> > Only the GDK_TOUCH_END of the first long press (with touch) is missed. After
> > that, GDK_TOUCH_BEGIN / GDK_TOUCH_END is being received correctly.
> > 
> > I found it's because the long press gesture will fire eContextMenu event,
> > which shows the (context menu) popup. And it will grabs pointer in [1] and
> > induces the browser window can't receive the GDK_TOUCH_END event. In that
> > case, the dom::Touch instance is remained in nsWindow::mTouches so the later
> > GDK_TOUCH_BEGIN will be handled as multiple touches.
> 
> It is odd that the touch end would go to the wrong window, but subsequent
> touch begin/end events go to the correct window.
This happens when we handling touch events and show the popup window, then the touch end is dispatched to the popup window (not the window which received the touch start event). The subsequence touch events are dispatched to the correct window which user touches. But the touch point remained in mTouches leads us to dispatch multiple points to APZ when next time user touches on the window. (after the popup window is dismissed).

[1] http://searchfox.org/mozilla-central/rev/944f87c575e8a0bcefc1ed8efff10b34cf7a5169/widget/gtk/nsWindow.cpp#3453
> (In reply to Karl Tomlinson (back Apr 26 :karlt) from comment #23)
> > Why does the event state manager need the correct window for non-start
> > events,
> > given it will need to assign all events in the sequence to the correct DOM
> > window anyway?

(In reply to Ming-Chou Shih [:stone] from comment #24)
> We cached touch points in mTouches and dispatch them all together [1]. When
> gdk dispatch GDK_TOUCH_START to one window and GDK_TOUCH_END to another
> window, the window which got GDK_TOUCH_START will always keep the point in
> mTouches and never remove it. In that case, we always dispatch multiple
> points and may cause APZ to identify them as a gesture when next time user's
> finger down on the screen.

Thanks.  I understand that, if the start and end of the same sequence are on
different nsWindows, then looking up the Touch point on the second window will
not work when it was recorded on the first.

But why does it matter whether it is the main window or popup window that
dispatches the WidgetTouchEvent with an end point to its nsIWidgetListener or
mAPZC?  What code requires a particular nsWindow to dispatch?

Also, what defines which Touch points / sequences should be associated with
which WidgetTouchEvents?  The current code aims to attach Touch points for all
sequences on the same window to the dispatched WidgetTouchEvent.  If the Touch
points in a single sequence may occur on multiple windows, then what defines
which points should be attached to the WidgetTouchEvent?

I guess the answer to this question will also indicate which Touch points
should be cleared from the global set of Touch points when the an nsWindow is
destroyed before all associated sequences have received their end point.

If a gesture from a direct touch device, such as a touchscreen, begins on one
window, and there are then additional concurrent touch sequences outside the
window, then a different native window receives the additional touch sequences
[2].  What defines whether or not they should be included in the same
WidgetTouchEvent?

[2] http://soc.if.usp.br/manual/x11proto-input-dev/XI2proto.html#multitouch-processing

> > (In reply to Ming-Chou Shih [:stone] from comment #17)
> > > Only the GDK_TOUCH_END of the first long press (with touch) is
> > > missed. After that, GDK_TOUCH_BEGIN / GDK_TOUCH_END is being received
> > > correctly.
> > > 
> > > I found it's because the long press gesture will fire eContextMenu
> > > event, which shows the (context menu) popup. And it will grabs pointer
> > > in [1] and induces the browser window can't receive the GDK_TOUCH_END
> > > event. In that case, the dom::Touch instance is remained in
> > > nsWindow::mTouches so the later GDK_TOUCH_BEGIN will be handled as
> > > multiple touches.
> > 
> > It is odd that the touch end would go to the wrong window, but subsequent
> > touch begin/end events go to the correct window.

> This happens when we handling touch events and show the popup window, then
> the touch end is dispatched to the popup window (not the window which
> received the touch start event). The subsequence touch events are dispatched
> to the correct window which user touches.

If the grab is still in place when the subsequent begin/end events are
received, then do they go to the popup window, or does it depend on where the
touch begins?

> > What is the bug that this is working around?

> Yes. This is work around. I'm not sure if there are other ways that we can
> force the touch end event being dispatched to the window which received
> touch start.

I'd like to see the code that causes the bug, so that we can know whether it
is already fixed, ensure it is reported, and know exactly what behavior we
will get and what we need to work around.
Blocks: 1207700
No longer depends on: 1207700
Flags: needinfo?(marina.billes)
Priority: -- → P3
Duplicate of this bug: 1407993
I'm affected by this problem on Arch Linux, Firefox 57.
After I use long press, touch input on tab bar stops and scrolling with swipe gestures becomes slow.
See Also: → 1422517
Affected as well here. Firefox 57, Linux Mint 18.3, Cinnamon, Kernel 4.13 on a Lenovo X1 Yoga (2nd Gen)
Trying to long press actually not just results in scrolling being slow afterwards, it also starts zooming while scrolling. It is almost as if Firefox still thinks that one finger is still pressed down somewhere and thus enables pinch to zoom as it recognizes the "scrolling finger" as the second finger for the pinch gesture.

Also this results not just in scrolling being broken. I can also no longer fire another long press event and I can also no longer move the selection handles if I wanted to use long press to select text. Any input after the first long press gets converted to scroll and zoom events.
Adding Jan as he has the requested hardware.
Still happens in Firefox 60...
Still present on 62.0a1 (nightly). The patch above, in comment 23, compiled from source (after fixing a merge conflict) removes this bug.
Note: I tested on both X11 and Wayland with gnome-shell.
Karl, what do you think about landing the patch in this bug, even though there are unanswered questions about it? It seems to fix a problem that a number of users are running into.

Or do you think the risk of the patch regressing other scenarios is too high?
Flags: needinfo?(karlt)
I don't support accepting modifications just because the seem to fix one
testcase.

I can't assess the risk because I don't know how the patch is meant to work.

Some things that are wrong with the patch AFAICT:

1. sTouches has a static constructor (comment 23).

2. It contains unrelated changes (comment 23).

3. TouchInfos are not removed from sTouches when the window is destroyed
   (comment 25).

If those issues are addressed, and someone else who knows which Touch
points / sequences should be associated with which WidgetTouchEvents
can review, then I'd be very happy for them to do so.
Flags: needinfo?(karlt)
(In reply to theotherjimmy from comment #36)
> Note: I tested on both X11 and Wayland with gnome-shell.

Same here. Still happening with xorg-server 1.20.0 and Wayland 1.15.0 on GNOME 3.28.2 (GTK 3.22.30).
(In reply to Karl Tomlinson (:karlt) from comment #38)
> I don't support accepting modifications just because the seem to fix one
> testcase.

Until you have found and accepted a proper patch, is there any way to work around the bug with a pref in the meantime, without entirely disabling MOZ_USE_XPINUT_2=1?

Can the long tap delay (the time until a tap is being detected as "long") be set to an infinite (or very high) value maybe?

Or can the long tap detection be disabled entirely?

Is anything like that possible at the moment?
(In reply to N. W. from comment #41)
> Can the long tap delay (the time until a tap is being detected as "long") be
> set to an infinite (or very high) value maybe?

If you go to about:config and look for the "ui.click_hold_context_menus.delay" pref, that controls the number of milliseconds for a tap to be detected as a long tap. It defaults to 500, so if you increase it to some really large number it should effectively disable long taps from getting detected.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #42)
> If you go to about:config and look for the
> "ui.click_hold_context_menus.delay" pref

Sorry, I just realized that pref isn't there by default on about:config. However, you can add it by right-clicking and select New -> Integer. Name the pref ui.click_hold_context_menus.delay and give it a large value like 10000 (this is the number of milliseconds as I mentioned above).
(In reply to Karl Tomlinson (:karlt) from comment #38)
> 
> 2. It contains unrelated changes (comment 23).

For the record, I don't think the changes you mentioned in that comment are unrelated. RegisterTouchWindow() only gets called on windows that will have APZ enabled, which is not the case for popup windows, unless they have remote content (see [1]). And RegisterTouchWindow() was setting the flag that allowed the touch events to get handled. So if we want the popup window to handle touch events (and redirect them to the original window), then we need to make some changes around that code, which is what I believe Stone was doing in his patch. That being said, there might be a better way to structure those changes, or at least document it so it's more obvious what's going on.

Also, it occurs to me that popup windows not handling touch events with the current code (because of the lack of RegisterTouchWindow() as mentioned above) might even be the root cause of the bug here. Certainly it seems like a bug that GTK is sending the event to the popup window, but we might be able to work around with a subset of Stone's patch - one that just keeps a map of the GdkEventSequence -> nsWindow, and then at the very beginning of the OnTouchEvent function (before or inside the !mHandleTouchEvent check), we look up the window from the event and redispatch to that window.

I don't have a touch-enabled Linux device so I can't test the changes myself but if somebody is willing to push the patch over the line I can write up a patch that I think should work.

[1] https://searchfox.org/mozilla-central/rev/dc6d85680539cb7e5fe2843d38a30a0581bfefe1/widget/nsBaseWidget.cpp#906
I'm happy to test patches with my touch-enabled linux device.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #45)
> I don't have a touch-enabled Linux device so I can't test the changes myself
> but if somebody is willing to push the patch over the line I can write up a
> patch that I think should work.

I'm happy to build a patch on a touch-enabled Linux device and try to debug it if things aren't working.
I looked at the code a little more and think there might be an even easier solution. I wrote a WIP (didn't compile it) that might work, but will also provide some logging to shed more light on what's going on. It's at https://github.com/staktrace/gecko-dev/commit/14610fe8c2b9c20c17ff2a8cc122eea0b0d43ad1 - I can work with Botond today to debug this as needed.
I fixed up the patch so it compiles [1], and made a try build [2]. Botond doesn't have access to his touch-enabled laptop today, so if somebody has time, it would be useful to run the build at [3], do a long-press action that reproduces the problem, and collect the stderr output that is produced (it should have a bunch of GTKWIDGET lines in it). If the patch actually fixes the problem, that's great. The output will be helpful regardless in understanding what's going on. Thanks!

[1] https://github.com/staktrace/gecko-dev/commit/14610fe8c2b9c20c17ff2a8cc122eea0b0d43ad1
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=b8f3f4642599d5afccb901330ffb58ef2f142a6c
[3] https://queue.taskcluster.net/v1/task/NLVh8h69QGmpbCEl_wD11g/runs/0/artifacts/public/build/target.tar.bz2
Did not fix it :(

Here is the output I get: https://pastebin.com/raw/MBUkqtak
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #49)
> I fixed up the patch so it compiles [1], 
>
> [...]
>
> [1]
> https://github.com/staktrace/gecko-dev/commit/
> 14610fe8c2b9c20c17ff2a8cc122eea0b0d43ad1

For the record, this is not the version of the patch that compiles (it tries to call a non-existent nsWindow::GetType() function).
The first problem is that the touch-end event is not being delivered to the popup window to begin with. This is fixed by adding back the hunk from Stone's patch which adds the GDK_TOUCH_MASK flag to the gdk_pointer_grab() call.

With that fixed, the problem still occurs, but I now see this in the log:

...
GTKWIDGET: got touch event of type 38 with window 0x7f9cff6c9b40 and widget 0x7f9cddb16190
GTKWIDGET: target window is 0x7f9ce48d5400 of type 0 with parent (nil) and handling touch: 1
GTKWIDGET: got touch event of type 38 with window 0x7f9cff6c9b40 and widget 0x7f9cddb16190
GTKWIDGET: target window is 0x7f9ce48d5400 of type 0 with parent (nil) and handling touch: 1
GTKWIDGET: got touch event of type 38 with window 0x7f9cff6c9b40 and widget 0x7f9cddb16190
GTKWIDGET: target window is 0x7f9ce48d5400 of type 0 with parent (nil) and handling touch: 1
GTKWIDGET: got touch event of type 39 with window 0x7f9cc58b9a80 and widget 0x7f9cbfbca660
GTKWIDGET: target window is 0x7f9cbfbc9000 of type 3 with parent (nil) and handling touch: 0

So now the touch-end event makes it to the popup window, but the popup window does not have its mParent set, so the "bounce the event to the parent" codepath is not entered.

I'm also intermittently seeing this assertion [1] firing (_without_ long tapping, interestingly), but I haven't had a chance to look into that yet.

[1] https://searchfox.org/mozilla-central/rev/5a744713370ec47969595e369fd5125f123e6d24/gfx/layers/apz/src/AsyncPanZoomController.cpp#1501
(In reply to Botond Ballo [:botond] from comment #51)
> For the record, this is not the version of the patch that compiles (it tries
> to call a non-existent nsWindow::GetType() function).

Whoops, sorry. The patch I meant to link to is https://github.com/staktrace/gecko-dev/commit/611ac15d792d00baf63edc9ea14f5c76c8da6f39

(In reply to Botond Ballo [:botond] from comment #52)
> The first problem is that the touch-end event is not being delivered to the
> popup window to begin with. This is fixed by adding back the hunk from
> Stone's patch which adds the GDK_TOUCH_MASK flag to the gdk_pointer_grab()
> call.

So this is interesting, because the log from comment 50 does show the touch-end event being delivered, and to the same window that got the rest of the events.

> So now the touch-end event makes it to the popup window, but the popup
> window does not have its mParent set, so the "bounce the event to the
> parent" codepath is not entered.

Ah, I guess this makes sense considering [1]. I tried seeing what the GetContainerWindow() is for popup windows and that's not the desired window either (it returns the popup window back). It looks like we need to go through the toplevel pointer. I'm updating the patch to do that now.

[1] https://searchfox.org/mozilla-central/rev/5a744713370ec47969595e369fd5125f123e6d24/widget/gtk/nsWindow.cpp#3589-3595
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #53)
> (In reply to Botond Ballo [:botond] from comment #52)
> > The first problem is that the touch-end event is not being delivered to the
> > popup window to begin with. This is fixed by adding back the hunk from
> > Stone's patch which adds the GDK_TOUCH_MASK flag to the gdk_pointer_grab()
> > call.
> 
> So this is interesting, because the log from comment 50 does show the
> touch-end event being delivered, and to the same window that got the rest of
> the events.

The log from comment 50 contains 4 touch-start events (type=37). All but the first are matched by a touch-end (type=39), but for the first one the touch-end is missing entirely.

I assume that the first touch sequence was the long tap, and the subsequent ones attempts to touch-scroll after the long-pan.
(In reply to Botond Ballo [:botond] from comment #55)
> The log from comment 50 contains 4 touch-start events (type=37). All but the
> first are matched by a touch-end (type=39), but for the first one the
> touch-end is missing entirely.
> 
> I assume that the first touch sequence was the long tap, and the subsequent
> ones attempts to touch-scroll after the long-pan.

Ah, that makes sense. I misread the log.
https://queue.taskcluster.net/v1/task/KwxupRxnQi22UsVxbD0DiQ/runs/0/artifacts/public/build/target.tar.bz2 is the build for the try push in comment 54. If somebody wants to try that and report the results/stderr that would be great.
Still the same behaviour. New log at https://pastebin.com/raw/9BmPjgcs.

I don't know if it's linked to the problem, but when I long-press on a link both the text selector and the contextual menu appear.
The patch in comment 54 doesn't work because it is still missing the GDK_TOUCH_MASK hunk :)

Adding that in, things seem to be working for me locally!

Here's a version of the patch with the hunk added: https://hg.mozilla.org/try/rev/2a3efa0d2a82cde806c8508ab8af925b02bcc9aa
And here's a Try push for it: https://treeherder.mozilla.org/#/jobs?repo=try&revision=fb89cb2e8a7115c9956ba912a5617d7f8176e751
Yay! I've cleaned up the patch and will post it for review.
Comment on attachment 8981445 [details]
Bug 1321069 - Redirect the end event of a long-tap sequence back to the content window.

https://reviewboard.mozilla.org/r/247556/#review253896

This is a nice and simple, thanks.

::: widget/gtk/nsWindow.h:238
(Diff revision 1)
> +    nsWindow*          GetTopLevelWindowIfPopup();
> +    bool               IsHandlingTouchSequence(GdkEventSequence* aSequence);

Please move these to among private methods.
The first method at least is purpose-specific.  It doesn't handle all the possible scenarios for destroyed windows, but should be fine when called from the event handler.  It's also helpful to know that the mTouches logic is used only internally. 

(No need to add another GTK_CHECK_VERSION(3,4,0) conditional because Gecko will not compile against earlier versions anyway.)

::: widget/gtk/nsWindow.cpp:3506
(Diff revision 1)
>                             aSelectionData, aInfo, aTime);
>  }
>  
>  #if GTK_CHECK_VERSION(3,4,0)
> +nsWindow*
> +nsWindow::GetTopLevelWindowIfPopup()

Please call this "GetTransientForWindowIfPopup".

Popup windows are toplevel and even have mIsTopLevel = true.

::: widget/gtk/nsWindow.cpp:3511
(Diff revision 1)
> +nsWindow::GetTopLevelWindowIfPopup()
> +{
> +    if (mWindowType != eWindowType_popup) {
> +        return nullptr;
> +    }
> +    if (GtkWindow* toplevel = gtk_window_get_transient_for(GTK_WINDOW(mShell))) {

Please break after = to stay within 80 columns.

::: widget/gtk/nsWindow.cpp:3512
(Diff revision 1)
> +{
> +    if (mWindowType != eWindowType_popup) {
> +        return nullptr;
> +    }
> +    if (GtkWindow* toplevel = gtk_window_get_transient_for(GTK_WINDOW(mShell))) {
> +        return get_window_for_gdk_window(gtk_widget_get_window(GTK_WIDGET(toplevel)));

Please use get_window_for_gtk_widget().  It is more direct and should give the same behavior due to
https://searchfox.org/mozilla-central/rev/5a744713370ec47969595e369fd5125f123e6d24/widget/gtk/nsWindow.cpp#3954

::: widget/gtk/nsWindow.cpp:3529
(Diff revision 1)
> +        // ensure the touch event gets sent to the original window instead. We
> +        // keep the checks here very conservative so that only redirect events
> +        // in this specific scenario.

The grammar in "so that only redirect events" doesn't seem quite right.

Perhaps "so as to only redirect events" (assuming you don't mind splitting an infinitive) or "so that events are redirected only".

::: widget/gtk/nsWindow.cpp:3533
(Diff revision 1)
> +        // and touch events got diverted to that window within a touch sequence,
> +        // ensure the touch event gets sent to the original window instead. We
> +        // keep the checks here very conservative so that only redirect events
> +        // in this specific scenario.
> +        nsWindow* targetWindow = GetTopLevelWindowIfPopup();
> +        if (targetWindow && targetWindow->IsHandlingTouchSequence(aEvent->sequence)) {

Please break after && to stay within 80 columns.

::: widget/gtk/nsWindow.cpp:4880
(Diff revision 1)
>                                (GdkEventMask)(GDK_BUTTON_PRESS_MASK |
>                                               GDK_BUTTON_RELEASE_MASK |
>                                               GDK_ENTER_NOTIFY_MASK |
>                                               GDK_LEAVE_NOTIFY_MASK |
> -                                             GDK_POINTER_MOTION_MASK),
> +                                             GDK_POINTER_MOTION_MASK |
> +                                             GDK_TOUCH_MASK),

Please add a comment to indicate that GDK_TOUCH_MASK is working around a GDK/X11 bug that causes touch events that would normally be received by this client on other windows to be discarded during the grab.
Attachment #8981445 - Flags: review?(karlt) → review+
(In reply to Karl Tomlinson (:karlt) from comment #63)
> > +    nsWindow*          GetTopLevelWindowIfPopup();
> > +    bool               IsHandlingTouchSequence(GdkEventSequence* aSequence);
> 
> Please move these to among private methods.
> The first method at least is purpose-specific.  It doesn't handle all the
> possible scenarios for destroyed windows, but should be fine when called
> from the event handler.  It's also helpful to know that the mTouches logic
> is used only internally. 
> 
> (No need to add another GTK_CHECK_VERSION(3,4,0) conditional because Gecko
> will not compile against earlier versions anyway.)

Done

> Please call this "GetTransientForWindowIfPopup".

Done

> > +    if (GtkWindow* toplevel = gtk_window_get_transient_for(GTK_WINDOW(mShell))) {
> 
> Please break after = to stay within 80 columns.

I moved the variable decl outside the if condition instead, I felt that was more readable than breaking inside the if condition. That way is also less than 80 columns.

> ::: widget/gtk/nsWindow.cpp:3512
> Please use get_window_for_gtk_widget().  It is more direct and should give
> the same behavior due to
> https://searchfox.org/mozilla-central/rev/
> 5a744713370ec47969595e369fd5125f123e6d24/widget/gtk/nsWindow.cpp#3954

Done

> ::: widget/gtk/nsWindow.cpp:3529
> (Diff revision 1)
> > +        // ensure the touch event gets sent to the original window instead. We
> > +        // keep the checks here very conservative so that only redirect events
> > +        // in this specific scenario.
> 
> The grammar in "so that only redirect events" doesn't seem quite right.

Whoops, I meant to say "so that _we_ only redirect events". Fixed

> ::: widget/gtk/nsWindow.cpp:3533
> > +        if (targetWindow && targetWindow->IsHandlingTouchSequence(aEvent->sequence)) {
> 
> Please break after && to stay within 80 columns.

Done

> ::: widget/gtk/nsWindow.cpp:4880

> Please add a comment to indicate that GDK_TOUCH_MASK is working around a
> GDK/X11 bug that causes touch events that would normally be received by this
> client on other windows to be discarded during the grab.

Done

I'll upload the new patch shortly and make another try build for somebody to test it, just to make sure I didn't screw anything up in the cleanup.
Assignee: stone123456 → bugmail
Try build is at https://queue.taskcluster.net/v1/task/DUIS0EKPSvmkGY14nWE0DQ/runs/0/artifacts/public/build/target.tar.bz2 - if somebody could test that to make sure it works that would be great. Thanks!
That try build worked for me w/ Wayland.
And it worked for me on X11. Party! \o/
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/27a160b7025f
Redirect the end event of a long-tap sequence back to the content window. r=karlt
https://hg.mozilla.org/mozilla-central/rev/27a160b7025f
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Any chance this could be backported to ESR 60?
Not really, only security or "very important" fixes get backported to ESR releases. This is a fix for a feature that's not even enabled by default (i.e. you need to set an env variable in order to enable it).
On the other hand, if you're running Linux on a tablet, your main/only method of interaction with the browser is pretty broken without this fix, and many Linux distros follow the ESR channel. Therefore, I would be in favour of at least requesting approval for a backport.

(I'm happy to make the request, and if approved, prepare the backport patch.)
Comment on attachment 8981445 [details]
Bug 1321069 - Redirect the end event of a long-tap sequence back to the content window.

I guess it can't hurt to ask.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: if a user is running on linux with MOZ_USE_XINPUT2=1 to use touch events, then doing a long-tap gesture usually breaks touch events until the browser is restarted. this can result in a pretty bad user experience for a configuration that is non-default but that a lot of people are using for touch support.
Fix Landed on Version: 62
Risk to taking this patch (and alternatives if risky): patch is not particularly high-risk, it's small and was intentionally made as conservative as possible. patch should apply cleanly to esr60, since the underlying code hasn't changed since then. we've been living with this behaviour in releases for a while now so if it's considered too risky we could just not take the patch.
String or UUID changes made by this patch: none

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.

Approval Request Comment
[Feature/Bug causing the regression]: touch event support on linux (requires MOZ_USE_XINPUT2=1 environment variable)
[User impact if declined]: see ESR comment above
[Is this code covered by automated tests?]: not this exact codepath, but touch events generally are, yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: sure. on a touch-enabled linux laptop, start firefox with MOZ_USE_XINPUT2=1 and long-press somewhere to bring up a context menu. After this patch applied touch events should continue working
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: not particularly
[Why is the change risky/not risky?]: see ESR comment above
[String changes made/needed]: none
Attachment #8981445 - Flags: approval-mozilla-release?
Attachment #8981445 - Flags: approval-mozilla-esr60?
Attachment #8981445 - Flags: approval-mozilla-beta?
Comment on attachment 8981445 [details]
Bug 1321069 - Redirect the end event of a long-tap sequence back to the content window.

Sounds like a no-go for release.
Attachment #8981445 - Flags: approval-mozilla-release? → approval-mozilla-release-
Duplicate of this bug: 1422517
I would love for someone to verify this on Nightly first. Especially for a late-in-cycle fix for an off-by-default feature.
(In reply to Ryan VanderMeulen [:RyanVM] from comment #78)
> I would love for someone to verify this on Nightly first.

We've had two users confirm that a Try build of the final patch fixes the bug for them (comment 66, comment 67). Is that sufficient, or should we ask them to re-test with the latest Nightly build?

(I'd test it myself, but I won't be near my touchscreen laptop until I'm back from the All Hands.)
Comment on attachment 8981445 [details]
Bug 1321069 - Redirect the end event of a long-tap sequence back to the content window.

Verified by a couple users in the field via Try builds. I discussed this with Kats on IRC and got clarification that the code in this patch is only invoked on touch events from GTK, so there's minimal risk to users outside of the MOZ_USE_XINPUT2=1 condition. Approved for 61.0b12, but punting on the ESR60 decision for Julien to decide.
Attachment #8981445 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
I can confirm that the patch is working on my touch-enabled Linux tablet (Chuwi Hi13, Fedora 28, Firefox 62 Nightly) and Firefox touch is finally usable. Thanks to everyone.
Should the GDK_TOUCH_MASK bit in GrabPointer be behind a GTK_CHECK_VERSION like the rest of the touch stuff seems to be?  Or does that not matter because gtk 3.4.0 is required through old-configure.in anyway?
Comment on attachment 8981445 [details]
Bug 1321069 - Redirect the end event of a long-tap sequence back to the content window.

fix for touchscreens on gtk, approved for 60.1esr
Attachment #8981445 - Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
(In reply to Julien Cristau [:jcristau] from comment #83)
> Should the GDK_TOUCH_MASK bit in GrabPointer be behind a GTK_CHECK_VERSION
> like the rest of the touch stuff seems to be?  Or does that not matter
> because gtk 3.4.0 is required through old-configure.in anyway?

Yeah I think it doesn't matter. In comment #63 :karlt said:

> (No need to add another GTK_CHECK_VERSION(3,4,0) conditional because Gecko
> will not compile against earlier versions anyway.)
I've installed Ubuntu 16.04 on a VM on a Surface Pro, but unfortunately I can't reproduce the original issue using the Nightly 61 from 04.01.2018, which should exhibit the original issue. This leads me to the conclusion that using a VM is not fit to test and verify this fix and AFAIK, neither Engineering nor Release  QA have any touch device with any Linux installed.
Luciano, can you help us verify this fix on:
-Firefox Esr60
-Firefox61
-Firefox62
?

Thanks in advance.
Flags: needinfo?(luciano)
Hi, apologies but I'm struggling to determine what version of firefox this has been released in. None of the release notes on mozilla.org mention this bug for 60.x releases. The commit shows 'milestone 60.0a1' - what does that mean? Also, how can I determine from my installed firefox what version (ideally commit) it's been built off?

Also (sorry for the stupid question) - is 60 ESR not the same as 60.0? Is there a separate download link for this?
It'll ship in Firefox 61 & ESR 60.1 next week. You can test it now with any Beta/DevEdition/Nightly builds.
If you don't mind I will confirm when it's released through the normal channels - the nightlies are not in my package manager. Thanks for your understanding.
(In reply to luciano from comment #91)
> If you don't mind I will confirm when it's released through the normal
> channels - the nightlies are not in my package manager. Thanks for your
> understanding.

Luciano, it would be greatly appreciated if you could help verifying the fixed versions as soon as possible, preferably before they actually get to go live. There is no need to use the package manager to test the targeted versions, instructions follow:

I. Archives containing the fixed versions:

- Nightly 62 2018-06-21 : http://ftp.mozilla.org/pub/firefox/nightly/2018/06/2018-06-21-01-36-59-mozilla-central/firefox-62.0a1.en-US.linux-x86_64.tar.bz2
- Firefox 61 RC3 : http://ftp.mozilla.org/pub/firefox/candidates/61.0-candidates/build3/linux-x86_64/en-US/firefox-61.0.tar.bz2.asc
- Firefox 60.0.1ESR RC2 : http://ftp.mozilla.org/pub/firefox/candidates/60.1.0esr-candidates/build2/linux-x86_64/en-US/firefox-60.1.0esr.tar.bz2

II. 
- extract the archives in a test_location
- start the builds with a test profile (use -p argument to start the profile manager, if you need to create one)
- after testing, you can obviously remove the test versions.

Thanks again for the help verifying this.
Adrian, I have just successfully tested all three of the versions you linked on Linux Mint 18.3 using a Lenovo X1 Yoga in tablet mode. Everything works fine.(In reply to Adrian Florinescu [:adrian_sv] from comment #92)
Many thanks evilphish@phishtank.de!
Based on comment 93, updating status flags.
Flags: qe-verify+
Flags: needinfo?(luciano)
Hi Adrian/evilphish. Sorry for the delay. For what it's worth, I've also tested these on a Toshiba Radius 12 and can confirm that all three versions appear to work correctly.

Now we just need to get https://bugzilla.mozilla.org/show_bug.cgi?id=789038 (virtual keyboard not working) sorted and we'll have decent tablet support!

Thank you all for your efforts, appreciate your time and dedication!
Not to mention https://bugzilla.mozilla.org/show_bug.cgi?id=1231067
This actually bugs me quite a lot. What is your experience there luciano?
Component: Event Handling → User events and focus handling
You need to log in before you can comment on or make changes to this bug.