Closed Bug 913942 Opened 8 years ago Closed 6 years ago

Get the first touchmove event after a touchstart earlier when moving finger slowly

Categories

(Core :: DOM: UI Events & Focus Handling, defect, P2)

22 Branch
ARM
Gonk (Firefox OS)
defect

Tracking

()

RESOLVED WONTFIX
blocking-b2g -

People

(Reporter: julienw, Unassigned)

References

()

Details

(Keywords: perf, Whiteboard: [c=effect p= s= u=])

On unagi, when we're swiping the homescreen, there is consistently 140ms between the touchstart event and the first touchmove event. This makes the panel swiping not feel repsonsive, because we can't start swiping before the first touchmove event.

Therefore I'm opening this bug to investigate if we could not get the first "touchmove" event earlier.
Olli, would you know something about this ?
Flags: needinfo?(bugs)
I don't really know what causes that 140ms delay. Sounds like something very b2g specific.
Flags: needinfo?(bugs)
Ok, now I understand better.

I made this app on http://everlong.org/mozilla/touchmove/

You can install it to test the path used for an application, but the behaviour is the same for me whether you run in the browser or as an application.

Basically, if you slide slowly, you'll get the first touchmove event later. Moving very slowly, I could get more than 1 second between the touchstart and the first touchmove.

This is consistent on an Unagi and on Firefox for Android (on a Galaxy Nexus with Android 4.3).

So, what are the reasons for not firing the first touchmove earlier, in a consistent time ? (firing the other ones depending on the speed of the finger is fine for me).

All the reasons I think of (like: not screwing the longpress gesture) are not convincing.
Flags: needinfo?(bugs)
I really don't know about that stuff. I think this behavior comes from http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/AsyncPanZoomController.h or b2g shell code. Both areas I know next to nothing about.
Component: DOM: Events → Event Handling
Flags: needinfo?(bugs)
Oki... Could you please suggest someone to needinfo ?
This may be related to some patches we have in place to not send touchmove events until the persons finger has moved some minimum distance. We had to hack something like that in on Android to make click detection work in some major web pages that are using touchmove as a way to detecting the difference between taps and pans. B2G may have picked up the same code when they translated our APZC?

I have a WIP to remove this (on Android) when the user is using a pen in bug 904245, but it doesn't sound like it would help here.
Flags: needinfo?(bugmail.mozilla)
Ok, I just tried on a central B2G (on Galaxy nexus and Keon), and we actually get the first touchmove very quickly. I'll check tomorrow with a Firefox Nightly on Android, to see if this is consistent.
Yeah on fennec the first touchmove is delayed by the code at [1] and is done on purpose for compatibility with some websites. I don't recall seeing anything similar in the B2G pan/zoom code.

[1] http://hg.mozilla.org/mozilla-central/file/b9029b1de410/mobile/android/base/gfx/LayerView.java#l154
Flags: needinfo?(bugmail.mozilla)
I think b2g does something similar here:

http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/AsyncPanZoomController.cpp#441

(in fact, we should bring those prefs into line with Fennec at some point).
No, that code determines when the content actually starts panning. The touch events should still be getting delivered even if the panning is restricted. In other words, the code you pointed to in comment 9 is equivalent to http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/gfx/JavaPanZoomController.java#475
Ahh. thanks :)
Re: comment 7

On a leo device with master, I get the same behavior as in my unagi 1.1.

So it's probably device specific. Maybe Michael Wu would know something in the driver section ?
Flags: needinfo?(mwu)
I don't think I have an answer for what you're asking, though I'm not entirely sure what you're asking.
Flags: needinfo?(mwu)
Julien: you should add logging at http://mxr.mozilla.org/mozilla-central/source/widget/gonk/nsAppShell.cpp#173 to see the latency between the touch down and the first touch move. I don't believe that there are delays later in the pipeline so it's likely you will see the same 140ms delay in this code. That would mean that the hardware is just generating events with large delays. I don't know if there's anything we can do about that.
Keywords: perf
blocking-b2g: koi? → koi+
Priority: -- → P1
Whiteboard: [c=effect p= s= u=1.2]
Just to report that I see the same on Firefox for Android in my Galaxy Nexus with Android 4.3.

However as I said in comment 7, this was quick in Firefox OS on another Galaxy Nexus (Andrew Sutherland's device IIRC). So not the same device though...

So it could be either a hardware, or driver issue. I don't have the skills to investigate though.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #14)
> Julien: you should add logging at
> http://mxr.mozilla.org/mozilla-central/source/widget/gonk/nsAppShell.cpp#173
> to see the latency between the touch down and the first touch move. I don't
> believe that there are delays later in the pipeline so it's likely you will
> see the same 140ms delay in this code. That would mean that the hardware is
> just generating events with large delays. I don't know if there's anything
> we can do about that.

On a keon I don't see a big latency here (I get ~20ms consistently).
Yep, that's what I said in comment 7.

So to resume:
* keon, Galaxy Nexus running Firefox OS, Nexus S running Firefox OS -> small latency
* unagi, buri, leo -> big latency when we slide slowly

I can see a pattern here: when we don't build the kernel ourselves, there is a big latency, but when we build the kernel, there is a small latency.
I just tested on a leo, and also get numbers in the 20ms to 40ms range fwiw.
Using the page in comment 3, I see numbers in the ~15ms range on the peak and 40-60ms range on the hamachi. These don't seem excessive to me so I don't think the devices I have can reproduce this problem.

If you do have a device where the page in comment 3 reports large latencies but the code in comment 14 has small latencies then the next places to "bisect" where the latency is showing up would be in TabParent::SendRealTouchEvent and TabChild::RecvRealTouchEvent.
I see 20ms to 40ms response too on my unagi. Obviously if I hold my finger before starting to move, I get longer delays, but that's expected.
Thats the point of this bug imho: when moving slowly (not necessarily holding the finger first), the first touchmove is slow.

When going quickly, everything is always fine on all devices.
When going slowly, only some are behaving in a correct way.

This is not (always) a hardware issue, because the Galaxy Nexus exposes both behaviors, depending whether we use Firefox on Android (NOK) or Firefox OS (OK).

I'm not saying we should always have a touchmove 30ms after the first touchstart if the finger is moving (although I think it's better). Firefox OS on Nexus S reacts at about 100 or 150ms consistently, whether we go quick or slow, this is imo good enough.

But now I can get 3 seconds in this situation on the devices with the broken behaviour... This is a lot and this is too much.
I think I misunderstood this bug, then. IMO if you move your finger slowly it's *required* that the first touchmove be delayed otherwise it destroys gestures.

(In reply to Julien Wajsberg [:julienw] from comment #3)
> All the reasons I think of (like: not screwing the longpress gesture) are
> not convincing.

Why is this not convincing?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #22)
> I think I misunderstood this bug, then. IMO if you move your finger slowly
> it's *required* that the first touchmove be delayed otherwise it destroys
> gestures.

Does it destroy gestures on keon where we get the first touchmove earlier ? I don't think so.

> 
> (In reply to Julien Wajsberg [:julienw] from comment #3)
> > All the reasons I think of (like: not screwing the longpress gesture) are
> > not convincing.
> 
> Why is this not convincing?

See [1] and [2], we have a 25px threshold around the touchstart coordinates. That means that as long you don't move too far away, we'll still have the longpress aka hold gesture, even if we get touchmove events.

[1] https://mxr.mozilla.org/mozilla-central/source/b2g/app/b2g.js#246
[2] http://mxr.mozilla.org/mozilla-central/source/content/events/src/nsEventStateManager.cpp#2100
Content can implement gestures too. There are quite a few websites out there that listen for touchstart/touchmove/touchend events and basically consume them all and implement their own gesture detection. What you're suggesting would probably break those use cases.
I've checked hammer.js too (a famous library for handling gestures):

see https://github.com/EightMedia/hammer.js/blob/master/src/gestures.js#L115-L153

They have a default threshold of "1" (which supports what you're saying), but a default timeout of 500ms, which means we could still send our first touchmove event earlier.

I've checked other libs too:
QuoJS: https://github.com/soyjavi/QuoJS/blob/master/src/quo.gestures.coffee
-> threshold of 30px, timeout of 650ms

mootools mobile: timeout of 750ms, no threshold
jquery touchy: I don't understand completely the code, but it looks like it's using a timeout of 100ms

Please direct me to websites that you think might break and I can test with the keon.
I don't have any specific websites off the top of my head, unfortunately. But thanks for looking at all of those frameworks!

I'd like to take a couple of steps back to properly understand what's causing this bug, since for most of the bug I was under the mis-impression that this was with panning fast. I'm able to reproduce the problem with panning slowly on your test page on the hamachi device I have, so I will debug and see where the delay is coming from.
So in a nutshell this looks like a hardware or driver level issue. When you put your finger down, we get a touchdown and followed by a stream of touchmove events from the hardware. All of these events have the same coordinates because your finger isn't moving, and so the code in nsPresShell.cpp filters out the touchmoves and doesn't fire them to content (this I think is correct). If you move your finger very slowly, the hardware doesn't seem to register the change until some time later. So for example, I see stuff like this:

10-18 14:03:11.210 I/Gecko   ( 2790): staktrace: coords 144 282
10-18 14:03:11.230 I/Gecko   ( 2790): staktrace: touch move at 585645264
10-18 14:03:11.230 I/Gecko   ( 2790): staktrace: coords 144 282
10-18 14:03:11.250 I/Gecko   ( 2790): staktrace: touch move at 585645280
10-18 14:03:11.250 I/Gecko   ( 2790): staktrace: coords 144 282
10-18 14:03:11.260 I/Gecko   ( 2790): staktrace: touch move at 585645296
10-18 14:03:11.260 I/Gecko   ( 2790): staktrace: coords 144 282
10-18 14:03:11.280 I/Gecko   ( 2790): staktrace: touch move at 585645313
10-18 14:03:11.280 I/Gecko   ( 2790): staktrace: coords 143 289
10-18 14:03:11.290 I/Gecko   ( 2790): staktrace: touch move at 585645328
10-18 14:03:11.290 I/Gecko   ( 2790): staktrace: coords 143 289
10-18 14:03:11.310 I/Gecko   ( 2790): staktrace: touch move at 585645344
10-18 14:03:11.310 I/Gecko   ( 2790): staktrace: coords 143 289
10-18 14:03:11.330 I/Gecko   ( 2790): staktrace: touch move at 585645360
10-18 14:03:11.330 I/Gecko   ( 2790): staktrace: coords 143 289
10-18 14:03:11.350 I/Gecko   ( 2790): staktrace: touch move at 585645376
10-18 14:03:11.350 I/Gecko   ( 2790): staktrace: coords 143 289

Note how we're constantly getting touch move events but the coordinates jump by 7 pixels in the middle rather than getting 7 1-pixel jumps like you would expect. Without hardware cooperation I don't see what else we can do here.

(note: the above logging statements are from printf_stderr lines I added to widget/gonk/nsAppShell.cpp in sendTouchEvent and addDOMTouch.)
I quite agree this is probably a driver issue (For example, in Galaxy Nexus, the behavior is different in Firefox/Android and on Browser/Firefox OS).

I really don't know how to move forward, let even if we want to change this (I want ! ;) ).
Not a performance regression or performance issue in a new feature. We're only blocking on critical issues at this point.
blocking-b2g: koi+ → -
Whiteboard: [c=effect p= s= u=1.2] → [c=effect p= s= u=]
Assignee: nobody → dhuseby
Whiteboard: [c=effect p= s= u=] → [c=effect p=2 s= u=]
Assignee: dhuseby → nobody
Priority: P1 → P2
See Also: → 998379
Whiteboard: [c=effect p=2 s= u=] → [c=effect p= s= u=]
See Also: → input-thread
Updating title per comment 21. Some discussion with mwu on IRC today made us hypothesize that this may be due to a "dead zone" implementation in the driver, where touch moves within a certain radius of the touchstart are ignored in order to more clearly distinguish pan gestures from non-pan gestures. I think such a deadzone implementation makes sense to have, but it should live in the APZ code rather than in the driver code. (In fact we already have a deadzone implementation in the APZ, and so we should just combine the two).
Summary: Get the first touchmove event after a touchstart earlier → Get the first touchmove event after a touchstart earlier when moving finger slowly
This is basically a WONTFIX. I am planning to implement a deadzone for touchmove events (similar to what other browsers have) in bug 1141127. So now touchmove events will actually be more delayed rather than less. However if there are specific user-visible latency issues caused by this we should file bugs for those issues and find other ways to fix them.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
Component: Event Handling → User events and focus handling
You need to log in before you can comment on or make changes to this bug.