Closed Bug 946164 Opened 7 years ago Closed 7 years ago

touch events disappear

Categories

(Core :: DOM: Events, defect)

28 Branch
ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla29
blocking-b2g 1.3+
Tracking Status
firefox27 --- wontfix
firefox28 --- wontfix
firefox29 --- fixed
b2g-v1.3 --- fixed
b2g-v1.4 --- fixed

People

(Reporter: viralwang, Assigned: smaug)

References

Details

Reproduce steps:

1. Launch dialer
2. make an incoming call
3. touch the incoming call attention screen with two fingers and hold fingers on screen
4. hang up at the remote calling party
5. hold fingers from step 3 until the attention screen disappears

From here, the dialer's number pad doesnt work, call history scrolling doesnt work, subsequent incoming attention screen doesn't receive touches, however all other clicks still work.

Recoverable with doing any multi-touch gesture in the dialer app, or kill and relaunch dialer app.
Hi Fabrice,

We found TabChild (dailer in this case) receive touch event but gaia can not receive touch event.
Dailer can still get tap event but no any touch event after that.
Another two finger events in dailer can recover this symptom.

Do you know any expert who can provide more hint for this part?
Thanks.
Flags: needinfo?(fabrice.desre)
Fixing needinfo to @mozilla.com address.

What FxOS versions is this reproducible on?
Flags: needinfo?(fabrice.desre) → needinfo?(fabrice)
Keywords: qawanted
I don't know this part of the code base, sorry.
Flags: needinfo?(fabrice)
(In reply to Jason Smith [:jsmith] from comment #2)
> Fixing needinfo to @mozilla.com address.
> 
> What FxOS versions is this reproducible on?

I can repo this in unagi + master and it's 100% issue.
Hi Wesley,

I found you via bug 790454.
I need some input about touch events and iframes since we met a problem in multi-touches.
Do you know who is proper owner can provide some advice?
Thank you!
Flags: needinfo?(wjohnston)
I can answer questions, or smaug.
Flags: needinfo?(wjohnston)
QA Contact: mvaughan
This issue DOES reproduce on the 12/04 1.1 & 12/05 1.2 builds on a Buri device for me. However this issue does not seem to reproduce on the Leo 1.1 build and the Buri 1.3 build, both from 12/05.

- Buri 1.1 Build -
Environmental Variables:
Device: Buri v1.1 COM RIL
BuildID: 20131204041429
Gaia: 6ff3a607f873320d00cb036fa76117f6fadd010f
Gecko: c714e1bd607f
Version: 18.0
Firmware Version: v1.2_20131115
RIL Version: 01.01.00.019.28

- Buri 1.2 Build -
Environmental Variables:
Device: Buri v1.2 COM RIL
BuildID: 20131205004003
Gaia: 0659f16b9790b1cf9eba4d80743fcc774d2ffe3a
Gecko: af2c7ebb5967
Version: 26.0
Firmware Version: v1.2_20131115
RIL Version: 01.02.00.019.102
Keywords: qawanted
See Also: → 912867
Nominating this for 1.3 as this potentially could cause users not being able to answer incoming calls, considering step 3 in comment 0 STR could be caused by items in the pocket or lap.

Comment 4 suggests still reproducible on master.
blocking-b2g: --- → 1.3?
Hi Wesley,

The bug happened when there are 2 iframes in one app(dialer)
when we touch 2 fingers on one iframe and this iframe(incoming call) being killed.
After that, we release 2 fingers but we can not control another iframe in dialer in one finger anymore!
Touch events work fine in all apps except dialer.

I'm guessing when fingers pressed with iframe killed, it will lost NS_TOUCH_END event since the target of  event is gone (TabParent didn't send touch event and only mouse event works).
When I'm trying to touch another iframe, it will become multi-touch in this content process and can not work correctly.

I had a workaround in bug 912867 but I didn't think it's a good solution since it didn't fix it very well.
https://bug912867.bugzilla.mozilla.org/attachment.cgi?id=8342893

I think we should modify some codes in NS_TOUCH_END when the target iframe is killed but not sure if it's the best way we can have. Could you please provide some advice and see what to do next?
Thank you so much :)
Flags: needinfo?(wjohnston)
I found IsRemoteTarget(target) will be false in nsEventStateManager since the target iframe is gone.
So the touch leave event is missing in content side.

Content process can not figure out touch is gone until next touch leave (touch event this time can not work)

It looks like we cannot fix it in content process side, perhaps need to make sure chrome process exactly send the event to content process even the foreground iframe (or app?) is gone.
Sorry, I've been meaning to write a test page and look into this, but have been busy. Sounds like the problems are in the e10s code for this though. You may just need to ensure that the event with this id is removed from the touch list. [1] That should just work, but if the events aren't being sent to the child process anymore, I could see it causing issues.

[1] http://mxr.mozilla.org/mozilla-central/source/layout/base/nsPresShell.cpp#6915
Flags: needinfo?(wjohnston)
Hi Wesley,

Thank you for your response!
Yes, normally we expect parent process will remove id and send event to child process which will remove id also.
But for the dialer case, parent process will not send event to child process.
I add some logs to describe the symptom more detail.

Normal case:
(Press number pad in dialer)
I/GonkTouch(  108): gCaptureTouchList->Count():0
I/GonkTouch(  108): touchEvent->touches.Length(): 1
I/GonkTouch(  108): targetPtr: 0x46a65510
I/GonkTouch(  467): gCaptureTouchList->Count():0
I/GonkTouch(  467): touchEvent->touches.Length(): 1
I/GonkTouch(  467): targetPtr: 0x441511c0

(finger leave)
I/GonkTouch(  108): remove id: 0, targetPtr: 0x46a65510
I/GonkTouch(  467): remove id: 0, targetPtr: 0x441511c0

How bug happens:
(incoming call)
(press two fingers on incoming call attention screen)
I/GonkTouch(  108): gCaptureTouchList->Count():1
I/GonkTouch(  108): touchEvent->touches.Length(): 2
I/GonkTouch(  108): targetPtr: 0x46a64940
I/GonkTouch(  108): targetPtr: 0x46a64940
I/GonkTouch(  467): gCaptureTouchList->Count():1
I/GonkTouch(  467): touchEvent->touches.Length(): 2
I/GonkTouch(  467): targetPtr: 0x443f3880
I/GonkTouch(  467): targetPtr: 0x443f3880

(hang up call in remote)
(remove two fingers from screen)
I/GonkTouch(  108): remove id: 1, targetPtr: 0x46a64940 //child process didn't receive event
I/GonkTouch(  108): remove id: 0, targetPtr: 0x46a64940 //child process didn't receive event

(another press number pad)
I/GonkTouch(  108): gCaptureTouchList->Count():0 //parent process think it's 1st touch
I/GonkTouch(  108): touchEvent->touches.Length(): 1
I/GonkTouch(  108): targetPtr: 0x46a65510
I/GonkTouch(  467): gCaptureTouchList->Count():1 //child process think it's 2nd touch since previous missing
I/GonkTouch(  467): touchEvent->touches.Length(): 1

http://mxr.mozilla.org/mozilla-central/source/content/events/src/nsEventStateManager.cpp#1736
Since remote target here is gone, we cannot find proper target to send the finger leave event.
Not sure if we should 
1) notify child process when finger leave
2) discard last finger in next touch
Flags: needinfo?(wjohnston)
I can confirm the bug on Alcatel One Touch Fire (hamachi) on b2g 1.4 master built by myself on 01/01/14.

STR:
1. Phone is locked, screen is turned off.
2. Incoming call occurs.
3. Unlock screen, tap on top to get into incoming call bar on the bottom.
4. Incoming call bar on the bottom doesn't react for swipe in any direction but everything except the incoming bar works properly and answer for gestures.
(In reply to viral [:viralwang] from comment #12)
> Not sure if we should 
> 1) notify child process when finger leave

I think this is probably what you want to do. The child process should be deciding for itself that the target is gone and the ends should be removed, not the parent...
Flags: needinfo?(wjohnston)
(In reply to mac from comment #13)
> I can confirm the bug on Alcatel One Touch Fire (hamachi) on b2g 1.4 master
> built by myself on 01/01/14.

Let's make sure this gets fixed for 1.4.  We've had it for a while, and this doesn't sound like a fix we want to rush.
blocking-b2g: 1.3? → 1.4+
Now its fixed on 1.4, I can answer incoming calls when screen is locked.
This misbehaviour is affecting the incoming calls. This is a blocker for certification. We need it fixed in version 1.3, sorry. Renominating accordingly.
blocking-b2g: 1.4+ → 1.3?
Note that this is not a regression and that we haven't heard reports of it happening in the wild.
(In reply to Beatriz Rodríguez [:brg] from comment #17)
> This misbehaviour is affecting the incoming calls. This is a blocker for
> certification. We need it fixed in version 1.3, sorry. Renominating
> accordingly.

How would this be a certification blocker if this has been around since 1.1? We've passed certification with this bug present already present in IOT cycles.
Flags: needinfo?(brg)
(In reply to Jason Smith [:jsmith] from comment #19)
> (In reply to Beatriz Rodríguez [:brg] from comment #17)
> > This misbehaviour is affecting the incoming calls. This is a blocker for
> > certification. We need it fixed in version 1.3, sorry. Renominating
> > accordingly.
> 
> How would this be a certification blocker if this has been around since 1.1?
> We've passed certification with this bug present already present in IOT
> cycles.
Jason, this bug was originally reported by a partner, check the see also field. 
This misfunction is in the branch and it has a big impact in carrier business.
Before removing it from the 1.3 scope, I would like that someone from Product team can give their opinion about the importance of fixing this bug.
IMHO, we must include it in v1.3 because it is an inestability with big impact in UX and carrier business.
Flags: needinfo?(brg) → needinfo?(ffos-product)
I know we didn't block on this previously.  Given the impact to answering calls and the work around may not be discoverable, I think we should block on this (perhaps we should have originally).
Flags: needinfo?(ffos-product)
1.3+ for cert blocking
blocking-b2g: 1.3? → 1.3+
PM team triaged this and consider this 1.3 blocker
milan - can you find an assignee for this bug please since it is remarked as a blocker for 1.3
Flags: needinfo?(milan)
I wasn't paying attention to this, as FirefoxOS/General, and seemingly to do with events, rather than graphics, but ignoring that - Comment 16 says "Now its fixed in 1.4" - what bug was it that fixed it, sounds like we just need to uplift it?

NI :smaug, in case he knows of the bugfix that happened in 1.4.
Flags: needinfo?(milan) → needinfo?(bugs)
Component: General → DOM: Events
Product: Firefox OS → Core
Version: unspecified → 28 Branch
I can't recall anything obvious which could have fixed this. I guess someone who can reproduce this
should find which build is the first one which doesn't have this bug anymore.

But, perhaps Bug 952170 has affected to the behavior?
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #26)
> I can't recall anything obvious which could have fixed this. I guess someone
> who can reproduce this
> should find which build is the first one which doesn't have this bug anymore.
> 
> But, perhaps Bug 952170 has affected to the behavior?

Adding qawanted to find out when last reproduced on master & first started working on master builds-wise.
Keywords: qawanted
At this point, it appears the last time this issue reproduced was on the 01/25/14 Master (1.4) build. This issue appears to be quite finicky however, and will reproduce for me on a couple builds between 12/10/14 and 01/28/14. If the issue does not reproduce, if not just fully working on the first tap, most of the time it will take at least one tap on the screen to re-enable touch events for the Dialer.

- Last repro on 1.4 -
Device: Buri Master (1.4) MOZ RIL
BuildID: 20140125040202
Gaia: f382061fe95750d584a9078175c421a36892afc9
Gecko: 9e06d42c2a6a
Version: 29.0a1
Firmware Version: V1.2-device.cfg
Keywords: qawanted
Matthew, any chance to get the changeset of the latest version of Gecko where you can reproduce the
bug, and changeset of the first version of Gecko you can't reproduce the bug.
Regression range would be really useful here.
Assignee: nobody → bugs
Flags: needinfo?(mvaughan)
Well now I am unable to repro this issue on the 01/25 1.4 build. With that said, I can reproduce this issue on the 01/24 build so my window will be between these two builds for now.

- Last reproducing -
Device: Buri Master (1.4) MOZ RIL
BuildID: 20140124040404
Gaia: 290efee3de3a12c9d803f4650d50bc7c7a8e1f2d
Gecko: 9d650c07b547
Version: 29.0a1
Firmware Version: V1.2-device.cfg

- Started working -
Device: Buri Master (1.4) MOZ RIL
BuildID: 20140125040202
Gaia: f382061fe95750d584a9078175c421a36892afc9
Gecko: 9e06d42c2a6a
Version: 29.0a1
Firmware Version: V1.2-device.cfg
Flags: needinfo?(mvaughan)
Can this be reproduced on 1.3 builds?
Flags: needinfo?(mvaughan)
This issue reproduced for me on the 02/03/14 1.3 build on a Buri.

Environmental Variables:
Device: Buri v1.3 MOZ RIL
BuildID: 20140203004001
Gaia: f9a37c77efb4621a1f57e4695b497d18601fe134
Gecko: 3d9d920ca43b
Version: 28.0a2
Firmware Version: V1.2-device.cfg
Flags: needinfo?(mvaughan)
Looking at the fix range in comment 30 (http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9d650c07b547&tochange=9e06d42c2a6a) I spotted the fix for bug 959242 which looks likely to fix this.
Does the bug happen only with APZC enabled?
or disabled... in other words, does the APZC setting affect to this?
(In reply to viral [:viralwang] from comment #12)
> Not sure if we should 
> 1) notify child process when finger leave
> 2) discard last finger in next touch

I guess we could notify the child process about "finger leave", if the process is still there, but
iframe content isn't.
Another option would be to cancel all the current touches for an oop iframe when it is going away.
(In reply to Olli Pettay [:smaug] from comment #34)
> Does the bug happen only with APZC enabled?

APZC on & off won't play a factor here - this bug reproduced in 1.1, which doesn't include the APZC feature.
Depends on: 967236
(In reply to Jason Smith [:jsmith] from comment #38)
> (In reply to Olli Pettay [:smaug] from comment #34)
> > Does the bug happen only with APZC enabled?
> 
> APZC on & off won't play a factor here - this bug reproduced in 1.1, which
> doesn't include the APZC feature.

But if this bug is fixed in the build which has the patch for bug 959242, APZC certainly plays a
role here.

I filed bug 967236 to evict touch points more often. That _might_ help even in non APZC case.
Can we get ETA to fix this bug? Thank you.
Depends on: 959242
When bug 959242 lands on 1.3, we'll want to verify this works.
viral, want to try the patch for Bug 967236 (without apzc) ?
Flags: needinfo?(vwang)
patch in Bug 967236 can fix the touch disappear issue!
Flags: needinfo?(vwang)
Base on master branch with following commit.
Test in Unagi.

commit fdde61f35486f5b68931262e1134ce4bff71b120
Merge: 6e943c7ce 5ebd153
Author: Ryan VanderMeulen <ryanvm@gmail.com>
Date:   Wed Feb 5 15:47:36 2014 -0500

    Merge b2g-inbound to m-c.

Without patch, it fails.
With patch, it works!
bug 959242 has landed. Can someone retest this bug on 1.3?
Keywords: qawanted
I am unable to reproduce this issue on the 02/06/14 1.3 build. I tested this with APZ enabled and with it disabled.

Device: Buri v1.3 MOZ RIL
BuildID: 20140206004002
Gaia: 467ef8c9145d9a57d35b0619db541d23b522b958
Gecko: a1fa925c40c2
Version: 28.0
Firmware Version: V1.2-device.cfg
Keywords: qawanted
Awesome, closing this bug out then!
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
I am very happy to hear that the issue is fixed in v1.3. Thank you everyone. However I am a bit confused about the source of the fix, can anyone help me to clarify if it is due to bug 959242 (according to comment 45) or due to patch referenced in comment 44?
viral, want to test update-to-date mozilla central without apzc? I had to tweak the patch for bug 967236 a bit (to make it a bit safer, less regression prone).
(In reply to Beatriz Rodríguez [:brg] from comment #48)
> I am very happy to hear that the issue is fixed in v1.3. Thank you everyone.
> However I am a bit confused about the source of the fix, can anyone help me
> to clarify if it is due to bug 959242 (according to comment 45) or due to
> patch referenced in comment 44?

When APZC is enabled, bug 959242 should apparently help here.

Without APZC, bug 967236 should help (needs still re-verification).
Target Milestone: --- → mozilla29
You need to log in before you can comment on or make changes to this bug.