Closed
Bug 1256344
Opened 8 years ago
Closed 8 years ago
crash in mozilla::layers::AsyncPanZoomController::OnLongPress
Categories
(Core :: Panning and Zooming, defect)
Tracking
()
RESOLVED
FIXED
mozilla48
Tracking | Status | |
---|---|---|
firefox48 | --- | fixed |
People
(Reporter: MatsPalmgren_bugz, Assigned: kats)
References
Details
(Keywords: crash, Whiteboard: gfx-noted)
Crash Data
Attachments
(2 files, 1 obsolete file)
2.64 KB,
patch
|
botond
:
review+
|
Details | Diff | Splinter Review |
1.53 KB,
patch
|
botond
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is report bp-b9334420-832d-4111-89e8-d77882160313. ============================================================= This signature appears to have become more frequent in v48. The stack is similar to bug 1256341 so it might be the same underlying issue. The volume is low so far (for the past 28 days): Operating System Percentage Number Of Crashes Windows 10 64.29% 9 Windows 8.1 28.57% 4 Linux 7.14% 1 Product Version Percentage Number Of Crashes Firefox 48.0a1 78.57% 11 FennecAndroid 48.0a1 7.14% 1 Firefox 46.0a2 7.14% 1 Firefox 47.0a1 7.14% 1 Architecture Percentage Number Of Crashes amd64 50.00% 7 x86 42.86% 6 arm 7.14% 1 Stack: mozilla::layers::AsyncPanZoomController::OnLongPress(mozilla::TapGestureInput const&) mozilla::layers::AsyncPanZoomController::HandleGestureEvent(mozilla::InputData const&) mozilla::layers::GestureEventListener::HandleInputTimeoutLongTap() MessageLoop::DoDelayedWork(base::TimeTicks*) mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) MessageLoop::RunHandler() MessageLoop::Run() nsBaseAppShell::Run() nsAppShell::Run() nsAppStartup::Run() XREMain::XRE_mainRun() BTW, the Linux crash is bp-a166d953-9f27-414c-ad7a-460a52160313 That stack is slightly different.
Reporter | ||
Comment 1•8 years ago
|
||
Fwiw, this signature is currently at #22 in Firefox 48.0a1 Top Crash list. (Together with bug 1256341 they would be at #10.)
Comment 2•8 years ago
|
||
This comes from: http://hg.mozilla.org/mozilla-central/diff/2f392da0634f/gfx/layers/apz/src/AsyncPanZoomController.cpp#l1665 CurrentTouchBlock() is null. Kats is a null check appropriate here?
Comment 3•8 years ago
|
||
While bug 1113386 is the patch that added the call to CurrentTouchBlock(), it may not be what regressed this, as it landed over a year ago, and we're only hearing about these crashes now. I suspect it's more likely that a more recent patch broke the invariant that CurrentTouchBlock() cannot be null there.
Assignee | ||
Comment 4•8 years ago
|
||
I think the call to CurrentTouchBlock() is definitely wrong, since CurrentTouchBlock() is intended to be called while the active input block is a touch block. In both this bug and bug 1256341, the call happens from a posted runnable at which point the active input block may be something other than the touch block that posted the runnable. I suspect this problem has existed since this code was added, but is only being exposed on desktop because actually get other kinds of input blocks (in particular, wheel/mouse input) whereas on B2G that never happened. This is probably easy to reproduce if you are interleaving different kinds of input.
Comment 5•8 years ago
|
||
Yeah, I think you're right. I thought perhaps we had some logic to cancel the relevant runnables when a new kind of input block is started, but it doesn't look like we do.
Comment 6•8 years ago
|
||
Adding a null check in OnLongPress() seems fine - a touch block that's gone away cannot be in a fast fling. Adding one in GenerateSingleTap() is also safe, but we have a latent bug here: - When processing a touch-end, we send an APZStateChange::EndTouch notification, with an aWasClick argument that's set to TouchBlockState::mSingleTapOccurred. This is used by ActiveElementManager. - TouchBlockState::mSingleTapOccurred is set in GenerateSingleTap(). - The bug is that, in general, GenerateSingleTap() can happen *after* the touch-end is processed (if we were waiting for a timeout), so the value of aWasClick being sent with the EndTouch notification may not be accurate. Do we care?
Assignee | ||
Comment 7•8 years ago
|
||
(In reply to Botond Ballo [:botond] from comment #6) > Adding a null check in OnLongPress() seems fine - a touch block that's gone > away cannot be in a fast fling. > > Adding one in GenerateSingleTap() is also safe, I'm not 100% convinced yet. The scenario would be as follows: a) user kicks off a fling (touch down, bunch of touch moves, touch up with high velocity). All of this is input block #1 b) user does a quick tap (touch down, touch up). This is input block #2 and causes the tap timeout to get scheduled c) user does a wheel event, input block #3 d) tap timeout fires, with the "current input block" as #3 (i.e. CurrentTouchBlock returns nullptr) So here input block #2 will have the mDuringFastFling flag set on it, because that is set during input block construction. At step (d) I don't think we want to process it as a tap; instead we want to ignore it because at step (b) the page was flinging fast and the user stopped the fling with the tap. Or to put it another way, if you omit step (c), we wouldn't generate the single-tap event. So we shouldn't generate it with step (c) either. > but we have a latent bug > here: > > - When processing a touch-end, we send an APZStateChange::EndTouch > notification, with an > aWasClick argument that's set to TouchBlockState::mSingleTapOccurred. > This is used > by ActiveElementManager. > > - TouchBlockState::mSingleTapOccurred is set in GenerateSingleTap(). > > - The bug is that, in general, GenerateSingleTap() can happen *after* the > touch-end is > processed (if we were waiting for a timeout), so the value of aWasClick > being sent > with the EndTouch notification may not be accurate. > > Do we care? I think we do care, but maybe in a separate bug. That might actually be of the things that's impacting bug 1227241.
Comment 8•8 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #7) > Or to put it another way, if you omit step (c), we wouldn't generate the > single-tap event. So we shouldn't generate it with step (c) either. Oh, good point! I forgot about the exact semantics of mDuringFastFling. Perhaps we can save the touch block's mDuringFastFling on the GEL (or in the task), and pass it into OnLongPress?
Assignee | ||
Comment 9•8 years ago
|
||
I think we can do one better and skip scheduling the timers entirely if the touch block has mDuringFastFling, because the events they generate are gonna get ignored anyway.
Comment 11•8 years ago
|
||
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #9) > I think we can do one better and skip scheduling the timers entirely if the > touch block has mDuringFastFling, because the events they generate are gonna > get ignored anyway. I've been trying to reason about whether or not that would work; in particular, whether or not it's OK to omit the state transitions that are triggered when the timers fire. For example, consider a quick tap followed by a long tap. I think we rely on the max tap timeout expiring and setting the gesture state to NONE to prevent this from being treated as a double-tap. (An unrelated question about this scenario - it doesn't look like in this scenario the long tap is detected as a long tap. Should it be?)
Assignee | ||
Comment 12•8 years ago
|
||
(In reply to Botond Ballo [:botond] from comment #11) > I've been trying to reason about whether or not that would work; in > particular, whether or not it's OK to omit the state transitions that are > triggered when the timers fire. > > For example, consider a quick tap followed by a long tap. I think we rely on > the max tap timeout expiring and setting the gesture state to NONE to > prevent this from being treated as a double-tap. We can probably do something like keep the state transitions but ditch the events dispatched into APZC. I'll take this bug and see if I can cook up a fix and a gtest. > (An unrelated question about this scenario - it doesn't look like in this > scenario the long tap is detected as a long tap. Should it be?) I'm not sure, that might count as a double tap.
Assignee: nobody → bugmail.mozilla
Flags: needinfo?(bugmail.mozilla)
Assignee | ||
Comment 13•8 years ago
|
||
I wrote a gtest to exercise this, that was pretty easy. However the fix I tried (passing the IsDuringFastFling into the NewRunnableMethod and throwing away the TapGestureInput if it's true) didn't quite work. The problem is that when we do trigger a long-press, we call InjectNewTouchBlock on the input queue, which tries to copy things over from the current touch block, and if that's null there's still a crash. Need to figure out if we should keep that old touch block around, or if it's not really valid any more, or what.
Assignee | ||
Comment 14•8 years ago
|
||
Here's the gtest anyway.
Assignee | ||
Comment 15•8 years ago
|
||
Attachment #8730864 -
Attachment is obsolete: true
Attachment #8731310 -
Flags: review?(botond)
Assignee | ||
Comment 16•8 years ago
|
||
So I concluded here that if we get a wheel event while the finger is still down, we should abort the long-press because it doesn't really make sense to dispatch that. If the finger is lifted before the wheel event comes in we should not be generating a long-press anyway since we cancel the runnable.
Attachment #8731311 -
Flags: review?(botond)
Updated•8 years ago
|
Attachment #8731310 -
Flags: review?(botond) → review+
Updated•8 years ago
|
Attachment #8731311 -
Flags: review?(botond) → review+
Comment 17•8 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/b0079f81c85b https://hg.mozilla.org/integration/mozilla-inbound/rev/33c773637c21
Comment 18•8 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/b0079f81c85b https://hg.mozilla.org/mozilla-central/rev/33c773637c21
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Assignee | ||
Comment 19•8 years ago
|
||
For the record I'm able to run into this crash pretty easily while trying to tap/double-tap/drag the scrollbars on the March 16 Windows nightly.
You need to log in
before you can comment on or make changes to this bug.
Description
•