Closed Bug 1086931 Opened 10 years ago Closed 10 years ago

[Window management] crash in mozilla::layers::TouchBlockState::HasEvents() const

Categories

(Core :: Panning and Zooming, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

VERIFIED FIXED
Tracking Status
b2g-v2.1 --- verified

People

(Reporter: KTucker, Assigned: kats)

References

Details

(Keywords: crash, Whiteboard: [b2g-crash] [2.1-exploratory-3])

Crash Data

This bug was filed from the Socorro interface and is 
report bp-00135df1-857a-446b-b44c-393282141021.
=============================================================

Description:
A crash occurred when scrolling a website with a lot of images on the webpage and locking/unlocking the phone.

Repro Steps:
1)  Updated Flame to Build ID: 20141021001201
2)  Open the browser and go to www.gamespot.com
3)  While the page is loading, keep scrolling up and down the page.
4)  Lock the phone and unlock the phone.
5)  Repeat steps 3 and 4 multiple times.

Actual:
A crash will occur when scrolling a webpage with a lot of images on it and locking/unlocking the dut.

Expected:
No crash occurs.

Environmental Variables
Device: Flame 2.1 (319mb)(Kitkat Base)(Full Flash)
Build ID: 20141021001201
Gecko: https://hg.mozilla.org/releases/mozilla-b2g34_v2_1/rev/ee86921a986f
Gaia: e458f5804c0851eb4e93c9eb143fe044988cecda
Platform Version: 34.0
Firmware Version: v188
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

Notes:
Repro frequency: 1/10 attempts
See attached: Video
Keywords: steps-wanted
Is there a video? I don't see one attached. Also if you can recall what site you saw it on, would give us a better chance to repro.
Flags: needinfo?(ktucker)
Whiteboard: [b2g-crash]
Also adding Botond to cc list since he touched code in this area according to the stack: http://hg.mozilla.org/releases/mozilla-b2g34_v2_1/annotate/ee86921a986f/gfx/layers/apz/src/InputBlockState.cpp#l140
Component: Gaia::System::Window Mgmt → Graphics
Product: Firefox OS → Core
OS: Android → Gonk (Firefox OS)
Hardware: All → ARM
Keywords: qawanted
I think there's a race condition in the code where the APZC might get destroy()d on the compositor thread while the ProcessPendingInputBlocks is running on the controller thread, which could result in this.
Component: Graphics → Panning and Zooming
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #3)
> I think there's a race condition in the code where the APZC might get
> destroy()d on the compositor thread while the ProcessPendingInputBlocks is
> running on the controller thread, which could result in this.

Kats, do you any tips on how a tester can get into this race condition, and try to manually reproduce?
Flags: needinfo?(bugmail.mozilla)
Most likely the critical elements are to load a page with slow touchstart event listeners (a big page like gamespot.com or nytimes.com is a likely candidate, or you could construct one), and then locking the device as quickly as possible after touching the screen to do a pan for example. That should cause the APZC to get destroyed while the touch event listener is still going, and then the crash should trigger when it listener finishes running.
Flags: needinfo?(bugmail.mozilla)
Thank you Kartikaya for the information. I was able to reproduce this crash once more by navigating through the Gamespot website, scrolling the page and locking/unlocking the device. I have tried this several more times and I still haven't got a solid repro or video for this issue. I will keep working on it. Once thing i did notice was that scrolling will completely break after awhile. The user will not be able to scroll the webpage at all. This has happened to me twice in my many attempts. This could be related or a completely separate issue.
Flags: needinfo?(ktucker)
Whiteboard: [b2g-crash] → [b2g-crash] [2.1-exploratory-3]
[Blocking Requested - why for this release]: I'm not convinced this actually needs to be blocking because it seems fairly rare, but I'm requesting blocking to know what the release drivers think. The patch series I'm working on in bug 1083395 will fix this, but that will land on 2.2 and will NOT be upliftable to 2.1. So if we want to fix this in 2.1 we will need to either land a fix before I land bug 1083395 and uplift that, or fix it directly in 2.1 code at some point. So I would like to know if this is considered blocking to determine if we should spend time on it or not.
blocking-b2g: --- → 2.1?
Given that the reproduction rate is low, and it's not showing up as a topcrash in Kairo's report, i'm inclined to agree with Kats that we should let this ride the trains.  It's really too late to risk landing complicated patches.  I'm going to 2.1-, and move this to 2.2?.  But if this surfaces up on the top crashes or we can find large reproducibility rate, we can renom.
blocking-b2g: 2.1? → 2.2?
Assignee: nobody → bugmail.mozilla
This should be fixed by bug 1083395 which moves around a lot of this code.
Status: NEW → RESOLVED
Closed: 10 years ago
Depends on: 1083395
Resolution: --- → FIXED
Unable to reproduce issue on Flame 2.1 build (Full Flash, nightly, 319 MB memory). 

I spent 1+ hours attempting to repro this bug, but could not get the Browser app to crash (gamespot.com). 

Environmental Variables:
Device: Flame 2.1
Build ID: 20141028001203
Gaia: a0174f7166745256aaca1cb3aa9f894033fbffa6
Gecko: 43bda3541f6b
Version: 34.0 (2.1)
Firmware Version: v188
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(jmitchell)
Keywords: qawanted
Issue is fixed, removing steps-wanted keyword
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(jmitchell)
Keywords: steps-wanted
Changing the status to VERIFIED based on Comment 9, Comment 10, and Comment 11.
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(ktucker)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(ktucker)
blocking-b2g: 2.2? → ---
You need to log in before you can comment on or make changes to this bug.