Closed Bug 833964 Opened 13 years ago Closed 12 years ago

crash [@ mozilla::layers::AsyncPanZoomController::ContentReceivedTouch]

Categories

(Core :: Graphics: Layers, defect)

ARM
Gonk (Firefox OS)
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla23
blocking-b2g leo+
Tracking Status
firefox21 --- wontfix
firefox22 --- wontfix
firefox23 --- fixed
b2g18 + fixed
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- wontfix

People

(Reporter: cjones, Assigned: kats)

References

Details

(Keywords: crash, reproducible, topcrash, Whiteboard: [b2g-crash])

Crash Data

Attachments

(2 files, 1 obsolete file)

Attached file processed dump
Seen while dogfooding over the weekend. Was doing heavy browsing and opening/closing a lot of tabs. No better STR atm. Crash reason: SIGSEGV Crash address: 0x6c003a Thread 0 (crashed) 0 libxul.so!mozilla::layers::AsyncPanZoomController::ContentReceivedTouch [AsyncPanZoomController.cpp : 1346 + 0x4] r4 = 0x4aef1800 r5 = 0x00000000 r6 = 0xbe87e090 r7 = 0x00050024 r8 = 0xbe87e760 r9 = 0x4030990c r10 = 0x00000000 fp = 0x00000000 sp = 0xbe87de60 lr = 0x40b0ef0f pc = 0x411ef462 Found by: given as instruction pointer in context 1 libxul.so!mozilla::layout::RenderFrameParent::ContentReceivedTouch [RenderFrameParent.cpp : 978 + 0x3] r4 = 0xbe87e090 r5 = 0xbe87e6fc r6 = 0xbe87e090 r7 = 0x00050024 r8 = 0xbe87e760 r9 = 0x4030990c r10 = 0x00000000 fp = 0x00000000 sp = 0xbe87de70 pc = 0x40b0ef0f Found by: call frame info 2 libxul.so!mozilla::dom::TabParent::RecvContentReceivedTouch [TabParent.cpp : 1304 + 0x5] r4 = 0xbe87e090 r5 = 0xbe87e6fc r6 = 0xbe87e090 r7 = 0x00050024 r8 = 0xbe87e760 r9 = 0x4030990c r10 = 0x00000000 fp = 0x00000000 sp = 0xbe87de78 pc = 0x4105d685 Found by: call frame info 3 libxul.so!mozilla::dom::PBrowserParent::OnMessageReceived [PBrowserParent.cpp : 1477 + 0xb] r4 = 0x47047ec0 r5 = 0xbe87e6fc r6 = 0xbe87e090 r7 = 0x00050024 r8 = 0xbe87e760 r9 = 0x4030990c r10 = 0x00000000 fp = 0x00000000 sp = 0xbe87de80 pc = 0x410978a7 Found by: call frame info
Severity: normal → critical
blocking-b2g: --- → tef?
Crash Signature: [@ mozilla::layers::AsyncPanZoomController::ContentReceivedTouch]
Keywords: crash
OS: Linux → Gonk (Firefox OS)
Hardware: x86_64 → ARM
Whiteboard: [b2g-crash]
(In reply to Chris Jones [:cjones] [:warhammer] from comment #0) > Seen while dogfooding over the weekend. Was doing heavy browsing and > opening/closing a lot of tabs. No better STR atm. Without an STR we can't block but let's track this.
blocking-b2g: tef? → -
tracking-b2g18: --- → +
to clarify : I am seeing those on socorro. Not the actual device.
Flags: needinfo?(jones.chris.g)
I saw the crash on unagi, but none of this code is device specific.
Flags: needinfo?(jones.chris.g)
Just guessing: Guessing it had to do with browser, the parent tab, a frame and zooming? Since zooming doesn't change when following a link in the same tab, maybe it crashed during the opening a web page that had an iframe? Going to toy with that idea.
There have been no crashes since 18.0/20130131.
Do we still need qawanted on this? Looks like the crash isn't reproducible at this point.
Status: NEW → RESOLVED
Closed: 12 years ago
Keywords: qawanted
Resolution: --- → WORKSFORME
I am not sure it's fixed. Indeed, crashes occurred in B2G 18.0/20130222.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Build ID: 20130404070202 This crash is occured to leo device. We got a same dump file to Comment 10. This issue reproduced when the complex web page is opening accroding to click link in bookmarked twiter on homescreen.(1 time) It concerned us that this issue could be reproduce many times to be used the FFOS by many people.
blocking-b2g: - → leo?
Keywords: reproducible
Can you provide the URL where this was reproducible?
Flags: needinfo?(leo.bugzilla.gecko)
Partners are considering this a blocking issue (in triage right now). We'll start this off as leo+ assuming that QA is either able to repro, or engineering can easily figure out the issue from the stack. Otherwise we'll move this back to blocking-b2g:-.
blocking-b2g: leo? → leo+
Keywords: qawanted
Joe - Does the crash stack give any indication of what might not be working here? Any hints on what STR might cause this stack?
Flags: needinfo?(joe)
I have absolutely no clue about APZC; I understand that Anthony Jones does, but I think he's indisposed at the moment. Maybe roc has some idea who would.
Flags: needinfo?(joe) → needinfo?(roc)
If the crash report https://crash-stats.mozilla.com/report/index/8cf2155f-801d-4c88-9612-f13af2130319 https://crash-stats.mozilla.com/report/index/1c7e05a0-dd5e-4934-af89-f27cf2130131 corresponds to https://hg.mozilla.org/releases/mozilla-b2g18/annotate/051fa7fa5ded/gfx/layers/ipc/AsyncPanZoomController.cpp https://hg.mozilla.org/releases/mozilla-b2g18/annotate/d5d6ef77f2d9/gfx/layers/ipc/AsyncPanZoomController.cpp respectively, they both crash at: mTouchListenerTimeoutTask->Cancel(); Are we hitting a race-condition problem? Since both Cancel() and ~RunnableMethod() in class RunnableMethod call ReleaseCallee(), which is obviously not thread safe: void ReleaseCallee() { if (obj_) { RunnableMethodTraits<T>::ReleaseCallee(obj_); obj_ = NULL; } }
If I understand correctly, all accesses to mTouchListenerTimeoutTask happen on the controller/UI thread. So there shouldn't be a threading-related race condition here. I don't see how we'd crash here though. mTouchListenerTimeoutTask should always be null or a valid pointer...
Flags: needinfo?(roc)
So AFAICT the actual CancelableTask object pointed to mTouchListenerTimeoutTask is created in NewRunnableMethod (either at line 235 or line 267) and is deleted by the message loop code, I think around [1]. This means that once the task finished executing (i.e. the TimeoutTouchListeners method finishes), the pointer mTouchListenerTimeoutTask is pointing to garbage if it hasn't been nulled out explicitly. This could happen if, for example, the early-exit condition at the top of ContentReceivedTouch gets hit when TimeoutTouchListeners gets called. I don't know if there that early-exit condition is guaranteed to never hit when being called through TimeoutTouchListeners but I don't think it is. So, in a nutshell: - mTouchListenerTimeoutTask is assigned and TimeoutTouchListeners is queued as a delayed task - TimeoutTouchListeners runs - The call to ContentReceivedTouch from TimeoutTouchListeners does an early exit at the top - The message loop deletes the object pointed to by mTouchListenerTimeoutTask since it's done - Some time later (could be much later, since mTouchListenerTimeoutTask is never re-assigned when non-null, except in ContentReceivedTouch), content calls into ContentReceivedTouch - This time the early exit doesn't get hit - Since mTouchListenerTimeoutTask is non-null, calling ->Cancel() on it goes boom. In terms of fixing it, I see two options: one is to move the mTouchListenerTimeoutTask cancellation/nulling to the top of the method, before the early exit. I don't know if this will have other side-effects because I don't know what that check is doing exactly. The second option, much safer, is to just set mTouchListenerTimeoutTask to null in TimeoutTouchListeners(). We know the only place that function gets called is from the task, and once it's done, the task will be destroyed, so we might as well null out the pointer to it. [1] http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/base/message_loop.cc#416
Not sure who should review this exactly. See above comment for an explanation of what I think is happening.
Attachment #739074 - Flags: review?(bugzilla)
Pulling qawanted - we're unblocked here to fix this bug.
Keywords: qawanted
Comment on attachment 739074 [details] [diff] [review] Null out potentially dangling pointer Review of attachment 739074 [details] [diff] [review]: ----------------------------------------------------------------- Makes perfect sense to me. I don't even really think this needs a comment explaining it.
Attachment #739074 - Flags: review?(bugzilla) → review+
Updated patch to remove comment. I also felt it was kind of unnecessary.
Attachment #739074 - Attachment is obsolete: true
Attachment #739209 - Flags: review+
Adding checkin-needed flag since I'm not sure what the process for landing on b2g-18 is, and it looks like this patch needs to go there.
Keywords: checkin-needed
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Hit this on 1.0.1 branch: https://crash-stats.mozilla.com/report/index/18fe8cd7-d78e-4e4c-b2d6-bd3ff2130516 Nominating for consideration. The issue was hit while aggressively panning and zooming in testing scenarious related to Bug 847592.
blocking-b2g: leo+ → tef?
Glad we have better STR, but still only leo+. Don't want to take non-blocking change at this point.
blocking-b2g: tef? → leo+
(In reply to lsblakk@mozilla.com [:lsblakk] from comment #12) > Can you provide the URL where this was reproducible? We tested this page "http://www.naver.com"
Flags: needinfo?(leo.bugzilla.gecko)
Flags: in-moztrap-
It's #5 top crasher in B2G 18.0.
Keywords: topcrash
Does that mean the patch didn't fix it? Or do the crashing builds not have the patch?
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #32) > Or do the crashing builds not have the patch? Keon and Peak that still crash are running v1.0.1 that doesn't contain the fix. Currently, B2G top crashers are almost exclusively on Geekphones, the most widespread Fx OS phones.
Assignee: nobody → bugmail.mozilla
Blocks: 838215
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: