Closed Bug 1101297 Opened 10 years ago Closed 9 years ago

Crash in mozilla::dom::TabParent::Release()

Categories

(Firefox OS Graveyard :: Stability, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:2.1+, b2g-v2.1 affected, b2g-v2.2 affected)

RESOLVED DUPLICATE of bug 1105468
blocking-b2g 2.1+
Tracking Status
b2g-v2.1 --- affected
b2g-v2.2 --- affected

People

(Reporter: anshulj, Assigned: mccr8)

References

Details

(Keywords: crash, Whiteboard: [b2g-crash][caf priority: p2][CR 786377][caf-crash 216])

Attachments

(2 files)

Attached file stack trace
We are consistently seeing the crash libxul.so!mozilla::dom::TabParent::Release()  [TabParent.cpp : 217 + 0x4] when running voice call as well as SMS related tests. There are no specific STRs I have that cause this crash.

I am also seeing this issue on 2.1 if I enable asserts by settings B2G_DEBUG and running the same tests. On 2.2 the crash happens without the need for turning the asserts.

Please find the attachments with the logs and the stack trace.
Attached file logs
OS: All → Gonk (Firefox OS)
Hardware: All → ARM
Bhavana, could you please get someone to analyze the issue. This crash is happening very frequently for us and we have to disable a lot of our unit tests because of this issue.
Flags: needinfo?(sbhavna)
Sorry, added NI for wrong Bhavna.
Flags: needinfo?(sbhavna) → needinfo?(bbajaj)
Andrew, any first impressions on this one or can you please help re-direct this to the right person who can help investigate? thanks!
Flags: needinfo?(bbajaj) → needinfo?(continuation)
Component: Stability → Video/Audio
Flags: needinfo?(continuation)
Product: Firefox OS → Core
From the stack, it looks like we're off on some media thread, then we call AsyncLatencyLogger::Release().  I guess this is okay because this looks like some kind of low-level media class that can be used wherever.

Then, somehow, we're in ~OpenFileAndSendFDRunnable().  This is a class that is very main-thready looking, and holds references to mainthread-only classes, so when we release those, I presume we're hitting a thread assertion.

I have no idea how AsyncLatencyLogger is triggering a release of some other runnable.

Though, OpenFileAndSendFDRunnable does look like it expects to be run off the main thread sometimes, so maybe the problem is that its dtor should dispatch its releases of main thread objects to the main thread?  

Ben, do you have some idea what is going wrong here?
Flags: needinfo?(bent.mozilla)
AsyncLatencyLogger::Release is probably just a symbol that has been coalesced with something more relevant to us.  OpenFileAndSendFDRunnable is supposed to start on the main thread, bounce to a background thread to open the file, come back to the main thread to do some IPC calls, and then go back to the background thread to close the file.  The TabParent is supposed to be released in the main thread section (in SendResponse) so this suggests that the runnable never went back to the main thread.  That could certainly happen if we take one of the NS_ENSURE_SUCCESS paths at http://mxr.mozilla.org/mozilla-central/source/dom/ipc/TabParent.cpp#177 ...
Flags: needinfo?(bent.mozilla)
Thanks, it sounds like this is an issue in OpenFileAndSendFDRunnable most likely.
Component: Video/Audio → DOM: Content Processes
And one reason we could take that path is if the relevant file doesn't exist.

So yeah, we really need to remove those early returns.
Summary: Crash in mozilla::dom::TabParent::Release() → OpenFileAndSendFDRunnable releases TabParent off the main thread
Assignee: nobody → continuation
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #8)
> And one reason we could take that path is if the relevant file doesn't exist.

That shouldn't happen with apps that we're pre-opening fds for, should it?
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #9)
> (In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #8)
> > And one reason we could take that path is if the relevant file doesn't exist.
> 
> That shouldn't happen with apps that we're pre-opening fds for, should it?

Probably not ...
Depends on: 1105468
I split off the one issue in case it doesn't actually fix the crash, because we don't have a test case.
Component: DOM: Content Processes → Stability
Product: Core → Firefox OS
Summary: OpenFileAndSendFDRunnable releases TabParent off the main thread → Crash in mozilla::dom::TabParent::Release()
The patch in bug 1105468 should be ready to test.  Hopefully it will solve this problem.
The patch in bug 1105468 doesn't seem to be fixing the issue. Below is the stack trace with the patch applied.

#0  OpenFileAndSendFDRunnable::SendResponse (this=this@entry=0xac686b20)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/dom/ipc/TabParent.cpp:156
#1  0xb552a87c in OpenFileAndSendFDRunnable::Run (this=0xac686b20)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/dom/ipc/TabParent.cpp:129
#2  0xb4be2408 in nsThread::ProcessNextEvent (this=0xb69d8940, aMayWait=<optimized out>, aResult=0xbe82d5ef)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/xpcom/threads/nsThread.cpp:830
#3  0xb4bf028e in NS_ProcessNextEvent (aThread=<optimized out>, aMayWait=aMayWait@entry=false)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/xpcom/glue/nsThreadUtils.cpp:265
#4  0xb4d31a94 in mozilla::ipc::MessagePump::Run (this=0xb69ff190, aDelegate=0xb69ea1a0)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/ipc/glue/MessagePump.cpp:99
#5  0xb4d2612c in MessageLoop::RunInternal (this=this@entry=0xb69ea1a0)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/ipc/chromium/src/base/message_loop.cc:233
#6  0xb4d261e0 in RunHandler (this=0xb69ea1a0)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/ipc/chromium/src/base/message_loop.cc:226
#7  MessageLoop::Run (this=0xb69ea1a0) at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/ipc/chromium/src/base/message_loop.cc:200
#8  0xb55db592 in nsBaseAppShell::Run (this=0xb1ea4400)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/widget/nsBaseAppShell.cpp:164
#9  0xb58e7efe in nsAppStartup::Run (this=0xb1eaeee0)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/toolkit/components/startup/nsAppStartup.cpp:281
#10 0xb58fc0aa in XREMain::XRE_mainRun (this=this@entry=0xbe82d750)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/toolkit/xre/nsAppRunner.cpp:4150
#11 0xb58fd2f6 in XREMain::XRE_main (this=this@entry=0xbe82d750, argc=argc@entry=1, argv=argv@entry=0xb6a03190, 
    aAppData=aAppData@entry=0x2c860 <_ZL8sAppData>)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/toolkit/xre/nsAppRunner.cpp:4226
#12 0xb58fd46e in XRE_main (argc=1, argv=0xb6a03190, aAppData=0x2c860 <_ZL8sAppData>, aFlags=<optimized out>)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/toolkit/xre/nsAppRunner.cpp:4446
#13 0x0001149c in do_main (argc=argc@entry=1, argv=argv@entry=0xb6a03190)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/b2g/app/nsBrowserApp.cpp:165
#14 0x000115aa in b2g_main (argc=argc@entry=1, argv=argv@entry=0xbe82ea04)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/b2g/app/nsBrowserApp.cpp:293
#15 0x0001132c in RunProcesses (aReservedFds=..., argv=0xbe82ea04, argc=1)
    at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/b2g/app/B2GLoader.cpp:225
#16 main (argc=1, argv=0xbe82ea04) at /local/mnt/workspace/anshulj/kk.b2g_3.8/gecko/b2g/app/B2GLoader.cpp:290
What is the crash address?
Flags: needinfo?(anshulj)
Andrew how do I get the crash address?
Flags: needinfo?(anshulj)
It should say it at the start of the stack trace, like in the attachment "stack trace" attached to this bug it says:
Crash reason:  SIGSEGV
Crash address: 0x0
Keywords: crash
Andrew, below is the information you requested.

Crash reason:  SIGSEGV
Crash address: 0x0
blocking-b2g: 2.1? → 2.1+
Thanks.  I'm guessing that after the file fails to open we're in a state where mFd is null, and then somewhere we're crashing, though I can't tell where.  I could return from SendResponse() early, but it would be nice to know where the invalid app URL is actually coming from, if that is what it is.
Kyle, do you think I can just null-check mFd at the start of SendResponse(), or is that just going to cause something else to blow up?
Flags: needinfo?(khuey)
I added a new patch in bug 1105468 that might help with this crash.

Also, it seems like this failure should happen when a particular app URL is being opened. It would be nice to know what is actually causing the crashing, if possible.
Flags: needinfo?(anshulj)
Whiteboard: [caf-crash 216]
bent pointed out that the file really should always be here ... I'd like to know more about what app is causing this.
Flags: needinfo?(khuey)
(In reply to Andrew McCreight [:mccr8] from comment #20)
> I added a new patch in bug 1105468 that might help with this crash.
> 
> Also, it seems like this failure should happen when a particular app URL is
> being opened. It would be nice to know what is actually causing the
> crashing, if possible.
Andrew, I tried with both of your patches on bug 1105468 and the issue seems to be resolved. What kind of information would help you find the root cause of the issue?
Flags: needinfo?(anshulj)
(In reply to Anshul from comment #22)
> (In reply to Andrew McCreight [:mccr8] from comment #20)
> Andrew, I tried with both of your patches on bug 1105468 and the issue seems
> to be resolved. What kind of information would help you find the root cause
> of the issue?

It would be good to know what app is causing this.  Something is trying to open an app address that is invalid, or something like that.  So the app should be fixed, too.  khuey knows the specifics better.
Whiteboard: [caf-crash 216] → [CR 786377][caf-crash 216]
Whiteboard: [CR 786377][caf-crash 216] → [caf priority: p2][CR 786377][caf-crash 216]
I've landed bug 1105468 which should hopefully fix this.
Whiteboard: [caf priority: p2][CR 786377][caf-crash 216] → [b2g-crash][caf priority: p2][CR 786377][caf-crash 216]
Haven't seen this crash in a while now, so marking this dup of bug 1105468.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
No longer blocks: CAF-v2.2-metabug
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: