This bug was filed from the Socorro interface and is report bp-1f15da08-ee54-4d6f-99cc-962562150404. ============================================================= Topcrash in 37.0.1, and it's a startup crash. 38 beta is affected. bent says: "we spun the event loop while an IPC call was on the stack and one of the events destroyed the channel" 0 mozalloc.dll mozalloc_abort(char const* const) memory/mozalloc/mozalloc_abort.cpp 1 xul.dll NS_DebugBreak xpcom/base/nsDebugImpl.cpp 2 xul.dll mozilla::ipc::MessageChannel::DebugAbort(char const*, int, char const*, char const*, bool) ipc/glue/MessageChannel.cpp 3 xul.dll mozilla::ipc::MessageChannel::~MessageChannel() ipc/glue/MessageChannel.cpp 4 mozalloc.dll moz_xmalloc memory/mozalloc/mozalloc.cpp 5 xul.dll mozilla::layers::PCompositorChild::~PCompositorChild() obj-firefox/ipc/ipdl/PCompositorChild.cpp (Frame 4 is probably spurious)
[Tracking Requested - why for this release]: Startup topcrash in 37.0.1, and 38 is affected v39 status unknown. It might be fixed or it might just not be showing up on that channel. Note that this is different from bug 1122008 whose signature is extremely similar but differs in CompositorParent vs CompositorChild.
status-firefox37: --- → affected
status-firefox38: --- → affected
status-firefox39: --- → ?
tracking-firefox38: --- → ?
(In reply to David Major [:dmajor] from comment #0) > bp-1f15da08-ee54-4d6f-99cc-962562150404 This shows us releasing the last reference to a ContentChild in a runnable that runs within a nested event loop during a sync XHR. Apparently there is an IPC call on the stack (lost in the JIT frames) that sent an intr message to the parent. Ugh. I guess we need to figure out how to guard against this in a general sense.
Startup crash, tracking!
tracking-firefox38: ? → +
tracking-firefox39: --- → -
Ben, we are late in the beta cycle, any chance you could provide a fix for this soon? Thanks
I'm not the right asignee for this, you need someone on e10s here.
Bill, is there anything you can do for 38 here?
(In reply to Ben Turner [:bent] (use the needinfo flag!) from comment #2) > (In reply to David Major [:dmajor] from comment #0) > > bp-1f15da08-ee54-4d6f-99cc-962562150404 > > This shows us releasing the last reference to a ContentChild in a runnable > that runs within a nested event loop during a sync XHR. Apparently there is > an IPC call on the stack (lost in the JIT frames) that sent an intr message > to the parent. Ugh. > > I guess we need to figure out how to guard against this in a general sense. Could you look at this again, Ben? I'm not seeing what you're seeing. We're releasing a CompositorChild and I don't see any intr calls. mCxxStackFrames would be non-empty if there were any messages being sent or dispatched on the CompositorChild channel. It's mysterious how we could get from there to executing JS code though. PCompositor is a sync protocol with only normal message priorities, so sending a message shouldn't allow anything to run. So we'd have to be doing something weird while dispatching. It also seems possible that the channel has already been released and mCxxStackFrames just looks non-empty. It would really help to have more stack frames here. Is that possible dmajor?
Oops, we're releasing a *CompositorChild*, not a *ContentChild*... Sorry! There's nsXMLHttpRequest::Send(JSContext*, mozilla::ErrorResult&) on the stack, and that spins the event loop, so that's how we have JS triggering this.
Loading the dmp in MSVC shows this: mozalloc.dll!mozalloc_abort(...) Line 37 C++ xul.dll!mozilla::layers::LayerTransactionChild::Release() Line 32 C++ xul.dll!mozilla::layers::CompositorChild::DeallocPLayerTransactionChild(...) Line 128 C++ xul.dll!mozilla::layers::PCompositorChild::RemoveManagee(...) Line 632 C++ xul.dll!mozilla::layers::PLayerTransactionChild::OnMessageReceived(...) Line 883 C++ xul.dll!mozilla::layers::PCompositorChild::OnMessageReceived(...) Line 969 C++ xul.dll!nsAppShell::EventWindowProc(...) Line 113 C++ And WinDbg shows this: ntdll!ZwWaitForSingleObject+0x15 KERNELBASE!WaitForSingleObjectEx+0x98 kernel32!WaitForSingleObjectExImplementation+0x75 kernel32!WaitForSingleObject+0x12 xul!google_breakpad::ExceptionHandler::WriteMinidumpOnHandlerThread+0x59 xul!google_breakpad::ExceptionHandler::WriteMinidumpForException+0x25 xul!CrashReporter::WriteMinidumpForException+0x1a xul!nsXULAppInfo::WriteMinidumpForException+0x9 xul!mozilla::ReportException+0x22 xul!CallWindowProcCrashProtected+0x3388b8 xul!nsWindow::WindowProc+0x37 user32!InternalCallWinProc+0x23 user32!UserCallWinProcCheckWow+0x109 user32!DispatchMessageWorker+0x3bc user32!DispatchMessageW+0xf xul!nsAppShell::ProcessNextNativeEvent+0x1de 0x233efd4 No idea why they're all so different (including the stack on socorro)
(In reply to Ben Turner [:bent] (use the needinfo flag!) from comment #8) > There's nsXMLHttpRequest::Send(JSContext*, mozilla::ErrorResult&) on the > stack, and that spins the event loop, so that's how we have JS triggering > this. But the XHR is triggered from JIT code. Who's running that JS? As I said above, we would expect that whatever is below the JS is doing compositor stuff. But that's weird, because we shouldn't be able to get from compositor stuff to JS.
> It would really help to have more stack frames here. Is that possible dmajor? The stack at bp-1f15da08-ee54-4d6f-99cc-962562150404 goes pretty deep. It has some nonsense frames like moz_xmalloc and IsWindowVisible, but if you ignore those, does the rest seem reasonable?
Too late for 38 but tracking in case it happens again with 39.
status-firefox37: affected → wontfix
status-firefox38: affected → wontfix
status-firefox39: ? → affected
There aren't currently any crashes with this signature for 39+ so I'm dropping the tracking.
status-firefox39: affected → unaffected
status-firefox40: --- → unaffected
status-firefox41: --- → unaffected
no crashes matching this sig.
Status: NEW → RESOLVED
Last Resolved: 9 months ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.