1152980 - Startup crash in mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::ipc::MessageChannel::DebugAbort(char const*, int, char const*, char const*, bool) | mozilla::ipc::MessageChannel::~MessageChannel() ...

Reporter

Description

•

9 years ago

This bug was filed from the Socorro interface and is 
report bp-1f15da08-ee54-4d6f-99cc-962562150404.
=============================================================

Topcrash in 37.0.1, and it's a startup crash. 38 beta is affected. bent says: "we spun the event loop while an IPC call was on the stack and one of the events destroyed the channel"

0 	mozalloc.dll 	mozalloc_abort(char const* const) 	memory/mozalloc/mozalloc_abort.cpp
1 	xul.dll 	NS_DebugBreak 	xpcom/base/nsDebugImpl.cpp
2 	xul.dll 	mozilla::ipc::MessageChannel::DebugAbort(char const*, int, char const*, char const*, bool) 	ipc/glue/MessageChannel.cpp
3 	xul.dll 	mozilla::ipc::MessageChannel::~MessageChannel() 	ipc/glue/MessageChannel.cpp
4 	mozalloc.dll 	moz_xmalloc 	memory/mozalloc/mozalloc.cpp
5 	xul.dll 	mozilla::layers::PCompositorChild::~PCompositorChild() 	obj-firefox/ipc/ipdl/PCompositorChild.cpp

(Frame 4 is probably spurious)

(Away)

Reporter

Comment 1

•

9 years ago

[Tracking Requested - why for this release]: Startup topcrash in 37.0.1, and 38 is affected

v39 status unknown. It might be fixed or it might just not be showing up on that channel.

Note that this is different from bug 1122008 whose signature is extremely similar but differs in CompositorParent vs CompositorChild.

status-firefox37: --- → affected

status-firefox38: --- → affected

status-firefox39: --- → ?

tracking-firefox38: --- → ?

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 2

•

9 years ago

(In reply to David Major [:dmajor] from comment #0)
> bp-1f15da08-ee54-4d6f-99cc-962562150404

This shows us releasing the last reference to a ContentChild in a runnable that runs within a nested event loop during a sync XHR. Apparently there is an IPC call on the stack (lost in the JIT frames) that sent an intr message to the parent. Ugh.

I guess we need to figure out how to guard against this in a general sense.

Sylvestre Ledru [:Sylvestre]

Comment 3

•

9 years ago

Startup crash, tracking!

tracking-firefox38: ? → +

tracking-firefox39: --- → -

Sylvestre Ledru [:Sylvestre]

Updated

•

9 years ago

tracking-firefox39: - → +

Sylvestre Ledru [:Sylvestre]

Comment 4

•

9 years ago

Ben, we are late in the beta cycle, any chance you could provide a fix for this soon?
Thanks

Flags: needinfo?(bent.mozilla)

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 5

•

9 years ago

I'm not the right asignee for this, you need someone on e10s here.

Flags: needinfo?(bent.mozilla)

(Away)

Reporter

Comment 6

•

9 years ago

Bill, is there anything you can do for 38 here?

Flags: needinfo?(wmccloskey)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 7

•

9 years ago

(In reply to Ben Turner [:bent] (use the needinfo flag!) from comment #2)
> (In reply to David Major [:dmajor] from comment #0)
> > bp-1f15da08-ee54-4d6f-99cc-962562150404
> 
> This shows us releasing the last reference to a ContentChild in a runnable
> that runs within a nested event loop during a sync XHR. Apparently there is
> an IPC call on the stack (lost in the JIT frames) that sent an intr message
> to the parent. Ugh.
> 
> I guess we need to figure out how to guard against this in a general sense.

Could you look at this again, Ben? I'm not seeing what you're seeing. We're releasing a CompositorChild and I don't see any intr calls. mCxxStackFrames would be non-empty if there were any messages being sent or dispatched on the CompositorChild channel. It's mysterious how we could get from there to executing JS code though. PCompositor is a sync protocol with only normal message priorities, so sending a message shouldn't allow anything to run. So we'd have to be doing something weird while dispatching.

It also seems possible that the channel has already been released and mCxxStackFrames just looks non-empty.

It would really help to have more stack frames here. Is that possible dmajor?

Flags: needinfo?(wmccloskey)

Flags: needinfo?(dmajor)

Flags: needinfo?(bent.mozilla)

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 8

•

9 years ago

Oops, we're releasing a *CompositorChild*, not a *ContentChild*... Sorry!

There's nsXMLHttpRequest::Send(JSContext*, mozilla::ErrorResult&) on the stack, and that spins the event loop, so that's how we have JS triggering this.

Flags: needinfo?(bent.mozilla)

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 9

•

9 years ago

Loading the dmp in MSVC shows this:

mozalloc.dll!mozalloc_abort(...) Line 37	C++
xul.dll!mozilla::layers::LayerTransactionChild::Release() Line 32	C++
xul.dll!mozilla::layers::CompositorChild::DeallocPLayerTransactionChild(...) Line 128	C++
xul.dll!mozilla::layers::PCompositorChild::RemoveManagee(...) Line 632	C++
xul.dll!mozilla::layers::PLayerTransactionChild::OnMessageReceived(...) Line 883	C++
xul.dll!mozilla::layers::PCompositorChild::OnMessageReceived(...) Line 969	C++
xul.dll!nsAppShell::EventWindowProc(...) Line 113	C++

And WinDbg shows this:

ntdll!ZwWaitForSingleObject+0x15
KERNELBASE!WaitForSingleObjectEx+0x98
kernel32!WaitForSingleObjectExImplementation+0x75
kernel32!WaitForSingleObject+0x12
xul!google_breakpad::ExceptionHandler::WriteMinidumpOnHandlerThread+0x59
xul!google_breakpad::ExceptionHandler::WriteMinidumpForException+0x25
xul!CrashReporter::WriteMinidumpForException+0x1a
xul!nsXULAppInfo::WriteMinidumpForException+0x9
xul!mozilla::ReportException+0x22
xul!CallWindowProcCrashProtected+0x3388b8
xul!nsWindow::WindowProc+0x37
user32!InternalCallWinProc+0x23
user32!UserCallWinProcCheckWow+0x109
user32!DispatchMessageWorker+0x3bc
user32!DispatchMessageW+0xf
xul!nsAppShell::ProcessNextNativeEvent+0x1de
0x233efd4

No idea why they're all so different (including the stack on socorro)

Bill McCloskey [inactive unless it's an emergency] (:billm)

Comment 10

•

9 years ago

(In reply to Ben Turner [:bent] (use the needinfo flag!) from comment #8)
> There's nsXMLHttpRequest::Send(JSContext*, mozilla::ErrorResult&) on the
> stack, and that spins the event loop, so that's how we have JS triggering
> this.

But the XHR is triggered from JIT code. Who's running that JS? As I said above, we would expect that whatever is below the JS is doing compositor stuff. But that's weird, because we shouldn't be able to get from compositor stuff to JS.

(Away)

Reporter

Comment 11

•

9 years ago

> It would really help to have more stack frames here. Is that possible dmajor?
The stack at bp-1f15da08-ee54-4d6f-99cc-962562150404 goes pretty deep. It has some nonsense frames like moz_xmalloc and IsWindowVisible, but if you ignore those, does the rest seem reasonable?

Flags: needinfo?(dmajor)

Sylvestre Ledru [:Sylvestre]

Comment 12

•

9 years ago

Too late for 38 but tracking in case it happens again with 39.

status-firefox37: affected → wontfix

status-firefox38: affected → wontfix

status-firefox39: ? → affected

Liz Henry (:lizzard) (relman/hg->git project)

Comment 13

•

9 years ago

There aren't currently any crashes with this signature for 39+ so I'm dropping the tracking.

status-firefox39: affected → unaffected

status-firefox40: --- → unaffected

status-firefox41: --- → unaffected

Jim Mathies [:jimm]

Comment 14

•

7 years ago

no crashes matching this sig.

Status: NEW → RESOLVED

Closed: 7 years ago

Resolution: --- → WORKSFORME

Bugzilla

Quick Search

Startup crash in mozalloc_abort(char const* const) | NS_DebugBreak | mozilla::ipc::MessageChannel::DebugAbort(char const, int, char const, char const*, bool) | mozilla::ipc::MessageChannel::~MessageChannel() ...

Categories

(Core :: IPC, defect)

Tracking

()

People

(Reporter: away, Unassigned)

References

Details

(Keywords: crash)

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14