Crash in [@ nsXPCWrappedJS::Release] on the Breakpad ExceptionHandler thread
Categories
(Toolkit :: Crash Reporting, defect)
Tracking
()
People
(Reporter: mccr8, Assigned: gsvelto)
References
Details
(Keywords: crash)
Crash Data
Attachments
(2 files)
This is a weird one. We're on the Breakpad ExceptionHandler thread, and calling exit. Somehow this is triggering the XPCWJS thread check release assert. There's at least a couple of these crashes
Crash report: https://crash-stats.mozilla.org/report/index/17766243-6c90-4f19-84f8-767e90210913
MOZ_CRASH Reason: MOZ_RELEASE_ASSERT(NS_IsMainThread()) (nsXPCWrappedJS::Release called off main thread)
Top 6 frames of crashing thread:
0 XUL nsXPCWrappedJS::Release js/xpconnect/src/XPCWrappedJS.cpp:259
1 libsystem_c.dylib __cxa_finalize_ranges
2 libsystem_c.dylib exit
3 XUL google_breakpad::ExceptionHandler::WaitForMessage toolkit/crashreporter/breakpad-client/mac/handler/exception_handler.cc:699
4 libsystem_pthread.dylib _pthread_start
5 libsystem_pthread.dylib thread_start
Reporter | ||
Comment 1•4 years ago
|
||
Still happening. Should this be using _exit() instead of exit() or something? Whatever skips the destructors.
Assignee | ||
Comment 2•4 years ago
|
||
Yes, we should probably call _exit()
however... I don't know what the correct behavior would be here. What we're doing is trying to populate the reply by forwarding the exception in case anybody cares, then replying to the kernel (which originally sent us the exception message). We call exit()
only if we fail to populate the reply which in the current code only happens if the msgh_id
of the inbound exception is different from 2405 which should be impossible.
This is odd, but there's something else that's odd: the crashes indicate that we're in the main process but if we were we should have called _exit()
already. This happens implicitly here, the second-last parameter is telling Breakpad to quit after having written the minidump.
So if we haven't quit already that could mean two things:
- We're in a forked version of the main process
- Launching the crash reporter client failed
1 should be impossible and 2 is very unlikely. Either way this indicate a fundamental problem, and I'd rule out a hardware issue because it would be odd that a bit flip would affect this specific code path. So instead of calling _exit()
I'm tempted to turn this into a crash where I store the contents of the exception id and the result of launching the crash reporter client. Then we'd know what's going on and we might react properly.
Assignee | ||
Comment 3•4 years ago
|
||
I've got a patch to try and detect what's wrong here. Let's see if I can get some facts.
Assignee | ||
Comment 4•4 years ago
|
||
Comment 6•4 years ago
|
||
bugherder |
Assignee | ||
Comment 7•4 years ago
|
||
Adding the new signature I introduced for this issue. It seems that MOZ_CRASH_UNSAFE_PRINTF()
doesn't understand the "pretty hexadecimal" modifier and so it's not printing out the first parameter correctly, I should adjust that. The second parameter appears to be always zero - which would indicate that we didn't write out the minidump correctly - however given that the first parameter wasn't printed out correctly I fear we might really be looking at the contents of the first one.
Assignee | ||
Comment 8•4 years ago
|
||
Comment 10•4 years ago
|
||
bugherder |
Assignee | ||
Comment 11•4 years ago
|
||
Here's a couple more signatures. The mach message ID appears to be bogus sometimes, however what all these crashes have in common is that we failed to write a minidump. I'm considering leaving this as-is for the time being because this would have been a crash anyway - we're just not getting the minidump because it's not being written out. Having this crash anyway later lets us track the volume and eventually verify the fix for the minidump writing logic when get around to that.
Comment 13•3 years ago
|
||
The leave-open keyword is there and there is no activity for 6 months.
:gsvelto, maybe it's time to close this bug?
For more information, please visit auto_nag documentation.
Assignee | ||
Comment 14•3 years ago
|
||
Looking at the crash data we're getting mach messages with ID we don't know how to handle (4917, 29434 and more). In all cases we've not been able to write a minidump, we don't really know what to do and the main process crashed anyway so I'll call this fixed since it's now a controlled crash.
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Description
•