Open Bug 1408629 Opened 3 years ago Updated 3 years ago
Crash in shutdownhang | libsystem
This bug was filed from the Socorro interface and is report bp-a14e2057-5361-42fe-aea9-75d700171013. ============================================================= This is topcrash #13 in the OSX nightly 20171012105833.
Uh. Usually I can tell when something else is hanging in shutdown crashes, but nothing obviously jumps out. The two threads that look sort of suspicious are the crashing main thread, which is hanging during some sort of worker cleanup. I guess it's waiting on thread 27: 0 libsystem_kernel.dylib libsystem_kernel.dylib@0x1be7e 1 libmozglue.dylib <name omitted> mozglue/misc/ConditionVariable_posix.cpp:118 2 XUL mozilla::dom::workers::WorkerPrivate::DoRunLoop(JSContext*) xpcom/threads/CondVar.h:68 3 XUL (anonymous namespace)::WorkerThreadPrimaryRunnable::Run() dom/workers/RuntimeService.cpp:2864 4 XUL nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:1037 5 XUL NS_ProcessNextEvent(nsIThread*, bool) xpcom/threads/nsThreadUtils.cpp:524 6 XUL mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp:368 7 XUL MessageLoop::Run() ipc/chromium/src/base/message_loop.cc:326 8 XUL nsThread::ThreadFunc(void*) xpcom/threads/nsThread.cpp:425 9 libnss3.dylib _pt_root nsprpub/pr/src/pthreads/ptthread.c:216 and that thread just isn't getting the message? Did we just fail to poke that thread appropriately or something? ni baku for ideas there. The other thread that looks sort of plausible is thread 23: 0 libsystem_kernel.dylib libsystem_kernel.dylib@0x12e76 1 XUL google_breakpad::ReceivePort::WaitForMessage(google_breakpad::MachReceiveMessage*, unsigned int) toolkit/crashreporter/google-breakpad/src/common/mac/MachIPC.mm:249 2 XUL google_breakpad::CrashGenerationServer::WaitForOneMessage() toolkit/crashreporter/breakpad-client/mac/crash_generation/crash_generation_server.cc:102 3 XUL google_breakpad::CrashGenerationServer::WaitForMessages(void*) toolkit/crashreporter/breakpad-client/mac/crash_generation/crash_generation_server.cc:96 Ø 4 libsystem_pthread.dylib libsystem_pthread.dylib@0x36c0 Ø 5 libsystem_pthread.dylib libsystem_pthread.dylib@0x356c Ø 6 libsystem_pthread.dylib libsystem_pthread.dylib@0x2c5c 7 XUL XUL@0x359ce2f (failure to symbolicate that XUL symbol doesn't seem good...) I have no idea what's going on there, ni to Ted if his Breakpad knowledge can generate some insight there.
For the record: the missing libsystem_kernel.dylib symbols are because this is macOS 10.13, and the process I have setup to scrape system symbols for macOS only handles Apple's update packages from their update servers, which doesn't include full major version updates like this apparently. I'll try to backfill those for sanity's sake, but they're not likely to be super interesting. > The other thread that looks sort of plausible is thread 23: That's the thread that waits for crash messages from child processes. It gets signaled and shutdown in `OOPDeinit` which gets called from `UnsetExceptionHandler` very late in shutdown. Shouldn't be a problem. > (failure to symbolicate that XUL symbol doesn't seem good...) I don't think it's a problem here, I think the stackwalker just walked off the end of the stack and found junk. Not having symbols for libsystem_pthread.dylib probably doesn't help. I think your analysis is likely correct--there's a worker thread there that didn't get the message that it's supposed to shut down, and the main thread is hanging out waiting for it.
Andrew - we need an assessment of whether this bug should be a critical or not, and if :baku is too backed up, an alternate to investigate. Thanks!
Looks like a null deref. The comments make me think this is QuotaManager-related because one mentions attempting to manually move their profile directory from one OS to another (PC->Mac so I doubt it's the .DS_STORE (?) problem Ehsan experienced) but others just mention it was a "restart firefox" situation (which obviously correlates with the shutdownhang summary here). baku told me he'd take another look.
(In reply to Andrew Overholt [:overholt] from comment #4) > Looks like a null deref. The comments make me think this is > QuotaManager-related because one mentions attempting to manually move their > profile directory from one OS to another (PC->Mac so I doubt it's the > .DS_STORE (?) problem Ehsan experienced) but others just mention it was a > "restart firefox" situation (which obviously correlates with the > shutdownhang summary here). FTR: this is an intentional MOZ_CRASH triggered because shutdown took too long (the `shutdownhang` in the signature).
I recently landed a set of patches for bug 1405290. We should have more data about why workers block the shutdown.
See Also: → 1405290
You need to log in before you can comment on or make changes to this bug.