Closed Bug 1402595 Opened 2 years ago Closed 2 years ago
_CRASH(mozilla::detail::Mutex Impl::~Mutex Impl: pthread _mutex _destroy failed) at mozglue/misc/Mutex _posix .cpp:68 with eval In Cooperative Thread and Debugger
The following testcase crashes on mozilla-central revision 2cd3752963fc (build with --enable-posix-nspr-emulation --enable-valgrind --enable-gczeal --disable-tests --enable-stdcxx-compat --disable-profiling --enable-debug --without-intl-api --enable-optimize --target=i686-pc-linux-gnu --enable-simulator=arm, run with --fuzzing-safe --ion-offthread-compile=off): evalInCooperativeThread(` var dbg = new Debugger; `); evalInCooperativeThread(` `); Backtrace: received signal SIGSEGV, Segmentation fault. 0x080ca64d in mozilla::detail::MutexImpl::~MutexImpl (this=0xf50472e0, __in_chrg=<optimized out>) at mozglue/misc/Mutex_posix.cpp:67 #0 0x080ca64d in mozilla::detail::MutexImpl::~MutexImpl (this=0xf50472e0, __in_chrg=<optimized out>) at mozglue/misc/Mutex_posix.cpp:67 #1 0x08072177 in js::Mutex::~Mutex (this=0xf50472e0, __in_chrg=<optimized out>) at js/src/threading/Mutex.h:50 #2 CooperationState::~CooperationState (this=0xf50472e0, __in_chrg=<optimized out>) at js/src/shell/js.cpp:3283 #3 js_delete<CooperationState> (p=0xf50472e0) at dist/include/js/Utility.h:454 #4 KillWorkerThreads (cx=cx@entry=0xf7940800) at js/src/shell/js.cpp:3879 #5 0x08080c52 in main (argc=4, argv=0xffffd8b4, envp=0xffffd8c8) at js/src/shell/js.cpp:8663 eax 0x0 0 ebx 0x8d8eff4 148434932 ecx 0xf7da4864 -136689564 edx 0x0 0 esi 0x8b5313c 146092348 edi 0xf50472e0 -184257824 ebp 0xffffd4b8 4294956216 esp 0xffffd4a0 4294956192 eip 0x80ca64d <mozilla::detail::MutexImpl::~MutexImpl()+77> => 0x80ca64d <mozilla::detail::MutexImpl::~MutexImpl()+77>: movl $0x0,0x0 0x80ca657 <mozilla::detail::MutexImpl::~MutexImpl()+87>: ud2
Please evaluate this Sean. Re-prioritize or re-assign as appropriate.
Priority: -- → P2
This is a race condition based around numThreads, where one cooperative thread can get stuck waiting on the cvar in CooperativeYield(), while the main thread is convinced that numThreads == 0. The race condition happens like this: 1: Main thread runs KillWorkers(). All cooperative threads are woken up. 2: Cooperative thread 1 enters the ScopeExit in WorkerMain(), decrementing its own count from numThreads. Inside CooperativeYield(), it sees that numThreads != 0, so it enters the cvar wait. 3: Cooperative thread 2 enters its own ScopeExit in WorkerMain(), setting numThreads == 0. It destroys itself successfully. 4: The scheduler chooses to run the main thread. Cooperative thread 1 is still waiting on the cvar (but has been notified, and therefore is runnable if chosen). 5: The main thread sees that numThreads == 0, believes that this means that all references to the mutex are dropped, and attempts to pthread_mutex_destroy(), which fails because the mutex is still in use by Cooperative thread 1 in the cvar. This is easily fixed and appears to be shell-only -- but bhackett would know whether a similar race can happen in whatever cooperative thread lifetime code the browser uses. It might have the same issue there.
Flags: needinfo?(sstangl) → needinfo?(bhackett1024)
It's also worth pointing out that we had a similar race condition in the SharedArrayBuffer lifetime code. It seems to be a very common pattern where the number of live elements is decremented before that is actually the truth. Worth being alert to in the future.
Thanks for looking into this, Sean. I can't reproduce this, but from your description this should fix the problem.
Assignee: nobody → bhackett1024
Attachment #8923359 - Flags: review?(jdemooij)
Attachment #8923359 - Flags: review?(jdemooij) → review+
Pushed by email@example.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/9cff4b8eb396 Don't wait on cooperative thread cvar after marking the current thread as having finished, r=jandem.
You need to log in before you can comment on or make changes to this bug.