Last Comment Bug 916504 - Deadlock in AutoPauseWorkersForGC on various sites
: Deadlock in AutoPauseWorkersForGC on various sites
Status: RESOLVED FIXED
[good first verify]
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla27
Assigned To: Brian Hackett (:bhackett)
:
Mentors:
: 917256 (view as bug list)
Depends on: 918862
Blocks: 916531
  Show dependency treegraph
 
Reported: 2013-09-14 18:02 PDT by Alon Zakai (:azakai)
Modified: 2013-10-30 09:26 PDT (History)
15 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
fixed
+
fixed


Attachments
patch (6.24 KB, patch)
2013-09-16 08:58 PDT, Brian Hackett (:bhackett)
wmccloskey: review+
Details | Diff | Splinter Review

Description Alon Zakai (:azakai) 2013-09-14 18:02:16 PDT
Seems to have started in last nightly, sometimes the browser completely deadlocks with no CPU usage and no receptiveness to input at all. I found fairly reproducible STR as follows:

1. Load http://www.unrealengine.com/html5/
2. Press Play
3. If it reaches the "downloading" stage, that means it passed js parsing and the bug did not show itself - reload the page and go to 2.

Eventually, usually the first attempt or the second, the bug will occur.

When it happens, breaking in gdb always shows us in

(gdb) where
#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab2af in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dd02b9 in js::AutoPauseWorkersForGC::AutoPauseWorkersForGC(JSRuntime*) () from /home/alon/Downloads/firefox/libxul.so
#4  0xb5d5fa07 in js::gc::AutoTraceSession::AutoTraceSession(JSRuntime*, js::HeapState) ()
   from /home/alon/Downloads/firefox/libxul.so
#5  0xb5d65e21 in GCCycle(JSRuntime*, bool, long long, js::JSGCInvocationKind, JS::gcreason::Reason) ()
   from /home/alon/Downloads/firefox/libxul.so
#6  0xb5d6625c in Collect(JSRuntime*, bool, long long, js::JSGCInvocationKind, JS::gcreason::Reason) ()
   from /home/alon/Downloads/firefox/libxul.so
#7  0xb5d562de in JS::IncrementalGC(JSRuntime*, JS::gcreason::Reason, long long) () from /home/alon/Downloads/firefox/libxul.so
#8  0xb4c944a9 in nsJSContext::GarbageCollectNow(JS::gcreason::Reason, nsJSContext::IsIncremental, nsJSContext::IsCompartment, nsJSContext::IsShrinking, long long) () from /home/alon/Downloads/firefox/libxul.so
#9  0xb4c9455d in GCTimerFired(nsITimer*, void*) () from /home/alon/Downloads/firefox/libxul.so
#10 0xb569cc65 in nsTimerImpl::Fire() () from /home/alon/Downloads/firefox/libxul.so
#11 0xb569cd9c in nsTimerEvent::Run() () from /home/alon/Downloads/firefox/libxul.so
#12 0xb5698f71 in nsThread::ProcessNextEvent(bool, bool*) () from /home/alon/Downloads/firefox/libxul.so
#13 0xb5656c25 in NS_ProcessNextEvent(nsIThread*, bool) () from /home/alon/Downloads/firefox/libxul.so
#14 0xb526d39e in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) () from /home/alon/Downloads/firefox/libxul.so
#15 0xb56ceadd in MessageLoop::Run() () from /home/alon/Downloads/firefox/libxul.so
#16 0xb51dab78 in nsBaseAppShell::Run() () from /home/alon/Downloads/firefox/libxul.so
#17 0xb50560ec in nsAppStartup::Run() () from /home/alon/Downloads/firefox/libxul.so
#18 0xb44a435b in XREMain::XRE_mainRun() () from /home/alon/Downloads/firefox/libxul.so
#19 0xb44a6cbf in XREMain::XRE_main(int, char**, nsXREAppData const*) () from /home/alon/Downloads/firefox/libxul.so
#20 0xb44a6f17 in XRE_main () from /home/alon/Downloads/firefox/libxul.so
#21 0x0804b486 in do_main(int, char**, nsIFile*) ()
#22 0x0804b7ee in main ()

Perhaps something to do with background parsing which just landed?
Comment 1 Alon Zakai (:azakai) 2013-09-14 18:03:23 PDT
Adding cc's based on that speculative guess.
Comment 2 Bill McCloskey (:billm) 2013-09-14 18:04:31 PDT
If you still have it open in gdb, it would help to have stack traces for the other JS worker threads. They should be named "Analysis Helper" (we should probably rename them).
Comment 3 Alon Zakai (:azakai) 2013-09-14 18:12:34 PDT
Ok, two Analysis Helper threads, one is

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab2af in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dd036e in js::WorkerThread::pause() () from /home/alon/Downloads/firefox/libxul.so
#4  0xb5db44c3 in js::SourceCompressionTask::MOZ_Z_compress() () from /home/alon/Downloads/firefox/libxul.so
#5  0xb5dd0128 in js::WorkerThread::handleCompressionWorkload(js::WorkerThreadState&) () from /home/alon/Downloads/firefox/libxul.so
#6  0xb5dd1084 in js::WorkerThread::threadLoop() () from /home/alon/Downloads/firefox/libxul.so
#7  0xb7bad43d in _pt_root () from /home/alon/Downloads/firefox/libnspr4.so
#8  0xb7fb0d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#9  0xb7db1dde in clone () from /lib/i386-linux-gnu/libc.so.6

and the other is

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab2af in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dcf9d2 in js::WorkerThreadState::wait(js::WorkerThreadState::CondVar, unsigned int) ()
   from /home/alon/Downloads/firefox/libxul.so
#4  0xb5f190aa in GenerateCodeForFinishedJob((anonymous namespace)::ModuleCompiler&, ParallelGroupState&, js::AsmJSParallelTask**)
    () from /home/alon/Downloads/firefox/libxul.so
#5  0xb5f1f0cf in CheckModule(js::ExclusiveContext*, js::frontend::Parser<js::frontend::FullParseHandler>&, js::frontend::ParseNode*, js::ScopedJSDeletePtr<js::AsmJSModule>*, js::ScopedJSFreePtr<char>*) () from /home/alon/Downloads/firefox/libxul.so
#6  0xb5f1f3fb in js::CompileAsmJS(js::ExclusiveContext*, js::frontend::Parser<js::frontend::FullParseHandler>&, js::frontend::ParseNode*, bool*) () from /home/alon/Downloads/firefox/libxul.so
#7  0xb5ee68ad in js::frontend::Parser<js::frontend::FullParseHandler>::asmJS(js::frontend::ParseNode*) ()
   from /home/alon/Downloads/firefox/libxul.so
#8  0xb5ef76e2 in js::frontend::Parser<js::frontend::FullParseHandler>::statements() () from /home/alon/Downloads/firefox/libxul.so
#9  0xb5eff70c in js::frontend::Parser<js::frontend::FullParseHandler>::functionBody(js::frontend::FunctionSyntaxKind, js::frontend::Parser<js::frontend::FullParseHandler>::FunctionBodyType) () from /home/alon/Downloads/firefox/libxul.so
#10 0xb5ef5981 in js::frontend::Parser<js::frontend::FullParseHandler>::functionArgsAndBodyGeneric(js::frontend::ParseNode*, JS::Handle<JSFunction*>, js::frontend::FunctionType, js::frontend::FunctionSyntaxKind, js::frontend::Directives*) ()
   from /home/alon/Downloads/firefox/libxul.so
#11 0xb5ef63f7 in js::frontend::Parser<js::frontend::FullParseHandler>::functionArgsAndBody(js::frontend::ParseNode*, JS::Handle<JSFunction*>, js::frontend::FunctionType, js::frontend::FunctionSyntaxKind, js::GeneratorKind, js::frontend::Directives, js::frontend::Directives*) () from /home/alon/Downloads/firefox/libxul.so
#12 0xb5ef66b5 in js::frontend::Parser<js::frontend::FullParseHandler>::functionDef(JS::Handle<js::PropertyName*>, js::frontend::TokenStream::Position const&, js::frontend::FunctionType, js::frontend::FunctionSyntaxKind, js::GeneratorKind) ()
   from /home/alon/Downloads/firefox/libxul.so
#13 0xb5ef6925 in js::frontend::Parser<js::frontend::FullParseHandler>::functionExpr() ()
   from /home/alon/Downloads/firefox/libxul.so
#14 0xb5ef8008 in js::frontend::Parser<js::frontend::FullParseHandler>::primaryExpr(js::frontend::TokenKind) ()
   from /home/alon/Downloads/firefox/libxul.so
#15 0xb5ef936f in js::frontend::Parser<js::frontend::FullParseHandler>::memberExpr(js::frontend::TokenKind, bool) ()
   from /home/alon/Downloads/firefox/libxul.so
#16 0xb5ef9a2a in js::frontend::Parser<js::frontend::FullParseHandler>::unaryExpr() () from /home/alon/Downloads/firefox/libxul.so
#17 0xb5efa31c in js::frontend::Parser<js::frontend::FullParseHandler>::orExpr1() () from /home/alon/Downloads/firefox/libxul.so
#18 0xb5efa76d in js::frontend::Parser<js::frontend::FullParseHandler>::assignExpr() () from /home/alon/Downloads/firefox/libxul.so
#19 0xb5efc105 in js::frontend::Parser<js::frontend::FullParseHandler>::expr() () from /home/alon/Downloads/firefox/libxul.so
#20 0xb5efdd5e in js::frontend::Parser<js::frontend::FullParseHandler>::parenExpr(bool*) ()
   from /home/alon/Downloads/firefox/libxul.so
#21 0xb5ef8147 in js::frontend::Parser<js::frontend::FullParseHandler>::primaryExpr(js::frontend::TokenKind) ()
   from /home/alon/Downloads/firefox/libxul.so
#22 0xb5ef936f in js::frontend::Parser<js::frontend::FullParseHandler>::memberExpr(js::frontend::TokenKind, bool) ()
---Type <return> to continue, or q <return> to quit--- 
   from /home/alon/Downloads/firefox/libxul.so
#23 0xb5ef9a2a in js::frontend::Parser<js::frontend::FullParseHandler>::unaryExpr() () from /home/alon/Downloads/firefox/libxul.so
#24 0xb5efa31c in js::frontend::Parser<js::frontend::FullParseHandler>::orExpr1() () from /home/alon/Downloads/firefox/libxul.so
#25 0xb5efa76d in js::frontend::Parser<js::frontend::FullParseHandler>::assignExpr() () from /home/alon/Downloads/firefox/libxul.so
#26 0xb5efd855 in js::frontend::Parser<js::frontend::FullParseHandler>::variables(js::frontend::ParseNodeKind, bool*, js::StaticBlockObject*, js::frontend::VarContext) () from /home/alon/Downloads/firefox/libxul.so
#27 0xb5ef6f8a in js::frontend::Parser<js::frontend::FullParseHandler>::statement(bool) ()
   from /home/alon/Downloads/firefox/libxul.so
#28 0xb5ec1f8c in js::frontend::CompileScript(js::ExclusiveContext*, js::LifoAlloc*, JS::Handle<JSObject*>, JS::Handle<JSScript*>, JS::CompileOptions const&, unsigned short const*, unsigned int, JSString*, unsigned int, js::SourceCompressionTask*) ()
   from /home/alon/Downloads/firefox/libxul.so
#29 0xb5dd0f44 in js::WorkerThread::handleParseWorkload(js::WorkerThreadState&) () from /home/alon/Downloads/firefox/libxul.so
#30 0xb5dd1113 in js::WorkerThread::threadLoop() () from /home/alon/Downloads/firefox/libxul.so
#31 0xb7bad43d in _pt_root () from /home/alon/Downloads/firefox/libnspr4.so
#32 0xb7fb0d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#33 0xb7db1dde in clone () from /lib/i386-linux-gnu/libc.so.6
Comment 4 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2013-09-15 18:46:06 PDT
I've hit this a few times today too.
Comment 5 Bill McCloskey (:billm) 2013-09-15 19:33:51 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #4)
> I've hit this a few times today too.

Are you hitting it on normal sites? The stacks Alon posted appear to be asm.js-specific. If so, can you maybe post new stacks? It might be a different problem.
Comment 6 Alon Zakai (:azakai) 2013-09-15 20:02:23 PDT
I see the problem on other sites too, but it is much harder to reproduce. I did find one semi-reproducible case though, to open dozens of deviantart pages and wait until the deadlock happens. Then I see

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab3cf in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dcab09 in js::AutoPauseWorkersForGC::AutoPauseWorkersForGC(JSRuntime*) () from /home/alon/Downloads/firefox/libxul.so
#4  0xb5d5a317 in js::gc::AutoTraceSession::AutoTraceSession(JSRuntime*, js::HeapState) ()
   from /home/alon/Downloads/firefox/libxul.so
#5  0xb5d5a5d4 in js::gc::AutoPrepareForTracing::AutoPrepareForTracing(JSRuntime*) () from /home/alon/Downloads/firefox/libxul.so
#6  0xb5d5a753 in js::gc::MergeCompartments(JSCompartment*, JSCompartment*) () from /home/alon/Downloads/firefox/libxul.so
#7  0xb5dca3e2 in js::WorkerThreadState::finishParseTask(JSContext*, JSRuntime*, void*) ()
   from /home/alon/Downloads/firefox/libxul.so
#8  0xb5d21096 in JS::FinishOffThreadScript(JSContext*, JSRuntime*, void*) () from /home/alon/Downloads/firefox/libxul.so
#9  0xb4c8da89 in nsJSUtils::EvaluateString(JSContext*, nsAString_internal const&, JS::Handle<JSObject*>, JS::CompileOptions&, nsJSUtils::EvaluateOptions&, JS::Value*, void**) () from /home/alon/Downloads/firefox/libxul.so
#10 0xb4c86759 in nsJSContext::EvaluateString(nsAString_internal const&, JS::Handle<JSObject*>, JS::CompileOptions&, bool, JS::Value*, void**) () from /home/alon/Downloads/firefox/libxul.so
#11 0xb4a41cbb in nsScriptLoader::EvaluateScript(nsScriptLoadRequest*, nsString const&, void**) ()
   from /home/alon/Downloads/firefox/libxul.so
#12 0xb4a41f5e in nsScriptLoader::ProcessRequest(nsScriptLoadRequest*, void**) () from /home/alon/Downloads/firefox/libxul.so
#13 0xb4a42113 in nsScriptLoader::ProcessOffThreadRequest(void**) () from /home/alon/Downloads/firefox/libxul.so
#14 0xb4a42153 in (anonymous namespace)::NotifyOffThreadScriptLoadCompletedRunnable::Run() ()
   from /home/alon/Downloads/firefox/libxul.so
#15 0xb5693121 in nsThread::ProcessNextEvent(bool, bool*) () from /home/alon/Downloads/firefox/libxul.so
#16 0xb5650e45 in NS_ProcessNextEvent(nsIThread*, bool) () from /home/alon/Downloads/firefox/libxul.so
#17 0xb5260ebe in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) () from /home/alon/Downloads/firefox/libxul.so
#18 0xb56c8c7d in MessageLoop::Run() () from /home/alon/Downloads/firefox/libxul.so
#19 0xb51ce638 in nsBaseAppShell::Run() () from /home/alon/Downloads/firefox/libxul.so
#20 0xb5049c4c in nsAppStartup::Run() () from /home/alon/Downloads/firefox/libxul.so
#21 0xb4496a4b in XREMain::XRE_mainRun() () from /home/alon/Downloads/firefox/libxul.so
#22 0xb44993af in XREMain::XRE_main(int, char**, nsXREAppData const*) () from /home/alon/Downloads/firefox/libxul.so
#23 0xb4499607 in XRE_main () from /home/alon/Downloads/firefox/libxul.so
#24 0x0804b486 in do_main(int, char**, nsIFile*) ()
#25 0x0804b7ee in main ()

on the main thread, and two workers,

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab3cf in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dcabbe in js::WorkerThread::pause() () from /home/alon/Downloads/firefox/libxul.so
#4  0xb5dcb85f in js::WorkerThread::threadLoop() () from /home/alon/Downloads/firefox/libxul.so
#5  0xb7bad44d in _pt_root () from /home/alon/Downloads/firefox/libnspr4.so
#6  0xb7fb0d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#7  0xb7db1dde in clone () from /lib/i386-linux-gnu/libc.so.6

and

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab3cf in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dcacab in js::SourceCompressionTask::complete() () from /home/alon/Downloads/firefox/libxul.so
#4  0xb5ebdc71 in js::frontend::CompileScript(js::ExclusiveContext*, js::LifoAlloc*, JS::Handle<JSObject*>, JS::Handle<JSScript*>, JS::CompileOptions const&, unsigned short const*, unsigned int, JSString*, unsigned int, js::SourceCompressionTask*) ()
   from /home/alon/Downloads/firefox/libxul.so
#5  0xb5dcb744 in js::WorkerThread::handleParseWorkload(js::WorkerThreadState&) () from /home/alon/Downloads/firefox/libxul.so
#6  0xb5dcb8f3 in js::WorkerThread::threadLoop() () from /home/alon/Downloads/firefox/libxul.so
#7  0xb7bad44d in _pt_root () from /home/alon/Downloads/firefox/libnspr4.so
#8  0xb7fb0d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#9  0xb7db1dde in clone () from /lib/i386-linux-gnu/libc.so.6
Comment 7 Emanuel Hoogeveen [:ehoogeveen] 2013-09-15 23:53:28 PDT
Could this be the same as bug 916531? (note that the regression range in that bug appears to be incorrect)
Comment 8 Brian Hackett (:bhackett) 2013-09-16 08:58:58 PDT
Created attachment 805371 [details] [diff] [review]
patch

I think this will fix the remaining asm.js specific deadlocks after bug 916351.  The issue in that bug is that the pausing mechanism for threads is malfunctioning, while the issue here is that there are places where the asm.js parse thread waits for compilation threads to finish, without marking itself as paused.  This adds that logic, and commons it into an AutoPauseCurrentWorkerThread for use in cases where one worker thread is waiting for another.
Comment 9 Bill McCloskey (:billm) 2013-09-16 10:07:50 PDT
Comment on attachment 805371 [details] [diff] [review]
patch

Review of attachment 805371 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsworkers.cpp
@@ +986,5 @@
> +    // If the current thread is a worker thread, treat it as paused while
> +    // the caller is waiting for another worker thread to complete. Otherwise
> +    // we will not wake up and mark this as paused due to the loop in
> +    // AutoPauseWorkersForGC.
> +    if (cx->workerThread()) {

Might be easier to write this as:
  if (!cx->workerThread())
    return;

@@ +998,5 @@
> +}
> +
> +AutoPauseCurrentWorkerThread::~AutoPauseCurrentWorkerThread()
> +{
> +    if (cx->workerThread()) {

Same here.
Comment 10 Benjamin Smedberg [:bsmedberg] 2013-09-17 06:55:14 PDT
I am experiencing this and have a full-memory dump of the hang on windows if you need to see it. I have a bunch of tabs loaded, but I wasn't aware of any asm.js code in particular (I'm  not writing or testing it myself, but it's possible that I loaded some asm.js demo from my news feed).
Comment 11 Brian Hackett (:bhackett) 2013-09-17 10:30:46 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/4bcf9b261b94
Comment 13 Paul Silaghi, QA [:pauly] 2013-09-19 00:06:35 PDT
*** Bug 917256 has been marked as a duplicate of this bug. ***
Comment 14 Ryan VanderMeulen [:RyanVM] 2013-09-19 11:05:13 PDT
https://hg.mozilla.org/releases/mozilla-aurora/rev/290e1e44e8b3
Comment 15 Tracy Walker [:tracy] 2013-10-17 09:27:53 PDT
Good first verify - steps to repro are in the the description(comment #0). Please verify against latest Fx26 and Fx27.

Note You need to log in before you can comment on or make changes to this bug.