Closed Bug 916504 Opened 11 years ago Closed 11 years ago

Deadlock in AutoPauseWorkersForGC on various sites

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla27
Tracking Status
firefox26 + fixed
firefox27 + fixed

People

(Reporter: azakai, Assigned: bhackett1024)

References

Details

(Whiteboard: [good first verify])

Attachments

(1 file)

Seems to have started in last nightly, sometimes the browser completely deadlocks with no CPU usage and no receptiveness to input at all. I found fairly reproducible STR as follows:

1. Load http://www.unrealengine.com/html5/
2. Press Play
3. If it reaches the "downloading" stage, that means it passed js parsing and the bug did not show itself - reload the page and go to 2.

Eventually, usually the first attempt or the second, the bug will occur.

When it happens, breaking in gdb always shows us in

(gdb) where
#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab2af in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dd02b9 in js::AutoPauseWorkersForGC::AutoPauseWorkersForGC(JSRuntime*) () from /home/alon/Downloads/firefox/libxul.so
#4  0xb5d5fa07 in js::gc::AutoTraceSession::AutoTraceSession(JSRuntime*, js::HeapState) ()
   from /home/alon/Downloads/firefox/libxul.so
#5  0xb5d65e21 in GCCycle(JSRuntime*, bool, long long, js::JSGCInvocationKind, JS::gcreason::Reason) ()
   from /home/alon/Downloads/firefox/libxul.so
#6  0xb5d6625c in Collect(JSRuntime*, bool, long long, js::JSGCInvocationKind, JS::gcreason::Reason) ()
   from /home/alon/Downloads/firefox/libxul.so
#7  0xb5d562de in JS::IncrementalGC(JSRuntime*, JS::gcreason::Reason, long long) () from /home/alon/Downloads/firefox/libxul.so
#8  0xb4c944a9 in nsJSContext::GarbageCollectNow(JS::gcreason::Reason, nsJSContext::IsIncremental, nsJSContext::IsCompartment, nsJSContext::IsShrinking, long long) () from /home/alon/Downloads/firefox/libxul.so
#9  0xb4c9455d in GCTimerFired(nsITimer*, void*) () from /home/alon/Downloads/firefox/libxul.so
#10 0xb569cc65 in nsTimerImpl::Fire() () from /home/alon/Downloads/firefox/libxul.so
#11 0xb569cd9c in nsTimerEvent::Run() () from /home/alon/Downloads/firefox/libxul.so
#12 0xb5698f71 in nsThread::ProcessNextEvent(bool, bool*) () from /home/alon/Downloads/firefox/libxul.so
#13 0xb5656c25 in NS_ProcessNextEvent(nsIThread*, bool) () from /home/alon/Downloads/firefox/libxul.so
#14 0xb526d39e in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) () from /home/alon/Downloads/firefox/libxul.so
#15 0xb56ceadd in MessageLoop::Run() () from /home/alon/Downloads/firefox/libxul.so
#16 0xb51dab78 in nsBaseAppShell::Run() () from /home/alon/Downloads/firefox/libxul.so
#17 0xb50560ec in nsAppStartup::Run() () from /home/alon/Downloads/firefox/libxul.so
#18 0xb44a435b in XREMain::XRE_mainRun() () from /home/alon/Downloads/firefox/libxul.so
#19 0xb44a6cbf in XREMain::XRE_main(int, char**, nsXREAppData const*) () from /home/alon/Downloads/firefox/libxul.so
#20 0xb44a6f17 in XRE_main () from /home/alon/Downloads/firefox/libxul.so
#21 0x0804b486 in do_main(int, char**, nsIFile*) ()
#22 0x0804b7ee in main ()

Perhaps something to do with background parsing which just landed?
Adding cc's based on that speculative guess.
If you still have it open in gdb, it would help to have stack traces for the other JS worker threads. They should be named "Analysis Helper" (we should probably rename them).
Ok, two Analysis Helper threads, one is

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab2af in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dd036e in js::WorkerThread::pause() () from /home/alon/Downloads/firefox/libxul.so
#4  0xb5db44c3 in js::SourceCompressionTask::MOZ_Z_compress() () from /home/alon/Downloads/firefox/libxul.so
#5  0xb5dd0128 in js::WorkerThread::handleCompressionWorkload(js::WorkerThreadState&) () from /home/alon/Downloads/firefox/libxul.so
#6  0xb5dd1084 in js::WorkerThread::threadLoop() () from /home/alon/Downloads/firefox/libxul.so
#7  0xb7bad43d in _pt_root () from /home/alon/Downloads/firefox/libnspr4.so
#8  0xb7fb0d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#9  0xb7db1dde in clone () from /lib/i386-linux-gnu/libc.so.6

and the other is

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab2af in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dcf9d2 in js::WorkerThreadState::wait(js::WorkerThreadState::CondVar, unsigned int) ()
   from /home/alon/Downloads/firefox/libxul.so
#4  0xb5f190aa in GenerateCodeForFinishedJob((anonymous namespace)::ModuleCompiler&, ParallelGroupState&, js::AsmJSParallelTask**)
    () from /home/alon/Downloads/firefox/libxul.so
#5  0xb5f1f0cf in CheckModule(js::ExclusiveContext*, js::frontend::Parser<js::frontend::FullParseHandler>&, js::frontend::ParseNode*, js::ScopedJSDeletePtr<js::AsmJSModule>*, js::ScopedJSFreePtr<char>*) () from /home/alon/Downloads/firefox/libxul.so
#6  0xb5f1f3fb in js::CompileAsmJS(js::ExclusiveContext*, js::frontend::Parser<js::frontend::FullParseHandler>&, js::frontend::ParseNode*, bool*) () from /home/alon/Downloads/firefox/libxul.so
#7  0xb5ee68ad in js::frontend::Parser<js::frontend::FullParseHandler>::asmJS(js::frontend::ParseNode*) ()
   from /home/alon/Downloads/firefox/libxul.so
#8  0xb5ef76e2 in js::frontend::Parser<js::frontend::FullParseHandler>::statements() () from /home/alon/Downloads/firefox/libxul.so
#9  0xb5eff70c in js::frontend::Parser<js::frontend::FullParseHandler>::functionBody(js::frontend::FunctionSyntaxKind, js::frontend::Parser<js::frontend::FullParseHandler>::FunctionBodyType) () from /home/alon/Downloads/firefox/libxul.so
#10 0xb5ef5981 in js::frontend::Parser<js::frontend::FullParseHandler>::functionArgsAndBodyGeneric(js::frontend::ParseNode*, JS::Handle<JSFunction*>, js::frontend::FunctionType, js::frontend::FunctionSyntaxKind, js::frontend::Directives*) ()
   from /home/alon/Downloads/firefox/libxul.so
#11 0xb5ef63f7 in js::frontend::Parser<js::frontend::FullParseHandler>::functionArgsAndBody(js::frontend::ParseNode*, JS::Handle<JSFunction*>, js::frontend::FunctionType, js::frontend::FunctionSyntaxKind, js::GeneratorKind, js::frontend::Directives, js::frontend::Directives*) () from /home/alon/Downloads/firefox/libxul.so
#12 0xb5ef66b5 in js::frontend::Parser<js::frontend::FullParseHandler>::functionDef(JS::Handle<js::PropertyName*>, js::frontend::TokenStream::Position const&, js::frontend::FunctionType, js::frontend::FunctionSyntaxKind, js::GeneratorKind) ()
   from /home/alon/Downloads/firefox/libxul.so
#13 0xb5ef6925 in js::frontend::Parser<js::frontend::FullParseHandler>::functionExpr() ()
   from /home/alon/Downloads/firefox/libxul.so
#14 0xb5ef8008 in js::frontend::Parser<js::frontend::FullParseHandler>::primaryExpr(js::frontend::TokenKind) ()
   from /home/alon/Downloads/firefox/libxul.so
#15 0xb5ef936f in js::frontend::Parser<js::frontend::FullParseHandler>::memberExpr(js::frontend::TokenKind, bool) ()
   from /home/alon/Downloads/firefox/libxul.so
#16 0xb5ef9a2a in js::frontend::Parser<js::frontend::FullParseHandler>::unaryExpr() () from /home/alon/Downloads/firefox/libxul.so
#17 0xb5efa31c in js::frontend::Parser<js::frontend::FullParseHandler>::orExpr1() () from /home/alon/Downloads/firefox/libxul.so
#18 0xb5efa76d in js::frontend::Parser<js::frontend::FullParseHandler>::assignExpr() () from /home/alon/Downloads/firefox/libxul.so
#19 0xb5efc105 in js::frontend::Parser<js::frontend::FullParseHandler>::expr() () from /home/alon/Downloads/firefox/libxul.so
#20 0xb5efdd5e in js::frontend::Parser<js::frontend::FullParseHandler>::parenExpr(bool*) ()
   from /home/alon/Downloads/firefox/libxul.so
#21 0xb5ef8147 in js::frontend::Parser<js::frontend::FullParseHandler>::primaryExpr(js::frontend::TokenKind) ()
   from /home/alon/Downloads/firefox/libxul.so
#22 0xb5ef936f in js::frontend::Parser<js::frontend::FullParseHandler>::memberExpr(js::frontend::TokenKind, bool) ()
---Type <return> to continue, or q <return> to quit--- 
   from /home/alon/Downloads/firefox/libxul.so
#23 0xb5ef9a2a in js::frontend::Parser<js::frontend::FullParseHandler>::unaryExpr() () from /home/alon/Downloads/firefox/libxul.so
#24 0xb5efa31c in js::frontend::Parser<js::frontend::FullParseHandler>::orExpr1() () from /home/alon/Downloads/firefox/libxul.so
#25 0xb5efa76d in js::frontend::Parser<js::frontend::FullParseHandler>::assignExpr() () from /home/alon/Downloads/firefox/libxul.so
#26 0xb5efd855 in js::frontend::Parser<js::frontend::FullParseHandler>::variables(js::frontend::ParseNodeKind, bool*, js::StaticBlockObject*, js::frontend::VarContext) () from /home/alon/Downloads/firefox/libxul.so
#27 0xb5ef6f8a in js::frontend::Parser<js::frontend::FullParseHandler>::statement(bool) ()
   from /home/alon/Downloads/firefox/libxul.so
#28 0xb5ec1f8c in js::frontend::CompileScript(js::ExclusiveContext*, js::LifoAlloc*, JS::Handle<JSObject*>, JS::Handle<JSScript*>, JS::CompileOptions const&, unsigned short const*, unsigned int, JSString*, unsigned int, js::SourceCompressionTask*) ()
   from /home/alon/Downloads/firefox/libxul.so
#29 0xb5dd0f44 in js::WorkerThread::handleParseWorkload(js::WorkerThreadState&) () from /home/alon/Downloads/firefox/libxul.so
#30 0xb5dd1113 in js::WorkerThread::threadLoop() () from /home/alon/Downloads/firefox/libxul.so
#31 0xb7bad43d in _pt_root () from /home/alon/Downloads/firefox/libnspr4.so
#32 0xb7fb0d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#33 0xb7db1dde in clone () from /lib/i386-linux-gnu/libc.so.6
I've hit this a few times today too.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #4)
> I've hit this a few times today too.

Are you hitting it on normal sites? The stacks Alon posted appear to be asm.js-specific. If so, can you maybe post new stacks? It might be a different problem.
I see the problem on other sites too, but it is much harder to reproduce. I did find one semi-reproducible case though, to open dozens of deviantart pages and wait until the deadlock happens. Then I see

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab3cf in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dcab09 in js::AutoPauseWorkersForGC::AutoPauseWorkersForGC(JSRuntime*) () from /home/alon/Downloads/firefox/libxul.so
#4  0xb5d5a317 in js::gc::AutoTraceSession::AutoTraceSession(JSRuntime*, js::HeapState) ()
   from /home/alon/Downloads/firefox/libxul.so
#5  0xb5d5a5d4 in js::gc::AutoPrepareForTracing::AutoPrepareForTracing(JSRuntime*) () from /home/alon/Downloads/firefox/libxul.so
#6  0xb5d5a753 in js::gc::MergeCompartments(JSCompartment*, JSCompartment*) () from /home/alon/Downloads/firefox/libxul.so
#7  0xb5dca3e2 in js::WorkerThreadState::finishParseTask(JSContext*, JSRuntime*, void*) ()
   from /home/alon/Downloads/firefox/libxul.so
#8  0xb5d21096 in JS::FinishOffThreadScript(JSContext*, JSRuntime*, void*) () from /home/alon/Downloads/firefox/libxul.so
#9  0xb4c8da89 in nsJSUtils::EvaluateString(JSContext*, nsAString_internal const&, JS::Handle<JSObject*>, JS::CompileOptions&, nsJSUtils::EvaluateOptions&, JS::Value*, void**) () from /home/alon/Downloads/firefox/libxul.so
#10 0xb4c86759 in nsJSContext::EvaluateString(nsAString_internal const&, JS::Handle<JSObject*>, JS::CompileOptions&, bool, JS::Value*, void**) () from /home/alon/Downloads/firefox/libxul.so
#11 0xb4a41cbb in nsScriptLoader::EvaluateScript(nsScriptLoadRequest*, nsString const&, void**) ()
   from /home/alon/Downloads/firefox/libxul.so
#12 0xb4a41f5e in nsScriptLoader::ProcessRequest(nsScriptLoadRequest*, void**) () from /home/alon/Downloads/firefox/libxul.so
#13 0xb4a42113 in nsScriptLoader::ProcessOffThreadRequest(void**) () from /home/alon/Downloads/firefox/libxul.so
#14 0xb4a42153 in (anonymous namespace)::NotifyOffThreadScriptLoadCompletedRunnable::Run() ()
   from /home/alon/Downloads/firefox/libxul.so
#15 0xb5693121 in nsThread::ProcessNextEvent(bool, bool*) () from /home/alon/Downloads/firefox/libxul.so
#16 0xb5650e45 in NS_ProcessNextEvent(nsIThread*, bool) () from /home/alon/Downloads/firefox/libxul.so
#17 0xb5260ebe in mozilla::ipc::MessagePump::Run(base::MessagePump::Delegate*) () from /home/alon/Downloads/firefox/libxul.so
#18 0xb56c8c7d in MessageLoop::Run() () from /home/alon/Downloads/firefox/libxul.so
#19 0xb51ce638 in nsBaseAppShell::Run() () from /home/alon/Downloads/firefox/libxul.so
#20 0xb5049c4c in nsAppStartup::Run() () from /home/alon/Downloads/firefox/libxul.so
#21 0xb4496a4b in XREMain::XRE_mainRun() () from /home/alon/Downloads/firefox/libxul.so
#22 0xb44993af in XREMain::XRE_main(int, char**, nsXREAppData const*) () from /home/alon/Downloads/firefox/libxul.so
#23 0xb4499607 in XRE_main () from /home/alon/Downloads/firefox/libxul.so
#24 0x0804b486 in do_main(int, char**, nsIFile*) ()
#25 0x0804b7ee in main ()

on the main thread, and two workers,

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab3cf in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dcabbe in js::WorkerThread::pause() () from /home/alon/Downloads/firefox/libxul.so
#4  0xb5dcb85f in js::WorkerThread::threadLoop() () from /home/alon/Downloads/firefox/libxul.so
#5  0xb7bad44d in _pt_root () from /home/alon/Downloads/firefox/libnspr4.so
#6  0xb7fb0d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#7  0xb7db1dde in clone () from /lib/i386-linux-gnu/libc.so.6

and

#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fb496b in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7bab3cf in PR_WaitCondVar () from /home/alon/Downloads/firefox/libnspr4.so
#3  0xb5dcacab in js::SourceCompressionTask::complete() () from /home/alon/Downloads/firefox/libxul.so
#4  0xb5ebdc71 in js::frontend::CompileScript(js::ExclusiveContext*, js::LifoAlloc*, JS::Handle<JSObject*>, JS::Handle<JSScript*>, JS::CompileOptions const&, unsigned short const*, unsigned int, JSString*, unsigned int, js::SourceCompressionTask*) ()
   from /home/alon/Downloads/firefox/libxul.so
#5  0xb5dcb744 in js::WorkerThread::handleParseWorkload(js::WorkerThreadState&) () from /home/alon/Downloads/firefox/libxul.so
#6  0xb5dcb8f3 in js::WorkerThread::threadLoop() () from /home/alon/Downloads/firefox/libxul.so
#7  0xb7bad44d in _pt_root () from /home/alon/Downloads/firefox/libnspr4.so
#8  0xb7fb0d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#9  0xb7db1dde in clone () from /lib/i386-linux-gnu/libc.so.6
Could this be the same as bug 916531? (note that the regression range in that bug appears to be incorrect)
Blocks: 916531
Attached patch patchSplinter Review
I think this will fix the remaining asm.js specific deadlocks after bug 916351.  The issue in that bug is that the pausing mechanism for threads is malfunctioning, while the issue here is that there are places where the asm.js parse thread waits for compilation threads to finish, without marking itself as paused.  This adds that logic, and commons it into an AutoPauseCurrentWorkerThread for use in cases where one worker thread is waiting for another.
Attachment #805371 - Flags: review?(wmccloskey)
Comment on attachment 805371 [details] [diff] [review]
patch

Review of attachment 805371 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsworkers.cpp
@@ +986,5 @@
> +    // If the current thread is a worker thread, treat it as paused while
> +    // the caller is waiting for another worker thread to complete. Otherwise
> +    // we will not wake up and mark this as paused due to the loop in
> +    // AutoPauseWorkersForGC.
> +    if (cx->workerThread()) {

Might be easier to write this as:
  if (!cx->workerThread())
    return;

@@ +998,5 @@
> +}
> +
> +AutoPauseCurrentWorkerThread::~AutoPauseCurrentWorkerThread()
> +{
> +    if (cx->workerThread()) {

Same here.
Attachment #805371 - Flags: review?(wmccloskey) → review+
I am experiencing this and have a full-memory dump of the hang on windows if you need to see it. I have a bunch of tabs loaded, but I wasn't aware of any asm.js code in particular (I'm  not writing or testing it myself, but it's possible that I loaded some asm.js demo from my news feed).
https://hg.mozilla.org/mozilla-central/rev/4bcf9b261b94
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Depends on: 918862
Assignee: general → bhackett1024
Good first verify - steps to repro are in the the description(comment #0). Please verify against latest Fx26 and Fx27.
Keywords: verifyme
Whiteboard: [good first verify]
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: