Closed
Bug 1288579
Opened 9 years ago
Closed 8 years ago
Hang in js::GCParallelTask::join()
Categories
(Core :: JavaScript: GC, defect)
Core
JavaScript: GC
Tracking
()
RESOLVED
FIXED
mozilla51
Tracking | Status | |
---|---|---|
firefox49 | --- | unaffected |
firefox50 | --- | fixed |
firefox51 | --- | fixed |
People
(Reporter: jrmuizel, Assigned: jonco)
References
Details
(Keywords: dogfood, regression)
Attachments
(2 files)
86.75 KB,
text/plain
|
Details | |
1.03 KB,
patch
|
terrence
:
review+
ritu
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
Here's the stack:
* frame #0: 0x00007fff8dce8716 libsystem_kernel.dylib`__psynch_cvwait + 10
frame #1: 0x00007fff8c256c3b libsystem_pthread.dylib`_pthread_cond_wait + 727
frame #2: 0x0000000107b8cc45 XUL`js::ConditionVariable::wait_for(js::LockGuard<js::Mutex>&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) + 53
frame #3: 0x0000000108058969 XUL`js::GCParallelTask::join() + 121
frame #4: 0x0000000107f4b7aa XUL`js::gc::GCRuntime::incrementalCollectSlice(js::SliceBudget&, JS::gcreason::Reason, js::AutoLockForExclusiveAccess&) + 1546
frame #5: 0x0000000107f4be31 XUL`js::gc::GCRuntime::gcCycle(bool, js::SliceBudget&, JS::gcreason::Reason) + 593
frame #6: 0x0000000107f4c3cf XUL`js::gc::GCRuntime::collect(bool, js::SliceBudget, JS::gcreason::Reason) + 431
frame #7: 0x0000000107f4eef2 XUL`JS::StartIncrementalGC(JSContext*, JSGCInvocationKind, JS::gcreason::Reason, long long) + 194
frame #8: 0x000000010556c214 XUL`nsJSContext::GarbageCollectNow(JS::gcreason::Reason, nsJSContext::IsIncremental, nsJSContext::IsShrinking, long long) + 436
frame #9: 0x000000010431b3bd XUL`nsTimerImpl::Fire() + 1101
It looks like all of the other js helper threads are in:
frame #2: 0x0000000107b8cc45 XUL`js::ConditionVariable::wait_for(js::LockGuard<js::Mutex>&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) + 53
frame #3: 0x000000010805990c XUL`js::HelperThread::threadLoop() + 876
except for one thread in:
frame #1: 0x00007fff8c256c3b libsystem_pthread.dylib`_pthread_cond_wait + 727
frame #2: 0x0000000107b8cc45 XUL`js::ConditionVariable::wait_for(js::LockGuard<js::Mutex>&, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator> const&) + 53
frame #3: 0x0000000108055a2c XUL`js::GlobalHelperThreadState::wait(js::AutoLockHelperThreadState&, js::GlobalHelperThreadState::CondVar, mozilla::BaseTimeDuration<mozilla::TimeDurationValueCalculator>) + 60
frame #4: 0x0000000108207fa3 XUL`js::gc::GCRuntime::refillFreeListOffMainThread(js::ExclusiveContext*, js::gc::AllocKind) + 115
frame #5: 0x00000001082058b3 XUL`JSObject* js::Allocate<JSObject, (js::AllowGC)1>(js::ExclusiveContext*, js::gc::AllocKind, unsigned long, js::gc::InitialHeap, js::Class const*) + 99
frame #6: 0x0000000107f60d94 XUL`JSObject::create(js::ExclusiveContext*, js::gc::AllocKind, js::gc::InitialHeap, JS::Handle<js::Shape*>, JS::Handle<js::ObjectGroup*>) + 132
frame #7: 0x0000000107f6a4ed XUL`NewObject(js::ExclusiveContext*, JS::Handle<js::ObjectGroup*>, js::gc::AllocKind, js::NewObjectKind, unsigned int) + 349
frame #8: 0x0000000107f6a92b XUL`js::NewObjectWithClassProtoCommon(js::ExclusiveContext*, js::Class const*, JS::Handle<JSObject*>, js::gc::AllocKind, js::NewObjectKind) + 747
frame #9: 0x00000001081f38bf XUL`js::frontend::BytecodeEmitter::emitObject(js::frontend::ParseNode*) + 271
frame #10: 0x00000001081d96ce XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 1022
frame #11: 0x00000001081e96a0 XUL`js::frontend::BytecodeEmitter::emitAssignment(js::frontend::ParseNode*, JSOp, js::frontend::ParseNode*) + 2880
frame #12: 0x00000001081d93ff XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 303
frame #13: 0x00000001081f0a08 XUL`js::frontend::BytecodeEmitter::emitStatement(js::frontend::ParseNode*) + 648
frame #14: 0x00000001081d94f7 XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 551
frame #15: 0x00000001081d969f XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 975
frame #16: 0x00000001081f429e XUL`js::frontend::BytecodeEmitter::emitArgsBody(js::frontend::ParseNode*) + 1198
frame #17: 0x00000001081d9cec XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 2588
frame #18: 0x00000001081dbf54 XUL`js::frontend::BytecodeEmitter::emitFunctionScript(js::frontend::ParseNode*) + 820
frame #19: 0x00000001081ee5cb XUL`js::frontend::BytecodeEmitter::emitFunction(js::frontend::ParseNode*, bool) + 2059
frame #20: 0x00000001081d9cd4 XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 2564
frame #21: 0x00000001081f1611 XUL`js::frontend::BytecodeEmitter::emitCallOrNew(js::frontend::ParseNode*) + 577
frame #22: 0x00000001081d944a XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 378
frame #23: 0x00000001081f0a08 XUL`js::frontend::BytecodeEmitter::emitStatement(js::frontend::ParseNode*) + 648
frame #24: 0x00000001081d94f7 XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 551
frame #25: 0x00000001081d969f XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 975
frame #26: 0x00000001081ea727 XUL`js::frontend::BytecodeEmitter::emitTry(js::frontend::ParseNode*) + 327
frame #27: 0x00000001081d9a3b XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 1899
frame #28: 0x00000001081d969f XUL`js::frontend::BytecodeEmitter::emitTree(js::frontend::ParseNode*, js::frontend::BytecodeEmitter::EmitLineNumberNote) + 975
frame #29: 0x00000001081d9084 XUL`BytecodeCompiler::prepareAndEmitTree(js::frontend::ParseNode**) + 228
frame #30: 0x00000001081db1e7 XUL`BytecodeCompiler::compileScript(JS::Handle<JSObject*>, JS::Handle<JSScript*>) + 935
frame #31: 0x00000001081dc3c6 XUL`js::frontend::CompileScript(js::ExclusiveContext*, js::LifoAlloc*, JS::Handle<JSObject*>, JS::Handle<js::StaticScope*>, JS::Handle<JSScript*>, JS::ReadOnlyCompileOptions const&, JS::SourceBufferHolder&, JSString*, js::SourceCompressionTask*, js::ScriptSourceObject**) + 102
frame #32: 0x0000000108056441 XUL`js::ScriptParseTask::parse() + 289
Updated•9 years ago
|
Component: JavaScript Engine → JavaScript: GC
Reporter | ||
Comment 1•9 years ago
|
||
This happened to me again.
Comment 2•9 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #1)
> This happened to me again.
What platform are you running, and, assuming it's linux, what is the kernel version?
Flags: needinfo?(jmuizelaar)
Reporter | ||
Comment 4•8 years ago
|
||
I seem to run into this every couple of days.
Comment 5•8 years ago
|
||
It also reproduces on OSX 10.10 on try every 20 or 30 runs. Unfortunately, I've so far not managed to get it in a debugger. When it was manifesting on Linux I gdb and rr both hid the problem, so I'm not confident that we'll even be able to.
Comment 6•8 years ago
|
||
Did you happen to try with rr's chaos mode to tweak the scheduler's behavior? (Just commenting in case you didn't know about it...)
Comment 7•8 years ago
|
||
Extensively.
Comment 9•8 years ago
|
||
I got this just now, and I think Bill has seen it too. I'm on OSX.
Assignee | ||
Comment 10•8 years ago
|
||
It looks like the |decommitTask.join()| in incrementalCollectSlice is hanging.
Blocks: 1119537
Comment 11•8 years ago
|
||
Hung Content process on linux64 mozilla-central
changeset 97a52326b06a
Comment 12•8 years ago
|
||
I've hit this twice today on OS X 10.11. Maybe we can add some release asserts on Nightly?
Assignee | ||
Comment 13•8 years ago
|
||
Is it because GCRuntime::refillFreeListOffMainThread can eat the wake-up signal?
GCParallelTask::startWithLockHeld does HelperThreadState().notifyOne(GlobalHelperThreadState::PRODUCER) to kick off the task.
~AutoTraceSession does notifyAll(GlobalHelperThreadState::PRODUCER) when we return to the mutator - but that won't happen in a non-incremental GC.
Starting the background sweep task does HelperThreadState().notifyAll(GlobalHelperThreadState::PRODUCER) for some reason so wouldn't suffer from this problem.
I think refillFreeListOffMainThread should probably be using a different mechanism to wait for GC to finish.
Attachment #8784007 -
Flags: review?(terrence)
Comment 14•8 years ago
|
||
Comment on attachment 8784007 [details] [diff] [review]
bug1288579-decommit-hang
Review of attachment 8784007 [details] [diff] [review]:
-----------------------------------------------------------------
You are my hero.
Attachment #8784007 -
Flags: review?(terrence) → review+
Comment 15•8 years ago
|
||
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5b07a9829bd7
Fix hang when GC is decommitting r=terrence
Comment 16•8 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 8 years ago
status-firefox51:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
Comment 17•8 years ago
|
||
Terrence, can you please request Aurora approval on this since Jon is on PTO? Thanks!
Assignee: nobody → jcoppeard
status-firefox49:
--- → unaffected
status-firefox50:
--- → affected
Flags: needinfo?(terrence)
Keywords: dogfood,
regression
Comment 18•8 years ago
|
||
Comment on attachment 8784007 [details] [diff] [review]
bug1288579-decommit-hang
Approval Request Comment
[Feature/regressing bug #]: bug 1119537
[User impact if declined]: intermittent hangs
[Describe test coverage new/current, TreeHerder]: TH for the last week. Solves many intermittent failures there.
[Risks and why]: Low. We understand what's going wrong now and this fixes it.
[String/UUID change made/needed]: None.
Flags: needinfo?(terrence)
Attachment #8784007 -
Flags: approval-mozilla-aurora?
Comment on attachment 8784007 [details] [diff] [review]
bug1288579-decommit-hang
Fixes possible hangs and makes our test automation more robust, stabilized on Nightly for a week, Aurora50+
Attachment #8784007 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 20•8 years ago
|
||
bugherder uplift |
Comment 21•8 years ago
|
||
Backed out for bustage. It's running afoul of bug 1290156 changing the signature of GlobalHelperThreadState::notifyOne. I'm guessing it's an easy rebase if you know what you're doing, though :)
https://treeherder.mozilla.org/logviewer.html#?job_id=3449573&repo=mozilla-aurora
https://hg.mozilla.org/releases/mozilla-aurora/rev/99c0c53f5ea3
Flags: needinfo?(terrence)
Comment 22•8 years ago
|
||
Indeed! The lock parameter is new.
https://hg.mozilla.org/releases/mozilla-aurora/rev/0baff3de0d57
Flags: needinfo?(terrence)
Updated•8 years ago
|
Updated•8 years ago
|
Version: unspecified → Trunk
You need to log in
before you can comment on or make changes to this bug.
Description
•