Crash in [@ IPCError-browser | ShutDownKill | js::gc::ParallelWorker<T>::run]
Categories
(Core :: JavaScript: GC, defect, P5)
Tracking
()
Tracking | Status | |
---|---|---|
firefox73 | --- | unaffected |
firefox74 | --- | unaffected |
firefox75 | --- | wontfix |
People
(Reporter: calixte, Assigned: allstars.chh)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file)
This bug is for crash report bp-272ad86e-cc39-48e2-b116-469840200229.
Top 10 frames of crashing thread:
0 xul.dll js::gc::ParallelWorker<ArenaListSegment, ArenasToUnmark>::run js/src/gc/ParallelWork.h:49
1 xul.dll js::GCParallelTask::runTask js/src/gc/GCParallelTask.cpp:143
2 xul.dll js::gc::GCRuntime::joinTask js/src/gc/GC.cpp:5035
3 xul.dll js::gc::AutoRunParallelWork<ArenaListSegment, ArenasToUnmark>::~AutoRunParallelWork js/src/gc/ParallelWork.h:103
4 xul.dll js::gc::GCRuntime::unmarkCollectedZones js/src/gc/GC.cpp:4003
5 xul.dll js::gc::GCRuntime::beginMarkPhase js/src/gc/GC.cpp:4055
6 xul.dll js::gc::GCRuntime::incrementalSlice js/src/gc/GC.cpp:6494
7 xul.dll js::gc::GCRuntime::gcCycle js/src/gc/GC.cpp:6979
8 xul.dll js::gc::GCRuntime::collect js/src/gc/GC.cpp:7161
9 xul.dll js::gc::GCRuntime::startGC js/src/gc/GC.cpp:7249
There are 6 crashes (from 6 installations) in nightly 75 starting with buildid 20200228161940. In analyzing the backtrace, the regression may have been introduced by patch [1] to fix bug 1617902.
[1] https://hg.mozilla.org/mozilla-central/rev?node=1434bd7bae35
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Comment 1•5 years ago
|
||
Jonco is on PTO, taking this bug.
Comment 2•5 years ago
|
||
BTW we already had frequent signatures in the GC, see bug 1616018 comment 4. It would be interesting to know if these signatures are new (i.e. they appear in addition to the old ones) or if they've replaced the old ones. Yoshi, FYI if you're not familiar with this type of signature this is not a crash per se. What is happening is that a content process is being slow during shutdown and we're killing it after having waited for 5 seconds for it to close itself. We grab a minidump which is sent as a crash report just before we kill the process to know where it was getting stuck.
Comment 3•5 years ago
|
||
Yeah, this feels like it is probably a signature change.
Comment 4•5 years ago
|
||
Joh or Yoshi, could please you set the priority flag?
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 5•5 years ago
•
|
||
I think the problem is the AutoRunParallelWork in unmarkCollectedZones keeps holding the lock
https://searchfox.org/mozilla-central/rev/13b081a62d3f3e3e3120f95564529257b0bf451c/js/src/gc/GC.cpp#4006
so JS Helpers aren't run because they cannot acquire the lock.
And when unmarkCollectedZones exits, the destructor of AutoRunParallelWork calls joinTask
https://searchfox.org/mozilla-central/rev/13b081a62d3f3e3e3120f95564529257b0bf451c/js/src/gc/ParallelWork.h#103
and because those tasks are not run by helpers, so main thread runs them
https://searchfox.org/mozilla-central/rev/13b081a62d3f3e3e3120f95564529257b0bf451c/js/src/gc/GC.cpp#5031
The stack trace in Comment 0 also explains this.
And because it's only main thread runs the task, it takes a long time, and as Gabriele explains in comment 2, the timer kills it
(Thanks Gabriels)
So I will try to fix this with adding another joinTask() that won't run the task by main thread. (or maybe add another condition, say main thread will run the task if helper threads are all busy)
Updated•5 years ago
|
Assignee | ||
Comment 6•5 years ago
|
||
AutoRunParallelWork will try to dispatch the tasks to other helper
threads, but it keeps holding the lock so helper threads cannot run the
tasks. Later when ~AutoRunParallelWork calls joinTask, which will try to
run those tasks on main thread. And in some cases it's taking too long
hence gets killed by Shutdown timer.
So adding a overload version of joinTask to forcibly to run the tasks on
helper thread without dispatching it to main thread.
Comment 7•5 years ago
|
||
(In reply to Yoshi Cheng-Hao Huang [:allstars.chh][:allstarschh] from comment #5)
Good catch about not releasing the lock. However joinTask() will release the lock when running a task on the main thread, so in the worst case one task will run on the main thread and the others will start running on helper threads at that point.
I added some printf debugging to confirm this:
js> gc()
Starting ParallelWorker for unmarkCollectedZones on thread 140013731993792
Starting ParallelWorker for unmarkCollectedZones on thread 140013677328128
Starting ParallelWorker for unmarkCollectedZones on thread 140013679421184
Starting ParallelWorker for unmarkCollectedZones on thread 140013685700352
Starting ParallelWorker for sweepWeakCaches on thread 140013731993792
Starting ParallelWorker for sweepWeakCaches on thread 140013673142016
Starting ParallelWorker for sweepWeakCaches on thread 140013685700352
Starting ParallelWorker for sweepWeakCaches on thread 140013671048960
Starting ParallelWorker for sweepWeakCaches on thread 140013683607296
Starting ParallelWorker for sweepWeakCaches on thread 140013675235072
Starting ParallelWorker for sweepWeakCaches on thread 140013673142016
Starting ParallelWorker for sweepWeakCaches on thread 140013681514240
"before 638976, after 638976\n"
This is probably a signature change following bug 1618638.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 8•3 years ago
|
||
Closing because no crashes reported for 12 weeks.
Updated•3 years ago
|
Description
•