Crash in [@ shutdownhang | js::frontend::CompilationStencil::~CompilationStencil]
Categories
(Core :: JavaScript Engine, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox-esr91 | --- | unaffected |
firefox94 | --- | wontfix |
firefox95 | --- | wontfix |
firefox96 | --- | fixed |
People
(Reporter: aryx, Assigned: arai)
References
(Blocks 1 open bug)
Details
(Keywords: crash)
Crash Data
Attachments
(1 file)
New Windows crash in Firefox 94, mostly with 32-bit builds after >1h usage. Is this a change of a crash signature? 120 crashes across betas RCs so far.
Maybe Fission related. (DOMFissionEnabled=1)
Crash report: https://crash-stats.mozilla.org/report/index/cfa3fc75-f911-4515-bf58-75fa60211028
MOZ_CRASH Reason: Shutdown hanging at step xpcom-shutdown. Something is blocking the main-thread.
Top 10 frames of crashing thread:
0 xul.dll js::frontend::CompilationStencil::~CompilationStencil js/src/frontend/CompilationStencil.h:651
1 xul.dll JS::StencilRelease js/src/frontend/Stencil.cpp:3984
2 xul.dll mozilla::ScriptPreloader::CachedStencil::~CachedStencil js/xpconnect/loader/ScriptPreloader.h:184
3 xul.dll static nsTHashtable<nsBaseHashtableET<nsCStringHashKey, mozilla::UniquePtr<mozilla::ScriptPreloader::CachedStencil, mozilla::DefaultDelete<mozilla::ScriptPreloader::CachedStencil> > > >::s_ClearEntry xpcom/ds/nsTHashtable.h:720
4 xul.dll PLDHashTable::Clear xpcom/ds/PLDHashTable.cpp:311
5 xul.dll mozilla::ScriptPreloader::Observe js/xpconnect/loader/ScriptPreloader.cpp:329
6 xul.dll nsObserverList::NotifyObservers xpcom/ds/nsObserverList.cpp:70
7 xul.dll nsObserverService::NotifyObservers xpcom/ds/nsObserverService.cpp:291
8 xul.dll static mozilla::AppShutdown::AdvanceShutdownPhase xpcom/base/AppShutdown.cpp:379
9 xul.dll mozilla::ShutdownXPCOM xpcom/build/XPCOMInit.cpp:610
Comment 1•3 years ago
|
||
Arai,is this something you could look into? If not let me know and I can find someone else.
Updated•3 years ago
|
Comment 2•3 years ago
•
|
||
Looking at crashing, it seems that we are hanging on a simple memory-load that is 4kB aligned. The stencil data is likely to be cold and swapped out after 1hr, so the "hang" is very likely just a slow disk fetching back memory so we can free it. In 94 we changed the data format of this cache so the new signature there makes sense.
On investigation with :arai and :mccr8, it seems odd we are bothering to clean up the data in release builds. We probably should stop doing that and it should solve the crash. Arai is exploring removing the table cleanup code. The cleanup code was largely added previously when they were GC pointers.
Assignee | ||
Updated•3 years ago
|
Assignee | ||
Comment 3•3 years ago
|
||
discussed some more with :tcampbell, :mccr8, and :nika.
It turns out that we're doing the free operation for Stencils in too early stage inside ShutdownXPCOM
for the current code (without GC pointers):
ShutdownXPCOM
-> "xpcom-shutdown" notification -> ScriptPreloader::Cleanup
-> ScriptPreloader::CachedStencil::~CachedStencil
-> CompilationStencil::~CompilationStencil
And delaying it to the later step solves it because we perform fast shutdown on release build:
https://searchfox.org/mozilla-central/rev/267682a8f45221bf0bfe999d4a0239706a43bc56/modules/libpref/init/StaticPrefList.yaml#11338-11347
https://searchfox.org/mozilla-central/rev/267682a8f45221bf0bfe999d4a0239706a43bc56/xpcom/base/AppShutdown.cpp#93-106
https://searchfox.org/mozilla-central/rev/267682a8f45221bf0bfe999d4a0239706a43bc56/xpcom/build/XPCOMInit.cpp#714-715
nsresult ShutdownXPCOM(nsIServiceManager* aServMgr) {
...
mozilla::AppShutdown::MaybeFastShutdown(
mozilla::ShutdownPhase::CCPostLastCycleCollection);
So, if we move the free operation after that point, it's not performed on release build (but OS will perform cleanup for us), and it's performed on other builds (debug, LSan etc) to make sure everything is freed properly.
Currently, ScriptPreloader::Cleanup
is called during "xpcom-shutdown" notification,
and ScriptPreloader
dtor is called during ClearOnShutdown
for XPCOMShutdownFinal
phase,
and mmap-ed data is freed during ClearOnShutdown
for JSPostShutDown
phase.
the relation between those 4 calls are the following:
nsresult ShutdownXPCOM(nsIServiceManager* aServMgr) {
...
mozilla::AppShutdown::AdvanceShutdownPhase(
mozilla::ShutdownPhase::XPCOMShutdown, nullptr,
do_QueryInterface(mgr));
...
mozilla::KillClearOnShutdown(ShutdownPhase::XPCOMShutdownFinal);
...
mozilla::AppShutdown::MaybeFastShutdown(
mozilla::ShutdownPhase::CCPostLastCycleCollection);
...
mozilla::KillClearOnShutdown(ShutdownPhase::JSPostShutDown);
So, if we move the free operation into JSPostShutDown
phase, or somewhere (maybe new JSPreShutDown
if necessary), we won't hit hang due to slow page-in.
Assignee | ||
Comment 4•3 years ago
|
||
Stencils should be released before JSRuntime::destroyRuntime
, that happens in (nsComponentManagerImpl::gComponentManager)->Shutdown()
.
So, ScriptPreloader
should be released between CCPostLastCycleCollection
fast shutdown and it,
and actually, we just need to put it next to StartupCache::DeleteSingleton
.
maybe we should directly call ScriptPreloader's static method, both for ScriptPreloader
singleton and AutoMemMap
singleton,
to align with StartupCache
, instead of adding shutdown phase.
nsresult ShutdownXPCOM(nsIServiceManager* aServMgr) {
...
mozilla::AppShutdown::MaybeFastShutdown(
mozilla::ShutdownPhase::CCPostLastCycleCollection);
mozilla::scache::StartupCache::DeleteSingleton();
PROFILER_MARKER_UNTYPED("Shutdown xpcom", OTHER);
// Shutdown xpcom. This will release all loaders and cause others holding
// a refcount to the component manager to release it.
if (nsComponentManagerImpl::gComponentManager) {
rv = (nsComponentManagerImpl::gComponentManager)->Shutdown();
nsresult nsComponentManagerImpl::Shutdown(void) {
...
StaticComponents::Shutdown();
/* static */ void StaticComponents::Shutdown() {
CallUnloadFuncs();
static void CallUnloadFuncs() {
...
nsLayoutModuleDtor();
void nsLayoutModuleDtor() {
...
xpcModuleDtor();
void xpcModuleDtor() {
// Release our singletons
nsXPConnect::ReleaseXPConnectSingleton();
void nsXPConnect::ReleaseXPConnectSingleton() {
nsXPConnect* xpc = gSelf;
if (xpc) {
nsrefcnt cnt;
NS_RELEASE2(xpc, cnt);
}
nsXPConnect::~nsXPConnect() {
...
delete mContext;
class XPCJSContext final : public mozilla::CycleCollectedJSContext,
CycleCollectedJSContext::~CycleCollectedJSContext() {
...
JS_DestroyContext(mJSContext);
JS_PUBLIC_API void JS_DestroyContext(JSContext* cx) { DestroyContext(cx); }
void js::DestroyContext(JSContext* cx) {
...
rt->destroyRuntime();
void JSRuntime::destroyRuntime() {
...
{
AutoLockScriptData lock(this);
MOZ_ASSERT(scriptDataTable(lock).empty());
}
Assignee | ||
Comment 5•3 years ago
|
||
This stops destructing the stencils held by ScriptPreloader on release build,
to avoid the shutdown hang caused by thrashing.
Also:
- Move ScriptPreloader::Cleanup from xpcom-shutdown notification to ScriptPreloader destructor, given there's no reason for having 2 steps
- Replace ClearOnShutdown call with explicit function call for the singleton deletion, for simplicity
- Remove now-unused PostJSShutDown phase
Assignee | ||
Updated•3 years ago
|
Comment 7•3 years ago
|
||
bugherder |
Comment 8•3 years ago
|
||
Since the status are different for nightly and release, what's the status for beta?
For more information, please visit auto_nag documentation.
Assignee | ||
Updated•3 years ago
|
Comment 9•3 years ago
|
||
Please nominate this for Beta approval when you get a chance.
Assignee | ||
Comment 10•3 years ago
|
||
There's another crash possibly caused by this patch, and now I'm investigating.
I'd like to wait for a while before asking uplift.
Updated•3 years ago
|
Assignee | ||
Updated•3 years ago
|
Description
•