Closed Bug 1763593 Opened 2 years ago Closed 2 years ago

Assertion failure: false (ClearOnShutdown for phase that already was cleared)

Categories

(Toolkit :: Telemetry, defect, P1)

defect

Tracking

()

RESOLVED FIXED
101 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox99 --- wontfix
firefox100 --- wontfix
firefox101 --- fixed

People

(Reporter: florian, Assigned: chutten)

References

(Regression)

Details

(Keywords: regression)

Crash Data

Attachments

(3 files)

Assertion failure: false (ClearOnShutdown for phase that already was cleared), at /Users/florian/buildhg/mozilla/xpcom/base/ClearOnShutdown.cpp:20
#01: mozilla::ClearOnShutdown_Internal::InsertIntoShutdownList(mozilla::ClearOnShutdown_Internal::ShutdownObserver*, mozilla::ShutdownPhase)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x21a288]
#02: mozilla::detail::RunnableFunction<mozilla::glean::GetLabeledMirrorLock()::'lambda'()>::Run()[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x57abef8]
#03: mozilla::RunnableTask::Run()[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x35c5d0]
#04: mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x336490]
#05: mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x3350e8]
#06: mozilla::TaskController::ProcessPendingMTTask(bool)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x335394]
#07: mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_1>::Run()[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x36312c]
#08: nsThread::ProcessNextEvent(bool, bool*)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x34a22c]
#09: nsThreadPool::ShutdownWithTimeout(int)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x354e3c]
#10: mozilla::MozPromise<CopyableTArray<bool>, bool, false>::ThenValue<nsThreadManager::Shutdown()::$_7>::DoResolveOrRejectInternal(mozilla::MozPromise<CopyableTArray<bool>, bool, false>::ResolveOrRejectValue&)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x36b8d0]
#11: mozilla::MozPromise<CopyableTArray<bool>, bool, false>::ThenValueBase::ResolveOrRejectRunnable::Run()[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x3692e0]
#12: mozilla::RunnableTask::Run()[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x35c5d0]
#13: mozilla::TaskController::DoExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x336490]
#14: mozilla::TaskController::ExecuteNextTaskOnlyMainThreadInternal(mozilla::detail::BaseAutoLock<mozilla::Mutex&> const&)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x3350e8]
#15: mozilla::TaskController::ProcessPendingMTTask(bool)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x335394]
#16: mozilla::detail::RunnableFunction<mozilla::TaskController::InitializeInternal()::$_1>::Run()[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x36312c]
#17: nsThread::ProcessNextEvent(bool, bool*)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x34a22c]
#18: nsThreadManager::Shutdown()[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x34e85c]
#19: mozilla::ShutdownXPCOM(nsIServiceManager*)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x393e30]
#20: ScopedXPCOMStartup::~ScopedXPCOMStartup()[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x59c553c]
#21: XREMain::XRE_main(int, char**, mozilla::BootstrapConfig const&)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x59d17c4]
#22: XRE_main(int, char**, mozilla::BootstrapConfig const&)[/Users/florian/buildhg/mozilla/obj-dbg/toolkit/library/build/XUL +0x59d1eec]
#23: main[/Users/florian/buildhg/mozilla/obj-dbg/dist/NightlyDebug.app/Contents/MacOS/firefox +0xc70]

My try run also shows "TEST-UNEXPECTED-FAIL | leakcheck | default 336 bytes leaked (GetLabeledMirrorLock, nsStringBuffer)".

Set release status flags based on info from the regressing bug 1752417

:chutten, since you are the author of the regressor, bug 1752417, could you take a look?
For more information, please visit auto_nag documentation.

Flags: needinfo?(chutten)
Has Regression Range: --- → yes

I pushed to try a simple workaround similar to what I did in bug 1753598 but there are leaks that still make test runs orange: https://treeherder.mozilla.org/jobs?repo=try&tier=1%2C2%2C3&revision=dbd96928ffe3d1590bd33059ce04f5fe6f45e6fa (not really a surprise)

Well, we could push clearing these maps until XPCOMShutownFinal, maybe? But I'm guessing the threadstats will continue to be sent, so this might just be a bandaid.

...or I suppose since this is all internal stuff we could refrain from calling them if we're late in shutdown. I'll give it a think and try some solutions next week.

Assignee: nobody → chutten
Severity: -- → S3
Status: NEW → ASSIGNED
Flags: needinfo?(chutten)
Priority: -- → P1

We need to clear state at shutdown that we hold for GIFFT mirroring purposes.
However, more data can come in even later (there's nothing stopping it. Plus,
Glean can meaningfully record later given its shutdown happens in a later
phase), so the naive approach of lazy-instantiate and RunOnShutdown-clear
doesn't work.

Since Telemetry is only good to AppShutdownTelemetry, we now fail to mirror to
it after AppShutdownTelemetry and clear state in the immediately-next phase.

Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5f0c6679ff0b
GIFFT mirroring is now fallible r=janerik
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → 101 Branch

There are still failures (see bug 1763474 comment 7) on xpcshell tests on Windows debug, and on wpt tests on Mac (where it's still an issue with the RegisterFonts thread).

The stack for the xpcshell failure is:

Assertion failure: false (ClearOnShutdown for phase that already was cleared), at /builds/worker/checkouts/gecko/xpcom/base/ClearOnShutdown.cpp:20
#01: mozilla::ClearOnShutdown_Internal::InsertIntoShutdownList(mozilla::ClearOnShutdown_Internal::ShutdownObserver*, mozilla::ShutdownPhase) [xpcom/base/ClearOnShutdown.cpp:20]
#02: mozilla::detail::RunnableFunction<`lambda at /builds/worker/workspace/obj-build/dist/include/mozilla/glean/bindings/ScalarGIFFTMap.h:38:70'>::Run() [xpcom/threads/nsThreadUtils.h:532]
#03: mozilla::RunnableTask::Run() [xpcom/threads/TaskController.cpp:468]

The output for the wpt failure is:

[1090, RegisterFonts] WARNING: Called GetMainThread but there isn't a main thread and we're not the main thread.: file /builds/worker/checkouts/gecko/xpcom/threads/nsThreadManager.cpp:572
[1090, RegisterFonts] WARNING: 'NS_FAILED(rv)', file /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:220
 [1090, RegisterFonts] ###!!! ASSERTION: Failed NS_DispatchToMainThread() in shutdown; leaking: 'false', file /builds/worker/checkouts/gecko/xpcom/threads/nsThreadUtils.cpp:222

The patch landed in nightly and beta is affected.
:chutten, is this bug important enough to require an uplift?
If not please set status_beta to wontfix.

For more information, please visit auto_nag documentation.

Flags: needinfo?(chutten)

Don't want to uplift a partial fix. But maybe we'll want an uplift of the complete one.

Looks like with the added failures we must've missed something. It seems so obvious now, but I should be checking the shutdown phase inside the main thread runnable that wants to register the ClearOnShutdown. Otherwise the phase might advance between when the main thread runnable is dispatched and when it is run.

Status: RESOLVED → REOPENED
Flags: needinfo?(chutten)
Resolution: FIXED → ---

As for wpt, though... that assertion failure comes from here and I thought it was harmless. I guess assertions might very well fail tests (makes sense), in which case I need to inline NS_DispatchToMainThread a little more than I thought. Woo.

Crash Signature: [@ mozilla::ClearOnShutdown_Internal::InsertIntoShutdownList(mozilla::ClearOnShutdown_Internal::ShutdownObserver*, mozilla::ShutdownPhase)]
Pushed by chutten@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/926c186d95b4
Clear the map immediately if already late in shutdown r=TravisLong
https://hg.mozilla.org/integration/autoland/rev/6d204bcac61a
Bypass NS_DispatchToMainThread's assert while handling the leak. r=TravisLong
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED

Given that this failure was expressed in terms of bug 1763474 which is riding the same train and the lateness of the beta cycle, I don't think we want to uplift this.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: