Open Bug 1710018 Opened 3 years ago Updated 2 months ago

Crash in [@ shutdownhang | mozilla::PreferencesWriter::Flush]. Preferences not saved.

Categories

(Core :: Preferences: Backend, defect, P3)

Unspecified
Windows
defect

Tracking

()

Tracking Status
firefox-esr78 --- unaffected
firefox88 --- unaffected
firefox89 --- wontfix
firefox90 --- wontfix

People

(Reporter: aryx, Unassigned)

References

(Blocks 1 open bug)

Details

(5 keywords)

Crash Data

Attachments

(3 files)

80 crashes, all 89 betas on Windows, multiple crashes per installation

Crash report: https://crash-stats.mozilla.org/report/index/6eff1f66-20c6-404f-a03f-6995f0210506

MOZ_CRASH Reason: Shutdown hanging at step xpcom-will-shutdown. Something is blocking the main-thread.

Top 10 frames of crashing thread:

0 ntdll.dll NtWaitForAlertByThreadId 
1 ntdll.dll RtlSleepConditionVariableSRW 
2 kernelbase.dll SleepConditionVariableSRW 
3 mozglue.dll mozilla::detail::ConditionVariableImpl::wait mozglue/misc/ConditionVariable_windows.cpp:50
4 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:1093
5 xul.dll NS_ProcessNextEvent xpcom/threads/nsThreadUtils.cpp:548
6 xul.dll static mozilla::PreferencesWriter::Flush modules/libpref/Preferences.cpp:2974
7 xul.dll mozilla::Preferences::SavePrefFileBlocking modules/libpref/Preferences.cpp:3778
8 xul.dll mozilla::Preferences::Observe modules/libpref/Preferences.cpp:3666
9 xul.dll nsObserverList::NotifyObservers xpcom/ds/nsObserverList.cpp:70
Severity: -- → S3

So it seems that also on bug 1636925 there are still quite some crashes.

Looking at Preferences::WritePrefFile makes me think, if we should not do exactly the opposite of what the comment suggests, that is increment and decrement the PreferencesWriter::sPendingWriteCount only from the runnable's constructor/destructor itself.

IIUC, in the current state we check if the event target exists but later we do not check, if the dispatch really succeeded. If the PreferencesWriter::sPendingWriteCount was touched only in the runnable's constructor/destructor itself, we were immune against this kind of failures.

We want to ensure that increment and decrement of sPendingWriteCount can never be out
of sync and in DEBUG that we actually ran the PWRunnable, too.

Assignee: nobody → jstutte
Status: NEW → ASSIGNED
Attachment #9260622 - Attachment description: Bug 1710018: Reduce possible edge cases for sPendingWriteCount misalignments. r?smaug → Bug 1710018: Fix possible edge case for sPendingWriteCount misalignments. r?smaug
Pushed by jstutte@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c3081279d204
Fix possible edge case for sPendingWriteCount misalignments. r=smaug
https://hg.mozilla.org/integration/autoland/rev/107ab2613b16
Refuse preference changes after XPCOMShutdownThreads started. r=smaug

If this lands cleanly and survives tier-2 tests, I will add a patch to remove the assert in pref_SetPref, the NS_ERROR should be enough for the future.

Flags: needinfo?(jstutte)
Keywords: leave-open
Regressions: 1752308
Pushed by jstutte@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/736620d8094a
Remove the shutdown assertion from pref_SetPref to just rely on the NS_ERROR_ILLEGAL_DURING_SHUTDOWN failure return. r=dom-worker-reviewers,jesup
Crash Signature: [@ shutdownhang | mozilla::PreferencesWriter::Flush] → [@ shutdownhang | mozilla::PreferencesWriter::Flush] [@ shutdownhang | mozilla::SpinEventLoopUntil<T> | mozilla::PreferencesWriter::Flush]
Flags: needinfo?(jstutte)
Crash Signature: [@ shutdownhang | mozilla::PreferencesWriter::Flush] [@ shutdownhang | mozilla::SpinEventLoopUntil<T> | mozilla::PreferencesWriter::Flush] → [@ shutdownhang | mozilla::PreferencesWriter::Flush] [@ shutdownhang | mozilla::SpinEventLoopUntil<T> | mozilla::PreferencesWriter::Flush] [@ shutdownhang | kernelbase.dll | mozilla::SpinEventLoopUntil<T> | mozilla::Preferences::SavePrefFileBlocking]
Crash Signature: [@ shutdownhang | mozilla::PreferencesWriter::Flush] [@ shutdownhang | mozilla::SpinEventLoopUntil<T> | mozilla::PreferencesWriter::Flush] [@ shutdownhang | kernelbase.dll | mozilla::SpinEventLoopUntil<T> | mozilla::Preferences::SavePrefFileBlocking] → [@ shutdownhang | mozilla::PreferencesWriter::Flush] [@ shutdownhang | mozilla::SpinEventLoopUntil<T> | mozilla::PreferencesWriter::Flush] [@ shutdownhang | kernelbase.dll | mozilla::SpinEventLoopUntil<T> | mozilla::Preferences::SavePrefFileBlocking] […

Just to better understand the dynamics here: The patches attached here did not land before 97 was released and thus are no possible reason for the spikes we see here. On the other hand they seem not to solve the problem (entirely), as we still see crashes in beta 98. There must be some other thing that came into 97.

Looking at this comment makes me think if we should also check if aFile == mCurrentFile and not use the SaveMethod::Blocking in that case?

Flags: needinfo?(bugs)

(In reply to Jens Stutte [:jstutte] from comment #12)

Looking at this comment makes me think if we should also check if aFile == mCurrentFile and not use the SaveMethod::Blocking in that case?

To be clear, I do not really expect that this would help against this SpinEventLoopUntil hang here.

It seems as if the number of crashes on bug 1636925 went down with 97 while here it went up. It might just be the same root cause?

Not sure where we'd check aFile == mCurrentFile.
Do know in the crash reports if one is using SSD or old style HD?

Flags: needinfo?(bugs)

Crash report > "Crash annotations" tab > "TelemetryEnvironment" row > search for "hdd".

(In reply to Olli Pettay [:smaug] from comment #15)

Not sure where we'd check aFile == mCurrentFile.

Instead of just if (nullptr == aFile) I assume from the comment and the code below if (nullptr == aFile || aFile == mCurrentFile) in order to not do WritePrefFile(aFile, SaveMethod::Blocking);.

The comment hints about the behavior the method has, no?
If file isn't null, ::Blocking is used, whether or not aFile is mCurrentFile or not.
But yes, the comment is wondering whether we want that behavior.

The current behavior is consistent though. Whenever file is passed, Blocking is used.

Sure, I was just wondering if the comment's wondering gives us some hint...

What strikes me here is that we seem to see the remaining instances of this hang only in early beta versions now. As if there was something active only if EARLY_BETA_OR_EARLIER.

Updating the signatures to the only remaining one.

Crash Signature: [@ shutdownhang | mozilla::PreferencesWriter::Flush] [@ shutdownhang | mozilla::SpinEventLoopUntil<T> | mozilla::PreferencesWriter::Flush] [@ shutdownhang | kernelbase.dll | mozilla::SpinEventLoopUntil<T> | mozilla::Preferences::SavePrefFileBlocking] [… → [@ shutdownhang | mozilla::SpinEventLoopUntil<T> | mozilla::PreferencesWriter::Flush]
Assignee: jstutte → nobody
Status: ASSIGNED → NEW
Priority: -- → P3

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 20 desktop browser crashes on release

:KrisWright, could you consider increasing the severity of this top-crash bug?

For more information, please visit auto_nag documentation.

Flags: needinfo?(kwright)
Keywords: topcrash

(In reply to Jens Stutte [:jstutte] from comment #20)

What strikes me here is that we seem to see the remaining instances of this hang only in early beta versions now. As if there was something active only if EARLY_BETA_OR_EARLIER.

Now I see many release crashes. Not sure if something changed in how we handle the signatures here.

Most of the crashes I looked at seem to wait on NtCreateFile, that is slow or blocked IO for some reason. I wonder if we have a chance to know if the profile directory is on a network attached storage or otherwise removable device, though I'd expect Windows to throw an error then and not just block?

The leave-open keyword is there and there is no activity for 6 months.
:KrisWright, maybe it's time to close this bug?
For more information, please visit BugBot documentation.

Flags: needinfo?(kwright)
Summary: Crash in [@ shutdownhang | mozilla::PreferencesWriter::Flush] → Crash in [@ shutdownhang | mozilla::PreferencesWriter::Flush]. Preferences not saved.

If I recall some changes were made that impact the PreferencesWriter shutdown in the past few months. Is this still a topcrash for TB?
Additionally, though the signature hasn't changed we may benefit from spinning this hang off into a new bug, as the existing patches on this bug do not address remaining crashes.

Flags: needinfo?(vseerror)
Flags: needinfo?(kwright)

(In reply to Kris Wright :KrisWright from comment #25)

If I recall some changes were made that impact the PreferencesWriter shutdown in the past few months. Is this still a topcrash for TB?
Additionally, though the signature hasn't changed we may benefit from spinning this hang off into a new bug, as the existing patches on this bug do not address remaining crashes.

Thanks Kris. It is a top 20 crash for Thunderbird 115.8.0. if the changes to

There is no change improvement for Thunderbird esr (our main release) and also no change for non-release versions

Flags: needinfo?(vseerror)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: