Open Bug 1848585 Opened 1 year ago Updated 1 year ago

delete cookies on shutdown is very slow

Categories

(Toolkit :: Data Sanitization, defect, P3)

Firefox 115
defect

Tracking

()

UNCONFIRMED
Performance Impact low

People

(Reporter: uwe.geiges, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(1 file)

328.72 KB, application/x-zip-compressed
Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/115.0

Steps to reproduce:

Fresh install of Firefox, no extensions installed
Activating setting "Delete cookies and site data when Firefox is closed"
Visit a website that sets a cookie, e.g. amazon.com
Close Firefox

Actual results:

The main window closes immediately, but several "firefox.exe" processes are still running for 4-5 sec.

Expected results:

The cookie and site data cleanup should not take several seconds. All the "firefox.exe" processes should be able to close in a few tenths of a second.

This problem started with Firefox 111.
On Firefox 110 the time between closing Firefox and all processes terminated was < 0.5 sec.
On Firefox 102 (ESR) with the old "cookie.lifetimePolicy" code, the shutdown of all processes was almost instant.

The Bugbug bot thinks this bug should belong to the 'Toolkit::Data Sanitization' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Data Sanitization
Product: Firefox → Toolkit
Flags: needinfo?(pbz)

It is interesting that this can be dated back to when we switched over from the cookie.lifetimePolicy. We made some efforts to speed up the shutdown process by reducing the calls to the principalsCollector, which seems to be what is taking long, but, when I recall it correctly, the cookie.lifetimePolicy code also called the principalsCollector. So maybe it is something else in the Sanitizer code that is slow.

Could we capture a perf profile?

The expected result seems unrealistic, cleaning up especially on disk can take longer than "a few tenths of a second". While we can optimize the code more, there will always be slow machines or big profiles where it takes longer. The real solution is showing a progress dialog when it takes too long.

Blocks: 1550317
Severity: -- → S3
Flags: needinfo?(pbz)
Priority: -- → P3
See Also: → 1756724

My System is running on a NVMe SSD. Windows Explorer is able to delete 10k files in under a second.

I tested with an empty profile and visited only one website. Profile/cache2 contained 7.9 MB in 137 files.
After closing the main window, the profile\cache2 folder was deleted about 0.5 seconds later, together with most of the firefox.exe processes terminating. After that two or three firefox.exe processes are still running for 4-5 sec.

I traced the shutdown with Sysinternals ProcMon. In the first few tenths of a second there is heavy file access on profile\cache2\entries and \doomed. And then there is a four second delay where absolutely nothing happens. Then some file access to "AppData\Local\Temp\MozillaBackgroundTask-...", Registry access and Thread Exit.

Slow machine or big profile clearly isn't the problem here. Please read my first comment about earlier versions of Firefox.

Thanks!

Could you capture a profile of the shutdown and share it here? Profiling on shutdown is a bit more involved, but we have instructions here: https://profiler.firefox.com/docs/#/./guide-startup-shutdown?id=shutdown
The profile will give is a much more in-depth view of the slow paths.

Flags: needinfo?(uwe.geiges)
Flags: needinfo?(uwe.geiges)
Performance Impact: --- → ?

Hmm, it looks like whatever is keeping the processes running is happening after the profiler shuts down. In the profile, shutdown looks quite fast: https://share.firefox.dev/3L6csMa
But the profiler doesn't capture everything that happens during shutdown. It tries to capture as late as it can in the parent process. But for the other processes, the profile has to be sent to the parent process when the IPC connection is still around, so anything that happens after is not captured.

A different tool may be needed to gather useful information.

Can you recommend a tool?

I'm also wondering if we could use mozregression and treat long shutdown as "bad" to find the regressing bug.

Flags: needinfo?(mstange.moz)

After reading comment 6, it sounds like Sysinternals ProcMon may be the right tool :)

The mention of MozillaBackgroundTask makes me think that this delay happens in the "background task" process which runs outside the regular Firefox process. I wonder if the four second delay is intentional.

I found one 5-second timer in a BackgroundTask_* file here: https://searchfox.org/mozilla-central/rev/132ffbfd6842e5ecd3813673c24da849d3c9acf8/toolkit/components/backgroundtasks/BackgroundTask_removeDirectory.sys.mjs#73

Kagami, could this be the cause?

Flags: needinfo?(mstange.moz) → needinfo?(krosylight)

Yup, the process can't immediately be closed as Glean never blocks shutdown.

But I would expect only one background process, not sure why several processes are there. Keeping NI for that.

Confirmed, there are two for storage and network cache respectively. Filed bug 1853885 to reduce that, and adding bug 1790702 for the cause of this delay.

Depends on: 1790702
Flags: needinfo?(krosylight)

The Performance Impact Calculator has determined this bug's performance impact to be low. If you'd like to request re-triage, you can reset the Performance Impact flag to "?" or needinfo the triage sheriff.

Platforms: Windows
Impact on browser: Causes noticeable shutdown delay
Configuration: Specific but common

Performance Impact: ? → low
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: