Investigate purging HTTP disk cache using backgroundtasks (out-of-process)
Categories
(Core :: Networking: Cache, task, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox105 | --- | fixed |
People
(Reporter: valentin, Assigned: valentin)
References
Details
(Whiteboard: [necko-triaged])
Attachments
(5 files)
We have quite a few shutdown hang bugs caused by slow IO (on Windows) when clearing the cache folder at shutdown. This is more common for users who opted to purge the cache on shutdown.
Seeing how we now have a backgroundtasks process that can handle out-of-process work, it would be nicer if instead of waiting for each individual file to be removed we could let the backgroundtasks process do it.
Going forward we have two options:
- the main process renames the folder, passes the name to the background tasks, which then removes it. Here I need to investigate if renaming the folder has a better performance than removing each file.
- the main process changes the path it has for the cache folder, and passes the initial name to the background process. The benefit here is that we don't need to change anything about that folder.
Also something we need to consider - how do we deal with failures? Removing the cache folder is somewhat privacy sensitive, but even if it weren't, it would suck to have multiple cache folders sitting around wasting disk space.
Comment 1•4 years ago
|
||
This is an interesting idea; in spirit, it's similar to the way that we launch the pingsender
binary as the process exits: it provides performance and robustness during shutdown. Do we already use a threadpool for deletion on Windows? Former Mozilla employee :gps talks about this in his Surprisingly Slow blog post; we probably already do it, but maybe worth talking about.
There are some limitations built into the current background task implementation that impact how tasks like this are structured, especially around temporary profiles; some docs are at https://firefox-source-docs.mozilla.org/toolkit/components/backgroundtasks/index.html. It would be good for us to sketch the design and tradeoffs together if we're serious about doing this.
Fascinating idea!
Assignee | ||
Comment 2•3 years ago
|
||
This is a PoC that dispatches a background task to completely delete the entire cache2 folder instead of individually deleting each file at shutdown.
There are still a bunch of TODOs left to figure out, but I'd appreciate some early feedback for this approach.
Updated•3 years ago
|
Updated•3 years ago
|
Assignee | ||
Comment 3•3 years ago
|
||
Depends on D126339
Assignee | ||
Comment 4•3 years ago
|
||
This patch uses a CachePurgeLock for every directory it deletes.
Because we can't possibly know if a folder was already deleted before we
aquire the lock or if it never existed (without the use of a log file) we
can assume that if a folder isn't there after the timeout has expired it's
already been purged by a different cleanup task.
Depends on D132164
Assignee | ||
Comment 5•3 years ago
|
||
Depends on D126339
Comment 6•3 years ago
|
||
valentin: sorry for the delay here. I did get the prerequisite ticket landed, but not on the time frame you were pushing for. Sorry for the delay. Talk when you're back from PTO.
Assignee | ||
Comment 7•3 years ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #6)
valentin: sorry for the delay here. I did get the prerequisite ticket landed, but not on the time frame you were pushing for. Sorry for the delay. Talk when you're back from PTO.
No worries. Hopefully we can land this in this release and enable it on nightly.
Assignee | ||
Comment 8•3 years ago
|
||
Hi Nick, let me know if I should do anything to help with the reviews. Thanks!
Comment 9•3 years ago
|
||
(In reply to Valentin Gosu [:valentin] (he/him) from comment #8)
Hi Nick, let me know if I should do anything to help with the reviews. Thanks!
Sorry for the slow replies. This is on my list, will update by EoW.
Comment 10•3 years ago
|
||
There are some r+ patches which didn't land and no activity in this bug for 2 weeks.
:valentin, could you have a look please?
If you still have some work to do, you can add an action "Plan Changes" in Phabricator.
For more information, please visit auto_nag documentation.
Comment 11•3 years ago
|
||
The plan is that I will land this after some prerequisites land. NI to me.
Assignee | ||
Comment 12•3 years ago
|
||
Assuming there are no other blockers, I'm going to rebase the patches and see if we have a green try run.
Hopefully we can land them this week.
Comment 13•2 years ago
|
||
(In reply to Valentin Gosu [:valentin] (he/him) from comment #12)
Assuming there are no other blockers, I'm going to rebase the patches and see if we have a green try run.
Hopefully we can land them this week.
Valentin -- I'm so sorry for not getting this across the line. There is, of course, a story, but it's not particularly interesting. I found some small issues, that I addressed months ago locally. Rebased try builds are here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=196a4e01863f20230f01b1fc5c499f9cdb22606d
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d71152cecf4b64e1f05714ee9162c45d2ef8faab
Please take the last patch (marked TO FOLD
) and merge it into your changes. (I also squashed some commits that do not stand alone in my opinion.) Thanks for taking this back and sorry it's not already done!
Assignee | ||
Comment 14•2 years ago
|
||
No worries. I'm sure this wasn't the most important thing for you to work on.
Thanks for tracking down the path issue, I was just getting around to investigating it, you probably saved me a few hours 🙂.
Comment 15•2 years ago
|
||
The failures in this Windows TV job, are, I think, because this this line needs a .catch()
to ignore errors. It can happen that the process exits before we can close stdin
, IIRC.
Not sure what the other errors might be. Bump MOZ_LOG=BackgroundTasks:5
to get more information about potential issues choosing ephemeral profile directories, which might race (still!).
Assignee | ||
Comment 16•2 years ago
|
||
Depends on D132165
Comment 17•2 years ago
|
||
Hi Valentin,
This sounds like a great thing and I think this should be used for anything that does cleanup on shutdown. I'd be happy if I can reuse your work for my component, do you think there is anything on the current patch that blocking reuse?
Despite the name (purgeHTTPCache and CachePurgeLock), it looks like the implementation is already quite generic as the actual behavior doesn't seem to assert cacheDirName
is a cache directory, so I think giving a different dir name would do the work. Am I understanding correctly here?
Assignee | ||
Comment 18•2 years ago
|
||
(In reply to Kagami :saschanaz from comment #17)
This sounds like a great thing and I think this should be used for anything that does cleanup on shutdown. I'd be happy if I can reuse your work for my component, do you think there is anything on the current patch that blocking reuse?
I don't think there are major things blocking reuse. But we should be careful with having many shutdown tasks going at the same time, as they could cause unwanted interactions.
Despite the name (purgeHTTPCache and CachePurgeLock), it looks like the implementation is already quite generic as the actual behavior doesn't seem to assert
cacheDirName
is a cache directory, so I think giving a different dir name would do the work. Am I understanding correctly here?
Yes, it could potentially be used for other things as well, so I'm not against renaming it. I don't especially like the interface of CachePurgeLock, as you need to manually unlock it all the time, and if you throw or return before that you'll have a bad time 🙁. I plan to change the interface a bit in the future.
Comment 19•2 years ago
|
||
Comment 20•2 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/05f442b8dc10
https://hg.mozilla.org/mozilla-central/rev/a7467de0421b
https://hg.mozilla.org/mozilla-central/rev/a3efb692ab3f
https://hg.mozilla.org/mozilla-central/rev/6fc586712802
https://hg.mozilla.org/mozilla-central/rev/cad6e6b08c7b
Description
•