Dooming cache via asyncVisitAllStorages while resetting networkPredictor leads to inconsistent cache state
Categories
(Core :: Networking: Cache, defect, P2)
Tracking
()
People
(Reporter: pbz, Assigned: valentin)
References
(Depends on 1 open bug)
Details
(Whiteboard: [necko-triaged])
Attachments
(3 files)
For Bug 1705030 I'm adding a deleteByBaseDomain
method for the NetworkCacheCleaner. To clear storage by baseDomain I'm iterating over all cache entries via nsICacheStorageService#asyncVisitAllStorages
and doom those that match the baseDomain with nsICacheStorage#asyncDoomURI
.
When running my code isolated it clears the cache entries properly. However, when I run my code in parallel with nsINetworkPredictor#reset
I end up with leftover cache entries in the disk cache. These cache entries may not exist. While they return true
for the exists
check, they don't have a matching file on disk and have a data size of 0.
I found that all nsICacheStorage#asyncDoomURI
calls in my code return NS_OK for the disk cache, still they're not being fully cleared.
I've attached a PoC with which I can reproduce with the following STR:
- Visit https://senglehardt.com/test/dfpi/storage_access_api.html.
- Clear all cache via
Services.cache2.clear()
. - Reload the page so we get new cache entries only for this tab.
- Run PoC in browser toolbox.
- Find leftover entries in about:cache disk cache view which seem to have no last access date and no data associated.
Reporter | ||
Comment 1•4 years ago
|
||
Reporter | ||
Comment 2•4 years ago
|
||
Valentin, we recently discussed some of the issues I experienced with the cache service. Do you know if these cache entries are valid and why they don't get cleared?
This is blocking my work on Bug 1705030 so I'd like to find a solution in the near term. If you could give me some leads on where to look I can also try to debug the issue myself.
Thanks!
Assignee | ||
Comment 3•4 years ago
|
||
Interestingly the steps in comment 0 don't work if I test on my own Nightly profile, but do it I try on a debug build.
As far as I can say it's because the entries are still in the index or in mHandles
I'll give it another look tomorrow - but the cache impl is something we don't have much experience with.
Reporter | ||
Comment 4•4 years ago
|
||
(In reply to Valentin Gosu [:valentin] (he/him) from comment #3)
Interestingly the steps in comment 0 don't work if I test on my own Nightly profile, but do it I try on a debug build.
Maybe the predictor is disabled on your Nightly profile? When I set network.predictor.enabled=false
the issue doesn't reproduce anymore. I think some extension (uBlock Origin?) might disable the predictor.
Reporter | ||
Comment 5•4 years ago
|
||
I'm wondering if the issue is caused by how the network predictor clears cache. It iterates over all cache entries similar to my code. However, when it finds a cache entry without aIdEnhance
it visits the cache entry to inspect / rewrite its metadata:
AsyncOpenURI
: https://searchfox.org/mozilla-central/rev/bf8d5de8528036c09590009720bc172882845b80/netwerk/base/Predictor.cpp#1891
entry->SetMetaDataElement
: https://searchfox.org/mozilla-central/rev/bf8d5de8528036c09590009720bc172882845b80/netwerk/base/Predictor.cpp#1788
Possibly opening / writing to the cache entry or holding parts of the entry leads to this inconsistent state.
Assignee | ||
Comment 6•3 years ago
|
||
Thanks for the lead. I still don't know if this is expected behaviour or a bug, but this should help me isolate a test case.
Assignee | ||
Comment 7•3 years ago
|
||
Update:
It's actually the extra call to AsyncOpenURI that's causing the ghost cache entries to show up.
Assignee | ||
Comment 8•3 years ago
|
||
This bug is caused by two cache visitors working in parallel:
Visitor1: calls asyncDoomURI
Visitor2: calls asyncOpenURI
This causes CacheStorageService::AddStorageEntry to create a new entry and add
it to the hashtable:
https://searchfox.org/mozilla-central/rev/308ea44d0d60b391b031ccee695920bd543f7d2f/netwerk/cache2/CacheStorageService.cpp#1566,1574,1576
Visitor3: will then go through the hashtable and see the newly added entry.
Comment 10•3 years ago
|
||
bugherder |
Comment 11•3 years ago
|
||
Backed out changeset c7911efe7aad (bug 1708673) on beta for causing perf bug 1718267
Updated•3 years ago
|
Comment 12•3 years ago
|
||
Backed out from 92 as well due to bug 1718267.
https://hg.mozilla.org/releases/mozilla-beta/rev/74eeb74cabd7
Comment 13•3 years ago
|
||
Backed out from 93 for causing bug 1718267.
https://hg.mozilla.org/releases/mozilla-beta/rev/65591fce10de05bfd1d6d3d829f8cad140113138
(We should probably back it out in mozilla-central if we don't fix the perf issue in the 94 cycle)
Assignee | ||
Comment 14•3 years ago
|
||
I'll flip the pref for this change on nightly in bug 1718267. If that fixes the perf regression I'll file a different bug for reenabling it.
Updated•3 years ago
|
Comment 15•3 years ago
|
||
The patch landed in nightly and beta is affected.
:valentin, is this bug important enough to require an uplift?
If not please set status_beta
to wontfix
.
If yes, don't forget to request an uplift for the patches in the regression caused by this fix.
For more information, please visit auto_nag documentation.
Assignee | ||
Updated•3 years ago
|
Updated•3 years ago
|
Comment 16•3 years ago
•
|
||
I've reproduced this bug using the steps from comment 0, on an affected Nightly build (90.0a1 Build ID 20210430214504).
The last access date and data associated for the test page entries are now displayed in about:cache
. Tested with Beta 94.0b2, under macOS 11, Ubuntu 18.04 x64 and Win 11 x64.
Description
•