Open Bug 1800701 Opened 2 years ago Updated 21 days ago

On shared file systems browser.cache.disk.capacity reports hardware size, not user quota

Categories

(Core :: Networking: Cache, defect, P2)

Firefox 102
defect

Tracking

()

UNCONFIRMED

People

(Reporter: ed.sternin, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-triaged])

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Firefox/102.0

Steps to reproduce:

Used Firefox 102.3.0esr (64-bit) on a system (Linux kernel 3.10.0-1160.76.1.el7.x86_64) where the user's home directory is NFS mounted from a shared disk array.

Actual results:

User quota is set at 2GB. After a few weeks, cache size increased to 1.9GB, and no work could be done. Error messages unrelated to firefox showed up, all related to inability to write to home directory, exceeding the quota.

Possibly introduced in fixing https://bugzilla.mozilla.org/show_bug.cgi?id=1735717

Expected results:

browser.cache.disk.smart_size subsystem should have reported the user's share of the disk (the disk quota of 2GB), and not the total hardware disk size (500GB).

Temporary workaround is to use about:options to set
browser.cache.disk.smart_size.enabled=false
browser.cache.disk.capacity=something reasonable (200000?)

The Bugbug bot thinks this bug should belong to the 'Core::Storage: Quota Manager' component, and is moving the bug to that component. Please correct in case you think the bot is wrong.

Component: Untriaged → Storage: Quota Manager
Product: Firefox → Core

The preferences in question is only used in netwerk code-base and hence, I am moving it to that component.

@ed sternin, it would be very helpful if could please post about:cache and about:profiles screenshot to further assist the investigation.

Flags: needinfo?(ed.sternin)
Component: Storage: Quota Manager → Networking: Cache

(In reply to Harveer Singh from comment #2)

@ed sternin, it would be very helpful if could please post about:cache and about:profiles screenshot to further assist the investigation.

Clarifying this a little more, in general there are 2 systems of disk usage tracking that can be in play in a profile:

  1. The HTTP disk cache whose stats can be found by opening the about:cache page, and which potentially exists outside of the profile directory, but under the user's home directory.
  2. The Quota Manager-managed cache that uses the storage/ directory under a profile, and where about:profiles can be used to most easily locate the profile.

The main interest is to understand if it's primarily the HTTP cache that is causing problems here, or if bug 1735717 might also have impacted Quota Manager's storage calculations (and perhaps the HTTP cache's too?). It would also not be surprising for the browser to have a systemic issue understanding the NFS use case in question if the home directory isn't exposed as a mount with its size corresponding directly to the quota limit.

Whoops, I forgot my main point to make which was mainly that, there can be some private information (username) on about:caches disk section, so it's not necessary to provide a screenshot as much as it would be good to characterize the disk limit at play and amount of disk used. For about:profiles, that can be useful to find the profile directory and then check the size of the storage/ directory and all of its contents.

I have cleaned up and disabled the smart_size determination as in the original bug report, so I need to take some time to re-create the problem condition. Right now it looks like this, which at least tells you about the structure of the environment (and I do not mind revealing the username on this system :-) :

Default Profile yes
Root Directory /home/esternin/.mozilla/firefox/xxxxxxxx.default-1559224418245
Local Directory /home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245

$ du -ks /home/esternin/.mozilla/firefox/xxxxxxxxx.default-1559224418245/storage
24080	/home/esternin/.mozilla/firefox/xxxxxxxx.default-1559224418245/storage

$ du -ks /home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/*
144	/home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/activity-stream.discovery_stream.json
96	/home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/activity-stream.tippytop.json
100	/home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/activity-stream.topstories.json
209600	/home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/cache2
7544	/home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/OfflineCache
16916	/home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/safebrowsing
24	/home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/settings
12068	/home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/startupCache
184	/home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/thumbnails

$ df -h
Filesystem                            Size  Used Avail Use% Mounted on
...
xx.yy.zz.ca:/esternin           493G  427G   61G  88% /home/esternin
...

So it looks like /home/esternin/.cache/mozilla/firefox/xxxxxxxx.default-1559224418245/cache2 is where things go. I had set browser.cache.disk.capacity=25600 yesterday after a purge, and it looks like it's filling up.

Please, reset needinfo if this is not enough.

Flags: needinfo?(ed.sternin)

Oops, I meant browser.cache.disk.capacity=256000

actually, I think bug 1735717 should help here assuming it's working.
But without it and smart_size.enabled the HTTP cache code would just check the available disk size and usage, and use a max of 2GB of that.
Talking to janv, he said he didn't test with NFS.

(In reply to Valentin Gosu [:valentin] (he/him) from comment #7)

... cache code would just check the available disk size and usage

Then this is definitely a bug. "Usage" is not enough on shared disk spaces (need not be NFS), must check for disk quota (if active).

and use a max of 2GB of that.

Does this mean 2GB is the default? This seems excessive - at some point, searching a large cache for a small bit might take longer than re-getting it from the network, and in case of NFS-mounted cache this will happen more quickly, so having a large cache would slow down the page load. It seems to me the logic of this should be re-thought.

The cache folder is not usually on a shared disk, so this is quite an edge case.
We do have a feature called race-cache-with-network - if the disk operations are slow, we'll use the network instead.

(In reply to Valentin Gosu [:valentin] (he/him) from comment #9)

The cache folder is not usually on a shared disk

I am genuinely curious how you would suggest we arrange our cluster of about 100 workstations. Currently, one can come to any of them, log in, and see the same home directory, which is an NFS mount from a central file server.

Cache, and settings, and everything personal is in those home directories. How would you move the cache to a local (non-networked) disk space, without losing the ability to log in on another workstation and keep the cache and its benefits? The assumption of everyone always using the same piece of hardware is somewhat outdated, it seems to me.

We do have a feature called race-cache-with-network - if the disk operations are slow, we'll use the network instead.

Oh, that's excellent. Anecdotal reports of a "huge performance boost by turning off memory and disk cache" made me think it was not there/not working.

The severity field is not set for this bug.
:jesup, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(rjesup)

Mike - How common a problem would you expect this to be? (And do you know how easy it is to a) know it's on a shared drive, and b) how easy it is to get the User's quota. (and this may vary by OS and filesharing protocol...)

Flags: needinfo?(rjesup) → needinfo?(mozilla)

Mike - How common a problem would you expect this to be? (And do you know how easy it is to a) know it's on a shared drive, and b) how easy it is to get the User's quota. (and this may vary by OS and filesharing protocol...)

I don't think I'm the right person to answer this question. I think we need to grab someone from platform that knows more about file systems.

Flags: needinfo?(mozilla)
Blocks: necko-cache
Severity: -- → S3
Priority: -- → P2
Whiteboard: [necko-triaged][necko-priority-review]
Whiteboard: [necko-triaged][necko-priority-review] → [necko-triaged]
You need to log in before you can comment on or make changes to this bug.