+++ This bug was initially created as a clone of Bug #1120945 +++ Do manual testing of the patch in bug 1120945.
I did a basic manual testing on linux with profile placed on FAT32. What really needs to be tested (and is not clear from Jason's comment in bug 1120945) is whether this patch could cause some performance hit, since with this fix we'll be doing more IO almost every time we create a new entry when we reach the maximum files per directory limit. OTOH it is similar to what we do when the cache reaches its size limit. Anyway, I didn't test this because I don't think I would get representative numbers on linux.
So, is anything coming out of your tests?
I didn't do any performance measurements. I just tested that we successfully cache new entries when we hit the limit on Linux. I assume that it works also on Windows.
Honza, we'd like to see what the windows perf is when we hit the FAT32 workaround, so we'll know if the hack is good enough or if we need to do more work (switch to a tree of directories, etc). Could you try to get some data on one of your windows boxes? It doesn't need to be something that gets checked into the tree.
Michal: IIRC the other idea we talked about at the meeting was to add some telemetry that splits out page load times and cache hit rate (and any other relevant cache perf metrics) just for FAT32 users, and splits those FAT32 metrics into "not at file count limit" vs "at file count limit" versions. I don't know how many FAT32 nightly users we have, but even if it's just a few, it's data. Should we split out a new bug for that?
The patch in bug #1120945 causes an additional IO when we reach the limit. But it cannot be measured using telemetry, because the IO is performed only during write and it happens asynchronously on the background. So the original request is not anyhow affected. It could in theory affect subsequent requests because the cache IO thread could be blocked a bit longer than without this patch. OTOH with this patch when we hit the file count limit we never reach the cache size limit so we never run the overlimit eviction machinery, which could be a performance win in the end. (In reply to Jason Duell [:jduell] (needinfo? me) from comment #5) > Michal: IIRC the other idea we talked about at the meeting was to add some > telemetry that splits out page load times and cache hit rate (and any other > relevant cache perf metrics) just for FAT32 users, and splits those FAT32 > metrics into "not at file count limit" vs "at file count limit" versions. I > don't know how many FAT32 nightly users we have, but even if it's just a > few, it's data. Should we split out a new bug for that? I see several problems with this approach. First, the number of cache files continuously grows up to the limit and it never drops, so we would end up having much more data for the "at file count limit" metrics. Second, the "not at file count limit" metrics would contain all data starting at empty cache to almost full cache. The mean value would be half full cache and this is not comparable to full cache statistics. Also there would be no metrics for the case when we run overlimit eviction. To sum the all above, I think that there is probably no performance hit caused by the patch in bug #1120945. We should IMO only investigate what's the average maximum cache size on FAT32 and this will be covered by bug #1128339.
Yes, I think too that hitting the fat32 limit is actually the same as hitting the common cache limit. This only blocks a write that has a very low priority, only affect (the same as when limit is reached) is that the backlog could swell up a bit. All the doom/write operations are granular enough so that any read/open will proceed ASAP. Hence, I will just do a test this works (a double check) on a sdcard.
- profile on ssd - cache only on a slow card formatted for FAT32 - seen 13106 files at the start - browsing few websites - clearly the cache is wiping older files - no visible performance difference This works for me well.