Closed Bug 787576 Opened 8 years ago Closed 8 years ago

Refine telemetry data for how much cache corruption reduction plan would help


(Core :: Networking: Cache, defect)

Not set





(Reporter: bbondy, Assigned: bbondy)



(Whiteboard: [Snappy])


(1 file, 1 obsolete file)

The telemetry for disk cache reduction plan introduced in bug 777328 is reporting some unexplained results.  I think the changes below will help diagnose what's going on.

In particular what's currently found is that:
OS: PercentageCorruptviaDirtyHeaderBit -> PercentageCorruptviaNewWay
on Windows: 11.68% -> 7.74% (moderate win)
on Linux: 21.7% -> 13.18% (significant win)
on OSX: 10.29% -> 9.91% (insignificant win)

2 observations:
1) I would have thought on Windows and OSX it would be a bigger win.
2) I have seen some individual days on OSX where the new way has more cache corruption than the old way, and that should be impossible since we revalidate the cache when closing the cache map.

I'd like to cleanup the following items to get better data:
- We should only report telemetry data on the first call to Open().  This is leading to 2 submitted values.  This is probably skewing the data to make it look like it's not helping as much as it really is.
- Calls to FlushHeader should only revalidate the cache when the dirty flag is not set
- I'd like to see if writing to the cache file ever fails, and report telemetry data on it.
- I'd like to see how often the doom list is causing problems
- I'd like to lower the timeout from 5s to 3s

I'm hoping the above changes will make the wins more significant, or point to another bug that will explain why some days the cache clean file method is reporting higher.
Attached patch Patch v1. (obsolete) — Splinter Review
Attachment #657464 - Flags: review?(michal.novotny)
Comment on attachment 657464 [details] [diff] [review]
Patch v1.

Review of attachment 657464 [details] [diff] [review]:

r+ with the below fixed

::: netwerk/cache/nsDiskCacheMap.cpp
@@ +261,5 @@
>  {
> +    // If we have a clean header then revalidate the cache clean file
> +    if (!mHeader.mIsDirty) {
> +        RevalidateCache();
> +    }

We should revalidate the cache after writing the header to the disk. In case of crash between RevalidateCache() and PR_Sync(mMapFD) the cache would be wrongly considered as consistent.
Attachment #657464 - Flags: review?(michal.novotny) → review+
Attached patch Patch v2.Splinter Review
Implemented change.
Thanks for the review.
Attachment #657464 - Attachment is obsolete: true
Attachment #658172 - Flags: review+
Closed: 8 years ago
Resolution: --- → FIXED
Interestingly, DISK_CACHE_REVALIDATION_SAFE is 0 80% of the time, so this explains why we aren't seeing as good of results as we would like.  Any initial thoughts on why the doom listi s so often on empty?
*is so often non empty
I looked into this by browsing around for a while with a breakpoint when that condition is hit, I finally reproduced it.  It seems like sometimes a doom list entry doesn't get freed.

In my case the key for that entry was:
{mData=0x136fb9a8 "HTTP:" mLength=25 mFlags=5 }

And you can get it by signing into twitter (you have to be signed out first).
Depends on: 789936
You need to log in before you can comment on or make changes to this bug.