Closed
Bug 754575
Opened 11 years ago
Closed 11 years ago
Cache.Trash* files fill up disk space
Categories
(Core :: Networking: Cache, defect)
Tracking
()
People
(Reporter: decoder, Assigned: mfinkle)
References
Details
Attachments
(4 files, 3 obsolete files)
8.72 KB,
patch
|
Details | Diff | Splinter Review | |
708 bytes,
text/plain
|
Details | |
4.49 KB,
patch
|
jduell.mcbugs
:
review+
akeybl
:
approval-mozilla-aurora+
akeybl
:
approval-mozilla-beta+
|
Details | Diff | Splinter Review |
1.59 KB,
patch
|
Details | Diff | Splinter Review |
While fuzzing on mobile (mozilla-central, debug build, rev 448f554f6acb) I keep seeing folders like "Cache.Trash821552478" in the profile directory, slowly filling up all disk space until I manually delete them. I don't know exactly what conditions lead to this but I had it a second time now that all the space on /data/ was completely exhausted by these. They seem to contain cache data: $ adb shell ls -l /data/data/org.mozilla.fennec_decoder/files/mozilla/szzixceq.default/Cache.Trash821552478/ -rw------- app_35 app_35 18998 1970-01-01 12:05 _CACHE_001_ drwx------ app_35 app_35 1970-01-01 12:04 C -rw------- app_35 app_35 276 1970-01-01 12:04 _CACHE_MAP_ drwx------ app_35 app_35 1970-01-01 12:04 2 drwx------ app_35 app_35 1970-01-01 12:04 9 drwx------ app_35 app_35 1970-01-01 12:04 0 drwx------ app_35 app_35 1970-01-01 12:04 F drwx------ app_35 app_35 1970-01-01 12:04 E drwx------ app_35 app_35 1970-01-01 12:04 A drwx------ app_35 app_35 1970-01-01 12:04 7 drwx------ app_35 app_35 1970-01-01 12:04 4 drwx------ app_35 app_35 1970-01-01 12:04 8 drwx------ app_35 app_35 1970-01-01 12:04 6 -rw------- app_35 app_35 4096 1970-01-01 12:05 _CACHE_002_ drwx------ app_35 app_35 1970-01-01 12:04 D drwx------ app_35 app_35 1970-01-01 12:04 B drwx------ app_35 app_35 1970-01-01 12:04 1 drwx------ app_35 app_35 1970-01-01 12:04 3 drwx------ app_35 app_35 1970-01-01 12:04 5 -rw------- app_35 app_35 4183 1970-01-01 12:05 _CACHE_003_
Weird. The date is 01-01-1970? What's the date on your device set to? What device is this?
Reporter | ||
Comment 2•11 years ago
|
||
(In reply to Naoki Hirata :nhirata from comment #1) > Weird. The date is 01-01-1970? What's the date on your device set to? > What device is this? These are Tegra boards. I cannot reach them right now due to some IT failure in MV, but I'll check their date once I have access again.
Reporter | ||
Comment 3•11 years ago
|
||
(In reply to Naoki Hirata :nhirata from comment #1) > What's the date on your device set to? It's indeed 01-01-1970. But I guess thats normal for the Tegra boards.
I wonder if it's some sort of y2k or some similar issue? Could you try setting the device to be the current date/time and see what occurs after some time please?
Reporter | ||
Comment 5•11 years ago
|
||
(In reply to Naoki Hirata :nhirata from comment #4) > I wonder if it's some sort of y2k or some similar issue? Could you try > setting the device to be the current date/time and see what occurs after > some time please? I can try that later but I doubt it has something to do with date/time at all. I encounter this mainly during minimization of testcases, that means the browser crashes a lot. I could imagine that some cleanup is done on regular exit of Fennec, that is of course not done when it crashes. And starting up lots of times and not exiting regularly might leave something over.
On my device Firefox Beta is eating the storage memory till i get a memory warning and my phone is getting really slow. When i see in the app manager it's shown that "data" is growing, the "cache" is still empty (0).
Reporter | ||
Comment 7•11 years ago
|
||
(In reply to david from comment #6) > On my device Firefox Beta is eating the storage memory till i get a memory > warning and my phone is getting really slow. > When i see in the app manager it's shown that "data" is growing, the "cache" > is still empty (0). Is your device rooted so you could peak into the profile directory using ADB?
Yes, it is rooted, running with CM7. With ADB you mean the Android Debug Bridge? I don't have it installed. Is this quickly done on Ubuntu?
Reporter | ||
Comment 9•11 years ago
|
||
(In reply to david from comment #8) > Yes, it is rooted, running with CM7. > With ADB you mean the Android Debug Bridge? I don't have it installed. Is > this quickly done on Ubuntu? It should be, yes (I'm using Ubuntu as well and I remember it being really easy here). I think you need the Android SDK available at http://developer.android.com/sdk/index.html . Once you have that installed, just connect your device with an USB cable to your computer and using the "adb" command you should be able to run shell commands on the device (e.g. "adb shell ls").
![]() |
||
Comment 10•11 years ago
|
||
A couple of things to watch for: - the cache is typically invalidated and trashed (moved to Cache.Trashxxx) on the first run after a crash - the Cache.Trashxxx file is subsequently deleted only after a delay (60 seconds? I can't quite remember) and on some devices, the deletion may take some time (an additional 60s or so) I sometimes see an accumulation of Cache.Trash directories when testing, because I crash repeatedly, and Fennec doesn't get a chance to clean up.
Comment 12•11 years ago
|
||
There's a dump of the profile directory from another user who's experiencing this: https://bugzilla.mozilla.org/show_bug.cgi?id=757385
Reporter | ||
Comment 13•11 years ago
|
||
(In reply to Geoff Brown [:gbrown] from comment #10) > A couple of things to watch for: > - the cache is typically invalidated and trashed (moved to Cache.Trashxxx) > on the first run after a crash > - the Cache.Trashxxx file is subsequently deleted only after a delay (60 > seconds? I can't quite remember) and on some devices, the deletion may take > some time (an additional 60s or so) That might explain it. I'm encountering this bug mainly during minimization of test cases which involves many short runs which often crash. I worked around this by having our harness delete the files before each run.
Assignee | ||
Comment 14•11 years ago
|
||
Nom'd for blocking since this seems to be happening in the field
Assignee | ||
Updated•11 years ago
|
blocking-fennec1.0: --- → ?
Assignee | ||
Comment 15•11 years ago
|
||
Looks like we wait 90 seconds before starting the deletion: http://mxr.mozilla.org/mozilla-central/source/netwerk/cache/nsDeleteDir.cpp#340 I'm not really sure what the reason for not removing the trash folders. Is 90 seconds too long? We have telemetry data on the times required to delete the trash folders: https://metrics.mozilla.com/data/content/pentaho-cdf-dd/Render?solution=metrics2&path=%2Ftelemetry&file=telemetryEvolution.wcdf&bookmarkState={%22impl%22%3A%22client%22%2C%22params%22%3A{%22appNameParameter%22%3A%22[Application].[Fennec]%22%2C%22channelParameter%22%3A%22[Channel].[beta]%22%2C%22osParameter%22%3A%22[OS].[Android]%22%2C%22histogramParameter%22%3A%22[Histogram].[NETWORK_DISK_CACHE_TRASHRENAME]%22%2C%22referenceDate%22%3A%222012-05-22%22%2C%22histogramPopupTools%22%3A%22%22%2C%22duplicateHistogram%22%3A%22%23duplicateHistogram%22%2C%22medianButtonParam%22%3A0%2C%22scatterChart%22%3A%22%22}}
Assignee | ||
Comment 16•11 years ago
|
||
Whoops. That link is for the Cache -> Cache.Trash### rename. This is the folder deletion: https://metrics.mozilla.com/data/content/pentaho-cdf-dd/Render?solution=metrics2&path=%2Ftelemetry&file=telemetryEvolution.wcdf&bookmarkState={%22impl%22%3A%22client%22%2C%22params%22%3A{%22appNameParameter%22%3A%22[Application].[Fennec]%22%2C%22channelParameter%22%3A%22[Channel].[beta]%22%2C%22osParameter%22%3A%22[OS].[Android]%22%2C%22histogramParameter%22%3A%22[Histogram].[NETWORK_DISK_CACHE_DELETEDIR]%22%2C%22referenceDate%22%3A%222012-05-22%22%2C%22histogramPopupTools%22%3A%22%22%2C%22duplicateHistogram%22%3A%22%23duplicateHistogram%22%2C%22medianButtonParam%22%3A0%2C%22scatterChart%22%3A%22%22}}
Comment 17•11 years ago
|
||
Geoff, we were discussing putting these files in the android cache dir such that android can clean them up on its own if we're running out of disk space.
Assignee: nobody → gbrown
blocking-fennec1.0: ? → +
![]() |
||
Comment 18•11 years ago
|
||
(In reply to Brad Lassey [:blassey] from comment #17) > Geoff, we were discussing putting these files in the android cache dir such > that android can clean them up on its own if we're running out of disk space. That's bug 742558. Firefox likes to manage its own cache; there's currently no allowance made for an outside entity cleaning it up, so it's not just a simple matter of changing the directory.
Comment 19•11 years ago
|
||
moving the trash files should be relatively safe, no?
![]() |
||
Comment 20•11 years ago
|
||
Oh, sorry, I misinterpreted. Moving an invalid cache to the Android cache directory for future deletion seems safe and easy.
Assignee: gbrown → nobody
Component: General → Networking: Cache
Product: Fennec Native → Core
QA Contact: general → networking.cache
![]() |
||
Comment 21•11 years ago
|
||
I tested cache invalidation and trash cleanup and found everything working as designed: - max 20 MB (approx) disk cache on Android - disk cache invalidated when Fennec crashes or is killed - robocop tests always kill Fennec when test completes - Cache moved to Cache.Trash when invalid - Cache.Trash deleted in background after 90 s delay Moving the trash destination to the Android application cache directory has the advantage that the OS should clean up the .Trashes for us when free space is low. The OS cleaning also seems to happen after a delay, so we might not improve on this scenario much.
Comment 22•11 years ago
|
||
Another user reporting running out of disk space: https://support.mozilla.org/en-US/questions/928351
![]() |
||
Updated•11 years ago
|
Assignee: nobody → gbrown
![]() |
||
Comment 23•11 years ago
|
||
This patch changes behavior on Android such that the disk cache "trash dir" is now a sub-directory of the Android cache dir. Currently we move an invalid Cache from, say: /data/data/org.mozilla.fennec/files/mozilla/xxx.default/Cache to: /data/data/org.mozilla.fennec/files/mozilla/xxx.default/Cache.TrashNNNN With this patch, we move from /data/data/org.mozilla.fennec/files/mozilla/xxx.default/Cache to: /data/data/org.mozilla.fennec/cache/Cache.TrashNNNN The advantage is that, if the volume is low on space, Android may delete the trash folder itself, so that space is reclaimed even if Fennec has exited. This is a first draft/wip patch. It seems to work for me but is very rough/inelegant and needs testing. I am on PTO until June 11; feel free to take this bug!
Comment 25•11 years ago
|
||
Comment on attachment 628532 [details] [diff] [review] wip patch Review of attachment 628532 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/cache/nsICacheService.idl @@ +58,5 @@ > * Event target which is used for I/O operations > */ > readonly attribute nsIEventTarget cacheIOTarget; > + > + attribute nsIFile cacheTrashDir; Well, it's a gross thing to have an IDL be platform-specific, but we need to make sure this never, ever gets used on an NTFS-based filesystem (windows): if we try to move Cache to a different directory (rather than renaming it in the current directory) on NTFS the rename (which we block browser startup on) takes as long as actually deleting everything (due to traversing all files to update NTFS permissions: see bug 670911). That could be fixed by changing our underlying NTFS file code to have a flag that allows moving w/o the check, but you don't want to go there for this bug. I'd be fine with having SetCacheDir simply throw if you try to use it on NTFS. Alas, right now we don't have a way to determine is an nsIFile is on NTFS. bbondy tells me we could use ctypes to access the win32 API: "You could also simply add an attribute to nsILocalFileWin.idl which wraps a call to GetVolumeInformation and checks for NTFS easily. http://msdn.microsoft.com/en-us/library/windows/desktop/aa364993%28v=vs.85%29.aspx" But given that this is android-specific for now, and we're in a hurry (I am: are you? ;) I suggest we simply QI the nsIFile passed into SetCacheTrashDir and throw if it's a nsILocalFileWin. Make sure to add a comment that this is just to prevent NTFS usage, and that we can write a more precise test for that if we ever need one (i.e. we add Windows phone support, or something like that), and/or tweak the NFTS rename code to not do checks. Cite bug 670911 in the comment.
Comment 26•11 years ago
|
||
I don't expect we want to use ctypes, but in case we do, bbondy kindly gave me this C code that does the NTFS check that we could base it on.
Comment 27•11 years ago
|
||
So, given that this is ifdef'd to Android, I don't think we need to care too much about NTFS. However, this is not the first time that we've wanted to change behavior based on the file system (see: https://mxr.mozilla.org/mozilla-central/source/xpcom/io/nsLocalFileUnix.cpp#1072) so we may want to add a method to nsIFile to get that information in the future.
Assignee | ||
Comment 28•11 years ago
|
||
Could we just use the Directory Service to return a new directory for the Cache Trash folder? Instead of adding IDL for a specific folder property?
Assignee | ||
Comment 29•11 years ago
|
||
This patch simple puts the Android cache folder into the env and grabs it in C++. We do the same thing for Downloads: http://mxr.mozilla.org/mozilla-central/source/toolkit/components/downloads/nsDownloadManager.cpp#1182 The patch has some debugging crap in it. I am trying to verify the Trash folder is created and I am having a hard time.
Assignee | ||
Comment 30•11 years ago
|
||
I have verified the Android Cache folder is correctly received in DeleteDir, but that's as far as I have got.
Assignee | ||
Comment 31•11 years ago
|
||
Fallback to current behavior if no android cache folder is found, less debugging and XUL bustage fix. I sent this to Try server: https://tbpl.mozilla.org/?tree=Try&rev=89444a668270
Attachment #630726 -
Attachment is obsolete: true
Assignee | ||
Comment 32•11 years ago
|
||
Stealing from Gian-Carlo for now
Assignee: gpascutto → mark.finkle
Assignee | ||
Comment 33•11 years ago
|
||
The patch ran fine on Try, so asking for reviews.
Attachment #631033 -
Attachment is obsolete: true
Attachment #631130 -
Flags: review?(blassey.bugs)
Assignee | ||
Updated•11 years ago
|
Attachment #631130 -
Flags: review?(jduell.mcbugs)
Comment 34•11 years ago
|
||
Comment on attachment 631130 [details] [diff] [review] alt patch Review of attachment 631130 [details] [diff] [review]: ----------------------------------------------------------------- I'm surprised this works (do you know for sure it does?), for the following reason: http://mxr.mozilla.org/mozilla-central/source/netwerk/cache/nsDeleteDir.cpp#243 When we changed the code to always rename Cache -> Cache.trashRANDOM__NUMBER within the same directory (to avoid NTFS nastiness), we changed the MoveToNative() call to pass null for the "new directory" parameter. And we don't even wind up using the full 'trash' directory for the rename, but just the leafName. Poking through the MovetoNative code for Unix filesystems, I see we wind up getting the file's original directory for 'newdir' when it's passed a blank. So I strongly suspect that you're going to the trouble of getting a new trash directory name ("/foo/CACHE_DIRECTORY/Cache.trash"), but then we're grabbing just "Cache.trash" from it (in nsDeleteDir.cpp:214), appending a random number to it, then passing it to MovetoNative, where it winds up living in the same directory it started, not the one you assigned in GetTrashDir. If I'm right about this, the easiest fix is probably to keep your existing logic, but also compare 'dirIn.parent' with 'Trashdir.parent' right before the MoveToNative call: targetDir = nsnull; #if ANDROID if (dirIn.parent != TrashDir.parent) targetDir = Trashdir.parent #endif MoveToNative(targetDir, leaf); It's vital that we pass nsnull to MoveToNative in the NTFS case (we only skip the GUI-hanging ACL checks if the param is null), so it won't work to always simply pass TrashDir.parent as the first param. Sorry this code has gotten so tricky to navigate. ::: netwerk/cache/nsDeleteDir.cpp @@ +268,5 @@ > + char* cachePath = getenv("CACHE_DIRECTORY"); > + if (cachePath) { > + rv = NS_NewNativeLocalFile(nsDependentCString(cachePath), > + true, getter_AddRefs(*result)); > + Not checking rv here. Guard the rest of the logic in this branch with an if(NS_SUCCEEDED(rv)), and the if NS_FAILED after the #endif will handle the failure case.
Attachment #631130 -
Flags: review?(jduell.mcbugs) → review-
Comment 35•11 years ago
|
||
Comment on attachment 631130 [details] [diff] [review] alt patch Review of attachment 631130 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/cache/nsDeleteDir.cpp @@ +282,2 @@ > nsresult rv = target->Clone(getter_AddRefs(*result)); > +#endif I'd rather have this as: nsresult rv; #ifdef MOZ_WIDGET_ANDROID blah blah } else #endf { rv = target->Clone(getter_AddRefs(*result)); } just so that if that default behavior ever changes, we don't miss the change on android
Attachment #631130 -
Flags: review?(blassey.bugs) → review+
Assignee | ||
Comment 36•11 years ago
|
||
Thanks a ton for the explanation Jason! This patch takes parts of Geoff's DeleteDir WIP patch, the parts Jason warned me about. It also makes the change Brad asked for regarding the fallback. I tested this on my Nexus S. I loaded a page and killed Fennec quickly a few times. It created 2 Cache.Trash#### folders in the Android cache folder. I then let the app run for a few minutes and the Cache.Trash#### folders were removed by Fennec.
Attachment #631130 -
Attachment is obsolete: true
Attachment #631426 -
Flags: review?(jduell.mcbugs)
Updated•11 years ago
|
Whiteboard: [has new patch][needs review jduell]
Updated•11 years ago
|
Attachment #631426 -
Flags: review?(jduell.mcbugs) → review+
Comment 37•11 years ago
|
||
mfinkle: please make the comment change in this patch (just apply on top of your patch) before you land.
Assignee | ||
Comment 38•11 years ago
|
||
landed with the comment fix: https://hg.mozilla.org/mozilla-central/rev/44777d90bbd6
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Assignee | ||
Updated•11 years ago
|
Whiteboard: [has new patch][needs review jduell]
Assignee | ||
Comment 39•11 years ago
|
||
Comment on attachment 631426 [details] [diff] [review] alt patch v2 [Approval Request Comment] Bug caused by (feature/regressing bug #): none User impact if declined: unused cache files can fill up the device Testing completed (on m-c, etc.): just landed on m-c... needs to bake Risk to taking this patch (and alternatives if risky): low to med risk, but bake time should give confidence. String or UUID changes made by this patch: none
Attachment #631426 -
Flags: approval-mozilla-beta?
Attachment #631426 -
Flags: approval-mozilla-aurora?
Comment 40•11 years ago
|
||
Comment on attachment 631426 [details] [diff] [review] alt patch v2 [Triage Comment] Given the medium risk profile, we shouldn't land in our last beta next week. Additionally, giving this one day of bake time probably won't get us the feedback we need. If we're going to fix before release, let's fix now. Approved for branches.
Attachment #631426 -
Flags: approval-mozilla-beta?
Attachment #631426 -
Flags: approval-mozilla-beta+
Attachment #631426 -
Flags: approval-mozilla-aurora?
Attachment #631426 -
Flags: approval-mozilla-aurora+
Comment 41•11 years ago
|
||
(In reply to Alex Keybl [:akeybl] from comment #40) > Comment on attachment 631426 [details] [diff] [review] > alt patch v2 > > [Triage Comment] > Given the medium risk profile, we shouldn't land in our last beta next week. > Additionally, giving this one day of bake time probably won't get us the > feedback we need. If we're going to fix before release, let's fix now. > Approved for branches. (I'll note that this approval is based upon the fact that we think that the worst regression this could cause would be a startup time regression)
Comment 42•11 years ago
|
||
This landed on Aurora and Beta but was backed out because of a build error: https://hg.mozilla.org/releases/mozilla-aurora/rev/9d3ca1eaa001 https://hg.mozilla.org/releases/mozilla-beta/rev/86d23e17bd96 https://tbpl.mozilla.org/php/getParsedLog.php?id=12566994&tree=Mozilla-Aurora /builds/slave/m-aurora-andrd-xul/build/netwerk/cache/nsDeleteDir.cpp:279: error: cannot convert 'nsGetterAddRefs<nsIFile>' to 'nsILocalFile**' for argument '3' to 'nsresult NS_NewNativeLocalFile_P(const nsACString_internal&, bool, nsILocalFile**)'
Assignee | ||
Comment 43•11 years ago
|
||
The patch needed a tweak to handle the nsILocalFile -> nsIFile changes on aurora and beta: https://hg.mozilla.org/releases/mozilla-aurora/rev/e135fca7989a https://hg.mozilla.org/releases/mozilla-beta/rev/ed059049854e
status-firefox16:
--- → fixed
Target Milestone: mozilla16 → ---
You need to log in
before you can comment on or make changes to this bug.
Description
•