Closed
Bug 1006181
Opened 11 years ago
Closed 11 years ago
HTTP cache v2 memory only storage area not freed on limit
Categories
(Core :: Networking: Cache, defect)
Core
Networking: Cache
Tracking
()
RESOLVED
FIXED
mozilla32
Tracking | Status | |
---|---|---|
firefox31 | --- | unaffected |
firefox32 | --- | fixed |
b2g-v1.4 | --- | unaffected |
People
(Reporter: mayhemer, Assigned: mayhemer)
References
Details
(Keywords: crash, memory-leak, Whiteboard: [MemShrink:P1])
Crash Data
Attachments
(4 files)
1.62 KB,
patch
|
michal
:
review+
|
Details | Diff | Splinter Review |
132.19 KB,
application/gzip
|
Details | |
6.65 KB,
image/png
|
Details | |
1.01 KB,
patch
|
michal
:
review+
|
Details | Diff | Splinter Review |
...or limit is computed wrong.
Based on http://forums.mozillazine.org/viewtopic.php?p=13519625#p13519625 and http://imgur.com/QZ5pr1C where memory-storage has 10GB.
Updated•11 years ago
|
Whiteboard: [MemShrink]
Assignee | ||
Updated•11 years ago
|
Status: NEW → ASSIGNED
Assignee | ||
Comment 1•11 years ago
|
||
- this is actually a long standing bug!
- entries recreated as 'memory-only' (inhibit caching or ssl) were not updated the frecency (mFrecency was left at 0)
- such entries are never purged since those are considered in-progress of opening and purging them so soon is wrong
- fixed with call and return result of NewWriteHandle() on the recreated entry
- NewWriteHandle() sets the handle as mWriter (which is absolutely correct, see OnHandleClosed for the effect) and also updates the frecency automatically
Attachment #8417733 -
Flags: review?(michal.novotny)
Assignee | ||
Comment 2•11 years ago
|
||
Comment 3•11 years ago
|
||
Wow. How often does this occur in practice -- what kind of usage patterns would cause it? I guess it can't be happening too often; if lots of users were seeing behaviour that bad we'd hear more about it...?
Comment 4•11 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #3)
> Wow. How often does this occur in practice -- what kind of usage patterns
> would cause it? I guess it can't be happening too often; if lots of users
> were seeing behaviour that bad we'd hear more about it...?
I can only speak for my experience (screenshot of my memory usage is included in the top comment)
It seems to be explicitly cache2 related, so I'd imagine that it's limited only to nightly users and anyone else who has flipped the pref for it, so your general fx users aren't affected (yet). It has been happening for ages and actually caused me to move from the 32bit to the 64bit build just to make it take longer to bug me. Let netflix run on a machine unfettered for a few hours and you should see something similar.
Comment 5•11 years ago
|
||
http://forums.mozillazine.org/viewtopic.php?f=23&t=2799341#p13349873
When this problem presents in the 32bit version, the application stutters and freezes then goes oom and crashes. When I first started running into it (above - though i may have mentioned it in an earlier post) cache2 was not yet recognized by the memory profiler, so it's been a bug since at least February
Comment 6•11 years ago
|
||
> - this is actually a long standing bug!
Ah, I interpreted this as meaning that it pre-dates cache2, but judging from the patch that's not true. So IIUC it's not in any Firefox releases. Still, good to fix! (And Honza gets a gold star for implementing those memory reporters for cache2 :)
Assignee | ||
Comment 8•11 years ago
|
||
(In reply to Nicholas Nethercote [:njn] from comment #6)
> > - this is actually a long standing bug!
>
> Ah, I interpreted this as meaning that it pre-dates cache2, but judging from
> the patch that's not true. So IIUC it's not in any Firefox releases. Still,
> good to fix! (And Honza gets a gold star for implementing those memory
> reporters for cache2 :)
:*)
Yep, just a cache2 bug and the reporters help here. This is actually a pretty old bug in this new code - but surprisingly never happened to me or someone else... w/o the reporters we wouldn't find out!!
Comment 10•11 years ago
|
||
Comment on attachment 8417733 [details] [diff] [review]
v1
Review of attachment 8417733 [details] [diff] [review]:
-----------------------------------------------------------------
::: netwerk/cache2/CacheEntry.cpp
@@ +465,5 @@
> + // fails and also doesn't update the entry frecency. Not updating
> + // frecency causes entries to not be purged from our memory pools.
> + nsRefPtr<CacheEntryHandle> writeHandle =
> + newEntry->NewWriteHandle();
> + return writeHandle.forget();
Where we replace the normal handle with write handle in case CacheStorage::AsyncOpenURI() is called with OPEN_TRUNCATE flag? It seems to me that it would be better to create the write handle in CacheStorageService::AddStorageEntry().
Updated•11 years ago
|
Flags: needinfo?(honzab.moz)
Assignee | ||
Comment 11•11 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #10)
> Comment on attachment 8417733 [details] [diff] [review]
> v1
>
> Review of attachment 8417733 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: netwerk/cache2/CacheEntry.cpp
> @@ +465,5 @@
> > + // fails and also doesn't update the entry frecency. Not updating
> > + // frecency causes entries to not be purged from our memory pools.
> > + nsRefPtr<CacheEntryHandle> writeHandle =
> > + newEntry->NewWriteHandle();
> > + return writeHandle.forget();
>
> Where we replace the normal handle with write handle in case
> CacheStorage::AsyncOpenURI() is called with OPEN_TRUNCATE flag?
It doesn't work like that. There is also no concept of "replacing a handle with write hanle" at all.
We create a write handler when we give the entry to a callback that is going to write it the first time or when revalidating it (where reval is also a range request for partial entries).
> It seems to
> me that it would be better to create the write handle in
> CacheStorageService::AddStorageEntry().
Definitely not. The handle that is returned must not act as a writer. It's there just to prevent purging the entry from the memory pool. Writer is created later when there is a callback that will act as a writer - as described above. The state of the callback is examined much later (onCacheEntryCheck). This already works pretty well, no need to change it.
Role of the write handler is to revert internal state of the entry back to the previous one, REVAL->READY and WRITING->EMPTY, +call on any pending callbacks, when the consumer of the entry didn't do it's job (no call to MetaDataReady(), SetValid() or OpenOutputStream()) and just threw the entry away.
Hence I don't see a point why a write handler should be returned from CacheStorageService::AddStorageEntry() at all.
Flags: needinfo?(honzab.moz)
Updated•11 years ago
|
Attachment #8417733 -
Flags: review?(michal.novotny) → review+
Assignee | ||
Comment 12•11 years ago
|
||
Comment 13•11 years ago
|
||
I'd like to understand how netflix manages to store 10GB worth of stuff when we allegedly enforce a 5MB max object size in the memory cache (so I'd assume large video loads wouldn't get stored). Honza, we already talked about this but I wanted to make sure we don't forget to figure out why, so I'm needinfo-ing you :)
Flags: needinfo?(honzab.moz)
Comment 14•11 years ago
|
||
Netflix stronk!
For real though, I couldn't tell you. If you check dup bug 1006263 I uploaded a memory report that might tell you something. I can re-up it here if it's relevant/useful at all.
Comment 15•11 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #13)
> I'd like to understand how netflix manages to store 10GB worth of stuff when
> we allegedly enforce a 5MB max object size in the memory cache (so I'd
> assume large video loads wouldn't get stored). Honza, we already talked
> about this but I wanted to make sure we don't forget to figure out why, so
> I'm needinfo-ing you :)
I think you were talking about it with me. We check whether the entry is too big only in CacheEntry::SetPredictedDataSize() which is called from nsHttpChannel::CallOnStartRequest(). When the content length is unknown (-1) we always cache the entry. I think we should check the entry's size also in CacheFileIOManager::WriteInternal().
Assignee | ||
Comment 16•11 years ago
|
||
(In reply to Michal Novotny (:michal) from comment #15)
> I think we should check the entry's size also in
> CacheFileIOManager::WriteInternal().
Filed bug 1007071 on it.
Flags: needinfo?(honzab.moz)
Assignee | ||
Comment 17•11 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #13)
> I'd like to understand how netflix manages to store 10GB worth of stuff when
> we allegedly enforce a 5MB max object size in the memory cache (so I'd
> assume large video loads wouldn't get stored). Honza, we already talked
> about this but I wanted to make sure we don't forget to figure out why, so
> I'm needinfo-ing you :)
I don't have access to Netflix (location). As Michal has already answered in comment 15, when the transfer encoding is chunked we don't restrict the size per entry. In conjunction with this bug entries are left there forever.
I think it's worth another try after the patch here gets to Nightly.
Comment 18•11 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
Comment 19•11 years ago
|
||
Do patches land on the x64 build at the same time as the 32bit one? Because I'm still seeing this behavior.
Comment 20•11 years ago
|
||
Updated•11 years ago
|
QA Whiteboard: [@ NS_ABORT_OOM(unsigned int) | PL_DHashTableInit(PLDHashTable*, PLDHashTableOps const*, void*, unsigned int, unsigned int) | nsCycleCollector::BeginCollection(ccType, nsICycleCollectorListener*)]
Assignee | ||
Comment 21•11 years ago
|
||
Robert, can you please recheck with the latest Nightly and if it still leaks tell us what you get at about:buildconfig? Thanks!
Flags: needinfo?(rjward0)
Comment 22•11 years ago
|
||
Appears to still be happening. Session hasn't been open too long yet, but currently seeing
2,361.05 MB (100.0%) -- explicit
├──1,420.64 MB (60.17%) -- network
│ ├──1,419.88 MB (60.14%) ++ cache2
build config:
Build Machine
B-2008-IX-0007
Source
Built from https://hg.mozilla.org/mozilla-central/rev/4b6d63b05a0a
Build platform
target
x86_64-pc-mingw32
Build tools
Compiler Version Compiler flags
cl 16.00.30319.01 -TC -nologo -W3 -Gy -wd4244 -wd4819 -we4553
cl 16.00.30319.01 -wd4099 -TP -nologo -W3 -Gy -wd4251 -wd4244 -wd4345 -wd4351 -wd4482 -wd4800 -wd4819 -we4553 -GR- -DNDEBUG -DTRIMMED -Zi -UDEBUG -DNDEBUG -O1 -Oy-
Configure arguments
--target=x86_64-pc-mingw32 --host=x86_64-pc-mingw32 --enable-crashreporter --enable-release --enable-update-channel=nightly --enable-update-packaging --enable-jemalloc --with-google-api-keyfile=/c/builds/gapi.data --enable-signmar --enable-profiling --enable-js-diagnostics
Flags: needinfo?(rjward0)
Assignee | ||
Comment 23•11 years ago
|
||
Thanks, the patch is definitely there. Can you please look at about:cache and see how many entries and of what size are in the memory-only cache?
It might be we also need bug 1007071 here. Thanks.
Flags: needinfo?(rjward0)
Comment 24•11 years ago
|
||
http://paste2.org/0pj9M2mU
memory
Number of entries: 3210
Maximum storage size: 32768 KiB
Storage in use: 3400511 KiB
Storage disk location: none, only stored in memory
average entry size: 1082499.17755231 bytes (1.03MB)
Flags: needinfo?(rjward0)
Assignee | ||
Comment 25•11 years ago
|
||
Reopening. Easily reproducible locally with a large PHP generated file. The problem is that the CacheFile::IsWriteInProgress condition is wrong. mMetadata->IsDirty() is left at |true| forever on memory only entries.
I will add a test this time.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Assignee | ||
Updated•11 years ago
|
Status: REOPENED → ASSIGNED
Assignee | ||
Comment 26•11 years ago
|
||
This time tested more carefully. An automated test will be a bit more complicated. In the mean time, please check the code (we can land it only manually tested this time).
Attachment #8422065 -
Flags: review?(michal.novotny)
Updated•11 years ago
|
Whiteboard: [MemShrink] → [MemShrink:P1]
Updated•11 years ago
|
Attachment #8422065 -
Flags: review?(michal.novotny) → review+
Updated•11 years ago
|
Crash Signature: [@ NS_ABORT_OOM(unsigned int) | PL_DHashTableInit(PLDHashTable*, PLDHashTableOps const*, void*, unsigned int, unsigned int) | nsCycleCollector::BeginCollection(ccType, nsICycleCollectorListener*)]
QA Whiteboard: [@ NS_ABORT_OOM(unsigned int) | PL_DHashTableInit(PLDHashTable*, PLDHashTableOps const*, void*, unsigned int, unsigned int) | nsCycleCollector::BeginCollection(ccType, nsICycleCollectorListener*)]
Assignee | ||
Comment 27•11 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 11 years ago → 11 years ago
Resolution: --- → FIXED
Comment 29•11 years ago
|
||
This crash is currently #2 on the 30beta topcrash list. Is the fix one that should safely ride the trains or should it be uplifted?
status-firefox30:
--- → affected
status-firefox31:
--- → affected
status-firefox32:
--- → fixed
tracking-firefox30:
--- → ?
Comment 30•11 years ago
|
||
(In reply to [:tracy] Tracy Walker - QA Mentor from comment #29)
> This crash is currently #2 on the 30beta topcrash list. Is the fix one that
> should safely ride the trains or should it be uplifted?
Let's get a new bug on file for this. Cache2 is not active on 30 beta, this must be something else. The signature itself is, from all I know, generic OOM (but it sounds bad that CC is falling over, running out of memory while trying to free memory is rather nasty).
Comment 31•11 years ago
|
||
sorry, typo, it's #12 not #2. But still topcrash worthy on beta. I'll file a new bug against [@ NS_ABORT_OOM(unsigned int) | PL_DHashTableInit(PLDHashTable*, PLDHashTableOps const*, void*, unsigned int, unsigned int) | nsCycleCollector::BeginCollection(ccType, nsICycleCollectorListener*)] for 30/31
Updated•11 years ago
|
status-b2g-v1.4:
--- → unaffected
status-firefox31:
--- → unaffected
You need to log in
before you can comment on or make changes to this bug.
Description
•