1006181 - HTTP cache v2 memory only storage area not freed on limit

- this is actually a long standing bug! - entries recreated as 'memory-only' (inhibit caching or ssl) were not updated the frecency (mFrecency was left at 0) - such entries are never purged since those are considered in-progress of opening and purging them so soon is wrong - fixed with call and return result of NewWriteHandle() on the recreated entry - NewWriteHandle() sets the handle as mWriter (which is absolutely correct, see OnHandleClosed for the effect) and also updates the frecency automatically

Attachment #8417733 - Flags: review?(michal.novotny)

Honza Bambas (:mayhemer)

Assignee

Comment 2

•

11 years ago

https://tbpl.mozilla.org/?tree=Try&rev=d34ff80168f1

Nicholas Nethercote [inactive]

Comment 3

•

11 years ago

Wow. How often does this occur in practice -- what kind of usage patterns would cause it? I guess it can't be happening too often; if lots of users were seeing behaviour that bad we'd hear more about it...?

hazel ward

Comment 4

•

11 years ago

(In reply to Nicholas Nethercote [:njn] from comment #3) > Wow. How often does this occur in practice -- what kind of usage patterns > would cause it? I guess it can't be happening too often; if lots of users > were seeing behaviour that bad we'd hear more about it...? I can only speak for my experience (screenshot of my memory usage is included in the top comment) It seems to be explicitly cache2 related, so I'd imagine that it's limited only to nightly users and anyone else who has flipped the pref for it, so your general fx users aren't affected (yet). It has been happening for ages and actually caused me to move from the 32bit to the 64bit build just to make it take longer to bug me. Let netflix run on a machine unfettered for a few hours and you should see something similar.

hazel ward

Comment 5

•

11 years ago

http://forums.mozillazine.org/viewtopic.php?f=23&t=2799341#p13349873 When this problem presents in the 32bit version, the application stutters and freezes then goes oom and crashes. When I first started running into it (above - though i may have mentioned it in an earlier post) cache2 was not yet recognized by the memory profiler, so it's been a bug since at least February

Nicholas Nethercote [inactive]

Comment 6

•

11 years ago

> - this is actually a long standing bug! Ah, I interpreted this as meaning that it pre-dates cache2, but judging from the patch that's not true. So IIUC it's not in any Firefox releases. Still, good to fix! (And Honza gets a gold star for implementing those memory reporters for cache2 :)

Honza Bambas (:mayhemer)

Assignee

Comment 8

•

11 years ago

(In reply to Nicholas Nethercote [:njn] from comment #6) > > - this is actually a long standing bug! > > Ah, I interpreted this as meaning that it pre-dates cache2, but judging from > the patch that's not true. So IIUC it's not in any Firefox releases. Still, > good to fix! (And Honza gets a gold star for implementing those memory > reporters for cache2 :) :*) Yep, just a cache2 bug and the reporters help here. This is actually a pretty old bug in this new code - but surprisingly never happened to me or someone else... w/o the reporters we wouldn't find out!!

Michal Novotny [:michal]

Comment 10

•

11 years ago

Comment on attachment 8417733 [details] [diff] [review] v1 Review of attachment 8417733 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/cache2/CacheEntry.cpp @@ +465,5 @@ > + // fails and also doesn't update the entry frecency. Not updating > + // frecency causes entries to not be purged from our memory pools. > + nsRefPtr<CacheEntryHandle> writeHandle = > + newEntry->NewWriteHandle(); > + return writeHandle.forget(); Where we replace the normal handle with write handle in case CacheStorage::AsyncOpenURI() is called with OPEN_TRUNCATE flag? It seems to me that it would be better to create the write handle in CacheStorageService::AddStorageEntry().

Michal Novotny [:michal]

Updated

•

11 years ago

Flags: needinfo?(honzab.moz)

Honza Bambas (:mayhemer)

Assignee

Comment 11

•

11 years ago

(In reply to Michal Novotny (:michal) from comment #10) > Comment on attachment 8417733 [details] [diff] [review] > v1 > > Review of attachment 8417733 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: netwerk/cache2/CacheEntry.cpp > @@ +465,5 @@ > > + // fails and also doesn't update the entry frecency. Not updating > > + // frecency causes entries to not be purged from our memory pools. > > + nsRefPtr<CacheEntryHandle> writeHandle = > > + newEntry->NewWriteHandle(); > > + return writeHandle.forget(); > > Where we replace the normal handle with write handle in case > CacheStorage::AsyncOpenURI() is called with OPEN_TRUNCATE flag? It doesn't work like that. There is also no concept of "replacing a handle with write hanle" at all. We create a write handler when we give the entry to a callback that is going to write it the first time or when revalidating it (where reval is also a range request for partial entries). > It seems to > me that it would be better to create the write handle in > CacheStorageService::AddStorageEntry(). Definitely not. The handle that is returned must not act as a writer. It's there just to prevent purging the entry from the memory pool. Writer is created later when there is a callback that will act as a writer - as described above. The state of the callback is examined much later (onCacheEntryCheck). This already works pretty well, no need to change it. Role of the write handler is to revert internal state of the entry back to the previous one, REVAL->READY and WRITING->EMPTY, +call on any pending callbacks, when the consumer of the entry didn't do it's job (no call to MetaDataReady(), SetValid() or OpenOutputStream()) and just threw the entry away. Hence I don't see a point why a write handler should be returned from CacheStorageService::AddStorageEntry() at all.

Flags: needinfo?(honzab.moz)

Michal Novotny [:michal]

Updated

•

11 years ago

Attachment #8417733 - Flags: review?(michal.novotny) → review+

Honza Bambas (:mayhemer)

Assignee

Comment 12

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/9ac7ac429216

Jason Duell

Comment 13

•

11 years ago

I'd like to understand how netflix manages to store 10GB worth of stuff when we allegedly enforce a 5MB max object size in the memory cache (so I'd assume large video loads wouldn't get stored). Honza, we already talked about this but I wanted to make sure we don't forget to figure out why, so I'm needinfo-ing you :)

Flags: needinfo?(honzab.moz)

hazel ward

Comment 14

•

11 years ago

Netflix stronk! For real though, I couldn't tell you. If you check dup bug 1006263 I uploaded a memory report that might tell you something. I can re-up it here if it's relevant/useful at all.

Michal Novotny [:michal]

Comment 15

•

11 years ago

(In reply to Jason Duell (:jduell) from comment #13) > I'd like to understand how netflix manages to store 10GB worth of stuff when > we allegedly enforce a 5MB max object size in the memory cache (so I'd > assume large video loads wouldn't get stored). Honza, we already talked > about this but I wanted to make sure we don't forget to figure out why, so > I'm needinfo-ing you :) I think you were talking about it with me. We check whether the entry is too big only in CacheEntry::SetPredictedDataSize() which is called from nsHttpChannel::CallOnStartRequest(). When the content length is unknown (-1) we always cache the entry. I think we should check the entry's size also in CacheFileIOManager::WriteInternal().

Honza Bambas (:mayhemer)

Assignee

Comment 16

•

11 years ago

(In reply to Michal Novotny (:michal) from comment #15) > I think we should check the entry's size also in > CacheFileIOManager::WriteInternal(). Filed bug 1007071 on it.

Flags: needinfo?(honzab.moz)

Honza Bambas (:mayhemer)

Assignee

Comment 17

•

11 years ago

(In reply to Jason Duell (:jduell) from comment #13) > I'd like to understand how netflix manages to store 10GB worth of stuff when > we allegedly enforce a 5MB max object size in the memory cache (so I'd > assume large video loads wouldn't get stored). Honza, we already talked > about this but I wanted to make sure we don't forget to figure out why, so > I'm needinfo-ing you :) I don't have access to Netflix (location). As Michal has already answered in comment 15, when the transfer encoding is chunked we don't restrict the size per entry. In conjunction with this bug entries are left there forever. I think it's worth another try after the patch here gets to Nightly.

Carsten Book [:Tomcat]

Comment 18

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/9ac7ac429216

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla32

hazel ward

Comment 19

•

11 years ago

Attached file memory-report.json.gz — Details

Do patches land on the x64 build at the same time as the 32bit one? Because I'm still seeing this behavior.

hazel ward

Comment 20

•

11 years ago

Attached image snapshot of about:memory in x64 2014-05-08 build — Details

Tracy Walker [:tracy]

Updated

•

11 years ago

QA Whiteboard: [@ NS_ABORT_OOM(unsigned int) | PL_DHashTableInit(PLDHashTable*, PLDHashTableOps const*, void*, unsigned int, unsigned int) | nsCycleCollector::BeginCollection(ccType, nsICycleCollectorListener*)]

Honza Bambas (:mayhemer)

Assignee

Comment 21

•

11 years ago

Robert, can you please recheck with the latest Nightly and if it still leaks tell us what you get at about:buildconfig? Thanks!

Flags: needinfo?(rjward0)

hazel ward

Comment 22

•

11 years ago

Appears to still be happening. Session hasn't been open too long yet, but currently seeing 2,361.05 MB (100.0%) -- explicit ├──1,420.64 MB (60.17%) -- network │ ├──1,419.88 MB (60.14%) ++ cache2 build config: Build Machine B-2008-IX-0007 Source Built from https://hg.mozilla.org/mozilla-central/rev/4b6d63b05a0a Build platform target x86_64-pc-mingw32 Build tools Compiler Version Compiler flags cl 16.00.30319.01 -TC -nologo -W3 -Gy -wd4244 -wd4819 -we4553 cl 16.00.30319.01 -wd4099 -TP -nologo -W3 -Gy -wd4251 -wd4244 -wd4345 -wd4351 -wd4482 -wd4800 -wd4819 -we4553 -GR- -DNDEBUG -DTRIMMED -Zi -UDEBUG -DNDEBUG -O1 -Oy- Configure arguments --target=x86_64-pc-mingw32 --host=x86_64-pc-mingw32 --enable-crashreporter --enable-release --enable-update-channel=nightly --enable-update-packaging --enable-jemalloc --with-google-api-keyfile=/c/builds/gapi.data --enable-signmar --enable-profiling --enable-js-diagnostics

Flags: needinfo?(rjward0)

Honza Bambas (:mayhemer)

Assignee

Comment 23

•

11 years ago

Thanks, the patch is definitely there. Can you please look at about:cache and see how many entries and of what size are in the memory-only cache? It might be we also need bug 1007071 here. Thanks.

Flags: needinfo?(rjward0)

hazel ward

Comment 24

•

11 years ago

http://paste2.org/0pj9M2mU memory Number of entries: 3210 Maximum storage size: 32768 KiB Storage in use: 3400511 KiB Storage disk location: none, only stored in memory average entry size: 1082499.17755231 bytes (1.03MB)

Flags: needinfo?(rjward0)

Honza Bambas (:mayhemer)

Assignee

Comment 25

•

11 years ago

Reopening. Easily reproducible locally with a large PHP generated file. The problem is that the CacheFile::IsWriteInProgress condition is wrong. mMetadata->IsDirty() is left at |true| forever on memory only entries. I will add a test this time.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Honza Bambas (:mayhemer)

Assignee

Updated

•

11 years ago

Status: REOPENED → ASSIGNED

Honza Bambas (:mayhemer)

Assignee

Comment 26

•

11 years ago

Attached patch ad1 v1 — Details — Splinter Review

This time tested more carefully. An automated test will be a bit more complicated. In the mean time, please check the code (we can land it only manually tested this time).

Attachment #8422065 - Flags: review?(michal.novotny)

Nicholas Nethercote [inactive]

Updated

•

11 years ago

Whiteboard: [MemShrink] → [MemShrink:P1]

Michal Novotny [:michal]

Updated

•

11 years ago

Attachment #8422065 - Flags: review?(michal.novotny) → review+

Robert Kaiser

Updated

•

11 years ago

Crash Signature: [@ NS_ABORT_OOM(unsigned int) | PL_DHashTableInit(PLDHashTable*, PLDHashTableOps const*, void*, unsigned int, unsigned int) | nsCycleCollector::BeginCollection(ccType, nsICycleCollectorListener*)]

QA Whiteboard: [@ NS_ABORT_OOM(unsigned int) | PL_DHashTableInit(PLDHashTable*, PLDHashTableOps const*, void*, unsigned int, unsigned int) | nsCycleCollector::BeginCollection(ccType, nsICycleCollectorListener*)]

Keywords: crash, mlk

Honza Bambas (:mayhemer)

Assignee

Comment 27

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/0f5952a67494

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 28

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/0f5952a67494

Status: ASSIGNED → RESOLVED

Closed: 11 years ago → 11 years ago

Resolution: --- → FIXED

Tracy Walker [:tracy]

Comment 29

•

11 years ago

This crash is currently #2 on the 30beta topcrash list. Is the fix one that should safely ride the trains or should it be uplifted?

status-firefox30: --- → affected

status-firefox31: --- → affected

status-firefox32: --- → fixed

tracking-firefox30: --- → ?

Robert Kaiser

Comment 30

•

11 years ago

(In reply to [:tracy] Tracy Walker - QA Mentor from comment #29) > This crash is currently #2 on the 30beta topcrash list. Is the fix one that > should safely ride the trains or should it be uplifted? Let's get a new bug on file for this. Cache2 is not active on 30 beta, this must be something else. The signature itself is, from all I know, generic OOM (but it sounds bad that CC is falling over, running out of memory while trying to free memory is rather nasty).

Tracy Walker [:tracy]

Comment 31

•

11 years ago

sorry, typo, it's #12 not #2. But still topcrash worthy on beta. I'll file a new bug against [@ NS_ABORT_OOM(unsigned int) | PL_DHashTableInit(PLDHashTable*, PLDHashTableOps const*, void*, unsigned int, unsigned int) | nsCycleCollector::BeginCollection(ccType, nsICycleCollectorListener*)] for 30/31

status-firefox30: affected → ---

status-firefox31: affected → ---

tracking-firefox30: ? → ---

bhavana bajaj [:bajaj]

Updated

•

11 years ago

status-b2g-v1.4: --- → unaffected

status-firefox31: --- → unaffected

v1 11 years ago Honza Bambas (:mayhemer) 1.62 KB, patch	michal : review+	Details \| Diff \| Splinter Review
memory-report.json.gz 11 years ago hazel ward 132.19 KB, application/gzip		Details
snapshot of about:memory in x64 2014-05-08 build 11 years ago hazel ward 6.65 KB, image/png		Details
ad1 v1 11 years ago Honza Bambas (:mayhemer) 1.01 KB, patch	michal : review+	Details \| Diff \| Splinter Review