Closed
Bug 1028415
Opened 10 years ago
Closed 10 years ago
Cache thread gets stuck in CacheFileIOManager::OverLimitEvictionInternal loop
Categories
(Core :: Networking: Cache, defect)
Tracking
()
RESOLVED
FIXED
mozilla33
Tracking | Status | |
---|---|---|
firefox30 | --- | unaffected |
firefox31 | --- | unaffected |
firefox32 | + | fixed |
firefox33 | --- | fixed |
b2g-v2.0 | --- | fixed |
b2g-v2.1 | --- | fixed |
People
(Reporter: jchen, Assigned: michal)
References
Details
Attachments
(2 files)
19.78 KB,
text/plain
|
Details | |
4.88 KB,
patch
|
mayhemer
:
review+
Sylvestre
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
For some reason, Nightly on my MacBook Air was stuck in a loop in CacheFileIOManager::OverLimitEvictionInternal (CPU was at 100% for many minutes and it was enough to turn on the CPU fan at full speed). This happened on the Cache2 I/O thread, and the main thread was fully responsive.
I finally used "Clear Recent History" to clear the last hour of history, and afterwards CPU usage immediately dropped down to 0%.
Attached is the output from OS X's internal profiler (I didn't have the Gecko profiler installed).
Assignee | ||
Comment 1•10 years ago
|
||
(In reply to Jim Chen [:jchen :nchen] from comment #0)
> Created attachment 8443751 [details]
> nightly-cache-loop.txt
>
> For some reason, Nightly on my MacBook Air was stuck in a loop in
> CacheFileIOManager::OverLimitEvictionInternal (CPU was at 100% for many
> minutes and it was enough to turn on the CPU fan at full speed). This
> happened on the Cache2 I/O thread, and the main thread was fully responsive.
How big was the cache when this happened?
> I finally used "Clear Recent History" to clear the last hour of history, and
> afterwards CPU usage immediately dropped down to 0%.
"Clear Recent History" always deletes the whole cache and it is always faster than selective evicting of the least useful entries, so this is expected behavior.
It seems to me that for some reason the maximum cache size has been dramatically lowered, so we started to evict a lot of entries. But it's not clear to me how this could happen. What's your maximum cache size? Do you have smartsize enabled?
Reporter | ||
Comment 2•10 years ago
|
||
This happened to me twice yesterday while I was streaming a live BBC broadcast. about:cache from the first time,
> Information about the Network Cache Storage Service
> Private Anonymous AppID In Browser Element
> memory
> Number of entries: 1098
> Maximum storage size: 32768 KiB
> Storage in use: 6784 KiB
> Storage disk location: none, only stored in memory
> List Cache Entries
> disk
> Number of entries: 7896
> Maximum storage size: 327680 KiB
> Storage in use: 331063 KiB
> Storage disk location: /Users/nchen/Library/Caches/Firefox/Profiles/u6sv5ul4.default/cache2
> List Cache Entries
> appcache
> Number of entries: 0
> Maximum storage size: 512000 KiB
> Storage in use: 0 KiB
> Storage disk location: /Users/nchen/Library/Caches/Firefox/Profiles/u6sv5ul4.default/OfflineCache
After that, I cleared recent history (about:cache showed cache was cleared), and things returned to normal. But after some time, while I was still streaming the broadcast, this happened again. about:cache from the second time,
> Information about the Network Cache Storage Service
> Private Anonymous AppID In Browser Element
> memory
> Number of entries: 0
> Maximum storage size: 32768 KiB
> Storage in use: 0 KiB
> Storage disk location: none, only stored in memory
> disk
> Number of entries: 1788
> Maximum storage size: 358400 KiB
> Storage in use: 2785412 KiB
> Storage disk location: /Users/nchen/Library/Caches/Firefox/Profiles/u6sv5ul4.default/cache2
> List Cache Entries
> appcache
> Number of entries: 0
> Maximum storage size: 512000 KiB
> Storage in use: 0 KiB
> Storage disk location: /Users/nchen/Library/Caches/Firefox/Profiles/u6sv5ul4.default/OfflineCache
Both times CPU was at 100%, but there was *no* disk activity according to Activity Monitor. The broadcast I was watching was using Flash for streaming; not sure if it's related. Smartsize is enabled according to about:config.
Comment 3•10 years ago
|
||
For reference:
[18:19] NeilAway: mayhemer: I've had a bug where the cache doom thread was in an infinite loop because the lru entry had already been doomed
[18:21] NeilAway: mayhemer: I used the debugger to nop-out both the IsDoomed() checks and the thread eventually finished shrinking the cache, but unfortunately the browser still wouldn't shut down normally and I had to kill it anyway :s
Comment 4•10 years ago
|
||
I can verify that my 2-week-old 64-bit Windows self-build hung on a background thread in a loop in the same function.
I didn't notice the problem until I tried to shut the build down and noticed that the process was lingering with 25% (1 of 4 CPUs) utilisation.
My build was an unoptimised build with symbols so I attached a debugger and noticed that the cache entry that it was trying to doom was already doomed.
At first I just used the debugger to skip over the IsDoomed() checks in the appropriate functions and some of the cache got evicted (it turns out that forcibly dooming a doomed entry still removes it from the mru list) but the loop got stuck again and I got bored and used the debugger to remove the checks. This allowed the function to complete its loop and return to the thread but unfortunately the browser didn't complete shutdown and I had to kill it anyway.
OS: Mac OS X → All
Assignee | ||
Comment 5•10 years ago
|
||
We should never have doomed entries in frecency/expiration array in the index. The CacheFileHandle::mIsDoomed is set to true only in CacheFileIOManager::DoomFileInternal() and the entry is removed from the index just one line above that. Neil, could you please try to reproduce it with NSPR logging turned on (nsHttp:5,cache2:5)?
Assignee | ||
Comment 6•10 years ago
|
||
This patch doesn't fix the root of the cause of the problem. It just breaks the loop so we won't try to doom already doomed handle over and over again. I still have no idea how we can have such entry in the index. Neil, did you try to reproduce the problem with NSPR logging enabled?
Assignee: nobody → michal.novotny
Attachment #8456219 -
Flags: review?(honzab.moz)
Flags: needinfo?(neil)
Assignee | ||
Comment 7•10 years ago
|
||
Comment on attachment 8456219 [details] [diff] [review]
patch v1
https://tbpl.mozilla.org/?tree=Try&rev=a64fe943e3d2
Comment 8•10 years ago
|
||
(In reply to Michal Novotny from comment #6)
> This patch doesn't fix the root of the cause of the problem. It just breaks
> the loop so we won't try to doom already doomed handle over and over again.
> I still have no idea how we can have such entry in the index. Neil, did you
> try to reproduce the problem with NSPR logging enabled?
Sorry, I haven't been watching the bug, but as it happens, I've not seen the problem again anyway.
Flags: needinfo?(neil)
Comment 9•10 years ago
|
||
Comment on attachment 8456219 [details] [diff] [review]
patch v1
Review of attachment 8456219 [details] [diff] [review]:
-----------------------------------------------------------------
Hope this helps. Tho seems like the problem may actually be somewhere else according michal's claims that this simple cannot happen.
r=honzab
Attachment #8456219 -
Flags: review?(honzab.moz) → review+
Comment 10•10 years ago
|
||
Comment on attachment 8456219 [details] [diff] [review]
patch v1
Approval Request Comment
[Feature/regressing bug #]: cache2
[User impact if declined]: browser can go into 100% CPU loop on cache thread
[Risks and why]: low: fairly small change
[String/UUID change made/needed]: none
This cache2 fix also seems worth taking on aurora IMO.
Attachment #8456219 -
Flags: approval-mozilla-aurora?
Assignee | ||
Comment 11•10 years ago
|
||
Comment 12•10 years ago
|
||
Going to wait for this to land on m-c before granting Aurora approval.
status-firefox30:
--- → unaffected
status-firefox31:
--- → unaffected
status-firefox32:
--- → affected
status-firefox33:
--- → fixed
tracking-firefox32:
--- → +
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
Updated•10 years ago
|
Attachment #8456219 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 14•10 years ago
|
||
You need to log in
before you can comment on or make changes to this bug.
Description
•