Closed Bug 1025913 Opened 11 years ago Closed 11 years ago

CacheIOThread hogging cpu

Categories

(Core :: Networking: Cache, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: jrmuizel, Assigned: mayhemer)

References

Details

Attachments

(1 file, 1 obsolete file)

Version: 32.0a1 (2014-06-05) Spins without ceasing with all of the time spent in PurgeByFrecency
How often can you reproduce? Is it debuggable?
See Also: → 1027028
(In reply to Yuan Pengfei from comment #3) > See bug 1028415 I don't think it's related (these are actually separate bugs). The thing is that Jeff claims this loops at CacheStorageService::MemoryPool::PurgeByFrecency and not CacheFileIOManager::OverLimitEvictionInternal - those are totally unrelated. I'm concerned about but 1027028 that might indicate we do something wrong with removing entries from the pool (the frecency and exp time memory array actually). If there is something fishy, we may get to a situation the purging code loops. I've put this to the tree with a knowledge this could happen (Murphy's law). And here it is! There are two solutions: - try to find the true cause (there were a bug like this once, but it's well understood and fixed a very long time ago) - avoid the loop even there is a problem in the surrounding logic by some additional checks
Oh, I think I got it. Hmm... it may happen that we open an existing (warmed) memory-only entry as a disk entry. That will switch the mUseDisk flag and hence the entry will not be able to remove itself from the correct pool... This needs some thinking first.
Attached patch v1 (obsolete) — Splinter Review
- constify the mUseDisk flag in CacheEntry (fix for this bug) - when there is a warmed disk entry for the context/url and we are opening it again as memory only, the warmed entry is doomed (replaced by a new memory-only one) - we also check for a disk file (when there were no warmed disk entry) when opening a new memory-only entry and doom the file ; it's then consistent with the case when there already has been a warmed entry https://tbpl.mozilla.org/?tree=Try&rev=705c869dcf88
Assignee: nobody → honzab.moz
Status: NEW → ASSIGNED
Attachment #8444137 - Flags: review?(michal.novotny)
Blocks: 986179
(In reply to Honza Bambas (:mayhemer) from comment #8) > better try: > https://tbpl.mozilla.org/?tree=Try&rev=6dbe2e07da2c Bug 1005696 ?
Blocks: 1029213
(In reply to Honza Bambas (:mayhemer) from comment #9) > (In reply to Honza Bambas (:mayhemer) from comment #8) > > better try: > > https://tbpl.mozilla.org/?tree=Try&rev=6dbe2e07da2c > > Bug 1005696 ? Apparently: https://tbpl.mozilla.org/?tree=Try&rev=0b6907de0eed
Comment on attachment 8444137 [details] [diff] [review] v1 Review of attachment 8444137 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/cache2/CacheEntry.cpp @@ +334,5 @@ > + // 1. When this is a disk entry and not told to truncate, check there is a disk file. > + // If not, set the 'truncate' flag to true so that this entry will open instantly > + // as a new one. > + // 2. When this is a memory-only entry, check there is a disk file. > + // If there is or could be, doom that file. I think this should be documented in nsICacheStorageService.idl. It isn't obvious that storing an entry to memoryCacheStorage could remove entries from diskCacheStorage. ::: netwerk/test/unit/test_cache2-07a-open-memory.js @@ +17,5 @@ > + asyncOpenCacheEntry("http://disk-first/", "disk", Ci.nsICacheStorage.OPEN_NORMALLY, null, > + // Must wait for write, since opening the entry as memory-only before the disk one > + // is written would cause NS_ERROR_NOT_AVAILABLE from openOutputStream when writing > + // this disk entry. > + new OpenCallback(NEW|WAITFORWRITE, "m2m", "m2d", function(entryD1) { Just a nit. The way you choose the metadata and data content is a bit chaotic. If the first letter 'm' in "m1m" and "m1d" should mean the first letter from "mem-first" then this metadata and data should be "d1m" and "d2d".
Attachment #8444137 - Flags: review?(michal.novotny) → review+
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
I'm still seeing this on 32.0.3 PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 11162 gustavo 20 0 1626m 711m 44m R 93.1 35.3 3924:18 Cache2 I/O Is this bug really fixed?
(In reply to Gustavo Homem from comment #14) > I'm still seeing this on 32.0.3 > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > > 11162 gustavo 20 0 1626m 711m 44m R 93.1 35.3 3924:18 Cache2 I/O > > Is this bug really fixed? I think there is another one. See bug 1064091. Are you able to provide some additional info that could help diagnose?
I've looked at bug 1064091 but here I never got to the point of a hang. It's just the FF gets slower and slower until is unusable, keeps on using CPU even though I'm not doing anything while Facebook (the main CPU killer within Firefox) isn't open. It gets to a point where playing a low resolution (ex: 240p) youtube video becomes impossible. If I restart the browser than all is well again. This easily reproducible on a system that is tight on resources (an old 3 core AMD machine). If I run top and press shitf+H I see a thread called "Cache2 I/O" permanently using 100% CPU.
I made a comment on bug Bug 1064091 which could be related.
Today I was using firefox and it started using 100% CPU. I launched htop and saw that cache2 was the culprit. It did not crash and did not significantly slow down. I'm using the official firefox 32 on fedora 20. I should also say that I tried to investigate this bug further, but quickly realized that the firefox profiler wasn't showing the usage for cache2.
Currently I can browse for like one hour until the Cache threads begins hogging. @Paul Templeton: I added a related comment as well. Maybe these two bugs should me merged.
Just an update - Version 33.0 has fixed our problems - typical CPU <0.2% to 5% under load. I don't know what was changed since the last two versions but this one is stable in RDP
After I upgraded to version 33 I'm no longer seeing the problem.
I have the problem with 44.0 on Linux 64 bit ("Cache2 I/O" takes all the power of on CPU core (98.34%), and Firefox takes another 26.82%. All while being idle for minutes!). See my comment in bug 1085172.
(In reply to Ulrich Windl from comment #22) > I have the problem with 44.0 on Linux 64 bit ("Cache2 I/O" takes all the > power of on CPU core (98.34%), and Firefox takes another 26.82%. All while > being idle for minutes!). See my comment in bug 1085172. Please open a new bug and provide us with an HTTP log when you are able to reproduce the problem: https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging Just please change NSPR_LOG_MODULES=timestamp,cache2:5 Thanks.
Flags: needinfo?(Ulrich.Windl)
Flags: needinfo?(Ulrich.Windl)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: