Closed Bug 648232 Opened 14 years ago Closed 11 years ago

clearing the disk cache (with Clear Recent History) should be async (clearing history extremely slow and blocks UI for entire duration)

Categories

(Core :: Networking: Cache, defect)

x86
Windows XP
defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 913806

People

(Reporter: fehe, Assigned: michal)

References

Details

(4 keywords, Whiteboard: [snappy:p2])

Crash Data

Attachments

(2 files)

Clearing recent history by time interval, the first time in a browsing session, is extremely slow and the UI is blocked for the entire duration. 1. Make sure your system is using a hard drive and not an SSD 2. Launch Firefox with a new profile, if you've heavily customized your browser current profile 3. Make sure you have browsing history for a few days 4. On the day you are to perform this test, make sure you have not cleared history since you launched the browser 5. Use the browser session to do your web surfing, for about an hour or more. 6. Click Tools --> Clear Recent History... --> Last Hour --> Clear Now 7. Note that clearing is slow and the UI is blocked for the duration.
Keywords: perf, ue
You should ensure that you only selected browsing history in the checkboxes before pointing at history, since history clearing is already for the most part async. I think your issue could instead be cookies, so please, double check in Details what option of Clear History is causing the slowdown.
(In reply to comment #1) > I think your issue could instead be cookies, so please, double check in Details > what option of Clear History is causing the slowdown. Default options.
By the way, the first time I noticed the major slowdown came with the landing of the patch that made Firefox store cache files in a bunch of folders. So I believe that patch is the result of the slowdown. However, if history clearing is already async, why is the UI being blocked. And this even with a new profile and default options.
I'm asking you to figure out which one of the options in Details section causes the slowdown, otherwise this report has no data to proceed. the ui could be blocking when clearing history because this panel does not only clear history, it also clears cookies, cache and a bunch of other stuff. So, please help finding which part is slow.
OK. I'll post once I have the data.
My testing reveals that the culprits are both cache and cookies, but cookies more so. After about an hour of light browsing, clearing cache and then cookies produced the following timings: Cache: approx 4 sec Cookies: approx 7 sec With more browsing history, those number increase somewhat proportionally. I've experienced the UI blocked/unresponsive for over a minute on some occasions. Later, I will try and post numbers for much greater browsing history.
Looks like comment 6 isn't always true. I just had cookies not block, but cache blocked for over 10 seconds. So definitely a problem with clearing cache. In my default profile I generally don't clear cookies, because I accept only session cookies, with the exception of cookies permitted by my exceptions. And it is with this profile that I experience this issue (because it is the profile I use all the time). Thus, if clearing cache is supposed to be async, maybe it's only so if clearing all cache, as it most definitely isn't async when clearing only part of it via Clear Recent History.
I'm moving to PB since CRH was implemented there, looks like the hang is not related to history. In the case of cookies this is probably just a dupe of bug 633427. For the cache I thought it was on a separate thread but I'm unsure.
Component: Bookmarks & History → Private Browsing
QA Contact: bookmarks → private.browsing
(In reply to comment #8) > For the cache I thought it was on a separate thread but I'm unsure. Could it be that caching is on a separate thread but clearing cache is not?
Component: Private Browsing → Bookmarks & History
Please don't move bugs across components without explaining the reasoning.
Component: Bookmarks & History → Private Browsing
(In reply to comment #10) > Please don't move bugs across components without explaining the reasoning. I didn't do it. It's a Bugzilla bug. If you simply refresh a bug page, you see the new comments, but items in the pull-downs are not updated. Thus, when you post, it resets whatever was changed.
Summary: Clear Recent History should be async [currently extremely slow and blocks UI for entire duration] → Clear Recent History should be async [clearing cache extremely slow and blocks UI for entire duration]
This is why I hate CRH bugs ending up in the Private Browsing component. I own the Private Browsing component, but the only reason that CRH was implemented there was that there was no better place for it! So to make things clear, this bug is up for grabs, I'm not going to work on it.
Keywords: helpwanted
Ehsan, we can maybe move this bug to the component managing cache. And, if you think makes sense, we could move all CRH bugs to Firefox General and add a [clear-recent-history] whiteboard entry, like we did for switch-to-tab, about-home and such...
Yep, async cache eviction should definitely be in the cache component.
Component: Private Browsing → Networking: Cache
Product: Firefox → Core
QA Contact: private.browsing → networking.cache
It should be noted that the performance penalty experienced by this bug is actually a fallout of bug 597224; thus, I'm adding blocking so people are aware of the connection.
Blocks: 597224
I just did a test where I ran Process Monitor and cleared recent history (including "Browsing & Download History", "Form & Search History", "Cache", and "Flash Cookies") for the last 1 hour. Here are the results of the clearing. Total time was about 6 minutes (some of that added due to running Process Monitor) and the following is a percentage breakdown of how long Firefox spent in each area: 1. downloads.sqlite: 97.72 % 2. Cache: 1.75 % 3. places.sqlite: 0.27 % 4. signons.sqlite: 0.028 % 5. Macromedia\Flash: 0.098 % 6. sessionstore(-1)?.js: 0.12 % Interesting results. It's getting hung up scanning through my download history, which has 1796 entries going back to May 15th. If I'm reading it correctly, Process Monitor reports that Firefox reads the sqlite databases 1024 bytes at a time, which I suppose could be at the root of the problem, given that my downloads.sqlite is 3.82 MB in size. I'm assuming places.sqlite is loaded at browser start; thus, even though mine is 30 MB, it does not impact performance the same way downloads.sqlite does. Correct me if I'm wrong.
Summary: Clear Recent History should be async [clearing cache extremely slow and blocks UI for entire duration] → Clear Recent History should be async [clearing history extremely slow and blocks UI for entire duration]
(In reply to comment #18) > assuming places.sqlite is loaded at browser start; thus, even though mine is > 30 MB, it does not impact performance the same way downloads.sqlite does. no, places.sqlite is not read into memory, but most of the clear history operation is async and doesn't block main thread, while download manager is still using synchronous APIs.
Is it possible that Bug 670911 could have caused an improvement here? I'm noticing a huge improvement, but I'll have to give it time, in case it's coincidental.
Well, this is no longer a problem for me. It appears a recent patch may have resolved this. Haven't had any issues for about a week now. I strongly suspect that Bug 670911 fixed this, though I haven't bisected (thus resolving as WFM).
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Gotta reopen. Just had it block the UI for about 1 minute. Seems dependent on browsing behavior (i.e. what ends up in the cache). For instance, where cache contains content of browsing through many photo albums and photos from many Flickr users, over an hour or so. The browser interface was blocked for so long that I got the attached unresponsive script warning.
Now causes crash due to bug 429592. I have two crash IDs, but no crash report was submitted due to crash reporter throttling.
Severity: enhancement → critical
Keywords: crash
(In reply to IU from comment #23) > Now causes crash due to bug 429592. I have two crash IDs, but no crash > report was submitted due to crash reporter throttling. I doubt the latter claim. This can only happen on Nightly, and we don't throttle any reports for anything else than Firefox release. Could you please look into about:crashes and give us some IDs?
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #24) > I doubt the latter claim. This can only happen on Nightly... Which is in fact the build I normally use, so yes I do get throttling now-and-then. I just realized, however, that I was running an hourly at the time of the crash, so the reports will be useless. When I update to the next nightly, I'll try and get it to crash again and post the IDs.
http://crash-stats.mozilla.com/report/index/bp-d4db6dec-003c-45c1-89b8-f67782111126 Stack --- 0 mozalloc.dll mozalloc_abort memory/mozalloc/mozalloc_abort.cpp:77 1 xul.dll NS_DebugBreak_P xpcom/base/nsDebugImpl.cpp:345 2 xul.dll mozilla::HangMonitor::Crash xpcom/threads/HangMonitor.cpp:111 3 xul.dll mozilla::HangMonitor::ThreadMain 4 nspr4.dll _PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:426 5 msvcr80.dll msvcr80.dll@0x2e23 6 nspr4.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:122 7 msvcr80.dll msvcr80.dll@0x29ba 8 msvcr80.dll msvcr80.dll@0x8cec 9 msvcr80.dll msvcr80.dll@0x2a46 10 kernel32.dll BaseThreadStart 11 kernel32.dll GetCodePageFileInfo 12 kernel32.dll BaseThreadStart 13 msvcr80.dll msvcr80.dll@0x29e0
Summary: Clear Recent History should be async [clearing history extremely slow and blocks UI for entire duration] → Clear Recent History should be async [clearing history extremely slow and blocks UI for entire duration][@ chromehang | ZwSetSecurityObject ]
That's the Hang detector detecting an hang in nsDiskCacheDevice::ClearDiskCache
(In reply to Marco Bonardo [:mak] from comment #27) > That's the Hang detector detecting an hang in > nsDiskCacheDevice::ClearDiskCache Actually, if I'm correct, nsDiskCacheDevice::ClearDiskCache is not itself hanging. Rather it's blocking the rest of the browser. So is nsDiskCacheDevice::ClearDiskCache running off the main thread?
(In reply to IU from comment #28) > Actually, if I'm correct, nsDiskCacheDevice::ClearDiskCache is not itself > hanging. Rather it's blocking the rest of the browser. In our language, a "hang in X" means "we didn't get a response during a timeout period and the crash report we sent when therefore killing the process indicated that the code in X ran at that time".
Crash Signature: [@ chromehang | ZwSetSecurityObject ]
Keywords: crashhang
Summary: Clear Recent History should be async [clearing history extremely slow and blocks UI for entire duration][@ chromehang | ZwSetSecurityObject ] → Clear Recent History should be async [clearing history extremely slow and blocks UI for entire duration]
It is important that we aren't blocking the rest of the browser, but it is OK if we block the UI (e.g. with a modal dialog box with a "cancel" button that aborts the process). It is important that the user be aware of when the operation is complete, and the operation must block shutdown, because the user is intending to get some assurance of privacy by deleting his history.
BTW, we also need to blocking browsing since we would be deleting the new browsing history too, putting sites stored data potentially to an inconsistent state (only some data like cookies or localStorage suddenly deleted if we would be still cleaning).
Note that the perf issue here is not history, it's private data. The disk cache and the plug-ins data is what takes more time and blocks.
If we're doing a nsDeleteDir on the cache, the disk cache part of this doesn't block for longer than it takes to rename the old cache to cache.trash. (the deletion of the cache.trash dir is already async). If we're doing something dumb like evicting all entries that would be a different story.
(In reply to Jason Duell (:jduell) from comment #33) > If we're doing something dumb like evicting all entries that would be a > different story. Actually we do :-/ We can't close, delete and reopen the cache if some entries are in use, so we evict all entries. See nsDiskCacheDevice::EvictEntries().
Assignee: nobody → michal.novotny
Attached patch WIP patchSplinter Review
This is a WIP patch, but I think that the final patch won't differ too much. Still needs to be done: - resolve few intermittent try server failures - write more tests - remove all NS_ABORT_IF_FALSE(..., "fixme!!!") after a deeper testing
Attachment #585751 - Flags: feedback?(bjarne)
Comment on attachment 585751 [details] [diff] [review] WIP patch Review of attachment 585751 [details] [diff] [review]: ----------------------------------------------------------------- General comment is that I'm not entirely certain what this patch aims at doing: From reading the bug I get the impression that we want clearing the cache to be asynchronous, but it seems like this patch makes it synchronous and blocking. It's probably me not seeing something, but a high-level explanation what the patch wants to achieve should probably be added in a comment. The changes to nsDiskCacheStreams (closing iStreams explicitly and lock cache-service in the dtor) *may* be generally useful. I would consider isolating them in a separate bug and defend them separately from the issue described in this bug. See also comments in the file. ::: netwerk/cache/nsDiskCacheDevice.cpp @@ +601,5 @@ > } > } > > mBindery.RemoveBinding(binding); // extract binding from collision detection stuff > + binding->mCacheEntry = nsnull; I don't necessarily disagree with this but it may cause severe consequences in nsDiskCacheStreams where the binding is stored and |mCacheEntry| is used several places. Please make sure we don't run into issues like that. Why is this necessary? Do you plan to use it as a flag somewhere? ::: netwerk/cache/nsDiskCacheStreams.cpp @@ +103,5 @@ > > > nsDiskCacheInputStream::~nsDiskCacheInputStream() > { > + nsCacheServiceAutoLock lock; // grab service lock Please explain why grabbing the lock must be done here and why this is not done in the original code. @@ +491,5 @@ > + if (mOutStream) > + mOutStream->CloseInternal(); > + > + while (mInStreams.Count()) { > + static_cast<nsDiskCacheInputStream *>(mInStreams[0])->Close(); Please describe why explicit closing is necessary. Is it introduced by other changes in this patch, or is this useful in any case?
Attachment #585751 - Flags: feedback?(bjarne)
I just had Firefox hang for about 8 minutes while my disk IO was pegged for the full duration -- all because I chose to clear recent history from the past hour. This is seriously pathetic. I hope there is a plan to fix this.
Whiteboard: [snappy]
This should get fixed as part of places asyncification. Non-primary UI -> p2
Whiteboard: [snappy] → [snappy:p2]
(In reply to Taras Glek (:taras) from comment #38) > This should get fixed as part of places asyncification. Non-primary UI -> p2 bug# ?
(In reply to Wayne Mery (:wsmwk) from comment #39) > (In reply to Taras Glek (:taras) from comment #38) > > This should get fixed as part of places asyncification. Non-primary UI -> p2 > > bug# ? bug 699850
Blocks: asyncHistory
No longer blocks: asyncHistory
Depends on: asyncHistory
On my 13" MacBook Pro running the latest nightly, clearing recent history (last 1 hour) hung Nightly for almost 2 minutes. Activity Monitor showed excessive disk activity, and Firefox had to be force quit. Please let me know if you need details on cache size, db sizes, etc. to debug or make progress on this issue
"Clear recent/all history" causing Firefox to hang. It should run in background. In other words, an independent child process should handle this.
Summary: Clear Recent History should be async [clearing history extremely slow and blocks UI for entire duration] → clearing the disk cache (with Clear Recent History) should be async (clearing history extremely slow and blocks UI for entire duration)
I don't understand why clearing all browsing history should be slow. You just keep the data in a file, and delete the file whenever the user chooses to delete all history. You can also create a hierarchy of files (one file per day) stored under a directory per week, which is located under a directory per month, etc'. Whenever the user chooses to delete a specific range of dates, just delete the relevant files and/or directories.
While this design is possible, it would make actually using history much slower and the code would be more error-prone. Anyway, the main issue is that we do this operation on the main thread (which blocks the UI), whereas we should do this on a background thread (which would be mostly transparent for users).
There are two different issues: 1. When a lot of history is accumulated, the browser becomes very slow. Since I could not find such a bug, I opened a new bug: https://bugzilla.mozilla.org/show_bug.cgi?id=871943 2. Deleting the history causes the browser to hang for while. The second problem is annoying but still secondary since we do not delete data all the time, only once in a while. Are you implementing a database approach when accumulating history? I guess database partitioning by date is the solution for the second problem. Please also see: https://bugzilla.mozilla.org/show_bug.cgi?id=867110
your problem is very likely not history, clearing history doesn't block, it's clearing cache files and plug-ins data that causes this bug at the moment, please read the previous comments. I will check the other bugs you filed and answer where relevant.
Blocks: 837274
As a non-developer, my uninformed opinion is that making this async isn't a real solution, because you're still going to peg disk I/O for an unreasonable amount of time. There's no reason for the disk to thrash so much for a task as simple as clearing everything from the last hour of browsing. My guess as to why that's happening is that you're opening and closing the database / cache / whatever file for every single entry you remove. The evidence I have for this is that my hard drive is making the same exact noise over and over for at least a minute every time I do this operation. It would probably be much faster to have a single giant lock, open, read, modify, write, close, unlock cycle, such that the number of physical disk accesses is minimized.
I am not a Firefox developer and also not a C++ programmer, but I DO know that deleting a file is much faster than partially emptying it. Is it a possible solution?
I have same behavior in Linux and latest version of Firefox 26.0. Clearing only cache for last hour causes UI freeze for 1 minute or more. It's definetely not normal.
This is clearly solved in the new HTTP cache v2.
Status: REOPENED → RESOLVED
Closed: 13 years ago11 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: