Closed Bug 871908 Opened 7 years ago Closed 2 years ago

[meta] Clearing "browsing and download" history does excessive disc IO (due to huge wal file)

Categories

(Toolkit :: Places, defect, P5)

23 Branch
x86_64
Windows 7
defect
Points:
5

Tracking

()

RESOLVED DUPLICATE of bug 734643

People

(Reporter: mayankleoboy1, Unassigned)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: perf)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:23.0) Gecko/20130513 Firefox/23.0
Build ID: 20130513031057

Steps to reproduce:

Use your normal profile.

Go to Nightly->History->clear recent history . Select everything, and select only "browsing and download" history.
Press OK


Actual results:

I have a Vertex2 SSD , and clearing the browsing history kept the SSD busy for ~90seconds!

Also got this "script not responding" message :
chrome://browser/content/places/browserPlacesViews.js:101


Expected results:

Even if the clearing history is made to not block the UI (bug 648232) ,  is it OK to keep a SSD pegged completely for 90 seconds ? On a HDD, this could take 6-7 minutes!

Could this operation do less IO or maybe do the deletion in memory, and then write the final DB to the disc ?

bug 648232 seems to be focussed on making UI not block. I couldnt find any bug to do less IO overall.
Component: General → Places
Product: Core → Toolkit
Since all history things are kept in a couple databases, I'm inclined to believe that this is the disk cache being a problem instead.
My mistake, I missed the specific instructions. Comment 1 does not apply.
well, I don't understand what "Select everything, and select only "browsing and download" history" means, everything or only browsing history? was the first select an unselect?

That said, it actually depends on the size of the database, removing hundred MBs of relational data may take some time, should not be that much though.
I also wonder related to that not responding dialog, looks like something in the UI was replying to each removal, maybe an add-on? Was the Library open?
(In reply to Marco Bonardo [:mak] from comment #3)
> well, I don't understand what "Select everything, and select only "browsing
> and download" history" means, everything or only browsing history? was the
> first select an unselect?
 
"Everything" is the time duration. As in, last hour, last two hour and so on.
The item to delete is "browsing and download history"   (the first check box).


> That said, it actually depends on the size of the database, removing hundred
> MBs of relational data may take some time, should not be that much though.



Here is a Log from your Places-Maintainance 1.3 addon :

> Statistics
Database size is 61440 KiB
user_version is 22
page_size is 32768
cache_size is -2048
journal_mode is wal
synchronous is 1
History can store a maximum of 103775 unique pages
Table moz_places has 84357 records
Table moz_historyvisits has 197720 records
Table moz_inputhistory has 727 records
Table moz_bookmarks has 716 records
Table moz_bookmarks_roots has 5 records
Table moz_keywords has 0 records
Table sqlite_sequence has 0 records
Table moz_favicons has 4832 records
Table moz_anno_attributes has 13 records
Table moz_annos has 807 records
Table moz_items_annos has 233 records
Table sqlite_stat1 has 15 records
Table moz_hosts has 7493 records
Index sqlite_autoindex_moz_inputhistory_1
Index sqlite_autoindex_moz_bookmarks_roots_1
Index sqlite_autoindex_moz_keywords_1
Index sqlite_autoindex_moz_favicons_1
Index sqlite_autoindex_moz_anno_attributes_1
Index sqlite_autoindex_moz_hosts_1
Index moz_places_faviconindex
Index moz_places_hostindex
Index moz_places_visitcount
Index moz_places_frecencyindex
Index moz_places_lastvisitdateindex
Index moz_historyvisits_placedateindex
Index moz_historyvisits_fromindex
Index moz_historyvisits_dateindex
Index moz_bookmarks_itemindex
Index moz_bookmarks_parentindex
Index moz_bookmarks_itemlastmodifiedindex
Index moz_places_url_uniqueindex
Index moz_places_guid_uniqueindex
Index moz_bookmarks_guid_uniqueindex
Index moz_annos_placeattributeindex
Index moz_items_annos_itemattributeindex


> I also wonder related to that not responding dialog, looks like something in
> the UI was replying to each removal, maybe an add-on? Was the Library open?

That dialog usually came almost at the end of the 90second cleanup. Library was not open at this time.
So i discovered why it does so much disk I/O . While clearing the history, nightly creates a SQL-WAL file, which is huge. To clear my 70MB places.sqlite, nightly created a 3GB - 3.5GB SQL-WAL file!
This file gets deleted when the "delete history" operation finishes. I discovered this when i tried clearing my history on a SSD, and got frustrated with the time and disk I/O it took. I killed the firefox.exe process after ~2 minurs, and noticed this huge SQL-WAL file in my profile folder.

On another note, i am afraid of clearing my firefox history now.
do you have any add-ons using history/bookmarks?

it's quite strange it is creating a so large wal, that seems indicating there's an open transaction somewhere that is killing us. so let's first try to figure if some third party add-on may be listening to each history change...
So i was incorrect about the size of the sqlite-wal file. The correct size was 6GB, for a 81MB places.sqlite file.

I created a new profile, and copied over my places.sqlite to it. Then i tried to delete my history. Same thing happened. A 6GB wal file was created, and it did not delete when the "delete history" dialogue box disappeared. Another strange thing happened : after the history was deleted, i clicked on the bookmarks button. The whoole browser froze, and the firefox.exe started doing some reads and writes to the disk, i dunno where.


I got a few extensions from Kaspersky, and 1 extension from "Internet download manager". However, they all are incompatible with Nightlies, and hence automatic disabled. (however, i dont know how to remove them from nightly)


I would be glad to help you as you wish to fix this.
oh, and forgot to add : the browser never recovered after i clicked the bookmark button. It froze completely(and doing reads writes to disk). I had to manually end the firefox.exe process.

And, while the sqlite-wal file was being created, a sqlite-shm file was also created, which was about 1.2MB in size.
the shm file is fine, it's shared memory, ideally we don't need it but that's another story.

Would you be fine sharing your places.sqlite file (compressed) with me? You can mail me a link or the file, so I can try reproducing on my system.
ah something I forgot, did you change the prefs for history limits? like places.history.expiration.max_pages
places.history.expiration.transient_current_max_pages;103775
sent you my places.sqlite
Hey Marco, you got a chance to look at this?
not yet, it's still in my todo list. I will try to go through it today.
Flags: needinfo?(mak77)
You plan to come back to this bug ? :)
Sure, sorry but lots of different priorities caused this to slip out of view. I can reproduce the creation of a 6GB WAL file when clearing history on your 81MB database. It's indeed crazy behavior and we should surely investigate it.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(mak77)
Summary: Clearing "browsing and download" history does excessive disc IO → Clearing "browsing and download" history does excessive disc IO (due to huge wal file)
From http://www.sqlite.org/wal.html looks clear we either have a concurrent write from another process (but I doubt that since we don't support that) or a long read transaction that prevents the wal to commit to the main database.

I suspect it's the latter even if offhand I have no idea of where that might be.
I think we should first of all complete the work on bug 834545, then go back and see if that refactoring solved this.  Regardless, we should not spend time on the old API, so it's a good idea to wait for that work before proceeding with analysis.
the worst thing here is that the next operation that writes to the database will have to merge the giant wal file causing a long IO jank.
Blocks: PlacesJank
Keywords: perf
Depends on: 834545
we might even just leak somewhere a prepared statement (not reset it)
Added some logging to see which statements with bound params are not reset and turns it is:

QUERY_SILENT_EXPIRE_ORPHAN_URIS

from nsPlacesExpiration.js. Now we just need to find out why it's not reset. After I commented that out shutdown is indeed really fast and the .wal file is relatively small after clearing all history.
For that exact statement we're somehow not calling sqlite3_reset() here:

http://mxr.mozilla.org/mozilla-central/source/storage/src/mozStorageAsyncStatementExecution.cpp#278

We get no further than executeAndProcessStatement() and I have no idea how that would stop execution?

http://mxr.mozilla.org/mozilla-central/source/storage/src/mozStorageAsyncStatementExecution.cpp#275
Looks like we're not returning from sqlite3_step() in Connection::stepStatement()? Could we somehow loop in there? A deadlock maybe?
Hm, that statement can be quite slow due to the moz_hosts trigger attached to it (http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/nsPlacesTriggers.h#113), bug 843357 might help there. Though keeping that table up-to-date is clearly expensive here, so we might evaluate a different strategy.

Though I suspect it just takes huge amount of time and it's not really deadlocking. did you try leaving it alone and see if it would proceed?
Nightly is at 100% CPU and the WAL file is at 570MB while that's going on. I left it running for 5 minutes now (my places DB is only 40MB) and it's not finishing. No statement reset and still at full CPU usage.
Aha. It finished after ~10 minutes and execute a whole load of other waiting statements. So I guess we should do something about that statement. With a medium sized places DB that takes just too long...
yep, that's it.
This might also become more important should we implement the proposed privacy button that would expose history removal routines via a toolbar button.
Depends on: 1076775
No longer depends on: 834545
Points: --- → 5
Flags: qe-verify-
Flags: firefox-backlog+
Flags: needinfo?(mak77)
Flags: needinfo?(mak77)
Reproduced with v36 on Ubuntu. Deleting history took about 23 minutes on an SSD and places.sqlite-wal peaked at 5GB. CPU load was at 100% (one core only) and disk IO was low. Attempting to use the Awesome bar or quit Firefox during this time would cause it to freeze and then crash. I recorded a screencast showing the problem happening in safe-mode:

https://www.youtube.com/watch?v=U4yFblwG0WM
we are aware of the problem, but we are also in the process of converting history to a new API and we need to wait for that before we can attack this issue. Thanks for reporting your experience.
Blocks: 739219
Duplicate of this bug: 1172341
Depends on: 843357
Remains true on current nightly, STR as above.

Related: the slow script warning dialog doesn't appear to actually work - the button to stop that script didn't - but I'm not sure if that's an expected behavior in this context or another bug.
Likely because the slowdown is IO bound and stopping the script won't help there...
Priority: -- → P1
Duplicate of this bug: 1211276
Blocks: 734643
possible dupe bug 1247943
Blocks: 1248489
Blocks: 1206393
Depends on: 1250363
No longer blocks: 1248489
No longer blocks: 739219
Priority: P1 → P5
Summary: Clearing "browsing and download" history does excessive disc IO (due to huge wal file) → [meta] Clearing "browsing and download" history does excessive disc IO (due to huge wal file)
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 734643
You need to log in before you can comment on or make changes to this bug.