435712 - Implement automatic places backup

Assignee

Description

•

16 years ago

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1.14) Gecko/20080404 Iceweasel/2.0.0.14 (Debian-2.0.0.14-2)
Build Identifier: Trunk

As discussed on Bug #421482, having a backup of places can allow us to stop using over-aggressive fsync() on Linux with--at worst--loss of the history accumulated for the browser session that crashed.

I have implemented the suggestion and will attach this patch in the first comment.


Reproducible: Always

Jason Clinton

Assignee

Comment 1

•

16 years ago

Fixing blocking.

Blocks: 421482

Keywords: perf, regression

Version: unspecified → Trunk

Jason Clinton

Assignee

Comment 2

•

16 years ago

Attached patch Implement automatic backup and restore of places database (obsolete) — Details — Splinter Review

This patch implements automatic backup of the places database. If the database is discovered to be corrupted, the backup is tested to see if it is also corrupted. If it is not, it replaces the corrupted database. If both are corrupted, the old behavior kicks in: new database, rerun history import.

Jason Clinton

Assignee

Comment 3

•

16 years ago

I have tested this by manually corrupting my places.sqlite by appending /dev/urandom to the file using DD. The result is a corrupted database it is restored correctly.

I have no doubt that this could be more efficient. Even with a 10MB places file, the backup and restore operations complete in less than a second.

Jason Clinton

Assignee

Updated

•

16 years ago

Attachment #322478 - Flags: review?(me)

Reed Loden [:reed]

Updated

•

16 years ago

Component: Storage → Places

Product: Toolkit → Firefox

QA Contact: storage → places

Jason Clinton

Assignee

Updated

•

16 years ago

Attachment #322478 - Flags: review?(me) → review?(dietrich)

Reed Loden [:reed]

Updated

•

16 years ago

Assignee: nobody → me

Reed Loden [:reed]

Updated

•

16 years ago

Flags: wanted1.9.0.x?

Flags: blocking1.9.0.1?

Keywords: regression

Radek 'sysKin' Czyz

Comment 4

•

16 years ago

Random thought: does it handle "clear private data" as epected?

Jason Clinton

Assignee

Comment 5

•

16 years ago

It doesn't but, AFAIK, neither does the old behavior which creates a places.sqlite.corrupt file. I'm not opposed to it and it seems like a good idea. Clearing both .corrupt and .safety should probably be a new bug and a new patch to a different area of code.

Mike Schroepfer

Comment 6

•

16 years ago

Thanks a bunch for posting this - couple of quick thoughts:

1) OpenDatabase does not do an integrity_check (http://www.sqlite.org/pragma.html) so we'd want to do that to ensure we are detecting a corrupted database - ideally when we have a unclean shutdown (e.g. session restore fires)

2) Many places.sqlite files are >50mb and even a second to the start time is a big deal.  We should probably do this in the background as we do for bookmarks.

Mike Shaver (:shaver -- probably not reading bugmail closely)

Comment 7

•

16 years ago

What ensures that the copy you make is correctly written to disk in the case of a crash, or that it was consistent when you got to it?

Jason Clinton

Assignee

Comment 8

•

16 years ago

(In reply to comment #6)

I think that integrity_check could be extremely expensive; the existing practice is to never run integrity_check. I'm all for that idea if it can be done asynchronously since we don't want to cause it to take way too long to start the browser. However, we could be looking at locking the database for writes for the entire time that integrity_check is running: 10-15 seconds. Since I'm not a Mozilla hacker, I don't know what the implications of that would be.

I think the short-term answer is implement this synchronously as the patch does and then go back and make the changes that Mike Shaver outlined in his blog for 3.1/3.5: more asynchronous use of the databases project-wide. That would include an integrity_check at startup.

Jason Clinton

Assignee

Comment 9

•

16 years ago

(In reply to comment #7)
> What ensures that the copy you make is correctly written to disk in the case of
> a crash, or that it was consistent when you got to it?

Until integrity_check is implemented asynchronously, the consistency is confirmed the same way that corruption is currently detected: the cursory checks performed when OpenDatabase is called. It's not perfect but it *does* work in my own tests.

As for the copy, the BackupDatabaseFile currently uses nsIFile->copyTo. I don't know if the destination file handle is closed after this call is made. If so, when the dest. file handle is closed, it would be scheduled for page-out during the next commit interval. On ext3, that's at most 5 seconds.

Again, we can increase integrity when we can do the check and copy asynchronously without sacrificing startup time.

Mike Shaver (:shaver -- probably not reading bugmail closely)

Comment 10

•

16 years ago

This will also lose all bookmarks created during the session, as well as any adaptive learning done by awesomebar, I think.

Relying on the OS writeout doesn't work, because you can get _partial_ writeout, and the window can easily be more than 5 seconds.  (Witness the fact that an fsync can take 30 seconds to complete if there is a lot of I/O pending!)  In laptop_mode, the interval is more like 30 seconds.  I think you need an fsync after the copy is complete, and probably need one on shutdown as well if there isn't already.

I don't think we want to take this in 3.0.1 without making it happen in the background, but then we need to make sure that we're copying the database safely in the background and get a consistent view!  sqlite likely has API to help with that.  We should make sure that we understand what happens if the O/S crashes shortly after Firefox quits (possibly due to an X shutdown crash on user logout), for one thing.  My current mental model of this says that we'll have a corrupt database and no indication that we need to check it, which we'll then copy over our safety database, and go merrily about our business.  If the intent of this patch is that with it we can go to synchronous=OFF, we need to make sure that we understand the edge cases, since that's where data loss happens. :)

Status: UNCONFIRMED → NEW

Ever confirmed: true

Jason Clinton

Assignee

Comment 11

•

16 years ago

Comment on attachment 322478 [details] [diff] [review]
Implement automatic backup and restore of places database

After discussing this patch at length with Mike Shaver, I concede that all his concerns are completely valid and there's just no way that this patch is going provide sufficient integrity without blocking on an fsync() right after the copy. Doing so would cause Firefox to potentially take ages to start up.

For those who want to use libeatmydata, this patch would be an improvement on the no fsync() behavior but there's not really any guarantee that the backup is committed soon enough to prevent you from losing everything.

The right way to do this is to use some backup API in sqlite which would provide the needed thread safety to do this in a background the and the sane POSIX fsync behavior to ensure that the background backup actually gets flushed to disk. At the moment, mozStorageService doesn't provide a wrapper for the needed behavior.

Attachment #322478 - Attachment is obsolete: true

Attachment #322478 - Flags: review?(dietrich)

Mike Shaver (:shaver -- probably not reading bugmail closely)

Updated

•

16 years ago

Flags: wanted1.9.0.x?

Flags: wanted1.9.0.x-

Flags: blocking1.9.0.1?

Flags: blocking1.9.0.1-

Sergey Yanovich

Comment 12

•

16 years ago

Live automatic database backup is aka master-slave replication. Possible implementation plan A:
1. We keep a backup database in addition to the main in the profile.
2. The main is written in OFF mode.
3. Periodically, and only if the browser (or maybe even the whole system) is idle, the main database is queried for changes since last replication. The resulting changeset is written to the backup in a single transaction. The backup is written in FULL mode.
4. If the main database is ever determined to be corrupt, it is replaced with the backup using nsIFile.copyTo().

We may also choose to use an insert buffer. Possible implementation plan B:
1. We keep a temporary buffer database somewhere.
2. The temporary database is written in OFF mode.
3. Periodically, and only if the browser (or maybe even the whole system) is idle, all data from the buffer is read from the buffer. The resulting changeset is written to the main database in a single transaction. The main is written in FULL mode. The buffer is cleared on successful main write.

Plan B takes considerably less disk space, and it can use a database that is guaranteed to reside on local file system, not NFS (in /tmp or equivalent). But it is also considerably more work than Plan A.

Which path should we choose? Should this be posted to a newsgroup?

Mike Shaver (:shaver -- probably not reading bugmail closely)

Comment 13

•

16 years ago

Splitting the database use is not the same as an automatic backup, and I believe is underway in another bug; I don't know the bug off-hand.

A newsgroup would be a decent place to propose this, or d-a-f, or even another bug.  This bug isn't the right place.  I meant to mark this WONTFIX based on the tragic outcome of my conversation with Jason; doing so now.

Status: NEW → RESOLVED

Closed: 16 years ago

Resolution: --- → WONTFIX

Gervase Markham [:gerv]

Comment 14

•

15 years ago

Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".

In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body   contains   places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.

Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.

Gerv

Component: Places → Bookmarks & History

QA Contact: places → bookmarks