Closed Bug 506905 Opened 15 years ago Closed 14 years ago

detect real database corruptions when the software is not closed cleanly, and replace the db

Categories

(Toolkit :: Places, defect)

defect
Not set
critical

Tracking

()

RESOLVED DUPLICATE of bug 609286

People

(Reporter: mak, Unassigned)

References

Details

(Keywords: dataloss)

There are cases where due to some particular crash/freeze condition places.sqlite gets badly corrupted.
The corruption is detectable running PRAGMA integrity_check; or PRAGMA quick_check;

We should detect when the db connection has not been closed cleanly, and check integrity only in such a case, we could do the check with preventive maintenance and set a pref so that on next start places will replace the before opening the connection.
We also would need a way to restore history in such a case... but that should not block us in case backing up history would fail do to the corruption.
when the above corruption happens anything could happen, and we cannot guarantee user's data.
Sometimes the corruption could stay undetected by the user for days, in such a case we could be overwriting some users bookmarks with newer data (see bug 460651), and that will be a bad dataloss.

So maybe we should also increase the number of backups we retain, or have monthly backups in addition to daily backups.
Keywords: dataloss
Talking with Marco on IRC we identified this bug a bit. I have such a situation right now with my daily profile. After running integrity_checks against different versions of my places.sqlite file I found that the corruption happend between July 10th and 11th.

I will try to minimize my places.sqlite so I can attach a minimized testcase.
Severity: normal → critical
I'll attach an example places.sqlite file now which will demonstrate such a corrupt sqlite file. While my system frooze a couple of days ago some invalid entries were written to the database after a restart. That made the database corrupt.
That file is really badly corrupted, and i can confirm we cannot do anything about that.

Shawn do you think is there any sense to send the file to Sqlite Team for analysis (even if i doubt will be much useful to them without steps to cause the corruption)
It's only useful in two situations:
1) it crashes sqlite
2) we have steps to reproduce the corruption.

Given that we have neither, it's not worth sending it to them.
Here from bug 509252.  After spending way too much time trying to figure out what might have caused corruption in my formhistory.sqlite file, I think I found it.

While working one night, my UPS strangely went into an overload state (even when I'm really using my systems heavily, it's still only at 75% load, normally hangs around 58%).  This promptly shut the power down to my Mac Pro, in which I was actively using Firefox 3.5.2 beta at the time.

I got things running again, and at some point realized that the integrated Built-in Search Bar in Firefox wasn't functioning.  A few mozillazine and google searches later, I learned that corruption in a .sqlite file was likely to blame.  After dumping the DB to sql, then opening a new .sqlite file and importing, the corruption was skipped cleanly (NULL values where the schema was defined as NOT NULL), and all was well again.

I went to bed around 4am that night, and was back at things around 11am.  The next good entry in my formhistory.sqlite was around 2pm later that day, probably after I restarted Firefox a second time when I realized the Search Bar was non-functional.

The sqlite corruption was fairly minor it seems, as I was able to recover 1899 of the 1918 rows using sqlite manually.  It was clear the sqlite file was corrupted with data that shouldn't have been insertable, but sqlite was able to work around it and read most of the rest of it.

The same process I used to recover my data could be done in this proposed process as well, hopefully with little or no data loss from the corrupted .sqlite file.  I can document the process, though most of it is already documented in bug 509252, now marked as a duplicate (or at least related to) this bug.
I think bug 609286 is a better bet to be implemented, thus I'm duping to it
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.