Closed Bug 1394845 Opened 8 years ago Closed 8 years ago

places.sqlite gets corrupt in Firefox v55

Categories

(Toolkit :: Places, defect)

Product:

Component:

Version:

55 Branch

Type:

defect

Priority:

Not set

Severity:

normal

Tracking

()

Status:

VERIFIED INCOMPLETE

Tracking Flags:

Tracking

Status

firefox-esr52

---

unaffected

firefox55

---

wontfix

firefox56

+

fix-optional

firefox57

+

fix-optional

People

(Reporter: andrei.faber, Unassigned)

Details

(Keywords: dataloss)

Reporter

Description

•

8 years ago

User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:55.0) Gecko/20100101 Firefox/55.0 Build ID: 20170824053622 Steps to reproduce: Upgraded to v55 Actual results: My places.sqlite file was marked as corrupt and all website history disappeared. (The file itself doesn't have errors when I checked it in mysql; there is only a single warning "*** in database main *** Page 24718 is never used" Expected results: Nothing.

Kohei Yoshino [:kohei]

Updated

•

8 years ago

Component: Untriaged → Places

Keywords: dataloss

Product: Firefox → Toolkit

Kohei Yoshino [:kohei]

Comment 1

•

8 years ago

Perhaps related: https://www.reddit.com/r/firefox/comments/6w48lz/firefox_updated_to_5503_everything_appears_to/

Panos Astithas (he/him) [:past] (please ni?)

Comment 2

•

8 years ago

Did you use Firefox Refresh after you upgraded? Would you be willing to share your corrupt database with one of our engineers to figure out what is wrong with it?

Flags: needinfo?(andrei.faber)

Reporter

Comment 3

•

8 years ago

(In reply to Panos Astithas [:past] (56 Regression Engineering Owner) (please ni?) from comment #2) > Did you use Firefox Refresh after you upgraded? Would you be willing to > share your corrupt database with one of our engineers to figure out what is > wrong with it? 1. No 2. No, it's private I can confirm that the same places.sqlite works perfectly with Firefox v54

Flags: needinfo?(andrei.faber)

Sylvestre Ledru [:Sylvestre]

Comment 4

•

8 years ago

fandrei, is it 55.0 or 55.0.3 ?

Kohei Yoshino [:kohei]

Comment 5

•

8 years ago

[Tracking Requested - why for this release]: A dataloss issue in Firefox 55, still needs investigating though.

status-firefox55: --- → affected

status-firefox56: --- → ?

status-firefox57: --- → ?

status-firefox-esr52: --- → unaffected

tracking-firefox55: --- → ?

tracking-firefox56: --- → ?

Keywords: regression

Reporter

Comment 6

•

8 years ago

(In reply to Sylvestre Ledru [:sylvestre] from comment #4) > fandrei, is it 55.0 or 55.0.3 ? It's 55.0.3

Marco Bonardo [:mak]

Comment 7

•

8 years ago

Were your bookmarks restored? did you only lose history? Did you ever execute any manual SQL query on this database? Did you have add-ons that could have done changes to bookmarks and history? When we migrate the schema, there can be data incoherencies that are unmanageable. Those could even stay asymptomatic for many versions, until a migration touches exactly that data. That's what I suspect is happening here. Unfortunately without having the db I can't debug the problem. Unfortunately if your db can't be upgraded and you don't want to lose history, you'll also be stuck on v54, because it's hard to fix something we can't reproduce, and we still don't have an history backup system in place.

Marco Bonardo [:mak]

Comment 8

•

8 years ago

Off-hand I don't think this is related to the profile reset problem. It's likely that on every update we have some users whose db is found to be corrupt and replaced, as well as it's likely only a minority of users will file a bug here. We have an opt-out probe on that tells us how many corruptions we saw from 55 on, and I indeed see it being high on 55 (almost 140k pings), though we don't have a previous value to compare with, nor I can tell how many of those are repeated tries from the same profile. 56 looks much better stats-wise.

Reporter

Comment 9

•

8 years ago

> Were your bookmarks restored? did you only lose history? After the first start of Firefox after upgrade, the bookmarks were empty. However, they appeared after restarting Firefox; they only lost all favicons. > Did you ever execute any manual SQL query on this database? No. > Did you have add-ons that could have done changes to bookmarks and history? No.

Reporter

Comment 10

•

8 years ago

PS I can perform any checks on my places.sqlite file if you have ideas where to look.

Marco Bonardo [:mak]

Comment 11

•

8 years ago

Tomorrow I could probably build some SQL queries and ask you to run those and report results. But first of all, we should understand if "Page 24718 is never used" is a mistake that can be fixed, since otherwise the db will just be corrupt forever. Two things to try (after a backup) is executing a VACUUM on the db, then checking again with PRAGMA integrity_check. Another thing to try is REINDEX. Once PRAGMA integrity_check returns OK, we can proceeding into checking the data coherence.

Marco Bonardo [:mak]

Updated

•

8 years ago

Flags: needinfo?(mak77)

Reporter

Comment 12

•

8 years ago

> But first of all, we should understand if "Page 24718 is never used" is a > mistake that can be fixed, since otherwise the db will just be corrupt > forever. Two things to try (after a backup) is executing a VACUUM on the db, > then checking again with PRAGMA integrity_check. Another thing to try is > REINDEX. Executing vacuum and then PRAGMA integrity_check, getting the following error: "database disk image is malformed: PRAGMA integrity_check".

Marco Bonardo [:mak]

Comment 13

•

8 years ago

I see, I'm not surprised that 55 bails out and considers the database corrupt, because it looks really unrecoverable. You may try to build a new one with the steps outlined here: https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Places/places.sqlite_Database_Troubleshooting#How_to_(try_to)_recover_from_a_corrupt_places.sqlite

Reporter

Comment 14

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #13) > I see, I'm not surprised that 55 bails out and considers the database > corrupt, because it looks really unrecoverable. > You may try to build a new one with the steps outlined here: > https://developer.mozilla.org/en-US/docs/Mozilla/Tech/Places/places. > sqlite_Database_Troubleshooting#How_to_(try_to)_recover_from_a_corrupt_places > .sqlite Trying these steps... It sill leaves me with 2 questions: 1. Why pre-v55 Firefox worked with this file and didn't have any problems? 2. Why it didn't show any warnings that the database is severely damaged?

Reporter

Comment 15

•

8 years ago

PS playing with user_version haven't helped at all. Any other ideas how to fix the file, or how to export the history from v54 and import in v55?

Marco Bonardo [:mak]

Comment 16

•

8 years ago

(In reply to fandrei from comment #14) > Trying these steps... It sill leaves me with 2 questions: > 1. Why pre-v55 Firefox worked with this file and didn't have any problems? 55 moves favicons to a new database and then runs a vacuum on the original one because a lot of space can be regained from it. Looks like you have one kind of corruption that gets uncovered exactly by a vacuum operation. Usually we also vacuum on idle, but maybe in your case maintenance never found a long enough idle to run. > 2. Why it didn't show any warnings that the database is severely damaged? Because until the vacuum was executed your corruption was asymptomatic, the times something failed could have been unnoticed or considered glitches. We currently detect db corruption at 2 times: 1. on startup, but this depends on whether sqlite3_open succeeds. There are a lot of asymptomatic corruptions where open just works, and most of the queries work. 2. on weekly maintenance, that happens on a long idle. Unfortunately for some users idle doesn't seem to ever happen, so we never hit this. We surely can improve in proactive detection. The only thing that can detect a db corruption properly is PRAGMA integrity_check, but it's an expensive operation we can't run lightly when opening the db. And with a bogus idle there's no big choice. I filed time ago bug 1125157 to allow consumers to listen to "corrupt" notifications from Sqlite, and we should also integrate (or try to) the document I pointed out as a Firefox functionality. Firefox did what it could once it was able to detect the situation, it couldn't recover the db, so it just created a new one, and imported bookmarks into it. Ideally it would be a transparent process, if we'd have an history backup (bug 431274).

Marco Bonardo [:mak]

Comment 17

•

8 years ago

(In reply to fandrei from comment #15) > PS playing with user_version haven't helped at all. Any other ideas how to > fix the file, or how to export the history from v54 and import in v55? Please pay attention to follow the steps one by one starting from the db you had BEFORE the vacuum. The user_version is just one of the steps. Where does the tutorial fail exactly?

Reporter

Comment 18

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #16) > 2. on weekly maintenance, that happens on a long idle. Unfortunately for > some users idle doesn't seem to ever happen, so we never hit this. We surely > can improve in proactive detection. A 2 months old backup version of this file exhibits the same error. Apparently, the integrity check was never executed during these 2 months (at least). > The only thing that can detect a db corruption properly is PRAGMA > integrity_check, but it's an expensive operation we can't run lightly when > opening the db. Is it? It only took 1570 msecs on my PC.

Reporter

Comment 19

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #17) > Please pay attention to follow the steps one by one starting from the db you > had BEFORE the vacuum. The user_version is just one of the steps. Where does > the tutorial fail exactly? Those instructions didn't fail. However, Firefox lost all the history and bookmarks after I used the resulting file.

Marco Bonardo [:mak]

Comment 20

•

8 years ago

I'm sorry to insist, it's possible you did something wrong, the cloning process (.clone command) copies over everything, included history, and then you just need to properly set user_version, or Firefox may again consider it corrupt because it can't find a proper version. Don't start from the -corrupt file, start from the file that Firefox 54 can open and use properly, clone that into a new db. Currently there's no other simple way to move history, any other try may be very complex and end up still creating data coherency problems. (In reply to fandrei from comment #18) > > The only thing that can detect a db corruption properly is PRAGMA > > integrity_check, but it's an expensive operation we can't run lightly when > > opening the db. > > Is it? It only took 1570 msecs on my PC. Right, 1.5s where we'd block most of the other db operations and possibly cause an I/O bottleneck. We have users on very old systems, we can't just fire a check at a random time. That's why we use idle, but idle detection has some bugs.

Flags: needinfo?(mak77)

Gerry Chang [:gchang]

Comment 21

•

8 years ago

Track 56+/57+ as data loss regression.

tracking-firefox56: ? → +

tracking-firefox57: --- → +

Marco Bonardo [:mak]

Comment 22

•

8 years ago

this is not a regression, corrupt databases happen at every version. I'm indeed not sure what we could do about it, apart from trying to improve the experience, but that's not something we can do lightly and uplift.

Mike Taylor [:miketaylr]

Updated

•

8 years ago

status-firefox55: affected → wontfix

Mike Taylor [:miketaylr]

Updated

•

8 years ago

tracking-firefox55: ? → ---

Reporter

Comment 23

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #20) > Don't start from the > -corrupt file, start from the file that Firefox 54 can open and use > properly, clone that into a new db. That's exactly what I did. > Right, 1.5s where we'd block most of the other db operations and possibly > cause an I/O bottleneck. We have users on very old systems, we can't just > fire a check at a random time. Firefox often takes way much more time to start than 1.5s. > That's why we use idle, but idle detection > has some bugs. I'd say that it's just completely defunct.

Reporter

Comment 24

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #22) > I'm > indeed not sure what we could do about it, apart from trying to improve the > experience, but that's not something we can do lightly and uplift. Make Firefox really check for database integrity. At very least, add a button in the settings to invoke it manually.

Marco Bonardo [:mak]

Comment 25

•

8 years ago

there is already a button in about:support but it's not complete, indeed I plan (as soon as I have time) to augment that functionality to do a full check and also a more complete try to fix the db.

Reporter

Comment 26

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #25) > there is already a button in about:support but it's not complete, indeed I > plan (as soon as I have time) to augment that functionality to do a full > check and also a more complete try to fix the db. By the way, did you ever consider the idea to use a plain text file for storing places? This functionality doesn't really require SQL features, and text file would be way much easier to validate or fix it manually. And faster as well.

Marco Bonardo [:mak]

Comment 27

•

8 years ago

It wouldn't be faster and we definitely need the Sqlite ACID guarantees, plus the awesome bar wouldn't be possible without it. Indeed most browsers nowadays use Sqlite for that.

Reporter

Comment 28

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #27) > It wouldn't be faster Just test it, plain file is always faster when data fits into RAM. > and we definitely need the Sqlite ACID guarantees, I've lost all my history, and quite a few users lost all bookmarks as well. It doesn't look like these guarantees really guarantee anything.

Marco Bonardo [:mak]

Comment 29

•

8 years ago

it's a few cases out of millions, and ACID cannot guarantee safety if a filesystem lies (many do), or if a system has (even small) memory, disk or overclock issues causing data breakage. Well in those cases nothing can really do anything.

Reporter

Comment 30

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #29) > it's a few cases out of millions You wrote that for v55 there were 140k corruption cases reported by telemetry :) And there are more users who turned the telemetry off. > and ACID cannot guarantee safety if a > filesystem lies (many do) Or just a power outage or Firefox crash. Those guarantees are very weak, anyway. And manual data recovery is pretty much impossible for SQLite.

Marco Bonardo [:mak]

Comment 31

•

8 years ago

no, power outage or crashes can't corrupt Sqlite in normal conditions (if the conditions in comment 29 are not satisfied). (In reply to fandrei from comment #30) > You wrote that for v55 there were 140k corruption cases reported by > telemetry :) And there are more users who turned the telemetry off. 140k pings, the same profile may send multiple ones. And you should compare that to 100 millions users.

Reporter

Comment 32

•

8 years ago

(In reply to Marco Bonardo [::mak] from comment #31) > no, power outage or crashes can't corrupt Sqlite in normal conditions How can you be completely sure about that? > And you should compare > that to 100 millions users. That's a few in a thousand, not a few in a million. Difference in 3 orders of magnitude.

Marco Bonardo [:mak]

Comment 33

•

8 years ago

I'm sorry but I feel like the discussion continues in loop, please read the sqlite3 documentation, their team has a long time expertise with these issues, and we also have some knowledge after 10 years of releases :) We could go on for hours with technical details, but none of them is useful to move on this ticket.

Panos Astithas (he/him) [:past] (please ni?)

Comment 34

•

8 years ago

Removing the regression keyword as DB corruption reports appear in a few cases, but all the time during upgrades. Marking as fix-optional for 56 and 57 as there is no actionable info yet in this bug. We will reconsider the priority if we can get steps to reproduce the problem so proper investigation can occur.

status-firefox56: ? → fix-optional

status-firefox57: ? → fix-optional

Keywords: regression

Sylvestre Ledru [:Sylvestre]

Updated

•

8 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

Reporter

Comment 35

•

8 years ago

> I'm sorry but I feel like the discussion continues in loop, please read the sqlite3 documentation, their team has a long time expertise with these issue, and we also have some knowledge after 10 years of releases :) Database corruption problem isn't solved, it happens for hundreds thousands users, and database consistency isn't really checked. Those are the facts.

Marco Bonardo [:mak]

Comment 36

•

8 years ago

We are making work to improve reliability of the data, but there's nothing specific we can address for this report, thus resolving for now. We'll keep working on general data reliability.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → INCOMPLETE

Marco Bonardo [:mak]

Updated

•

8 years ago

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.