Closed
Bug 1240238
Opened 9 years ago
Closed 9 years ago
NS_ERROR_STORAGE_CONSTRAINT on webappsstore.sqlite breaks lots of sites
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
FIXED
mozilla47
Tracking | Status | |
---|---|---|
firefox45 | --- | unaffected |
firefox46 | + | fixed |
firefox47 | + | fixed |
People
(Reporter: andy+bugzilla, Assigned: mayhemer)
References
(Depends on 1 open bug)
Details
(Keywords: regression)
Attachments
(3 files, 2 obsolete files)
337.56 KB,
image/png
|
Details | |
2.12 MB,
application/zip
|
Details | |
7.21 KB,
patch
|
mayhemer
:
review+
lizzard
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
Found that I was having a JS error on viewing many sites: NS_ERROR_STORAGE_CONSTRAINT. I found that I had a places.sqlite db of 80 MB in my profile. I stopped Firefox, moved that out of the way and restarted and things started working again.
This isn't reproducible in Nightly, so this might already be fixed, but since this broke a large number of sites (bill.com, datadoghq.com, addons.mozilla.org etc) thought it was worth reporting.
Comment 1•9 years ago
|
||
I'm not sure how a problem in places.sqlite may break sites, the only thing it does is storing history visits and marking visited links coloring, and those go through an async API (and also through xpidl with e10s enabled).
It would be more likely for indexedDB or DOMStorage to break a site than places.sqlite.
The side of the database doesn't matter at all, mine is even bigger and I have no problem.
If you could send me the broken places.sqlite I could have a look but I'm honestly not sure what I should look at since there's no relation between places and content that may break content.
Comment 2•9 years ago
|
||
ehr, I said xpidl but I meant ipc... btw, that's minor.
Comment 3•9 years ago
|
||
Also, having a stack of that NS_ERROR_STORAGE_CONSTRAINT from the console would have been useful.
Reporter | ||
Comment 4•9 years ago
|
||
Note that the graphs are blank.
Perhaps it wasn't places.sqlite, but rather webappsstore.sqlite. I moved the latter (at 78MB) out of the way and the graph re-rendered. If you can tell me how to get any more data for you, please let me know.
Comment 5•9 years ago
|
||
yep, it sounds more like DOM Storage.
I think having webappsstore.sqlite and an example page where the bug happens could help.
But mostly I guess the problem is webappsstore.sqlite is corrupt and we don't have a way to detect corruption on-the-fly (that is, if the database opens, we consider it good, even if it's not)
Component: General → DOM
Product: Firefox → Core
Updated•9 years ago
|
Summary: NS_ERROR_STORAGE_CONSTRAINT breaks lots of sites → NS_ERROR_STORAGE_CONSTRAINT on webappsstore.sqlite breaks lots of sites
Why are we throwing an NS_ERROR_BLAH and not some sort of DOM exception?
Comment 7•9 years ago
|
||
Good question, but without a stack it's hard to guess. If anyone can get by mail the corrupt db and try to reproduce, would be nice to see where we fail.
Assignee | ||
Comment 8•9 years ago
|
||
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #6)
> Why are we throwing an NS_ERROR_BLAH and not some sort of DOM exception?
Because we take whatever mozStoreage spits w/o converting it to a DOM exception at [1]. This and probably [2] are the places to convert mLoadResult to anything of your liking.
[1] http://hg.mozilla.org/mozilla-central/annotate/a77b73c7723e/dom/storage/DOMStorageCache.cpp#l656
[2] http://hg.mozilla.org/mozilla-central/annotate/a77b73c7723e/dom/storage/DOMStorageCache.cpp#l239
Assignee | ||
Comment 9•9 years ago
|
||
The best way is IMO to walk the 6 methods on DOMStorage class exposed to DOM and make sure aRv is converted to something DOM related.
Because there may be more places then just those at comment 8, most coming from one place tho - DOMStorageDBThread::InsertDBOp. The database can be in some failure state (a mozStorage originated nsresult) which we then directly return to DOM.
Comment 10•9 years ago
|
||
even if we fix reporter errors, if this breaks websites it's not great, we likely need to detect this corruption and throw away current webappsstore.sqlite. If it's complex to do while Firefox is running, we could just set a pref, read it on startup and replace it there.
I suppose all the pages using DOM Storage can recreate the missing data, right?
Comment 11•9 years ago
|
||
Something like what I suggested in bug 1125157 would be useful here.
We really can't expect every consumer to implement their own Storage wrapper to catch corruption, having an onCorrupt handler on the connection would be simpler...
Depends on: 1125157
Comment 12•9 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #11)
> Something like what I suggested in bug 1125157 would be useful here.
> We really can't expect every consumer to implement their own Storage wrapper
> to catch corruption, having an onCorrupt handler on the connection would be
> simpler...
Also seeing this on :rfeeley's machine here. https://arty.name/localstorage.html reports "0 items saved" into storage. Moving "webappsstore.sqlite" out of the profile directory fixes it. the SQLite size is 55mb. I'm gonna check with Ryan if he wants to share the corrupt db or not.
Comment 13•9 years ago
|
||
Okay Ryan didn't want to share the DB. If you explain the steps to trace the SQLite error we can do that on his machine when possible.
Comment 14•9 years ago
|
||
Could be related: https://bugzilla.mozilla.org/show_bug.cgi?id=1236557
Why would this be related to IndexedDB?
The first thing that would be helpful is to break at http://mxr.mozilla.org/mozilla-central/source/storage/mozStoragePrivateHelpers.cpp#68 and get a stack.
Flags: needinfo?(vlad)
Comment 17•9 years ago
|
||
Could you please run PRAGMA integrity_check; using Sqlite Manager (or sqlite3 command line) on the db? It may be interesting to check the results.
Off-hand having a corrupt db is not very useful, the only thing we could do is send it to Sqlite team to check if they can figure out something about how it got corrupted, but if it isn't shareable, we can't even do that. with this kind of corruption we could just throw at any point during any read or write, that's why I don't think fixing a specific point in code helps, unless it happens during the connection startup (but then I assume DOM Storage already has code to throw away the db if open fails).
Comment 18•9 years ago
|
||
Ryan is super busy this week with UX meetings, so just going to ni? him.
Ryan when you have time could you run (on the ~/Desktop file that we backed up):
```
sqlite3 webappsstore.sqlite
```
that will show you
```
sqlite>
```
in the terminal, type in:
```
sqlite> PRAGMA integrity_check;
```
Flags: needinfo?(vlad) → needinfo?(rfeeley)
Reporter | ||
Comment 19•9 years ago
|
||
I did this in my profile:
```
oding98t.default sqlite3 webappsstore.sqlite
SQLite version 3.8.5 2014-08-15 22:37:57
Enter ".help" for usage hints.
sqlite> PRAGMA integrity_check;
ok
sqlite>
```
Here's how the first run pages looks for me in that profile: https://www.dropbox.com/s/km2do3spxs1rjgb/Screenshot%202016-01-21%2011.23.14.png?dl=0&preview=Screenshot+2016-01-21+11.23.14.png
Comment 20•9 years ago
|
||
so, it's the data that is somehow broken, doesn't look like something related to Storage.
there are also reports in bug 1238354, did we change anything in DOM Storage recently?
Updated•9 years ago
|
Flags: needinfo?(rfeeley)
Comment 21•9 years ago
|
||
(In reply to Andy McKay [:andym] from comment #19)
> I did this in my profile:
>
> ```
> oding98t.default sqlite3 webappsstore.sqlite
>
> SQLite version 3.8.5 2014-08-15 22:37:57
> Enter ".help" for usage hints.
> sqlite> PRAGMA integrity_check;
> ok
> sqlite>
> ```
>
> Here's how the first run pages looks for me in that profile:
> https://www.dropbox.com/s/km2do3spxs1rjgb/Screenshot%202016-01-21%2011.23.14.
> png?dl=0&preview=Screenshot+2016-01-21+11.23.14.png
Do you use this profile with different versions /builds of Firefox?
Ryan used his profile with Stable and Nightly.
Assignee | ||
Comment 22•9 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #20)
> so, it's the data that is somehow broken, doesn't look like something
> related to Storage.
>
> there are also reports in bug 1238354, did we change anything in DOM Storage
> recently?
Yes, bug 1165214. It changes the database schema, however should be compatible in both ways mostly. I remember testing going back and forth the version before and after the patch with the same database (for both cases - created first with and w/o the patch). Everything was working for me, but something could well slipped between my fingers.
Assignee | ||
Comment 23•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #22)
> (In reply to Marco Bonardo [::mak] from comment #20)
> > so, it's the data that is somehow broken, doesn't look like something
> > related to Storage.
> >
> > there are also reports in bug 1238354, did we change anything in DOM Storage
> > recently?
>
> Yes, bug 1165214. It changes the database schema, however should be
> compatible in both ways mostly. I remember testing going back and forth the
> version before and after the patch with the same database (for both cases -
> created first with and w/o the patch). Everything was working for me, but
> something could well slipped between my fingers.
However, at this time it's just on mozilla-central.
Reporter | ||
Comment 24•9 years ago
|
||
(In reply to Vlad Filippov :vladikoff from comment #21)
> Do you use this profile with different versions /builds of Firefox?
> Ryan used his profile with Stable and Nightly.
Yes I switch between Stable and Nightly frequently on the same profile.
Comment 25•9 years ago
|
||
(In reply to Andy McKay [:andym] from comment #19)
> I did this in my profile:
>
> ```
> oding98t.default sqlite3 webappsstore.sqlite
>
> SQLite version 3.8.5 2014-08-15 22:37:57
> Enter ".help" for usage hints.
> sqlite> PRAGMA integrity_check;
> ok
> sqlite>
> ```
>
> Here's how the first run pages looks for me in that profile:
> https://www.dropbox.com/s/km2do3spxs1rjgb/Screenshot%202016-01-21%2011.23.14.
> png?dl=0&preview=Screenshot+2016-01-21+11.23.14.png
Thanks Andy, we have been trying to track down the cause of this and related errors that are reported to FxA metrics. FxA from the firstrun flow also reports:
* NS_ERROR_STORAGE_IOERR
* NS_ERROR_FILE_NO_DEVICE_SPACE
* NS_ERROR_FILE_CORRUPTED
* NS_ERROR_STORAGE_CONSTRAINT
* NS_ERROR_FAILURE
* NS_ERROR_STORAGE_BUSY
* NS_ERROR_OUT_OF_MEMORY
* SecurityError (This operation is insecure)
Some of these errors occur several thousand times an hour, the main reported browsers are Fx 41->Fx 43. What I find confusing about these errors is users should only see the firstrun flow when they create a new profile. A corrupt sqlite file with a new profile is hard to understand.
Comment 26•9 years ago
|
||
I recently experienced this error, NS_ERROR_STORAGE_CONSTRAINT, while attempting to access https://accounts.firefox.com. Additionally, I often switch between Nightly and Stable builds.
Comment 27•9 years ago
|
||
Attached o6y6ubjz.qwqeoroe.zip is a test profile with the broken storage from /Users/[username]/Library/Application Support/Firefox/Profiles
Open any site that uses local storage with that profile (such as https://accounts.firefox.com/) and it should throw the error.
Flags: needinfo?(khuey) → needinfo?(honzab.moz)
Assignee | ||
Comment 29•9 years ago
|
||
I think I screwed up something with the database update code in bug 1165214.
Going to try the test profile.
Comment 30•9 years ago
|
||
Honza, I suspect the problem is you removed a UNIQUE constraint (scope, key) in the new schema, that means now you can have multiple rows having the same (scope, key), that means your previous schema version, trying to do this:
"CREATE UNIQUE INDEX IF NOT EXISTS scope_key_index"
" ON webappsstore2(scope, key)"));
will now fail.
That means you cannot use the existence of that index to detect downgrades.
that means InitDatabase fails in the old version. What do we do in such a case? replace the db? From a quick look sounds like we just bail out?
Btw, since you are already looking into this I didn't dig too deep, this is just what I noticed from comparing that broken db with mine.
Flags: needinfo?(mak77)
Comment 31•9 years ago
|
||
SELECT count(*), scope, key FROM webappsstore2 GROUP BY scope, key HAVING count(*) > 1;
looks like accounts.firefox.com falls into the multiple scope, key case.
Assignee | ||
Comment 32•9 years ago
|
||
So, what's happening here (with the provided profile):
- the database has been apparently updated to schema 1, but the origin_key_index is missing from a reason I don't fully understand ; we don't try to create it again when already updated to schema 1
- in Fx versions w/o bug 1165214 (=schema 0) we fail to create scope_key_index index -> hence the storage failures bubbling up to DOM
- to fail the index creating is only possible when the scope+key pairs are not unique (I found 22 duplicates for "moc.xoferif.stnuocca.:https:443" and "crosstab.MESSAGE_KEY")
- that may happen only when the data were added to the database while there already were not a unique index (with bug 1165214 (=schema 1) it's origin_key_index)
I'm not sure how that happened, since we don't allow writes to the database when we fail any part of the open and update process, half-updated tables should be threw away as all is happening in a transaction and is rolled back on a failure.
I suspect we failed the update but the transaction didn't rollback correctly and left the schema broken.
I know about a certain problem that we loose uniqueness when origin attributes stored with schema 1 expands beyond what the "scope" column can store (which is only appid and inbrowser). It might be this case.
Has user context already landed on m-c? If so, Vlad, have you been using it?
The fix here is:
- preserve uniqueness of scope to allow schema 1 -> schema 0 -> schema 1 jumps be fully handled
- try to create the index and on a failure just drop the database and start over with an empty one
Assignee | ||
Comment 33•9 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #31)
> SELECT count(*), scope, key FROM webappsstore2 GROUP BY scope, key HAVING
> count(*) > 1;
>
> looks like accounts.firefox.com falls into the multiple scope, key case.
Why are you wasting time when I'm already looking into this?
Comment 34•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #33)
> Why are you wasting time when I'm already looking into this?
I'm sorry, I just spent a few minutes looking at the sqlite file, just to be sure it was not related to Storage and noticed that discrepancy with indices.
Assignee | ||
Comment 35•9 years ago
|
||
(In reply to Marco Bonardo [::mak] from comment #34)
> (In reply to Honza Bambas (:mayhemer) from comment #33)
> > Why are you wasting time when I'm already looking into this?
>
> I'm sorry, I just spent a few minutes looking at the sqlite file, just to be
> sure it was not related to Storage and noticed that discrepancy with indices.
Aha. Understand. It just looked to me a bit like we are doing the same thing :)
Assignee | ||
Comment 36•9 years ago
|
||
Aim is to preserve both data and functionality.
- the patch enforces creation of tables and indexes, on failure we throw the database away and start over
- the |scope| column still present for backward compat can tho only keep appid and inbrowser and not any other attributes that may be set on the principal of a storage
- hence, I store the whole origin attributes suffix as another part of the |scope| string
- the old code will just ignore it and will not find the records (which is actually pretty correct) while the uniqueness is preserved
- when updating forward after the schema 1 database has been used on schema 0 code, we fully restore scoping information from the |scope| column now
Roughly tested.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e590ecd26d07
Attachment #8711768 -
Flags: review?(bugs)
Comment 37•9 years ago
|
||
Comment on attachment 8711768 [details] [diff] [review]
v1
># HG changeset patch
># User Honza Bambas <honzab.moz@firemni.cz>
># Parent 23937b60386e7cd0c86b7f73eef098c71fc6675f
>Bug 1240238 - Full DOMStorage schema 1 scope update uniqueness, r=smaug
>
>diff --git a/dom/storage/DOMStorageDBThread.cpp b/dom/storage/DOMStorageDBThread.cpp
>--- a/dom/storage/DOMStorageDBThread.cpp
>+++ b/dom/storage/DOMStorageDBThread.cpp
>@@ -60,16 +60,39 @@ Scheme0Scope(DOMStorageCacheBridge* aCac
>
> if (oa.mAppId != nsIScriptSecurityManager::NO_APP_ID || oa.mInBrowser) {
> result.AppendInt(oa.mAppId);
> result.Append(':');
> result.Append(oa.mInBrowser ? 't' : 'f');
> result.Append(':');
> }
>
>+ // If there is more than just appid and/or inbrowser stored in origin
>+ // attributes, put it to the schema 0 scope as well. We must do that
>+ // to keep the scope column unique (same resolution as schema 1 has
>+ // with originAttributes and originKey columns) so that switch between
>+ // schema 1 and 0 always works in both ways.
>+ nsAutoCString remaining;
>+ oa.mAppId = 0;
>+ oa.mInBrowser = false;
>+ oa.CreateSuffix(remaining);
>+ if (!remaining.IsEmpty()) {
>+ MOZ_ASSERT(!suffix.IsEmpty());
>+
>+ if (result.IsEmpty()) {
>+ // Must contain the old prefix, otherwise we won't search for the whole
>+ // origin attributes suffix.
>+ result.Append(NS_LITERAL_CSTRING("0:f:"));
>+ }
>+ // Append the whole suffix despite we have already stored appid and browser.
>+ // This always starts with '^' and is then easy to grab and use.
>+ result.Append(suffix);
>+ result.Append(':');
>+ }
Please add some examples where what kinds of strings we're playing with here.
Otherwise this and the changes to ExtractOriginData are very hard to read.
(...and it took enough time to figure this out that I'd like to see a new patch with the comments fixed.)
>+ // If the profile was running schema 1 -> schema 0 -> schema 1 switching
>+ // we may have stored the full attributes when there were more than just
>+ // appid and inbrowser set on OriginAttrbutes.
OriginAttributes
>+ // To preserve full uniqueness we store such attributes to the scope key.
>+ // Schema 0 code will just ignore it while keeping the scoping unique.
Please add some example what kinds of strings we're trying to tokenize here.
Attachment #8711768 -
Flags: review?(bugs) → review-
Assignee | ||
Comment 38•9 years ago
|
||
Attachment #8712118 -
Flags: review?(bugs)
Assignee | ||
Updated•9 years ago
|
Attachment #8711768 -
Attachment is obsolete: true
Comment 39•9 years ago
|
||
Comment on attachment 8712118 [details] [diff] [review]
v1.1
>+ // If the profile went through schema 1 -> schema 0 -> schema 1 switching
>+ // we may have stored the full attributes origin suffix when there were
>+ // more than just appid and inbrowser set on storage principal's
>+ // OriginAttributes.
>+ //
>+ // To preserve full uniqueness we store the suffix to the scope key.
>+ // Schema 0 code will just ignore it while keeping the scoping unique.
>+ //
>+ // The OriginAttributes suffix is a string in a form like:
>+ // "^addonId=101&userContextId=5" and it's ensured it always starts with '^'
>+ // and never contains ':'. See OriginAttributes::CreateSuffix.
>+ Record();
>+ if (CheckChar('^')) {
>+ Token t;
>+ while (Next(t)) {
>+ if (t.Equals(Token::Char(':'))) {
>+ Claim(suffix);
>+ break;
>+ }
So the comment still doesn't explain why we're looking at : here. Could you add something about
the stuff after : being non-origin attributes part of the string, or it being OriginNoSuffix or something.
It would be still nice to have also somewhere an example showing what kinds of string can be stored here, I mean
the whole string, not just some substring.
(it is unfortunate to need to play with strings here and not with some objects.)
Attachment #8712118 -
Flags: review?(bugs) → review+
Assignee | ||
Comment 40•9 years ago
|
||
Attachment #8712118 -
Attachment is obsolete: true
Attachment #8712739 -
Flags: review+
Assignee | ||
Updated•9 years ago
|
Keywords: checkin-needed
Comment 41•9 years ago
|
||
Keywords: checkin-needed
Comment 42•9 years ago
|
||
(In reply to Honza Bambas (:mayhemer) from comment #40)
> Created attachment 8712739 [details] [diff] [review]
> v1.2
Now that it landed does this mean that if you update Nightly this error should never happen? or would it require uplifts to older versions?
Comment 43•9 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox47:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
Assignee | ||
Comment 44•9 years ago
|
||
(In reply to Vlad Filippov :vladikoff from comment #42)
> (In reply to Honza Bambas (:mayhemer) from comment #40)
> > Created attachment 8712739 [details] [diff] [review]
> > v1.2
>
> Now that it landed does this mean that if you update Nightly this error
> should never happen? or would it require uplifts to older versions?
We cannot do uplift here. It's dependent on heavy changes to the database that are currently on m-c only. I think it's not that much standard people switch on the same profile between nightly and release. The DOM error should never show up again with nightly. One solution for release is that a user runs the profile with nightly having this patch and then reverts to release (or any non-nightly branch).
Assignee | ||
Comment 45•9 years ago
|
||
Comment on attachment 8712739 [details] [diff] [review]
v1.2
Approval Request Comment
[Feature/regressing bug #]: 1165214
[User impact if declined]: lot of site breakage (dom throws errors where devs usually don't expect exceptions leading to inability to use many web services)
[Describe test coverage new/current, TreeHerder]: lot of domstorage tests
[Risks and why]: N/A, probably very low
[String/UUID change made/needed]: none
Attachment #8712739 -
Flags: approval-mozilla-aurora?
Comment 46•9 years ago
|
||
Tracking since this is a regression (landed in m-c, 2016-01-06)
status-firefox45:
--- → unaffected
status-firefox46:
--- → affected
tracking-firefox46:
--- → +
tracking-firefox47:
--- → +
Keywords: regression
Comment 47•9 years ago
|
||
In comment 44 you say "we cannot do uplift here" but then you request uplift in comment 45. Did you change your mind or are you talking about something different in comment 44?
Flags: needinfo?(honzab.moz)
Assignee | ||
Comment 48•9 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #47)
> In comment 44 you say "we cannot do uplift here"
Up to beta and further. But we can do an uplift to aurora now (since the base patch for bug 1165214 is now on aurora.
> but then you request uplift
> in comment 45. Did you change your mind or are you talking about something
> different in comment 44?
Flags: needinfo?(honzab.moz)
Assignee | ||
Comment 49•9 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #47)
> In comment 44 you say "we cannot do uplift here"
Up to beta and further and at that time even to aurora. But we can do an uplift to aurora now (since the base patch for bug 1165214 is now on aurora).
> but then you request uplift
> in comment 45. Did you change your mind or are you talking about something
> different in comment 44?
as above.
Comment 50•9 years ago
|
||
Comment on attachment 8712739 [details] [diff] [review]
v1.2
Fix for a dataloss regression, please uplift to aurora.
Attachment #8712739 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment 51•9 years ago
|
||
bugherder uplift |
Updated•6 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•