Closed
Bug 547031
Opened 15 years ago
Closed 14 years ago
Improve async error handling in cookies
Categories
(Core :: Networking: Cookies, defect)
Core
Networking: Cookies
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | betaN+ |
People
(Reporter: dwitte, Assigned: dwitte)
References
Details
Attachments
(6 files, 1 obsolete file)
67.62 KB,
patch
|
sdwilsh
:
review+
|
Details | Diff | Splinter Review |
38.91 KB,
patch
|
sdwilsh
:
review+
|
Details | Diff | Splinter Review |
8.60 KB,
patch
|
sdwilsh
:
review+
|
Details | Diff | Splinter Review |
4.08 KB,
patch
|
sdwilsh
:
review+
|
Details | Diff | Splinter Review |
16.34 KB,
patch
|
sdwilsh
:
review+
|
Details | Diff | Splinter Review |
56.56 KB,
patch
|
sdwilsh
:
review+
|
Details | Diff | Splinter Review |
+++ This bug was initially created as a clone of Bug #536978 +++
Now that we're about to have spiffy new error handlers for async statement execution, we should do something sensible with them. For instance:
1) On corruption errors, drop the table and repopulate the entire database from memory.
2) On filesystem read/write errors, and other such low-level things, we should probably close our database connection and run entirely from memory. We may try rebuilding the database, but there's only so much we can do.
3) On "file already open" errors (i.e. an AV scanner messing with us), just close the connection and forget about it. Trying to delete or rename the file is likely to fail, and at that point, it's Not Our Problem.
4) On consistency failures (such as "row doesn't exist"), we should probably do 1). This really should never happen, because this indicates a logic error in the code. We certainly want to assert and log it clearly.
Assignee | ||
Updated•15 years ago
|
Assignee: nobody → dwitte
Assignee | ||
Comment 1•14 years ago
|
||
I think we want this for 2.0, since (right now) we have less error handling than we used to, and we've definitely seen issues of database corruption in previous releases.
blocking2.0: --- → ?
Assignee | ||
Comment 2•14 years ago
|
||
(I'd recommend betaN+ since we need some good testing on this.)
Assignee | ||
Comment 4•14 years ago
|
||
This makes it so DBStates are refcounted, allowing the default state to change. Firefox may not support switching profiles, but we at least need to not horribly break if one does so with Gecko. Without this, any kind of asynchronous error handling would be prone to complete failure -- we could end up operating on the database from a different profile.
This doesn't really introduce any more complexity, it's just a bunch of refactoring.
Attachment #487811 -
Flags: review?(sdwilsh)
Assignee | ||
Comment 5•14 years ago
|
||
This makes TryInitDB() much more explicit about what the various failure modes are. Basically, any error opening the database or on subsequent operations will result in throwing away the database and retrying.
It also makes it so we rename a corrupt database, rather than throwing it away. I don't have a strong opinion on this... it's certainly useful in some cases to have that database around (it's helped us diagnose migration bugs, for instance), but it's also a privacy concern to have orphaned data on disk. My rationale here is that it should happen so rarely that the advantage of having it for debugging purposes outweighs the downside of having orphaned data.
Attachment #487812 -
Flags: review?(sdwilsh)
Assignee | ||
Comment 6•14 years ago
|
||
GetSyncDBConn() can fail with SQLITE_BUSY if a write transaction (e.g. the DELETE kicked off from Read(), or a regular INSERT/DELETE/UPDATE operation) is in progress. If we do it earlier, we're fine. WAL saves us from the possibility of those writes blocking readers, so the read statements themselves shouldn't fail to execute.
Attachment #487813 -
Flags: review?(sdwilsh)
Assignee | ||
Comment 7•14 years ago
|
||
Found a silly bug where we don't include a 'WHERE baseDomain NOTNULL" clause in the SELECT statement, like we do in Read(). We kick off a DELETE of those rows in Read(), but the SELECT in EnsureReadComplete() could execute first, and we really don't want those rows showing up there.
Attachment #487815 -
Flags: review?(sdwilsh)
Assignee | ||
Comment 8•14 years ago
|
||
Those are the prelude patches to the actual meat. I've got the final part written, but I'm still writing tests to prove it works. (There are a ton of cases to test.)
Assignee | ||
Comment 9•14 years ago
|
||
Don't know why I didn't see crashes earlier, but reentrancy on observers (which we do in our tests) breaks horribly right now. This makes it so we fire notifications after all list mutations are complete, such that the observers may mutate at leisure. I dropped the "cookie-db-read" notification from EnsureReadComplete() because there's no way to make it safe -- it's called from a bunch of places itself, for which list mutation would be bad. And we don't need it for tests, because it's the synchronous case.
Attachment #488125 -
Flags: review?(sdwilsh)
Assignee | ||
Comment 10•14 years ago
|
||
This actually implements the logic to rebuild the database. I've played it conservative and made it so that, if we fail during an async read or a full sync read (i.e. EnsureReadComplete()), we throw away all the results. There's likely good data to be had, but we don't know which hosts we have complete data for and which we don't. Anything read previously via EnsureReadDomain() will be kept.
As an added bonus, this is the final patch, and the whole patchqueue passes tests on try!
Attachment #488903 -
Flags: review?(sdwilsh)
Assignee | ||
Comment 11•14 years ago
|
||
... and this one's actually the right one.
Attachment #488903 -
Attachment is obsolete: true
Attachment #488930 -
Flags: review?(sdwilsh)
Attachment #488903 -
Flags: review?(sdwilsh)
Assignee | ||
Updated•14 years ago
|
Attachment #488930 -
Attachment is patch: true
Attachment #488930 -
Attachment mime type: application/octet-stream → text/plain
Comment 12•14 years ago
|
||
Comment on attachment 487811 [details] [diff] [review]
part 1: multiple DBStates
>+++ b/extensions/cookie/test/unit/test_cookies_privatebrowsing.js
>@@ -0,0 +1,136 @@
>+/* Any copyright is dedicated to the Public Domain.
>+ * http://creativecommons.org/publicdomain/zero/1.0/
>+ */
Please copy boilerplate from http://www.mozilla.org/MPL/boilerplate-1.1/
>+ // Create URIs and channels pointing to foo.com and bar.com.
>+ let spec1 = "http://foo.com/foo.html";
>+ let spec2 = "http://bar.com/bar.html";
>+ let uri1 = NetUtil.newURI(spec1);
>+ let uri2 = NetUtil.newURI(spec2);
You don't create any channels, and there's no need to store the specs in a local. Please update the comment and get rid of the local spec variables.
>+++ b/netwerk/cookie/nsCookieService.cpp
> // cache frequently used statements (for insertion, deletion, and updating)
>- rv = mDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING(
>+ rv = mDefaultDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING(
> "INSERT INTO moz_cookies ("
This is only used asynchronously, right? Use CreateAsyncStatement please.
>- rv = mDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING(
>+ rv = mDefaultDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING(
> "DELETE FROM moz_cookies "
ditto
>- rv = mDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING(
>+ rv = mDefaultDBState->dbConn->CreateStatement(NS_LITERAL_CSTRING(
> "UPDATE moz_cookies SET lastAccessed = :lastAccessed "
ditto
> // finalize our statements and close the db connection for the default state.
> // since we own these objects, nulling the pointers is sufficient here.
We don't do any finalizing anymore, so this comment is wrong.
>+ // Null out the db. If it's open, this will synchronously close it.
>+ mDefaultDBState->dbConn = NULL;
This will fail if you've executed any async statements on it, I'm pretty sure.
>+ if (mDBState && mDBState->dbConn &&
>+ !nsCRT::strcmp(aData, NS_LITERAL_STRING("shutdown-cleanse").get())) {
>+ // Clear the cookie db if we're in the default DBState.
As I recall, we don't need the nsCRT:: anymore.
> } else if (NS_LITERAL_STRING(NS_PRIVATE_BROWSING_LEAVE).Equals(aData)) {
>+ NS_ASSERTION(mDefaultDBState, "don't have a default state");
>+ NS_ASSERTION(mDBState == mPrivateDBState, "not in private state");
>+ NS_ASSERTION(!mPrivateDBState->dbConn, "private DB connection not null");
>+
>+ // Clear the private DBState, and restore the default one.
>+ mPrivateDBState = NULL;
>+ mDBState = mDefaultDBState;
Hmm, why don't we just call Observe in InitDBStates when private browsing is enabled like we used to?
> NS_IMETHODIMP
> nsCookieService::RemoveAll()
> {
> RemoveAllFromMemory();
>
> // clear the cookie file
> if (mDBState->dbConn) {
>- NS_ASSERTION(mDBState == &mDefaultDBState, "not in default DB state");
>+ NS_ASSERTION(mDBState == mDefaultDBState, "not in default DB state");
Wouldn't it be legal to call this while in private browsing mode though?
> void
> nsCookieService::AsyncReadComplete()
> {
> // We may be in the private browsing DB state, with a pending read on the
> // default DB state. (This would occur if we started up in private browsing
> // mode.) As long as we do all our operations on the default state, we're OK.
>+ NS_ASSERTION(mDefaultDBState->pendingRead, "no pending read");
>+ NS_ASSERTION(mDefaultDBState->readListener, "no read listener");
Should also assert mDefaultDBState here, correct?
> void
> nsCookieService::CancelAsyncRead(PRBool aPurgeReadSet)
> {
> // We may be in the private browsing DB state, with a pending read on the
> // default DB state. (This would occur if we started up in private browsing
> // mode.) As long as we do all our operations on the default state, we're OK.
>+ NS_ASSERTION(mDefaultDBState->pendingRead, "no pending read");
>+ NS_ASSERTION(mDefaultDBState->readListener, "no read listener");
ditto
>+ rv = mDefaultDBState->stmtReadDomain->BindUTF8StringByIndex(0, aBaseDomain);
Would prefer to see BindUTF8StringByName here
>+++ b/netwerk/cookie/nsCookieService.h
>+ void CheckPrivateBrowsing();
This isn't implemented.
r=sdwilsh
Attachment #487811 -
Flags: review?(sdwilsh) → review+
Comment 13•14 years ago
|
||
> >+++ b/extensions/cookie/test/unit/test_cookies_privatebrowsing.js
> >@@ -0,0 +1,136 @@
> >+/* Any copyright is dedicated to the Public Domain.
> >+ * http://creativecommons.org/publicdomain/zero/1.0/
> >+ */
> Please copy boilerplate from http://www.mozilla.org/MPL/boilerplate-1.1/
This applies to all your patches here. I won't comment further about it.
Comment 14•14 years ago
|
||
Comment on attachment 487812 [details] [diff] [review]
part 2: TryInitDB() cleanup
>+++ b/netwerk/cookie/nsCookieService.cpp
>+ // Attempt to open and read the database. If TryInitDB() returns false, we
>+ // retry; if it returns true, then things either failed catastrophically
>+ // (with no further action required, including cleanup) or succeeded fine.
This comment is most certainly not correct given:
>+ OpenDBResult result = TryInitDB(false);
>+// Attempt to open and read the database. If 'aRecreateDB' is true, try to
>+// move the existing database file out of the way and create a new one. Return
>+// RESULT_OK if opening or creating the database succeeded; RESULT_RETRY if
>+// the database cannot be opened, is corrupt, or some other failure occurred
>+// that might be resolved by recreating the database; or RESULT_FAILED if there
>+// was an unrecoverable error and we must run without a database. If
>+// RESULT_RETRY or RESULT_FAILED is returned, the caller should perform cleanup
>+// of the default DBState.
This belongs in the header file in a javadoc style comment.
r=sdwilsh
Attachment #487812 -
Flags: review?(sdwilsh) → review+
Comment 15•14 years ago
|
||
(In reply to comment #12)
> > // finalize our statements and close the db connection for the default state.
> > // since we own these objects, nulling the pointers is sufficient here.
> We don't do any finalizing anymore, so this comment is wrong.
This is addressed in part 3, so don't worry about it.
Comment 16•14 years ago
|
||
Comment on attachment 487813 [details] [diff] [review]
part 3: get sync db connection early
r=sdwilsh
Attachment #487813 -
Flags: review?(sdwilsh) → review+
Comment 17•14 years ago
|
||
Comment on attachment 487815 [details] [diff] [review]
part 4: fix EnsureReadComplete()
r=sdwilsh
Attachment #487815 -
Flags: review?(sdwilsh) → review+
Comment 18•14 years ago
|
||
Comment on attachment 488125 [details] [diff] [review]
part 5: fix observers
r=sdwilsh
Attachment #488125 -
Flags: review?(sdwilsh) → review+
Comment 19•14 years ago
|
||
Comment on attachment 488930 [details] [diff] [review]
part 6 v2: async error handling
r=sdwilsh
Attachment #488930 -
Flags: review?(sdwilsh) → review+
Assignee | ||
Comment 20•14 years ago
|
||
(In reply to comment #12)
> Please copy boilerplate from http://www.mozilla.org/MPL/boilerplate-1.1/
But it's the same as the PD boilerplate already!
> This is only used asynchronously, right? Use CreateAsyncStatement please.
Done!
> >+ // Null out the db. If it's open, this will synchronously close it.
> >+ mDefaultDBState->dbConn = NULL;
> This will fail if you've executed any async statements on it, I'm pretty sure.
Yeah, that's OK. We never do that, and if we did, Connection::Close would assert all over the place.
> As I recall, we don't need the nsCRT:: anymore.
Hmm? I thought we did for PRUnichar* strings.
> Hmm, why don't we just call Observe in InitDBStates when private browsing is
> enabled like we used to?
I thought it was a little obfuscated to call Observe when it's clearer just to explicitly write out those two lines, but I'm not really fussed either way.
> > NS_IMETHODIMP
> > nsCookieService::RemoveAll()
> > {
> > RemoveAllFromMemory();
> >
> > // clear the cookie file
> > if (mDBState->dbConn) {
> >- NS_ASSERTION(mDBState == &mDefaultDBState, "not in default DB state");
> >+ NS_ASSERTION(mDBState == mDefaultDBState, "not in default DB state");
> Wouldn't it be legal to call this while in private browsing mode though?
Yes, but it's guarded by mDBState->dbConn, which is always null in PB.
All other comments addressed.
Assignee | ||
Comment 21•14 years ago
|
||
(In reply to comment #20)
> > This will fail if you've executed any async statements on it, I'm pretty sure.
(The comment above that code wasn't very helpfully explanatory, I've made it clearer.)
Comment 22•14 years ago
|
||
(In reply to comment #20)
> But it's the same as the PD boilerplate already!
Yours is three lines, and the boilerplate is two. Subtle, sure, but if gerv ever needs to go through a reformat stuff, it's helpful to have it be copied and pasted verbatim from the boilerplate page.
> Hmm? I thought we did for PRUnichar* strings.
Pretty sure it just works for both. Try it please.
Assignee | ||
Comment 23•14 years ago
|
||
(In reply to comment #22)
> Yours is three lines, and the boilerplate is two.
I changed all the files in extensions/cookie/test/unit.
> Pretty sure it just works for both. Try it please.
Nope. You're thinking 'wcscmp', but we don't use that, and I don't think it's portable across platforms (though it may be by now!). 'strcmp' is part of libc and is declared 'const char*', and C doesn't allow function name overloading.
Assignee | ||
Comment 24•14 years ago
|
||
All done.
http://hg.mozilla.org/mozilla-central/rev/bbf2de2368f4
http://hg.mozilla.org/mozilla-central/rev/12336d312ad1
http://hg.mozilla.org/mozilla-central/rev/adcc4d2f4070
http://hg.mozilla.org/mozilla-central/rev/09bb5890500d
http://hg.mozilla.org/mozilla-central/rev/3e8a89a8be56
http://hg.mozilla.org/mozilla-central/rev/8d5d68434171
http://hg.mozilla.org/mozilla-central/rev/97c3302f124c
http://hg.mozilla.org/mozilla-central/rev/0525032a59a3
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•