Open Bug 1124553 Opened 10 years ago Updated 2 years ago

Failure to add sync credentials to login manager due to apparently corrupt key3.db

Categories

(NSS :: Libraries, defect, P5)

Tracking

(Not tracked)

People

(Reporter: markh, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [fxsync][nss-fx])

Attachments

(2 files)

We've seen similar reports and now have a profile where it can be reproduced reliably.  The symptoms are that FxAccounts.jsm calls |Services.logins.addLogin();| which fails with NS_ERROR_ABORT, and logged to the console we can see:

> JavaScript error: file:///o:/src/mozilla-git/gecko-dev/obj-release/dist/bin/components/crypto-SDR.js, line 116: NS_ERROR_ABORT: User canceled master password entry
> 1421906771918   FirefoxAccounts ERROR   Failed to save data to the login manager: [Exception... "User canceled master password entry"  nsresult: "0x80004004 (NS_ERROR_ABORT)"  location: "...

But there is no master-password configured, Services.logins.isLoggedIn returns false and no master-password dialog is shown.

Tracing this with the debugger I see us enter nsSecretDecoderRing::Encrypt(), which winds its way down to sftkdb_write.  This calls sftkdb_lookupObject() to see if any such (something?) exists - this returns CK_INVALID_HANDLE, so sftkdb_CreateObject is called.  This ends up calling put_dbkey() with update=false, which ends up with keydb_Put() with R_NOOVERWRITE, then we wind up in hash_access (hash.c), which seems to find an existing key so returns ABNORMAL.  This ends up bubbling bask as NS_ERROR_ABORT which erroneously reports the user cancelled the master password dialog.

The relevant stack up to keydb_Put() is:

>	nssdbm3.dll!keydb_Put(NSSLOWKEYDBHandleStr * kdb, DBT * key, DBT * data, unsigned int flags) Line 2118	C
> 	nssdbm3.dll!put_dbkey(NSSLOWKEYDBHandleStr * handle, DBT * index, NSSLOWKEYDBKeyStr * dbkey, int update) Line 264	C
> 	nssdbm3.dll!seckey_put_private_key(NSSLOWKEYDBHandleStr * keydb, DBT * index, SDBStr * sdbpw, NSSLOWKEYPrivateKeyStr * pk, char * nickname, int update) Line 1657	C
> 	nssdbm3.dll!nsslowkey_StoreKeyByPublicKeyAlg(NSSLOWKEYDBHandleStr * handle, NSSLOWKEYPrivateKeyStr * privkey, SECItemStr * pubKeyData, char * nickname, SDBStr * sdbpw, int update) Line 1693	C
> 	nssdbm3.dll!nsslowkey_StoreKeyByPublicKey(NSSLOWKEYDBHandleStr * handle, NSSLOWKEYPrivateKeyStr * privkey, SECItemStr * pubKeyData, char * nickname, SDBStr * sdb) Line 1082	C
> 	nssdbm3.dll!lg_createSecretKeyObject(SDBStr * sdb, unsigned long key_type, unsigned long * handle, const CK_ATTRIBUTE * templ, unsigned long count) Line 892	C
> 	nssdbm3.dll!lg_createKeyObject(SDBStr * sdb, unsigned long objclass, unsigned long * handle, const CK_ATTRIBUTE * templ, unsigned long count) Line 930	C
> 	nssdbm3.dll!lg_CreateObject(SDBStr * sdb, unsigned long * handle, const CK_ATTRIBUTE * templ, unsigned long count) Line 972	C
> 	softokn3.dll!sftkdb_CreateObject(PLArenaPool * arena, SFTKDBHandleStr * handle, SDBStr * db, unsigned long * objectID, CK_ATTRIBUTE * template, unsigned long count) Line 581	C
> 	softokn3.dll!sftkdb_write(SFTKDBHandleStr * handle, SFTKObjectStr * object, unsigned long * objectID) Line 1181	C
> 	softokn3.dll!sftk_handleSecretKeyObject(SFTKSessionStr * session, SFTKObjectStr * object, unsigned long key_type, int isFIPS) Line 1315	C
> 	softokn3.dll!sftk_handleKeyObject(SFTKSessionStr * session, SFTKObjectStr * object) Line 1369	C
> 	softokn3.dll!sftk_handleObject(SFTKObjectStr * object, SFTKSessionStr * session) Line 1617	C
> 	softokn3.dll!NSC_GenerateKey(unsigned long hSession, CK_MECHANISM * pMechanism, CK_ATTRIBUTE * pTemplate, unsigned long ulCount, unsigned long * phKey) Line 3996	C
> 	nss3.dll!PK11_KeyGenWithTemplate(PK11SlotInfoStr * slot, unsigned long type, unsigned long keyGenType, SECItemStr * param, CK_ATTRIBUTE * attrs, unsigned int attrsCount, void * wincx) Line 1105	C
> 	nss3.dll!pk11_TokenKeyGenWithFlagsAndKeyType(PK11SlotInfoStr * slot, unsigned long type, SECItemStr * param, unsigned long keyType, int keySize, SECItemStr * keyid, unsigned long opFlags, unsigned int attrFlags, void * wincx) Line 948	C
> 	nss3.dll!PK11_TokenKeyGen(PK11SlotInfoStr * slot, unsigned long type, SECItemStr * param, int keySize, SECItemStr * keyid, int isToken, void * wincx) Line 1008	C
> 	nss3.dll!PK11_GenDES3TokenKey(PK11SlotInfoStr * slot, SECItemStr * keyid, void * cx) Line 1128	C
> 	nss3.dll!PK11SDR_Encrypt(SECItemStr * keyid, SECItemStr * data, SECItemStr * result, void * cx) Line 191	C
> 	xul.dll!nsSecretDecoderRing::Encrypt(unsigned char * data, int dataLen, unsigned char * * result, int * _retval) Line 82	C++

Justin, can you help us sort this out or put us in touch with someone who can?
Flags: needinfo?(dolske)
(In reply to Mark Hammond [:markh] from comment #0)
> But there is no master-password configured, Services.logins.isLoggedIn
> returns false and no master-password dialog is shown.

Oops - isLoggedIn return *true* - ie, everything seems to agree there is no master password.
Hmm, there have been a couple isolated bugs like this before, last I recall was bug 717490 comment 15-17. That was some kind of race, where enabling a master password was getting something confused in the session about the login state.

Although I wonder if this is a case of DB corruption -- almost sounds like the sftkdb_lookupObject() call is retrieving the wrapped secret key, thinks it's not there, and then fails when initializing it because it actually is there? It's been too long since I looked at this in detail, but that PK11_GenDES3TokenKey is being taken implies that's what's happening.

Any idea how the profile was created / got into this state? Is it reproducible?

I think Martin might be able to find someone to look at it, although if it's DB corruption it might not be further debuggable.
Flags: needinfo?(dolske) → needinfo?(martin.thomson)
Attached file key3.db in question
(In reply to Justin Dolske [:Dolske] from comment #2)

> Any idea how the profile was created

It's the normal profile for someone on staff and is probably of a reasonable age.

> got into this state?

Nope.

> Is it reproducible?

We can't repro getting the profile into this state, but *can* reproduce it every time with the profile in question.  The profile has no saved passwords, so dolske tells me it's "safe" (from a privacy POV) to upload it here - so here it is!

> I think Martin might be able to find someone to look at it, although if it's
> DB corruption it might not be further debuggable.

I think just detecting the corruption and deleting/resetting the .db file would be a reasonable outcome.
I don't think that I can help here.  Richard might be a better source of this sort of NSS knowledge.
Flags: needinfo?(martin.thomson) → needinfo?(rlb)
perhaps this is linked to my pb
see
https://bugzilla.mozilla.org/show_bug.cgi?id=1116119
Mark: Can you give STR with this key3.db?

David: Do you happen to know how to use certutil to inspect this thing?
Flags: needinfo?(rlb)
Flags: needinfo?(markh)
Flags: needinfo?(dkeeler)
(In reply to Richard Barnes [:rbarnes] from comment #6)
> Mark: Can you give STR with this key3.db?

In a profile with no master-password enabled: Open a "browser" scratchpad and execute the following:

---8<---
let loginInfo = new Components.Constructor(
         "@mozilla.org/login-manager/loginInfo;1", Ci.nsILoginInfo, "init");
let login = new loginInfo("http://example.com",
                           null, // aFormSubmitURL,
                           "A realm", // aHttpRealm,
                           "username",
                           "password",
                           "", // aUsernameField
                           "");// aPasswordField

Services.logins.addLogin(login);
---8<---

On running it, you will get an exception:

/*
Exception: [Exception... "User canceled master password entry"  nsresult: "0x80004004 (NS_ERROR_ABORT)"  location: "JS frame :: file:///o:/src/mozilla-git/gecko-dev/obj-release/dist/bin/components/crypto-SDR.js :: LoginManagerCrypto_SDR.prototype.encrypt :: line 115"  data: no]
*/

But there's no master-password configured and there was no master-password prompt. See comment 0 for the analysis I did on how (but not why!) that error is being raised.
Flags: needinfo?(markh)
I can't tell much from just that file. There is something there, but it's not clear what it is (I used db_dump185 to get the contents - there are some hex strings and some ASN.1, but nothing's really jumping out at me). Maybe having the certdb.8 as well would be helpful?
Flags: needinfo?(dkeeler)
Attached file cert8.db
Thanks for posting that. Unfortunately, I can't get much information from it either.

`certutil -K -d <download directory> -h all` says this:

certutil: Checking token "NSS Certificate DB" in slot "NSS User Private Key and Certificate Services"
certutil: no keys found
certutil: Checking token "NSS Generic Crypto Services" in slot "NSS Internal Cryptographic Services"
certutil: no keys found

My guess is key3.db got into an inconsistent state somehow. Unless we can track down how that happened, I'm not sure there's much to do here that would be useful.
(In reply to David Keeler [:keeler] (use needinfo?) from comment #10)
> My guess is key3.db got into an inconsistent state somehow. Unless we can
> track down how that happened, I'm not sure there's much to do here that
> would be useful.

It seems like it would be useful to detect that state and recreate the file, or take some other action to help the user get out of the bad state they find themselves in (eg, I expect they'd have an inability to save any passwords and an inability to use Sync)
No longer blocks: 1084472
Summary: Failure to add sync credentials to login manager → Failure to add sync credentials to login manager due to apparently corrupt key3.db
There have now been multiple reports of issues related to a corrupt key3.db (bug 1295122 and bug 1313985 are recent ones).

(Quoting Mark Hammond [:markh] from comment #3)
> I think just detecting the corruption and deleting/resetting the .db file
> would be a reasonable outcome.

Would someone be willing to look into whether this is possible? e.g. if we can distinguish a corrupt file vs. an incorrect password using the attached key3.db?

Another option would be to brainstorm possible corruption ideas e.g. do we ensure in code that key3.db isn't written to from multiple processes at once? Do we do good error handling in case some of the I/O returns an error?
Flags: needinfo?(dkeeler)
(In reply to Matthew N. [:MattN] (PM me if I'm slow responding) from comment #13)
> (Quoting Mark Hammond [:markh] from comment #3)
> > I think just detecting the corruption and deleting/resetting the .db file
> > would be a reasonable outcome.
> 
> Would someone be willing to look into whether this is possible? e.g. if we
> can distinguish a corrupt file vs. an incorrect password using the attached
> key3.db?

From reading the background on this, it seems to be the case that some operation in NSS fails that PSM then interprets as "the user cancelled entering their password". At that point, PSM should be able to check if the softoken even needs the user to log in, and if it doesn't, we could maybe assume the key3.db had been corrupted.

> Another option would be to brainstorm possible corruption ideas e.g. do we
> ensure in code that key3.db isn't written to from multiple processes at
> once?

Yes - key3.db should only be opened by the parent process (if child processes need NSS operations, they operate NSS in a kind-of "memory-only" mode). There could potentially be a race condition in the one process, though.

> Do we do good error handling in case some of the I/O returns an error?

I don't know. I'm not very familiar with the underlying implementation.
Flags: needinfo?(dkeeler)
Whiteboard: [fxsync]
See Also: → 1295122
Blocks: 1337275
See Also: → 1255992
Severity: normal → S4
Priority: -- → P5
Whiteboard: [fxsync] → [fxsync][nss-fx]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: