Open Bug 1295122 Opened 8 years ago Updated 2 years ago

Sync fails with "Services.logins is undefined"

Categories

(Firefox :: Sync, defect, P3)

48 Branch
defect

Tracking

()

People

(Reporter: u577553, Unassigned)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [sync:passwords][sync:rigor])

Attachments

(5 files)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:48.0) Gecko/20100101 Firefox/48.0
Build ID: 20160726073904

Steps to reproduce:

Clean install of Firefox 48 (64-bit) on a clean install of Windows 10. Entered credentials for Firefox sync.


Actual results:

Everything syncs except for my passwords/logins. This actually started happening on Firefox 47 on Windows 7, same machine. I wiped everything and started fresh, but the same problem occurred.

At the same time my sync issues started, I started having TLS issues on my corporate network. The certificates are never recognized as valid. I can still get to the page if I click through the exception dialogs, but it also will not let me add a permanent exception. If that box is checked, I cannot click the "Confirm Security Exception" button.


Expected results:

Sync should have finished successfully.
The actual error in the log I attached is here:

1471265148820	Sync.ErrorHandler	DEBUG	passwords failed: TypeError: Services.logins is undefined (resource://gre/modules/services-sync/engines/passwords.js:164:9) JS Stack trace: PasswordStore.prototype.getAllIDs@passwords.js:164:9 < SyncEngine.prototype._syncStartup@engines.js:931:22 < SyncEngine.prototype._sync@engines.js:1553:7 < WrappedNotify@util.js:146:21 < Engine.prototype.sync@engines.js:670:5 < _syncEngine@enginesync.js:213:7 < sync@enginesync.js:163:15 < onNotify@service.js:1262:7 < WrappedNotify@util.js:146:21 < WrappedLock@util.js:101:16 < _lockedSync@service.js:1252:12 < sync/<@service.js:1244:14 < WrappedCatch@util.js:75:16 < sync@service.js:1232:5
Component: Untriaged → Sync
Services.logins being undefined is going to cause a number of problems. I suspect an addon might be interfering - can you please try to reproduce in safe mode? See https://support.mozilla.org/en-US/kb/troubleshoot-firefox-issues-using-safe-mode for details.
Flags: needinfo?(mrstegeman)
Same issue in safe mode. I attached a new error log for good measure.
Flags: needinfo?(mrstegeman)
Priority: -- → P3
Summary: Password sync fails with error → Password sync fails with "Services.logins is undefined"
Matt, the logs from these syncs report "TypeError: Services.logins is undefined" and since we landed Sync telemetry we are seeing many hundreds of failures with this same issue. Are you aware of any issues which might cause this (eg, invalid logins.json or similar) or any way for these users to diagnose the issue which doesn't involve them disclosing all their passwords?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: needinfo?(MattN+bmo)
Please enable password manager debug logging and attach the output from the Browser Console after a restart followed by a manual sync (which I assume triggers this error).

https://wiki.mozilla.org/Firefox:Password_Manager_Debugging
Flags: needinfo?(MattN+bmo) → needinfo?(mrstegeman)
I've attached the browser console log with a couple failed syncs, both before and after I was asked to reconnect to the sync server with my credentials (which also happens often).
Flags: needinfo?(mrstegeman)
Thanks for the log. The relevant lines seem to be:

> : Component returned failure code: 0x805a1f9b [nsIPK11Token.initPassword]crypto-SDR.js:66
> Login crypto:Initializing key3.db with default blank password.crypto-SDR.js:65
> Error: Initialization failedstorage-json.js:90:13

To try and reproduce this I truncated key3.db at some random size and started the browser. On a recent Nightly I got the same messages as the log, but Services.logins still existed. However, trying the same thing on the release channel did make more noise, including an error regarding Services.logins being undefined.

I had a quick look at the hg logs but couldn't find an obvious candidate for us handling such an error. Matt, is there some other bug you are aware of that might have fixed this, or any other clues as to what is going on?
Flags: needinfo?(MattN+bmo)
Oops - I spoke too soon - while I didn't get that error on startup using Nightly, I did get it a little later, so I suspect it hasn't actually been fixed, just that the code has changed so the symptoms are slightly different. Also, bug 1124553 is another example of key3.db being corrupt causing problems, and trying to save my Sync credentials when in this state also causes an exception with "Services.logins in undefined".
I don't have other ideas as it sounds like an issue with key3.db. I think we should pursue your suggestion I quoted in bug 1124553 comment 13.
Flags: needinfo?(MattN+bmo)
Issue is still present in Firefox 50.
I can reproduce consistently after using Refresh Firefox on my Ubuntu VM. Unfortunately, I can't repro on macOS; haven't tried Windows yet.

The issue is `sftk_getKeyDB` (http://searchfox.org/mozilla-central/rev/8562d3859b89ac89f46690b9ed2c473e0728d6c0/security/nss/lib/softoken/sftkdb.c#2404-2416) returns NULL. This causes `NSC_InitPIN` to fail (http://searchfox.org/mozilla-central/rev/8562d3859b89ac89f46690b9ed2c473e0728d6c0/security/nss/lib/softoken/pkcs11.c#3651-3655), and the error propagates through `PK11_InitPin` (http://searchfox.org/mozilla-central/rev/8562d3859b89ac89f46690b9ed2c473e0728d6c0/security/nss/lib/pk11wrap/pk11auth.c#456-461), `nsPK11Token::InitPassword` (http://searchfox.org/mozilla-central/rev/8562d3859b89ac89f46690b9ed2c473e0728d6c0/security/manager/ssl/nsPK11TokenDB.cpp#299-300), and `LoginManagerCrypto_SDR#init` (http://searchfox.org/mozilla-central/rev/8562d3859b89ac89f46690b9ed2c473e0728d6c0/toolkit/components/passwordmgr/crypto-SDR.js#66).

That, in turn, breaks login manager for the remainder of the session. Interestingly, this only happens in the session immediately after refreshing. If I refresh Firefox, then manually restart, the issue goes away; login manager can initialize successfully.

FWIW, Firefox Refresh copies over key3.db from the old profile (http://searchfox.org/mozilla-central/rev/8562d3859b89ac89f46690b9ed2c473e0728d6c0/browser/components/migration/FirefoxProfileMigrator.js#133). Also, the VM has very slow disk I/O, while the Mac has a speedy SSD. I wonder if, in this case, there's a race where NSS initializes before we copy the DB over, and that causes havoc with some in-memory state. But I'm way out of my depth here. :-)

David, do you have insights into this?
Flags: needinfo?(dkeeler)
(In reply to Kit Cambridge [:kitcambridge] from comment #14)
...
> I wonder if, in this case, there's a race where
> NSS initializes before we copy the DB over, and that causes havoc with some
> in-memory state.

That could certainly cause this. Something that might give some insight is running with MOZ_LOG=pipnss:4 and MOZ_LOG_FILE=logfile.txt or similar. I'm betting it fails to initialize NSS in the profile directory the first run (at that point the implementation falls back to memory-only mode, which causes all password-saving machinery to not work).
Flags: needinfo?(dkeeler)
See Also: → 1124553
I've been playing a little with this, and as with everything NSS related, "it's complicated".

The level of corruption in key3.db causes 2 main symptoms that I've seen:

* As captured in bug 1124553 and bug 621846 - the DB initializes, but all attempts to decrypt anything fails in ways that look alot like the master password has been reset.

* Failure to initialize the DB:

> : Component returned failure code: 0x805a1f9b [nsIPK11Token.initPassword]crypto-SDR.js:66
> Login crypto:Initializing key3.db with default blank password.crypto-SDR.js:65
> Error: Initialization failedstorage-json.js:90:13

Let's just look at the second here, as that is more clearly "massive failure".

A quick play shows that if we catch the above exception we can delete key3.db, but I can't work out how to re-initialize the library. So while that doesn't help the current browser session, it *does* get handled next time the browser starts - obviously all existing logins are now unavailable, but the login manager (noisily) handles this OK - it looks just like a master-password reset, but new logins can be saved.

This patch is quite naive and doesn't have tests (and I'm not sure if testing this sanely would be possible), but is a patch like this worth persevering with, or should we just wait for bug 1271851 (and/or Godot)?
Attachment #8838399 - Flags: feedback?(dkeeler)
Attachment #8838399 - Flags: feedback?(MattN+bmo)
Comment on attachment 8838399 [details] [diff] [review]
delete-key3.patch

Review of attachment 8838399 [details] [diff] [review]:
-----------------------------------------------------------------

My input would be that maybe deleting key3.db isn't the most user-friendly approach - maybe it should be renamed (e.g. to "key3.db.backup.<some random hex>" or similar) instead?
Also, on android the file would be key4.db (not sure if this is happening on android).

(In reply to Mark Hammond [:markh] from comment #16)
...
> A quick play shows that if we catch the above exception we can delete
> key3.db, but I can't work out how to re-initialize the library.

Bad news/good news/bad news: you basically can't safely re-initialize NSS with the way the code is set up now. However, we're currently working on a project that could enable us to do exactly this (unload and reload the user's certificates/keys/etc.). See https://wiki.mozilla.org/Security/CryptoEngineering/Platform_Use_of_NSS . Unfortunately, there are some unanswered questions and significant code re-working that means we're still a ways out from implementing this, so it doesn't really help you right now.

Here's another thought, though - does calling token.reset() and then token.initPassword("") fix it? (again, this would clobber the user's password encryption key as well as any private keys (which they hopefully have backed up elsewhere, but still).
Attachment #8838399 - Flags: feedback?(dkeeler)
(In reply to David Keeler [:keeler] (use needinfo?) from comment #17)
> My input would be that maybe deleting key3.db isn't the most user-friendly
> approach - maybe it should be renamed (e.g. to "key3.db.backup.<some random
> hex>" or similar) instead?

Yeah, renaming would be better.

> Also, on android the file would be key4.db (not sure if this is happening on
> android).

I've not seen reports of it there, but I guess there's no reason it couldn't also happen there too, so wrapping that code in an AppConstants check also probably makes sense.

> Here's another thought, though - does calling token.reset() and then
> token.initPassword("") fix it? (again, this would clobber the user's
> password encryption key as well as any private keys (which they hopefully
> have backed up elsewhere, but still).

No, that's the first thing I tried and the second initPassword fails in the same way :(

I'm happy to update the patch to rename the file, but I'm still not clear if it is something we should go ahead with in general. I guess I'll wait and see what Matt says...
Comment on attachment 8838399 [details] [diff] [review]
delete-key3.patch

Review of attachment 8838399 [details] [diff] [review]:
-----------------------------------------------------------------

Hi Mark,

I'm fine with this approach if you can show me data that the issue this is trying to solve isn't a transient issue i.e. that deleting the file isn't worse for the user than leaving it there in cases where the issue is some kind of race or file-system issue.

For example, if you have telemetry from sync ping, can you show data that if a user gets this issue that they continue to get this error every time until it gets fixed?

::: toolkit/components/passwordmgr/crypto-SDR.js
@@ +67,5 @@
> +        this.log("Failed to initialize key3.db with a new password - clobbering it for next time!", ex);
> +        // use main-thread IO here as converting this to handle async is alot
> +        // of work and we should only end up here in the rarest of cases.
> +        let path = OS.Path.join(OS.Constants.Path.profileDir, "key3.db")
> +        let file = Cc['@mozilla.org/file/local;1'].createInstance(Ci.nsILocalFile);

Nit: double quotes

@@ +69,5 @@
> +        // of work and we should only end up here in the rarest of cases.
> +        let path = OS.Path.join(OS.Constants.Path.profileDir, "key3.db")
> +        let file = Cc['@mozilla.org/file/local;1'].createInstance(Ci.nsILocalFile);
> +        file.initWithPath(path);
> +        file.remove(false);

I agree that renaming with a unique name would be friendlier.
Attachment #8838399 - Flags: feedback?(MattN+bmo) → feedback+
One more observation here. The profile is not creating key3.db and logins.json files. So there is no point in taking a backup of those files.
After upgrading to Firefox 52, I set up a clean profile, with no data, no add-ons, didn't even put in my sync info. With this clean slate, there are already a ton of errors in the browser console. I can reliably reproduce this.
(In reply to mrstegeman from comment #22)
> After upgrading to Firefox 52, I set up a clean profile, with no data, no
> add-ons, didn't even put in my sync info. With this clean slate, there are
> already a ton of errors in the browser console. I can reliably reproduce
> this.

That's really odd - what platform are you on? This issue probably doesn't belong in this bug, but there may already be one on file - Matt, have you ever seen a clean profile fail to init the login manager?
Flags: needinfo?(MattN+bmo)
Windows 10, 64-bit. It happened on Windows 7 for me as well, before I upgraded my machine.

This bug is originally mine, so my issue has probably always been the same, but it does appear that lots of others have similar symptoms, at least.
My specific issue, as referenced above, has been resolved with an updated WebSense (Triton AP-Endpoint DLP) version, and Firefox Developer Edition 54.0a2 (2017-04-18).
(In reply to Mark Hammond [:markh] from comment #23)
> Matt, have you ever seen a clean profile fail to init the login manager?

No, I don't think so.
Flags: needinfo?(MattN+bmo)
i have this symptom as well when the plugin "GNOME keyring password integration" is enabled. and only then - without i have no error messages on startup.

when the plugin is enabled i can't store any passwords anymore (and previously stored passwords do not show up).

i use 45.9.0 on linux. i can't remember the exact version that worked (and stored PWDs in the keyring).
(In reply to github from comment #27)
> i have this symptom as well when the plugin "GNOME keyring password
> integration" is enabled. and only then - without i have no error messages on
> startup.
> 
> when the plugin is enabled i can't store any passwords anymore (and
> previously stored passwords do not show up).

Bug 1254321 tracks one issue with the gnome keyring, but this problem sounds more serious.

> i use 45.9.0 on linux. i can't remember the exact version that worked (and
> stored PWDs in the keyring).

Is there anything reported in the browser console? Regardless, you should report this to the addon author - in this bug we are trying to track this symptom in Firefox without any extensions installed or involved.
Whiteboard: [sync:passwords][sync:rigor]
See Also: → 1255992
The most recent analysis of sync telemetry tells us that this error accounts for 75% of password syncing failures.
Priority: P3 → --
Priority: -- → P3
Using Linux Mint 17, Firefox 56.0, This problem has been happening since FF 55, it's still not fixed; I have friends with using Arch Linux that has the same problem. How difficult can be to solve this guys? can I help in some way?
(In reply to mab2010 from comment #33)
> can I help in some way?

That would be great, actually! I haven't been able to reproduce this consistently, and don't have much insight into why the login manager is failing to initialize. Do you have any password management add-ons installed? These tend to interfere with Sync; can you still reproduce if you start Firefox in Safe Mode per comment 2?

If you don't have any password add-ons installed, could you also set the environment variables in comment 15, start Firefox, try syncing, and see what's logged? (https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging#Linux has more info about `MOZ_LOG`, too).

Thanks!
Flags: needinfo?(mab2010)
The easiest way to reproduce this error is just install the latest Linux Mint and fully update it; something changed from FF 55 to 56 though, before I would instantly get sync error, now it takes some time; at first I thought it was solved but nope.
I have 3 plugins installed, "Twitter App", Privacy Badger, and uBlock.

Also, setting the export variable to /tmp/log.txt (after creating the file with touch) doesn't seem to work for me for some reason; might a permission problem but I haven't looked it yet.

Finally, if I start FF in Safe Mode it happens the same after a while (15-30 minutes or so). It's really weird.

(In reply to Kit Cambridge (he/him) [:kitcambridge] (UTC-7) from comment #34)
> (In reply to mab2010 from comment #33)
> > can I help in some way?
> 
> That would be great, actually! I haven't been able to reproduce this
> consistently, and don't have much insight into why the login manager is
> failing to initialize. Do you have any password management add-ons
> installed? These tend to interfere with Sync; can you still reproduce if you
> start Firefox in Safe Mode per comment 2?
> 
> If you don't have any password add-ons installed, could you also set the
> environment variables in comment 15, start Firefox, try syncing, and see
> what's logged?
> (https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/
> HTTP_logging#Linux has more info about `MOZ_LOG`, too).
> 
> Thanks!
Flags: needinfo?(mab2010)
Bug 1415737 is interesting - it's not directly related to Sync, but I'm not aware of a non-Sync bug relating to "Services.logins is undefined". I'm also not sure that the issues reported there are actually this error, but it seems possible given the figured below.

Looking at Sync's telemetry [1], we see Firefox 54 and 55 report 70-80% of password sync failures are due to this error. Firefox 56 is much worse - around 95% of failures are this error.

IOW, our Sync telemetry is reporting a significant regression in this error for 56, which correlates with that bug.

Mat/Dolske, do you recall any changes in either 55 or 56 which might explain this?

[1] https://nbviewer.jupyter.org/gist/mhammond/649a6f6c5606cff0e0d9d8a4f0b9e9a1/Success-Failure-Error%20rates%20per%20engine%2C%20by%20version.ipynb
Flags: needinfo?(dolske)
Flags: needinfo?(MattN+bmo)
See Also: → 1415737
Barely anything has changed in password manager for a long time but there have been changes to PSM and NSS in preparation for key4.db e.g. bug 1380706. See https://bugzilla.mozilla.org/showdependencytree.cgi?id=783994&hide_resolved=0

Keeler would know better what, if any, of those would affect a 56 build. I don't believe 56 uses key4.db.
Flags: needinfo?(MattN+bmo) → needinfo?(dkeeler)
From skimming the changelog of the passwordmgr directory I would guess that https://reviewboard.mozilla.org/r/132722/diff/3#index_header could also be an issue if the LoginManagerParent code was previous initializing storage/crypto at a different time.
These are the PSM bugs fixed in 56: https://mzl.la/2iH2uCE
I don't see anything in this list that would cause the backing key db to fail.
Similarly for 55: https://mzl.la/2iKxCBh
Flags: needinfo?(dkeeler)
Flags: needinfo?(dolske)
See Also: → 1305749
Via bug 1295122:

> Login crypto:Initializing key3.db with default blank password.  				 crypto-SDR.js:55
> Login storage:Initialization failed: Exception { message: "Component returned failure code: 0x…", result: 2153185302, name: "NS_ERROR_XPC_GS_RETURNED_FAILURE", filename: "jar:file:///usr/lib/firefox/omni.ja…", lineNumber: 38, columnNumber: 0, data: null, stack: "get _crypto@jar:file:///usr/lib/fir…", location: XPCWrappedNative_NoHelper }  			 storage-json.js:88
> NS_ERROR_XPC_JAVASCRIPT_ERROR_WITH_DETAILS: [JavaScript Error: "Initialization failed" {file: "jar:file:///usr/lib/firefox/omni.ja!/components/storage-json.js" ... nsLoginManager.js:129

> Login crypto:Initializing key3.db with default blank password.  crypto-SDR.js:55
: Component returned failure code: 0x805a1f9b [nsIPK11Token.initPassword]  crypto-SDR.js:56
(In reply to Mark Hammond [:markh] from comment #40)
> Via bug 1295122:

Doh! Bug 1415051
I just commented in bug 1415737

--8<--

FWIW, I ended up running an analysis of sync telemetry from submissions over the last 200 days, to try and see if there does seem to be a trend upwards and couldn't identify anything. IOW, I can't see evidence that 56 or any other version is over represented, nor that there is a spike as new releases go out. IOW2, I guess I need to agree with Matt's comment [that there's no real pattern in this data].

https://gist.github.com/mhammond/77af60618b3ed3e9cb7abde6f0a69d26
Updating the title to reflect that the symptoms here aren't limited to the passwords engine, but also prevents Firefox Accounts saving Sync credentials and may interact poorly with the master-password.
Summary: Password sync fails with "Services.logins is undefined" → Sync fails with "Services.logins is undefined"
(Following on from Bug 1354023 report) Having upgraded to FF 61.0 today, I thought I'd have another go.   I left the three files in place Key3.db, Key4.db and logins.json), and re-enabled security.enterprise_roots.enabled .

Now FF hangs at startup, with the Master Password prompt visible, but unresponsive.   I have to force-quit it and edit prefs.js to disable security.enterprise_roots.enabled .
(In reply to kkilfedder from comment #51)
> (Following on from Bug 1354023 report) Having upgraded to FF 61.0 today, I
> thought I'd have another go.   I left the three files in place Key3.db,
> Key4.db and logins.json), and re-enabled security.enterprise_roots.enabled .
> 
> Now FF hangs at startup, with the Master Password prompt visible, but
> unresponsive.   I have to force-quit it and edit prefs.js to disable
> security.enterprise_roots.enabled .

That sounds like a regression from bug 1427248 maybe as it sounds like the hang didn't happen with 60? Can you file a bug blocking that one:
https://bugzilla.mozilla.org/enter_bug.cgi?format=__default__&product=Core&component=Security:%20PSM&blocked=1427248
Flags: needinfo?(kkilfedder)
Correct - it didn't hang on 60.   (Instead, it logged out of Firefox Sync and stopped showing my saved passwords; otherwise everything worked normally).
Flags: needinfo?(kkilfedder)
> That sounds like a regression from bug 1427248 maybe as it sounds like the
> hang didn't happen with 60? Can you file a bug blocking that one:
> https://bugzilla.mozilla.org/enter_bug.
> cgi?format=__default__&product=Core&component=Security:%20PSM&blocked=1427248

Filed here: https://bugzilla.mozilla.org/show_bug.cgi?id=1471932  Hope I've done it right!
See Also: → 1562743
Severity: normal → S3

The severity field for this bug is relatively low, S3. However, the bug has 13 duplicates and 5 See Also bugs.
:skhamis, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(skhamis)

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(skhamis)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: