977502 - Sync will never retry if initial authentication fails.

Assignee

Description

•

11 years ago

If FxA fails to authenticate, things don't work quite as expected. Specifically, Sync goes into a CLIENT_NOT_CONFIGURED state which prevents Sync from ever automatically retrying. Note too that even if sync *did* retry, there is no mechanism by which browserid_identity can be asked to restart the auth dance again. This is needed to fix bug 969528 - currently *every* problem causes Sync Prefs to enter a "must re-login" state, and every other part of the UI to enter a "please setup sync" state.

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 1

•

11 years ago

Attached patch 0001-Bug-977502-better-management-of-login-failure-states.patch (obsolete) — Details — Splinter Review

Currently most errors authenticating will cause Status.service to be CLIENT_NOT_CONFIGURED - If failure occurs before we have a token, we return LOGIN_FAILED_NO_USERNAME for currentAuthState, which causes CLIENT_NOT_CONFIGURED. This patch arranges for currentAuthState to return either LOGIN_FAILED_LOGIN_REJECTED (if a "real" 401 auth error) or LOGIN_FAILED_NETWORK_ERROR for other problems. The former code is treated as transient by Sync (ie, it should retry as normal) while the latter is considered fatal so the retry semantics differ appropriately. While Sync will now retry, it has no way of forcing bid_identity to restart the login process, so this patch adds a new identity.login() method, which asks the identity to start a new login process - the legacy identity does nothing. Sync calls this method in Service.login(), which is only called if we aren't already logged in. A fly in this ointment is that even after .login() is called, we aren't yet .readyToAuthenticate - so that login will actually fail with Status.login = LOGIN_FAILED_NOT_READY;. So there's a new observer in policies for 'weave:service:login:complete', which tells Sync to sync now if a sync is due. browserid_identity sends this notification once things are done, so everything (should ;) work as expected.

Attachment #8382870 - Flags: feedback?(rnewman)

Attachment #8382870 - Flags: feedback?(ckarlof)

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 2

•

11 years ago

Attached patch 0002-Bug-977502-add-tests-with-mock-http-responses-simula.patch (obsolete) — Details — Splinter Review

This patch adds somewhat "deep" testing for auth failures. It arranges for fake HTTP responses to be delivered to both the token server and the auth server for hawk. If arranges for a real 401 response and also a 200 non-json response - the former causes sync to treat the error as a "real" auth error that requires you to sign back in, while the latter is treated as "transient" (just like 500 and all other non-401 responses are). While the fake responses make the mock setup a little more painful, it seemed the right thing to do rather so the full response-processing part of both those modules is exercised.

Attachment #8382873 - Flags: feedback?(warner-bugzilla)

Attachment #8382873 - Flags: feedback?(ckarlof)

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 3

•

11 years ago

(In reply to Mark Hammond [:markh] from comment #1) > This patch arranges for currentAuthState to return either > LOGIN_FAILED_LOGIN_REJECTED (if a "real" 401 auth error) or > LOGIN_FAILED_NETWORK_ERROR for other problems. The former code is treated > as transient by Sync (ie, it should retry as normal) while the latter is > considered fatal so the retry semantics differ appropriately. Doh - got them back-to-front - LOGIN_FAILED_LOGIN_REJECTED is fatal, LOGIN_FAILED_NETWORK_ERROR is transient.

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 4

•

11 years ago

Attached patch 0001-Bug-977502-better-management-of-login-failure-states.patch (obsolete) — Details — Splinter Review

Now with moar manual testing :) This patch also removes the special handling for LOGIN_FAILED_NOT_READY in policies.js - now we have an observer that tells us when a login has completed we recover via this. Note that this does *not* cause a pending sync to happen immediately - eg, the flow is now something like: * User says "sync now" * Sync notices the identity isn't ready, so asks it to log in. * Sync still fails due to IDENTITY_NOT_READY, and backs off. * Finally we login, which policy.js sees via an observer. * This causes checkSyncStatus() to be called, which arranges for sync to happen on its normal schedule. Ideally, in the last point, Sync would start immediately. The thorn in that ointment is that just because policy.js sees the login:complete notification doesn't mean it should sync now - it does in the scenario above, but not in the "normal" case. Given this patch still allows sync to continue (where it didn't before) I think we can just fix the "but I said sync now" issue in a followup.

Attachment #8382870 - Attachment is obsolete: true

Attachment #8382870 - Flags: feedback?(rnewman)

Attachment #8382870 - Flags: feedback?(ckarlof)

Attachment #8383458 - Flags: feedback?(rnewman)

Attachment #8383458 - Flags: feedback?(ckarlof)

Richard Newman [:rnewman]

Comment 5

•

11 years ago

Comment on attachment 8382870 [details] [diff] [review] 0001-Bug-977502-better-management-of-login-failure-states.patch Review of attachment 8382870 [details] [diff] [review]: ----------------------------------------------------------------- This pains me a lot. I'm still digesting to see if there's a less painful solution than "fail all the time, add more state to record failure". ::: services/sync/modules/browserid_identity.js @@ +118,5 @@ > + if (!this.readyToAuthenticate) { > + return; // we are already logging in, so don't start a new one. > + } > + // Note we do not return a promise - this just kicks things off. > + this.initializeWithCurrentIdentity(); That also means that any errors in this promise chain won't be caught. That's not a new problem (it was previously called in an observer) but it's worth considering. ::: services/sync/modules/identity.js @@ +96,5 @@ > /** > + * Begin an explicit, async login process. > + */ > + login: function() { > + // This identity manager has no explcit login process, so nothing to do. explicit ::: services/sync/modules/service.js @@ +943,5 @@ > if (passphrase) { > this.identity.syncKey = passphrase; > } > > + this.identity.login(); I don't think this is the right term for this. What you're talking about is "kick off an asynchronous initialization process for the current credentials we have". That this is async seems fundamentally at odds with the very next line -- _checkSetup.

Attachment #8382870 - Attachment is obsolete: false

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 6

•

11 years ago

Attached patch Now with moar blocking! (obsolete) — Details — Splinter Review

(In reply to Richard Newman [:rnewman] from comment #5) > Comment on attachment 8382870 [details] [diff] [review] > 0001-Bug-977502-better-management-of-login-failure-states.patch > > This pains me a lot. I'm still digesting to see if there's a less painful > solution than "fail all the time, add more state to record failure". Indeed. Stepping back a little, I realized a better solution is to just have Service.login() block waiting for the login. Service.sync() is the only thing that calls .login(), so this actually seems sane. .verifyLogin is called more often and in ways that blocking would be bad - but blocking during a .sync() call sounds reasonable given what it does, and given it is protected via the locks etc, things should work out fine. Assuming I'm not missing something there, this makes things *alot* saner. For one, it means the sync will never fail due to Status.login being LOGIN_FAILED_NOT_READY (so this patch still removes that code). It also means that sync() will automagically do the right thing WRT retries etc, which simplifies things alot and saves us from the "fail all the time, add more state to record failure" state you correctly identity as a problem. We also don't need new observers etc. > ::: services/sync/modules/browserid_identity.js > @@ +118,5 @@ > > + if (!this.readyToAuthenticate) { > > + return; // we are already logging in, so don't start a new one. > > + } > > + // Note we do not return a promise - this just kicks things off. > > + this.initializeWithCurrentIdentity(); > > That also means that any errors in this promise chain won't be caught. > That's not a new problem (it was previously called in an observer) but it's > worth considering. This new patch has the same "problem" - however, note that initializeWithCurrentIdentity() itself has trailing rejection handlers that log the failure, so it's really not necessary to have extra ones. > I don't think this is the right term for this. What you're talking about is > "kick off an asynchronous initialization process for the current credentials > we have". > > That this is async seems fundamentally at odds with the very next line -- > _checkSetup. Indeed - so now it's .ensureLogin() that returns a promise which is resolved when a login process is complete, starting a new one if there isn't already one in progress and we are in an error state. service.js does the spinningly-thang on the promise. Seems much simpler - I hope I'm not missing something.

Attachment #8382870 - Attachment is obsolete: true

Attachment #8383458 - Attachment is obsolete: true

Attachment #8383458 - Flags: feedback?(rnewman)

Attachment #8383458 - Flags: feedback?(ckarlof)

Attachment #8384409 - Flags: feedback?(rnewman)

Attachment #8384409 - Flags: feedback?(ckarlof)

Richard Newman [:rnewman]

Comment 7

•

11 years ago

Comment on attachment 8384409 [details] [diff] [review] Now with moar blocking! Review of attachment 8384409 [details] [diff] [review]: ----------------------------------------------------------------- This looks sane. I would love a test. ::: services/sync/modules/browserid_identity.js @@ +23,5 @@ > Cu.import("resource://gre/modules/FxAccounts.jsm"); > > // Lazy imports to prevent unnecessary load on startup. > +XPCOMUtils.defineLazyModuleGetter(this, "Weave", > + "resource://services-sync/main.js"); Recursive imports, ho! :P @@ +130,5 @@ > + if (!this._shouldHaveSyncKeyBundle) { > + // We are already in the process of logging in. > + return this.whenReadyToAuthenticate.promise; > + } > + // If we are already happy then there is nothing more to do. Nit: newlines before comments. (Throughout.) @@ +134,5 @@ > + // If we are already happy then there is nothing more to do. > + if (Weave.Status.login == STATUS_OK) { > + return Promise.resolve(); > + } > + // Similarly, if we have a previous failure that implies an explicit login Careful with terminology here. Sync's "login" ain't the same as "login" as you intend it -- you mean "re-entering credentials", right? @@ +150,1 @@ > initializeWithCurrentIdentity: function(isInitialSync=false) { Worth leaving a comment for this function that it out-of-band 'returns' this.whenReadyToAuthenticate. ::: services/sync/modules/policies.js @@ -165,5 @@ > - this._loginNotReadyCounter++; > - this._log.debug("Couldn't log in: identity not ready."); > - this._log.trace("Scheduling a sync at IDENTITY_NOT_READY_RETRY_INTERVAL * " + > - this._loginNotReadyCounter); > - this.scheduleAtInterval(IDENTITY_NOT_READY_RETRY_INTERVAL * Do we still need IDENTITY_NOT_READY_RETRY_INTERVAL? ::: services/sync/modules/service.js @@ +951,5 @@ > + // Ask the identity manager to explicitly login now. > + let cb = Async.makeSpinningCallback(); > + this.identity.ensureLoggedIn().then( > + () => cb(), > + err => cb(err, null) .then(cb, cb)? @@ +953,5 @@ > + this.identity.ensureLoggedIn().then( > + () => cb(), > + err => cb(err, null) > + ); > + // just let any errors bubble up - they've more context than we do! s/just/Just, newline before.

Attachment #8384409 - Flags: feedback?(rnewman) → feedback+

Chris Karlof [:ckarlof]

Comment 8

•

11 years ago

Comment on attachment 8384409 [details] [diff] [review] Now with moar blocking! Review of attachment 8384409 [details] [diff] [review]: ----------------------------------------------------------------- Overall, I like the direction here, but I need to think about this a bit more. ::: services/sync/modules/browserid_identity.js @@ +142,5 @@ > + return Promise.reject(); > + } > + // So - we've a previous auth problem and aren't currently attempting to > + // login - so fire that off. > + this.initializeWithCurrentIdentity(); I'm confused why complete re-initialization should happen here. @@ +526,5 @@ > + // Drop the sync key bundle, but still expect to have one. > + // This will arrange for us to be in the right 'currentAuthState' > + // such that UI will show the right error. > + this._shouldHaveSyncKeyBundle = true; > + this._syncKeyBundle = null; Do you think we should change this to this.resetSyncKey()? We would then have to set this._shouldHaveSyncKeyBundle = true afterwards, because that function sets it to false. @@ +528,5 @@ > + // such that UI will show the right error. > + this._shouldHaveSyncKeyBundle = true; > + this._syncKeyBundle = null; > + Weave.Status.login = this._authFailureReason; > + Services.obs.notifyObservers(null, "weave:service:login:error", null); Tracing through the code, I feel like we might need a Weave.Service.logout() here or in the handler for weave:service:login:error (need to be careful because logout() will fire "weave:service:logout:finish", which is handled in browserid_identity.js. However, we are likely abusing that notification. Please see my comment in Bug 967015.) I think we'll need a Service.logout() because that is what causes Service._shouldLogin() to return false, which will then cause Service.login() to fail and prevent the sync from happening. ::: services/sync/modules/policies.js @@ -166,5 @@ > - this._log.debug("Couldn't log in: identity not ready."); > - this._log.trace("Scheduling a sync at IDENTITY_NOT_READY_RETRY_INTERVAL * " + > - this._loginNotReadyCounter); > - this.scheduleAtInterval(IDENTITY_NOT_READY_RETRY_INTERVAL * > - this._loginNotReadyCounter); Was this stuff added by us?

Attachment #8384409 - Flags: feedback+ → feedback?(rnewman)

Chris Karlof [:ckarlof]

Comment 9

•

11 years ago

Comment on attachment 8384409 [details] [diff] [review] Now with moar blocking! Review of attachment 8384409 [details] [diff] [review]: ----------------------------------------------------------------- sorry for unplussing your feedback, rnewman

Attachment #8384409 - Flags: feedback?(rnewman) → feedback+

Chris Karlof [:ckarlof]

Comment 10

•

11 years ago

Comment on attachment 8382873 [details] [diff] [review] 0002-Bug-977502-add-tests-with-mock-http-responses-simula.patch Review of attachment 8382873 [details] [diff] [review]: ----------------------------------------------------------------- nice improvements

Attachment #8382873 - Flags: feedback?(ckarlof) → feedback+

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 11

•

11 years ago

Attached patch 0001-Bug-977502-part-1-better-management-of-login-failure.patch — Details — Splinter Review

This addresses the feedback and discussion so far. There is a test for test_ensureLoggedIn() - it would be better if there was one covering a full sync, but that has proven difficult in the short term.

Attachment #8384409 - Attachment is obsolete: true

Attachment #8384409 - Flags: feedback?(ckarlof)

Attachment #8386031 - Flags: review?(rnewman)

Attachment #8386031 - Flags: review?(ckarlof)

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 12

•

11 years ago

Attached patch 0002-Bug-977502-part-2-add-tests-with-mock-http-responses.patch — Details — Splinter Review

The hawk and token-server tests.

Attachment #8382873 - Attachment is obsolete: true

Attachment #8382873 - Flags: feedback?(warner-bugzilla)

Attachment #8386032 - Flags: review?(ckarlof)

James Bonacci [:jbonacci]

Updated

•

11 years ago

Whiteboard: [qa+]

Chris Karlof [:ckarlof]

Comment 13

•

11 years ago

Comment on attachment 8386031 [details] [diff] [review] 0001-Bug-977502-part-1-better-management-of-login-failure.patch Review of attachment 8386031 [details] [diff] [review]: ----------------------------------------------------------------- I'm liking it. Since the try build was a bust, I still need to do some manual testing though. ::: services/sync/modules/browserid_identity.js @@ +534,5 @@ > + // Drop the sync key bundle, but still expect to have one. > + // This will arrange for us to be in the right 'currentAuthState' > + // such that UI will show the right error. > + this._shouldHaveSyncKeyBundle = true; > + this._syncKeyBundle = null; I'm still not sure if we should necessarily clear the syncKeyBundle for all errors, or just auth errors. I'm not sure if we need to do Service.logout() here for auth errors. Also for LOGIN_FAILED_LOGIN_REJECTED, it would be nice to arrange for sync to be de-scheduled somehow, because there will be no auto-recovery here. ::: services/sync/modules/policies.js @@ -166,5 @@ > - this._log.debug("Couldn't log in: identity not ready."); > - this._log.trace("Scheduling a sync at IDENTITY_NOT_READY_RETRY_INTERVAL * " + > - this._loginNotReadyCounter); > - this.scheduleAtInterval(IDENTITY_NOT_READY_RETRY_INTERVAL * > - this._loginNotReadyCounter); I think these are part of legacy sync. Why are we removing them?

Attachment #8386031 - Flags: review?(ckarlof) → review+

Chris Karlof [:ckarlof]

Comment 14

•

11 years ago

Comment on attachment 8386032 [details] [diff] [review] 0002-Bug-977502-part-2-add-tests-with-mock-http-responses.patch Review of attachment 8386032 [details] [diff] [review]: ----------------------------------------------------------------- Nice improvements.

Attachment #8386032 - Flags: review?(ckarlof) → review+

Richard Newman [:rnewman]

Comment 15

•

11 years ago

Comment on attachment 8386031 [details] [diff] [review] 0001-Bug-977502-part-1-better-management-of-login-failure.patch Review of attachment 8386031 [details] [diff] [review]: ----------------------------------------------------------------- ::: services/sync/modules/browserid_identity.js @@ +144,5 @@ > + return Promise.reject(); > + } > + > + // So - we've a previous auth problem and aren't currently attempting to > + // login - so fire that off. Nit: "log in" ::: services/sync/tests/unit/test_browserid_identity.js @@ +235,5 @@ > + // arrange for no logged in user. > + let fxa = browseridManager._fxaService > + let signedInUser = fxa.internal.currentAccountState.signedInUser; > + fxa.internal.currentAccountState.signedInUser = null; > + Status.login = LOGIN_FAILED_NO_USERNAME; You should ensure that shouldHaveSyncKeyBundle is false here, otherwise you're not testing much…

Attachment #8386031 - Flags: review?(rnewman) → review+

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 16

•

11 years ago

https://hg.mozilla.org/integration/fx-team/rev/6e4f7fc7e1f3 https://hg.mozilla.org/integration/fx-team/rev/00825a70f56f

status-firefox29: --- → affected

status-firefox30: --- → affected

tracking-firefox29: --- → ?

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 17

•

11 years ago

Attached patch Part 2 - Rebased against aurora — Details — Splinter Review

Part 2 is not going to apply on aurora, but this one will. I'll request approval once they get to m-c

Attachment #8387382 - Flags: review+

Mark Hammond [:markh] [:mhammond]

Assignee

Updated

•

11 years ago

Attachment #8387382 - Attachment description: Rebased against aurora → Part 2 - Rebased against aurora

Sylvestre Ledru [:Sylvestre]

Updated

•

11 years ago

tracking-firefox29: ? → +

Ryan VanderMeulen [:RyanVM]

Comment 18

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/6e4f7fc7e1f3 https://hg.mozilla.org/mozilla-central/rev/00825a70f56f

Status: ASSIGNED → RESOLVED

Closed: 11 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

Target Milestone: --- → Firefox 30

Chris Karlof [:ckarlof]

Comment 19

•

11 years ago

Comment on attachment 8386031 [details] [diff] [review] 0001-Bug-977502-part-1-better-management-of-login-failure.patch Review of attachment 8386031 [details] [diff] [review]: ----------------------------------------------------------------- ::: services/sync/modules/browserid_identity.js @@ -479,5 @@ > - // such that UI will show the right error. > - this._shouldHaveSyncKeyBundle = true; > - this._syncKeyBundle = null; > - Weave.Status.login = this.currentAuthState; > - Services.obs.notifyObservers(null, "weave:service:login:error", null); I think the removal of this line might be a regression. There are at least two significant listeners of this: 1) Sync UI, which use it as a signal to update the "logged in" state 2) The SyncScheduler, which when Status.login enters an "unrecoverable state", like LOGIN_FAILED_LOGIN_REJECTED, then it stops scheduling syncs, which makes sense.

Chris Karlof [:ckarlof]

Comment 20

•

11 years ago

> 2) The SyncScheduler, which when Status.login enters an "unrecoverable state", like LOGIN_FAILED_LOGIN_REJECTED, then it stops scheduling syncs, which makes sense. http://hg.mozilla.org/mozilla-central/diff/35a1943f351c/services/sync/modules/policies.js#l1.128

James Bonacci [:jbonacci]

Comment 21

•

11 years ago

So, maybe not ready for Resolved/Fixed status?

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 22

•

11 years ago

Comment on attachment 8386031 [details] [diff] [review] 0001-Bug-977502-part-1-better-management-of-login-failure.patch [Approval Request Comment] Bug caused by (feature/regressing bug #): Fxa Sync User impact if declined: Incorrect error state handling. Testing completed (on m-c, etc.): Landed on m-c Risk to taking this patch (and alternatives if risky): Low String or IDL/UUID changes made by this patch: None Note to sheriffs or whoever lands these: These patches may conflict unless they are landed in the order of: Bug 977502, Bug 967015 and Bug 969528

Attachment #8386031 - Flags: approval-mozilla-aurora?

Mark Hammond [:markh] [:mhammond]

Assignee

Comment 23

•

11 years ago

Comment on attachment 8387382 [details] [diff] [review] Part 2 - Rebased against aurora [Approval Request Comment] Bug caused by (feature/regressing bug #): Fxa Sync User impact if declined: Incorrect error state handling. Testing completed (on m-c, etc.): Landed on m-c Risk to taking this patch (and alternatives if risky): Low String or IDL/UUID changes made by this patch: None Note to sheriffs or whoever lands these: These patches may conflict unless they are landed in the order of: Bug 977502, Bug 967015 and Bug 969528

Attachment #8387382 - Flags: approval-mozilla-aurora?

Sylvestre Ledru [:Sylvestre]

Updated

•

11 years ago

status-firefox30: affected → fixed

Sylvestre Ledru [:Sylvestre]

Updated

•

11 years ago

Attachment #8387382 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Ryan VanderMeulen [:RyanVM]

Comment 24

•

11 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/23b00776e8f5 https://hg.mozilla.org/releases/mozilla-aurora/rev/bc4f7a9cd543

status-firefox29: affected → fixed

Sylvestre Ledru [:Sylvestre]

Comment 25

•

11 years ago

Comment on attachment 8386031 [details] [diff] [review] 0001-Bug-977502-part-1-better-management-of-login-failure.patch Sorry, I missed this patch.

Attachment #8386031 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Chris Karlof [:ckarlof]

Updated

•

11 years ago

Blocks: 968439

Tracy Walker [:tracy]

Updated

•

11 years ago

Keywords: verifyme

Tracy Walker [:tracy]

Updated

•

11 years ago

Status: RESOLVED → VERIFIED

QA Whiteboard: [qa-]

status-firefox29: fixed → verified

status-firefox30: fixed → verified

Keywords: verifyme

Whiteboard: [qa+]

0001-Bug-977502-better-management-of-login-failure-states.patch 11 years ago Mark Hammond [:markh] [:mhammond] 10.39 KB, patch		Details \| Diff \| Splinter Review
0002-Bug-977502-add-tests-with-mock-http-responses-simula.patch 11 years ago Mark Hammond [:markh] [:mhammond] 11.69 KB, patch	ckarlof : feedback+	Details \| Diff \| Splinter Review
0001-Bug-977502-better-management-of-login-failure-states.patch 11 years ago Mark Hammond [:markh] [:mhammond] 13.89 KB, patch		Details \| Diff \| Splinter Review
Now with moar blocking! 11 years ago Mark Hammond [:markh] [:mhammond] 15.05 KB, patch	ckarlof : feedback+	Details \| Diff \| Splinter Review
0001-Bug-977502-part-1-better-management-of-login-failure.patch 11 years ago Mark Hammond [:markh] [:mhammond] 19.94 KB, patch	ckarlof : review+ rnewman : review+ Sylvestre : approval-mozilla-aurora+	Details \| Diff \| Splinter Review
0002-Bug-977502-part-2-add-tests-with-mock-http-responses.patch 11 years ago Mark Hammond [:markh] [:mhammond] 8.99 KB, patch	ckarlof : review+	Details \| Diff \| Splinter Review
Part 2 - Rebased against aurora 11 years ago Mark Hammond [:markh] [:mhammond] 8.96 KB, patch	markh : review+ Sylvestre : approval-mozilla-aurora+	Details \| Diff \| Splinter Review