Closed Bug 990834 Opened 6 years ago Closed 6 years ago

hawkclient errors and retry/backoff headers not handled correctly

Categories

(Firefox :: Sync, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla31
Tracking Status
firefox29 + fixed
firefox30 + fixed
firefox31 --- fixed

People

(Reporter: markh, Assigned: markh)

References

Details

Attachments

(3 files, 2 obsolete files)

This is almost 2 bugs in one, but it seems easier to handle them both here - mainly as the fixes both rely on a "part 1" refactor of the test code.

* hawkclient doesn't handle backoff or retry headers.  This is self-explanatory.

* hawkclient errors aren't handled correctly by both FxAccounts and browserid_identity.
** FxAccounts fails to reject the whenKeysReadyPromise on failure, causing browserid_identity to "hang" waiting for a promise that's never resolved nor rejected.
** Once this is fixed, browserid_identity fails to handle hawk errors as it does tokenserver errors, so sync can get into a bad state (ie, will not be in the correct LOGIN_FAILED_LOGIN_REJECTED vs LOGIN_FAILED_NETWORK_ERROR state.)
A relatively simple refactor of the hawk utility functions, used by later patches in this bug.
Attachment #8400328 - Flags: review?(ckarlof)
This patch adds support for X-Backoff, X-Weave-Backoff and Retry-After headers to hawk.  It also renames the notifications sent by tokenserverclient from 'tokenserver:backoff:interval' to 'common:service:backoff:interval', and this same notification is sent by hawk.  As a bonus feature-creep, this also has tokenserverclient handle both 'X-Backoff' and 'X-Weave-Backoff'
Attachment #8400329 - Flags: review?(rnewman)
FxAccounts didn't handle fetchAndUnwrapKeys() failing, which caused whenKeysReadyDeferred to never be resolved nor rejected.

When fetching keys failed, browserid_identity didn't do it's normal "is this an auth error" logic - that was only done via _fetchToken.  The approach I took here was to have the key-fetch happen if necessary as part of _fetchToken, which allows for the same error handling logic to be used in all cases, so 401 errors are treated as auth and other errors treated as network errors.
Assignee: nobody → mhammond
Status: NEW → ASSIGNED
Attachment #8400334 - Flags: review?(ckarlof)
Blocks: 969228
Comment on attachment 8400329 [details] [diff] [review]
0002-Bug-990834-part-2-Add-support-for-retry-and-backoff-.patch

Review of attachment 8400329 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/common/hawkclient.js
@@ +167,5 @@
>        }
>  
>        self._updateClockOffset(restResponse.headers["date"]);
>  
> +      if (status === 401 && retryOK && !restResponse.headers["retry-after"]) {

"retry-after" in restResponse.headers

@@ +232,5 @@
> +      this._log.error("hawkclient response had invalid backoff value in '" +
> +                      headerName + "' header: " + headerVal);
> +      return;
> +    }
> +    Observers.notify("common:service:backoff:interval", backoffInterval);

common:service:backoff doesn't seem like a good idea unless we also include *which* service (and probably which hostname).

Consider two token server clients, both listening for this notification. Which one should back off?

Probably we should add some kind of additional topic here, which we can thread through and pull out of the request/response pair.

::: services/sync/modules/policies.js
@@ +181,5 @@
>          this.adjustSyncInterval();
>          this.nextSync = 0;
>          this.handleSyncError();
>          break;
> +      case "common:service:backoff:interval":

Further to earlier comment: and check that it's a service/host we care about!
Attachment #8400329 - Flags: review?(rnewman) → feedback+
I *think* this is what you had in mind:

* Both hawkclient and tokenserverclient now have an 'observerPrefix' property.  Notifications are only set if this is set, and the topic is prefixed with the value.

* FxAccountsClient arranges for hawkclient to have a prefix of "FxA:hawk"; browserid_identity arranges for tokenserverclient to have a prefix of "weave:service" (which means the observers sent by tokenserverclient are now what policies.js already expects)

* policies.js now listens for "FxA:hawk:backoff:interval" and "weave:service:backoff:interval"
Attachment #8400329 - Attachment is obsolete: true
Attachment #8400425 - Flags: feedback?(rnewman)
Comment on attachment 8400328 [details] [diff] [review]
0001-Bug-990834-part-1-minor-refactor-of-hawk-tests.-r-ck.patch

Review of attachment 8400328 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/sync/tests/unit/test_browserid_identity.js
@@ +498,3 @@
>  // A token server mock will be used that doesn't hit a server, so we move
>  // directly to a hawk request.
> +function* initializeIdentityWithHAWKResponseFactory(config, cbGetResponse) {

Interesting choice of naming here. The abstraction that browserid_identity deals with is FxAccounts, not Hawk, which is an implementation detail of FxAccountsClient, which is also behind the abstraction layer of FxAccounts. That said, the network error is currently being bubbled up to the browserid_identity layer, so all this makes sense. But this does call into question our local API (particularly errors) for FxAccounts reliers.
Attachment #8400328 - Flags: review?(ckarlof) → review+
Comment on attachment 8400425 [details] [diff] [review]
0002-Bug-990834-part-2-Add-support-tweak-retry-and-backof.patch

Review of attachment 8400425 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/common/hawkclient.js
@@ +230,5 @@
> +
> +  set observerPrefix(value) {
> +    this._observerPrefix = value;
> +  },
> +  get observerPrefix() this._observerPrefix,

Newline before getter, and it looks like the comment should live with the setter, not the member.

::: services/common/tokenserverclient.js
@@ +418,5 @@
> +   * Set the prefix used for all notifications sent by this module.  This
> +   * allows the handler of notifications to be sure they are handling
> +   * notifications for the service they expect.
> +   *
> +   * If not set, no notifications will be sent.

Same comment as above.

@@ +423,5 @@
> +   */
> +  _observerPrefix: null,
> +
> +  set observerPrefix(value) {
> +    this._observerPrefix = value;

Is there any point in having getters and setters for these? Looks like unnecessary Java to me.
Attachment #8400425 - Flags: feedback?(rnewman) → feedback+
Comment on attachment 8400334 [details] [diff] [review]
0003-Bug-990834-part-3-Fix-handling-of-hawk-errors.-r-cka.patch

Review of attachment 8400334 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/fxaccounts/FxAccounts.jsm
@@ +506,5 @@
> +              return;
> +            }
> +            currentState.whenKeysReadyDeferred.resolve(data);
> +          },
> +          err => currentState.whenKeysReadyDeferred.reject(err)

Thank you. I had this in a todo list somewhere that I don't think ever made it into a bug.
Attachment #8400334 - Flags: review?(ckarlof) → review+
Mark, could you make sure this bug lands in m-c and fill the uplift requests for aurora & beta? Thanks
Flags: needinfo?(mhammond)
Feedback comments addressed (and yeah, I dropped the getters/setters entirely)
Attachment #8400425 - Attachment is obsolete: true
Attachment #8403099 - Flags: review?(rnewman)
Comment on attachment 8403099 [details] [diff] [review]
0002-Bug-990834-part-2-Add-support-tweak-retry-and-backof.patch

Review of attachment 8403099 [details] [diff] [review]:
-----------------------------------------------------------------

Good enough for me!
Attachment #8403099 - Flags: review?(rnewman) → review+
Comment on attachment 8400328 [details] [diff] [review]
0001-Bug-990834-part-1-minor-refactor-of-hawk-tests.-r-ck.patch

This request is for all 3 patches

[Approval Request Comment]
Bug caused by (feature/regressing bug #): FxA Sync
User impact if declined: Sync might not work correctly if the tokenserver is overloaded or undergoing maintenance.
Testing completed (on m-c, etc.): Landed on fx-team and green there.
Risk to taking this patch (and alternatives if risky): Low risk, limited to sync
String or IDL/UUID changes made by this patch: None
Attachment #8400328 - Flags: approval-mozilla-beta?
Attachment #8400328 - Flags: approval-mozilla-aurora?
Attachment #8400328 - Flags: approval-mozilla-beta?
Attachment #8400328 - Flags: approval-mozilla-beta+
Attachment #8400328 - Flags: approval-mozilla-aurora?
Attachment #8400328 - Flags: approval-mozilla-aurora+
And you got that love, removing keyword.
Comment on attachment 8400334 [details] [diff] [review]
0003-Bug-990834-part-3-Fix-handling-of-hawk-errors.-r-cka.patch

Review of attachment 8400334 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/sync/modules/browserid_identity.js
@@ +490,5 @@
>          // TODO: Make it be only 80% of the duration, so refresh the token
>          // before it actually expires. This is to avoid sync storage errors
>          // otherwise, we get a nasty notification bar briefly. Bug 966568.
>          token.expiration = this._now() + (token.duration * 1000) * 0.80;
> +        if (!this._syncKeyBundle) {

This guard is one the suspects in Bug 1056523. I haven't uncovered the reasoning for its introduction yet.
Mark,

Scenario:

User resets has browser 1 connected to sync and open, and resets her password on browser 2.

When she is prompted to "reconnect to sync" in browser 1, this guard is preventing us from updating her sync key bundle with the new key.

What do you think?

We have another similar guard which is deserving of scrutiny as well: http://hg.mozilla.org/mozilla-central/annotate/0b9dd32d1e16/services/sync/modules/browserid_identity.js#l139
Flags: needinfo?(mhammond)
Related to this is the removal the line

this._syncKeyBundle = null

in bug 982848, which in conjunction with the introduction of the above mentioned guard is causing problems.
I think I've identified the problem. Discussion in bug 1056523.
Flags: needinfo?(mhammond)
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.