Closed Bug 1227527 Opened 8 years ago Closed 8 years ago

Initial implementation of Firefox Accounts device registration

Categories

(Firefox :: Sync, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Firefox 46
Tracking Status
firefox46 --- fixed

People

(Reporter: pb, Assigned: pb)

References

Details

(Whiteboard: [fxa-waffle])

Attachments

(1 file, 13 obsolete files)

74.32 KB, patch
Details | Diff | Splinter Review
There are a lot of different bugs open for FxA device registration and tracking them individually is a pain in the bum. So I'm creating this to cover the initial feature set that I plan to implement / get reviewed. I'll shortly close out those it obsoletes as duplicates of this issue (and leave those it doesn't obsolete alone).

The scope of this issue is to:

* implement a device API client object
* via calls to the aforementioned object, make the following API requests:
  * POST /account/device
    * after a signed-in user opens the browser
    * after a user signs up for Sync
    * after a user signs in to Sync
    * after editing the device name
  * POST /account/device/destroy
    * after the user disconnects Sync
* send a WebChannel message after the user disconnects Sync
* subscribe to a `fxaccounts:deleteDevice` WebChannel message
  and delete the locally-stored device id in the message handler
* accept a device id parameter in `fxaccounts:login` messages

Things that this issue will not cover:

* #1221111, sending WebChannel events for device name changes
  (the content server UI will have a refresh button)
* #1221125, subscribing to WebChannel events for device name changes
  (the content server UI will not support renaming)
* #1221105, fetching updated device details from the auth server on startup
  (the content server UI will not support renaming)
* push notifications from the auth server
Correction to comment 0:

Where I said "subscribe to a `fxaccounts:deleteDevice` WebChannel message", the message name is wrong. It should have been `fxaccounts:logout` instead of `fxaccounts:deleteDevice`.
Correct to comment 0:

In conversation with Vlad, we agreed that the "send a WebChannel message after the user disconnects Sync" requirement was unnecessary because the user's session will have been destroyed and they'll be force out at the next request anyway.
Because it's taking me longer than anticipated to sort out the tests, at Vlad's suggestion I'm attaching the code changes here for early feedback.

I'm still working on sorting out the tests (see [1]), so a proper patch will still come in once they're finished.

[1] https://github.com/mozilla/gecko-dev/compare/master...philbooth:bug-1227527?expand=1
Attachment #8694239 - Flags: feedback?(vlad)
Attachment #8694239 - Flags: feedback?(rfkelly)
Attachment #8694239 - Flags: feedback?(markh)
Comment on attachment 8694239 [details] [diff] [review]
Basic implementation without tests

Review of attachment 8694239 [details] [diff] [review]:
-----------------------------------------------------------------

left some feedback, I think my main concern are "" (blank) device names.

::: browser/components/preferences/in-content/sync.js
@@ +175,4 @@
>      let textbox = document.getElementById("fxaSyncComputerName");
>      if (save) {
>        Weave.Service.clientsEngine.localName = textbox.value;
> +      fxAccounts.updateDevice();

in my testing the user is able to set a device name to "" (empty string), should we allow this?

::: services/fxaccounts/FxAccounts.jsm
@@ +608,4 @@
>  
>    signOut: function signOut(localOnly) {
>      let currentState = this.currentAccountState;
> +    let sessionToken, tokensToRevoke, deviceId;

nit: style change here, might need to have a separate `let` for each variable to keep `git / hg blame` clean here.

@@ +1344,5 @@
>      ).catch(err => Promise.reject(this._errorToErrorClass(err)));
>    },
> +
> +  registerDeviceIfNotRegistered() {
> +    return this.getSignedInUser().then(signedInUser => {

nit: do we need logging here that we are attempting to register a device. might be useful in the future to see what is going on. should we log during " initialize() " or here?

@@ +1359,5 @@
> +      }
> +    });
> +  },
> +
> +  _registerOrUpdateDevice(signedInUser) {

is there a way to clean the checks for "if (signedInUser)" in registerDeviceIfNotRegistered() and updateDevice() and just do them as part of "_registerOrUpdateDevice"? Thoughts?

@@ +1370,5 @@
> +        signedInUser.sessionToken, signedInUser.deviceId, deviceName)
> +        .catch(handleError);
> +    }
> +
> +    log.debug("registering new device details");

nit: IMHO the log will be a bit confusing here, we would see:
-: updating existing device details
-: registering new device details

unclear what exactly happened to me

@@ +1372,5 @@
> +    }
> +
> +    log.debug("registering new device details");
> +    return this.fxAccountsClient.deviceRegister(
> +      signedInUser.sessionToken, deviceName, "desktop")

nit: "desktop" make this a constant? Might be useful for code reuse in the future for Fennec, others...

@@ +1379,5 @@
> +  },
> +
> +  _getDeviceName() {
> +    try {
> +      return Services.prefs.getCharPref("services.sync.client.name");

make ""services.sync.client.name""a constant on top of the file?

@@ +1396,5 @@
> +      if (error.errno === ERRNO_DEVICE_SESSION_CONFLICT) {
> +        log.warn("device session conflict, attempting to untangle things");
> +        return this._recoverFromDeviceError(error, signedInUser.sessionToken);
> +      }
> +    }

nit: would it be useful to have an "else {" here and log that "_handleDeviceError" failed due to unknown / uncaught reason (and what that error was)?

::: services/fxaccounts/FxAccountsManager.jsm
@@ +301,2 @@
>          let client = this._getFxAccountsClient();
> +        return client.deviceDestroy(sessionToken, deviceId).then(

do we need to check if "deviceId" is defined here?
Attachment #8694239 - Flags: feedback?(vlad) → feedback+
Thanks for the feedback Vlad, I'll push fixes for most of your comments tomorrow. Just to respond to some of your points...

> in my testing the user is able to set a device name to "" (empty string), should we allow this?

Good catch, I'll fix this and also add test coverage.

> nit: do we need logging here that we are attempting to register a device. might be useful in the
> future to see what is going on. should we log during " initialize() " or here?

The logging is in `_registerOrUpdateDevice`.

> nit: IMHO the log will be a bit confusing here, we would see:
> -: updating existing device details
> -: registering new device details
> 
> unclear what exactly happened to me

The `if` condition returns so only one log statement is executed.

> nit: would it be useful to have an "else {" here and log that "_handleDeviceError" failed due to
> unknown / uncaught reason (and what that error was)?

This is already done in `_failDeviceRegistration`.
Oh, I missed one...

> is there a way to clean the checks for "if (signedInUser)" in registerDeviceIfNotRegistered()
> and updateDevice() and just do them as part of "_registerOrUpdateDevice"? Thoughts?

I'm happy to do this, but I'm not sure whether it makes the code clearer.

We'd need to pass a predicate to `_registerOrUpdateDevice`, letting it know whether to continue or abort. Something like this is what I'm thinking:

>  registerDeviceIfNotRegistered() {
>    return this._registerOrUpdateDevice(signedInUser => signedInUser && !signedInUser.deviceId);
>  },
> 
>  updateDevice() {
>    return this._registerOrUpdateDevice(signedInUser => !!signedInUser);
>  },
> 
>  _registerOrUpdateDevice(predicate) {
>    this.getSignedInUser().then(signedInUser => {
>      if (predicate(signedInUser)) {
>        // carry on...
>      }
>    });
>  },

What do you think?
Comment on attachment 8694239 [details] [diff] [review]
Basic implementation without tests

Review of attachment 8694239 [details] [diff] [review]:
-----------------------------------------------------------------

This is looking good to me, thanks :pb!  Some high-level thoughts:

* I really, really wish we had the explicit state management and transitions of Android here ("Married", "Separated", etc) as it would make it easier to see and to explain what's going on.  Maybe one day.  I find myself wondering if this extra device-registration step would introduce a new pre-married state ("Eloped?") but I guess that's beside the point...

* Is there a place we should be changing documentation in this code to add the details of the new flow?

::: browser/components/preferences/in-content/sync.js
@@ +175,4 @@
>      let textbox = document.getElementById("fxaSyncComputerName");
>      if (save) {
>        Weave.Service.clientsEngine.localName = textbox.value;
> +      fxAccounts.updateDevice();

Down in the implementation of this `localName` property in services/sync/modules/engines/clients.js, it interprets an empty string as "use the default device name".  We should make sure we're doing the same in order to remain consistent with existing sync behaviour.

::: services/fxaccounts/FxAccounts.jsm
@@ +496,4 @@
>        // the background? Already does for updateAccountData ;)
>        return currentAccountState.promiseInitialized.then(() => {
>          Services.telemetry.getHistogramById("FXA_CONFIGURED").add(1);
> +        this.updateDevice();

Should we chain onto the promise here rather than letting it resolve in the background?  It might help avoid getting into weird states where we're syncing but haven't finished the device registration yet.

@@ +662,3 @@
>      // For now we assume the service being logged out from is Sync - we might
>      // need to revisit this when this FxA code is used in a context that
>      // isn't Sync.

This comment probably needs to be re-worded a bit, since we're now destroying the entire device registration rather than logging out of a service.

@@ +664,5 @@
>      // isn't Sync.
> +    log.debug("destroying device");
> +    return this.fxAccountsClient.deviceDestroy(sessionToken, deviceId, {
> +      service: "sync"
> +    });

Will we ever go to disconnect from sync *without* having a deviceId, e.g. due to some race with an old client that's being updated to the new version of the code, but hasn't registered itself yet?

@@ +1351,5 @@
> +      }
> +    })
> +  },
> +
> +  updateDevice() {

tiny nit: a slightly more verbose name like "updateDeviceRegistration" may be clearer here; since this is code running on a device, it's not obvious whether updateDevice() would refer to something local or something remote.

@@ +1372,5 @@
> +    }
> +
> +    log.debug("registering new device details");
> +    return this.fxAccountsClient.deviceRegister(
> +      signedInUser.sessionToken, deviceName, "desktop")

Indeed, there may be an existing namespace of constants from the sync clients engine that we should be using here.

@@ +1383,5 @@
> +      return Services.prefs.getCharPref("services.sync.client.name");
> +    } catch (ignore) {
> +      return Utils.getDefaultDeviceName();
> +    }
> +  },

I'm not a huge fan of the way this duplicates the functionality of the sync clients engine `localName` property (and is subtly different to it in the case of an empty-string pref).  Since we want to move towards FxA holding the canonical device details, perhaps we can refactor things so that the clients engine re-uses our logic?

That can easily be a follow-up refactor though.

@@ +1399,5 @@
> +      }
> +    }
> +
> +    return this._failDeviceRegistration(error);
> +  },

Since this `_handleDeviceError` has some branching logic in it that's conceptually part of the flow of the `_registerOrUpdateDevice` logic, I wonder if it would be clearer defined as an inline helper function inside that method.  Or failing that, immediately adjacent to that method so it's easier to see how the flow continues.

@@ +1416,5 @@
> +        }
> +        return this._failDeviceRegistration(error);
> +      })
> +      .catch(ignore => this._failDeviceRegistration(error));
> +  },

The recovery logic here is subtle enough that it deserves a comment explaining what's going on and why.

::: services/fxaccounts/FxAccountsClient.jsm
@@ +457,5 @@
> +   *                              represents the current device
> +   *             name: Device name
> +   *             type: Device type (mobile|desktop)
> +   *             pushCallback: Push notifications endpoint
> +   *             pushPublicKey: Push notifications public encryption key

Since we haven't figured out what we're doing with these push things yet, I suggest not documenting them as being returned.  We may even consider not exposing them directly to other clients at all.

::: services/fxaccounts/FxAccountsWebChannel.jsm
@@ +261,5 @@
>    logout(uid) {
>      return fxAccounts.getSignedInUser().then(userData => {
>        if (userData.uid === uid) {
> +        // true argument is `localOnly`, because server-side stuff
> +        // has already been taken care of by the content server

lolz, this is why we have named default arguments; not this patch's fault tho :-)
Attachment #8694239 - Flags: feedback?(rfkelly) → feedback+
Mark, for our situational awareness, it would be great if you could also say a few words about the timelines we're facing here, as IIUC we're coming up to the end of the 45 cycle.  Should we be pushing hard to get a fully-reviewable version of this up by a certain time to make 45, or are we pretty much looking at 46 here given all the other things in flight at the moment?

(no big deal either way, it's just nice to know where things stand)
Flags: needinfo?(markh)
Comment on attachment 8694239 [details] [diff] [review]
Basic implementation without tests

Review of attachment 8694239 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry, I didn't get far through this and may not get back to it today, so here's the little I have. I'll get back to ASAP.

::: browser/components/preferences/in-content/sync.js
@@ +179,1 @@
>      }

The Sync prefs behaviour is a little wrong here too - I just opened bug 1229625 - but I think this *will* do the right thing with a small change to FxAccounts.jsm I mention later.

::: services/fxaccounts/FxAccounts.jsm
@@ +346,4 @@
>    initialize() {
>      this.currentTimer = null;
>      this.currentAccountState = this.newAccountState();
> +    return this.registerDeviceIfNotRegistered();

this is now returning a promise without updating the callers - and the caller is the constructor, so it really *can't* handle it.

I guess it does depend on what bad things might happen if device registration isn't complete (or fails), causing the first call to .getSignedInUser() to continue returning an object without a device ID. If nothing, then maybe loadAndPoll is a better place for this one-off migration - it already has the account data, so registerDeviceIfNotRegistered() could then be inlined. This still wouldn't let you block on registration completing though - if that was necessary probably need to do it as part of getSignedInUser() and possibly cause it to fail - but that would be much scarier.

@@ +496,4 @@
>        // the background? Already does for updateAccountData ;)
>        return currentAccountState.promiseInitialized.then(() => {
>          Services.telemetry.getHistogramById("FXA_CONFIGURED").add(1);
> +        this.updateDevice();

agree we should chain this promise.

@@ +1367,5 @@
> +    if (signedInUser.deviceId) {
> +      log.debug("updating existing device details");
> +      return this.fxAccountsClient.deviceUpdate(
> +        signedInUser.sessionToken, signedInUser.deviceId, deviceName)
> +        .catch(handleError);

as below, I'd probably rather see this as .catch(error => return this._handleDeviceError(error)) but I'm not too bothered.

@@ +1379,5 @@
> +  },
> +
> +  _getDeviceName() {
> +    try {
> +      return Services.prefs.getCharPref("services.sync.client.name");

these semantics aren't identical to sync's - the pref existing with an empty string will return the empty string here, but Sync will use the default name in that case. I think it's also worth a comment indicating that this must be kept in sync with |get localName()| in clients.js, and maybe a comment in clients.js pointing back here.

(Given we are already importing Sync's utils module, another alternative that might be even better is to add a new method to that module, say, Utils.getDeviceName() that has Sync's logic, then change both clients.js and this code to use it.)

@@ +1403,5 @@
> +  },
> +
> +  _forceUpdateDevice(deviceId) {
> +    return this.currentAccountState.updateUserAccountData({ deviceId })
> +      .then(this.updateDevice.bind(this));

I tend to hate bind :) I think it looks better as |.then(() => this.updateDevice())| but I'm not too bothered.
(In reply to Ryan Kelly [:rfkelly] from comment #16)
> Mark, for our situational awareness, it would be great if you could also say
> a few words about the timelines we're facing here, as IIUC we're coming up
> to the end of the 45 cycle.  Should we be pushing hard to get a
> fully-reviewable version of this up by a certain time to make 45, or are we
> pretty much looking at 46 here given all the other things in flight at the
> moment?

This patch doesn't really need other teams apart from us, so I see no reason not to still aim for 45, especially if we find time at Orlando to work on it. But we'd probably need to get it landed *at* Orlando, so I'm a little skeptical we will find the time to complete it there, but you never know :)
Flags: needinfo?(markh)
Attaching the fixes (I hope) for feedback comments. Still working through the tests (sorry).
Attachment #8694239 - Attachment is obsolete: true
Attachment #8694239 - Flags: feedback?(markh)
Comment on attachment 8694842 [details] [diff] [review]
Basic implementation without tests, Mk. II

Review of attachment 8694842 [details] [diff] [review]:
-----------------------------------------------------------------

This is looking better every time :) I'm a little concerned about the error expectations, which I think should be clarified in comments and ideally also in tests.

::: services/fxaccounts/FxAccounts.jsm
@@ +672,5 @@
> +    const options = { service: "sync" };
> +
> +    if (deviceId) {
> +      log.debug("destroying device and session");
> +      return this.fxAccountsClient.deviceDestroy(sessionToken, deviceId, options);

this looks like it will skip the signOut if there is a device? We also need to handle an error destroying the device and ensure we still attempt the signOut.

@@ +1388,5 @@
> +      .then(d => this.currentAccountState.updateUserAccountData({ deviceId: d.id }))
> +      .catch(error => this._handleDeviceError(error));
> +  },
> +
> +  _handleDeviceError(error) {

I think we need a final catch where this is called, incase this fails. I wonder if _registerOrUpdateDevice could be written something like:

 _registerOrUpdateDevice(signedInUser) {
    const deviceName = Utils.getDeviceName();
    return Promise.resolve().then(() => {
      if (signedInUser.deviceId) {
        log.debug("updating existing device details");
        return this.fxAccountsClient.deviceUpdate(...);
      }
      log.debug("registering new device details");
      return this.fxAccountsClient.deviceRegister(
    }.catch(error => {
      log.info("Device registration/update failed - attempting to recover", error);
      return this._handleDeviceError(error);
    }.catch(error => {
      log.error("Failed to recover from device registration error", error);
    });

or similar?

@@ +1430,5 @@
> +  },
> +
> +  _failDeviceRegistration(error) {
> +    log.error("device registration failed", error);
> +    return Promise.reject(this._errorToErrorClass(error));

We need to think through where this kind of rejection will bubble to - ie, so we intend to have it bubble up to setSignedInUser, thereby preventing the user logging in at all?

@@ +1442,4 @@
>  
>    // XXX Bug 947061 - We need a strategy for resuming email verification after
>    // browser restart
> +  a.loadAndPoll().then(() => a.registerDeviceIfNotRegistered());

I think we should have loadAndPoll call registerDeviceIfNotRegistered and have it pass the account data.

So it looks something like:

  loadAndPoll: function() {
    let currentState = this.currentAccountState;
    return currentState.getUserAccountData()
      .then(data => {
        if (data) {
          Services.telemetry.getHistogramById("FXA_CONFIGURED").add(1);
          if (!this.isUserEmailVerified(data)) {
            this.pollEmailStatus(currentState, data.sessionToken, "start");
          }
        }
        return data;
      })
     .then(data => {
        if (data) {
          return this.registerDeviceIfNotRegistered(data);
        }
      });
  },

but - then registerDeviceIfNotRegistered becomes overhead with no gain, so that second .then() could read:
   .then(data => {
     if (data && !data.deviceId) {
      // comment about how this is a one-off "migration"
      return this._registerOrUpdateDevice(signedInUser);
     }
   }); 

and we've killed a new public method \o/

(I think we should also add a final .catch that logs an error - IIUC, one would be expected in a disconnected state)

::: services/fxaccounts/FxAccountsClient.jsm
@@ +432,5 @@
> +  deviceDestroy(sessionTokenHex, id, options={}) {
> +    let path = "/account/device/destroy";
> +
> +    if (options.service) {
> +      path += "?service=" + options.service;

I wonder if we should URLEncode this? OTOH, I bet service names are simple alpha strings?

::: services/fxaccounts/FxAccountsManager.jsm
@@ +301,4 @@
>          let client = this._getFxAccountsClient();
> +
> +        if (deviceId) {
> +          return client.deviceDestroy(sessionToken, deviceId);

it looks like we will skip client.signOut() if there is a device id? I'm also unclear on how this code will get a device id in the first place. We are going to need :ferjm to have a look at this as he basically owns the b2g impl.
Attachment #8694842 - Flags: feedback+
Hey Fernando,
  In this bug we are adding support for an FxA device registration api which ties into Sync. The current WIP patch touches FxAccountsManager.jsm, but it's not clear to me yet that this will be enough, or that we need to touch this at all in this bug, but I thought you should be aware of this effort regardless.
>   loadAndPoll: function() {
>    let currentState = this.currentAccountState;
>    return currentState.getUserAccountData()
>      .then(data => {
>        if (data) {
>          Services.telemetry.getHistogramById("FXA_CONFIGURED").add(1);
>          if (!this.isUserEmailVerified(data)) {
>            this.pollEmailStatus(currentState, data.sessionToken, "start");
>          }
>        }
>        return data;
>      })
>     .then(data => {
>        if (data) {
>          return this.registerDeviceIfNotRegistered(data);
>        }
>      });
>  },

Something to keep in mind for the future, is that at some point we intend to include a push notification endpoint in the device registration endpoint.  We'd like to use this endpoint to eliminate the polling for account verification in favour of push notifications.  But that means the device registration would have to be done *before* the polling step.
(In reply to Ryan Kelly [:rfkelly] from comment #22)
> Something to keep in mind for the future, is that at some point we intend to
> include a push notification endpoint in the device registration endpoint. 
> We'd like to use this endpoint to eliminate the polling for account
> verification in favour of push notifications.  But that means the device
> registration would have to be done *before* the polling step.

Yeah, good point. Doing it before sounds fine now too.
> I'm a little concerned about the error expectations

Indeed, and this is one of the difficulties of doing this with a separate API call.  If we get a 503 (or even a 404 on older servers) do we still proceed with connecting to sync, and just try again later?
Comment on attachment 8694842 [details] [diff] [review]
Basic implementation without tests, Mk. II

Review of attachment 8694842 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks again for the feedback :markh, will make the suggested changes and definitely (he says) finish the tests today too.

::: services/fxaccounts/FxAccounts.jsm
@@ +672,5 @@
> +    const options = { service: "sync" };
> +
> +    if (deviceId) {
> +      log.debug("destroying device and session");
> +      return this.fxAccountsClient.deviceDestroy(sessionToken, deviceId, options);

Actually, the `deviceDestroy` endpoint also destroys the session token, which is all that `signOut` does. I will rename `deviceDestroy` to make that duality clearer.

@@ +1430,5 @@
> +  },
> +
> +  _failDeviceRegistration(error) {
> +    log.error("device registration failed", error);
> +    return Promise.reject(this._errorToErrorClass(error));

Yeah, sorry, I did think about this but wasn't sure what the correct thing to do was. Will change to make sure it's non-fatal; then are we happy that the next time the browser is opened or the device name is changed, registration will be retried?

::: services/fxaccounts/FxAccountsClient.jsm
@@ +432,5 @@
> +  deviceDestroy(sessionTokenHex, id, options={}) {
> +    let path = "/account/device/destroy";
> +
> +    if (options.service) {
> +      path += "?service=" + options.service;

It's only called with the literal "sync" at the moment but I agree it makes sense to do it now rather than forget later.

::: services/fxaccounts/FxAccountsManager.jsm
@@ +301,4 @@
>          let client = this._getFxAccountsClient();
> +
> +        if (deviceId) {
> +          return client.deviceDestroy(sessionToken, deviceId);

As per FxAccounts.jsm, `deviceDestroy` does everything that `signOut` does, I'll rename the method.
> In this bug we are adding support for an FxA device registration api which ties into Sync.
> The current WIP patch touches FxAccountsManager.jsm, but it's not clear to me yet that this
> will be enough, or that we need to touch this at all in this bug, but I thought you should
> be aware of this effort regardless.

Very happy to take that code out if it's not needed, it was another thing I was unsure about anyway.
Just updating the patch with fixes for the most recent feedback before I continue with sorting out the tests.
Attachment #8694842 - Attachment is obsolete: true
Whoops. :-/
Attachment #8695252 - Attachment is obsolete: true
Comment on attachment 8695257 [details] [diff] [review]
Basic implementation without tests, Mk. IV

Review of attachment 8695257 [details] [diff] [review]:
-----------------------------------------------------------------

Marking some lines to call them out for feedback later...

::: services/fxaccounts/FxAccounts.jsm
@@ +671,5 @@
> +    const options = { service: "sync" };
> +
> +    if (deviceId) {
> +      log.debug("destroying device and session");
> +      return this.fxAccountsClient.signOutAndDestroyDevice(sessionToken, deviceId, options)

Hopefully the new method name `signOutAndDestroyDevice` makes it clearer that this condition doesn't fail to sign out.

@@ +924,2 @@
>     */
>    loadAndPoll: function() {

`loadAndPoll` seems like the wrong name for this function now. I tried to think of a better one but everything I came up with was minging (the *least* ugly I managed was `loadAndRegisterDeviceAndPoll`). Suggestions welcome. :)

@@ +925,5 @@
>    loadAndPoll: function() {
>      let currentState = this.currentAccountState;
>      return currentState.getUserAccountData()
>        .then(data => {
> +        if (data && (data.isDeviceStale || !data.deviceId)) {

Realised I needed to introduce some extra state on the account to signal that an earlier device registration request failed. Calling it out here for feedback on the approach and the flag name.
Attachment #8695257 - Flags: feedback?(markh)
Just in case anyone is about to look at the latest patch, save yourself the effort. There is an incoming patch that will soon fix a number problems (and include tests).
This patch contains tests, but there are still a couple of gaps in the coverage.

My brain is pretty frazzled now and I couldn't figure them out so I'm going to come back to it with a clear head tomorrow. I wanted to get the patch in front of the Aussies before your weekend begins though, hence I'm posting it now.

The gaps I'm concerned about are:

* There are no tests that `loadAndPoll` invokes device registration correctly from the two different conditions which are supposed to invoke it (`isDeviceStale` and `!deviceId`). I tried to write some tests to cover this, but the other part of `loadAndPoll` throws when I try to invoke it from the tests.

* There are no tests that the browser `change device name` code triggers device registration correctly.

And I've probably missed some others too.

Fwiw, I kicked off a try build here:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=d1656e55f607
Attachment #8695257 - Attachment is obsolete: true
Attachment #8695257 - Flags: feedback?(markh)
Attachment #8695541 - Flags: feedback?(markh)
Comment on attachment 8695541 [details] [diff] [review]
Basic implementation Mk. V, now with tests!

Review of attachment 8695541 [details] [diff] [review]:
-----------------------------------------------------------------

Looking good, but I think we need to untangle some of the error handling in FxAccounts.jsm

I also only scanned the tests, but see no issues there. Note other tests, notably much of services/sync and browser/base/content/aboutaccounts/aboutaccounts.js, 
browser/base/content/test/general/browser_aboutAccounts.js, browser/base/content/test/general/browser_fxaccounts.js and browser/components/customizableui/test/browser_946320_tabs_from_other_computers.js all do funky stuff with FxA, should you should manually check they pass, and also make a try run sooner rather than later so we don't get nasty surprises in the home-straight.

::: browser/components/preferences/in-content/sync.js
@@ +175,4 @@
>      let textbox = document.getElementById("fxaSyncComputerName");
>      if (save) {
>        Weave.Service.clientsEngine.localName = textbox.value;
> +      fxAccounts.updateDeviceRegistration();

This function returns a promise, need to add a .catch here.

Another idea is to have the setter in the clients engine call this - that way if something else was added (or, eg, an addon) called the setter it would always be up-to-date - thoughts?

::: services/fxaccounts/FxAccounts.jsm
@@ +1364,5 @@
>        }
>      ).catch(err => Promise.reject(this._errorToErrorClass(err)));
>    },
> +
> +  updateDeviceRegistration() {

As this is a public method it should be (briefly) documented via a comment.

@@ +1392,5 @@
> +        signedInUser.sessionToken, deviceName, DEVICE_TYPE_DESKTOP);
> +    })
> +    .then(device => this.currentAccountState.updateUserAccountData({
> +      deviceId: device.id,
> +      isDeviceStale: false

I think we should set this to null here - that effectively removes it from storage (ie, that flag should only exist when it is true IMO)

@@ +1405,5 @@
> +      log.warn("swallowing device registration error", error);
> +      if (signedInUser.deviceId) {
> +        return this.currentAccountState.updateUserAccountData({
> +          isDeviceStale: true
> +        }).catch(() => {});

we might as well log here too.

@@ +1424,5 @@
> +      }
> +    }
> +
> +    return this._failDeviceRegistration(error);
> +  },

I think we should simplify this a little - _failDeviceRegistration is *always* going to return a rejected promise. That means _handleDeviceError() will return a rejected promise in that case - so a few lines up we seem likely to hit the "swalling device registration error - which may be the rejection we explicitly raised.

Given we are explicitly swallowing these exceptions, I wonder why we need _failDeviceRegistration to return a rejected promise - ie, maybe it could be called something like _handleDeviceRegistrationFailure or similar and have *it* be the canonical swallower of exceptions, always returning a resolved promise?

@@ +1445,5 @@
> +        const matchingDevices = devices.filter(device => device.isCurrentDevice);
> +        if (matchingDevices.length === 1) {
> +          return this._forceUpdateDeviceRegistration(matchingDevices[0].id);
> +        }
> +        return this._failDeviceRegistration(error);

eg, this looks strange - if we call this._failDeviceRegistrion() it seems we are guaranteed to then enter the following .catch

::: services/fxaccounts/FxAccountsClient.jsm
@@ +413,5 @@
> +    return this._request(path, "POST", creds, body);
> +  },
> +
> +  /**
> +   * Delete a device and its associated session token

This comment should also reflect that it signs the user out

::: services/fxaccounts/tests/xpcshell/test_loginmgr_storage.js
@@ +55,2 @@
>  add_task(function test_simple() {
> +  let fxa = new createFxAccounts();

I was expecting the "new" to be in createFxAccounts and all these callers to have it dropped?
Attachment #8695541 - Flags: feedback?(markh) → feedback+
(In reply to Phil Booth from comment #31)
> * There are no tests that `loadAndPoll` invokes device registration
> correctly from the two different conditions which are supposed to invoke it
> (`isDeviceStale` and `!deviceId`). I tried to write some tests to cover
> this, but the other part of `loadAndPoll` throws when I try to invoke it
> from the tests.

You might find this works OK if the user you use has the verified flag.

> * There are no tests that the browser `change device name` code triggers
> device registration correctly.

hmm - preferences have almost zero coverage :( I wonder if my idea of moving it to the getter would make that easier or harder :)

> https://treeherder.mozilla.org/#/jobs?repo=try&revision=d1656e55f607

oops - I missed that before - you didn't specify xpcshell tests, which I expect to fail on the sync tests. And I think if you manually run and fix the test names I mentioned above you should be much closer.
Attaching the latest changes as a patch for :stomlinson's benefit, in case he wants to look at anything while I'm flying to Orlando. I haven't finished fixing all of the test failures yet though, I plan to mop those up during the flight.
Attachment #8695541 - Attachment is obsolete: true
(In reply to Phil Booth from comment #9)
> Correct to comment 0:
> 
> In conversation with Vlad, we agreed that the "send a WebChannel message
> after the user disconnects Sync" requirement was unnecessary because the
> user's session will have been destroyed and they'll be force out at the next
> request anyway.

The one nice thing about an instant notification is that users currently sitting at the `settings` page will be notified immediately of a signout instead of waiting until the next time FxA is loaded. With Push notifications maybe a WebChannel message isn't needed and Push can be used as the universal message bus.
Comment on attachment 8696377 [details] [diff] [review]
Basic implementation Mk. VI, still some test fixes outstanding

Review of attachment 8696377 [details] [diff] [review]:
-----------------------------------------------------------------

This is in a *far* better state than when I left Phil, nice job on getting this patch into shape!

::: services/fxaccounts/FxAccounts.jsm
@@ +1386,5 @@
> +      }
> +
> +      log.debug("registering new device details");
> +      return this.fxAccountsClient.registerDevice(
> +        signedInUser.sessionToken, deviceName, DEVICE_TYPE_DESKTOP);

Hard coding the device type seems brittle and an assumption that could be invalidated by future code use. It seems like the device type should come from Weave instead of being hard coded here. At a minimum that'll allow this code to be shared with FxOS/Fennec/iOS without an update here. Weave already has a localType [1], can it be surfaced here?

[1] - https://dxr.mozilla.org/mozilla-central/rev/59bc3c7a83de7ffb611203912a7da6ad84535a5a/services/sync/modules/engines/clients.js#130

@@ +1403,5 @@
> +  _handleDeviceError(error, sessionToken) {
> +    if (error.code === 400) {
> +      if (error.errno === ERRNO_UNKNOWN_DEVICE) {
> +        log.warn("unknown device id, attempting to register new device instead");
> +        return this._forceUpdateDeviceRegistration(null);

Since these two error modes end up making additional network requests, could the error handlers themselves error and end up in an infinite loop where we inadvertently DDOS ourselves?

@@ +1432,5 @@
> +    //   4. If we don't find a match, fail with the original error.
> +    return this.fxAccountsClient.getDeviceList(sessionToken)
> +      .then(devices => {
> +        const matchingDevices = devices.filter(device => device.isCurrentDevice);
> +        if (matchingDevices.length === 1) {

If matchingDevices.length > 1, that itself seems like an error we may want to know about. At a minimum, sounds like the backend would be acting unexpectedly.

@@ +1443,5 @@
> +  _logErrorAndSetStaleDeviceFlag(error) {
> +    // Device registration should never cause other operations to fail.
> +    // If we're unable to recover to a sane state, log the error and set
> +    // a flag on the account data signalling that device registration is
> +    // still required.

Can you continue with this comment about when the next device registration attempt will occur?

::: services/sync/modules/engines/clients.js
@@ +15,4 @@
>  Cu.import("resource://services-sync/record.js");
>  Cu.import("resource://services-sync/util.js");
>  
> +XPCOMUtils.defineLazyModuleGetter(this, "fxAccounts",

Does this introduce a circular dependency where FxAccounts is dependent on Sync modules which are dependent on FxAccounts modules?

@@ +123,4 @@
>    },
>    set localName(value) {
>      Svc.Prefs.set("client.name", value);
> +    fxAccounts.updateDeviceRegistration();

How come this calls fxAccounts instead of fxAccounts listening for a pref change notification? I vaguely recall some oddness, but forget.
Comment on attachment 8696377 [details] [diff] [review]
Basic implementation Mk. VI, still some test fixes outstanding

Review of attachment 8696377 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the feedback Shane, all good points. I'll update the patch today.

::: services/fxaccounts/FxAccounts.jsm
@@ +1403,5 @@
> +  _handleDeviceError(error, sessionToken) {
> +    if (error.code === 400) {
> +      if (error.errno === ERRNO_UNKNOWN_DEVICE) {
> +        log.warn("unknown device id, attempting to register new device instead");
> +        return this._forceUpdateDeviceRegistration(null);

Only if the auth server or the db server are completely broken. I did consider coding defensively for those conditions but then I figured if one of those is broken we have bigger problems to worry about. I'm happy to put the safeguards in though, will update the patch.

::: services/sync/modules/engines/clients.js
@@ +123,4 @@
>    },
>    set localName(value) {
>      Svc.Prefs.set("client.name", value);
> +    fxAccounts.updateDeviceRegistration();

I haven't tried that approach at all, so maybe the oddness you recall is from before when I got involved in the patch. Anyway, I'll try it out.
Attachment #8697631 - Flags: review?(markh)
Attachment #8696377 - Attachment is obsolete: true
Comment on attachment 8697631 [details] [diff] [review]
Basic implementation Mk. VII, tests pass

Review of attachment 8697631 [details] [diff] [review]:
-----------------------------------------------------------------

Awesome!

::: services/fxaccounts/FxAccounts.jsm
@@ +557,5 @@
> +          }
> +
> +          return data.deviceId;
> +        }
> +      });

I think we should add an explicit "return null" (or undefined) just to make it clear and to stop eslint complaining in the future.
Attachment #8697631 - Flags: review?(markh) → review+
Comment on attachment 8697710 [details] [diff] [review]
Basic implementation Mk. VII, with return null

Thanks Phil,
  FYI, an r+ with a request to make simple changes generally means the updated patch doesn't need another review - just upload the new version and set r+ on the patch yourself!
Attachment #8697710 - Flags: review?(markh) → review+
(In reply to Mark Hammond [:markh] from comment #42)
> just upload the new version and
> set r+ on the patch yourself!

Or if you have level 3, just push the new version without uploading it :) I'm not sure if you do, so I'm setting checkin-needed
Keywords: checkin-needed
Backed out in https://hg.mozilla.org/integration/fx-team/rev/d5fd77678f2c because either it or bug 1191230 caused the build bustage in https://treeherder.mozilla.org/logviewer.html#?job_id=6186947&repo=fx-team which only seems to have affected Windows builds and B2G emulator builds, and I can't figure out which of these bugs could have done it.
This also appears to have caused mulet gij test bustage like https://treeherder.mozilla.org/logviewer.html#?job_id=6191315&repo=fx-team
Flags: needinfo?(pbooth)
Arrgh, really sorry for breaking stuff, it was indeed this patch that caused the problem. There is a pref (identity.fxaccounts.skipDeviceRegistration) that can be set to stop the tests from failing, I'll make sure I add it everywhere that needs it in the new patch.
Flags: needinfo?(pbooth)
This fixes the failures mentioned above and some more that were uncovered when I fixed those. I'm not totally confident that it's ready for check-in yet though, so I'll carry on with it tomorrow. It's tricky to be sure which try failures are significant (in terms of this patch) and some of the results appear to be inconsistent between runs, which further confuses things.
Attachment #8697710 - Attachment is obsolete: true
Attachment #8699178 - Attachment is obsolete: true
Hey :markh, needinfo-ing you as I have some questions about the correct fix for one of the B2G failures.

This particular problem was caused by the import of services-sync/util.js in FxAccounts.jsm throwing from the B2G ICS emulator xpcshell tests. There are four ways I can see to "fix" it, of which two I've been able to make work, one I haven't got working yet and one more that I don't know how to do at all (or if it's possible). The two that are working seem like sub-optimal approaches, so I wanted to run everything by you to make sure I'm focusing on the right thing.

1st sub-optimal solution, see [1]: Catch and log the import error, then skip device registration from the B2G tests so that they can complete okay. Sub-optimal because it could hide a genuine problem at run-time.

2nd sub-optimal solution, see [2]: Special-case the import behind the skip preference, because the imported methods are only called during device registration. Doesn't have the disadvantage of option 1, but still sub-optimal because having conditions in production code that exist only to make tests pass is hackish.

3rd solution, not working yet, see [3]: Extract the three methods in question (getDefaultDeviceName, getDeviceName, getDeviceType) from util.js to some other module that is available in the failing tests and import that instead. For the sake of trying it out, I did this in an FxAccountsDeviceInfo.jsm module but I guess it should really be somewhere else that is not related to FxA at all, maybe in services/common? My current effort is failing the sync tests with some error in stringbundle.js but I'm going to come back to it tomorrow.

4th solution, don't know if possible: Just make the failing import work correctly in those tests. Do you know if that's doable?

My thinking was that option 3 is the correct thing to do. What do you reckon?

[1] https://github.com/mozilla/gecko-dev/compare/master...philbooth:bug-1227527-ignore?expand=1
[2] https://github.com/mozilla/gecko-dev/compare/master...philbooth:bug-1227527-conditional?expand=1
[3] https://github.com/mozilla/gecko-dev/compare/master...philbooth:bug-1227527-refactor?expand=1
Flags: needinfo?(markh)
How about making it a lazy import, then ensuring the symbols references from that import are all guarded by that pref (which it appears is already the case).

XPCOMUtils.defineLazyModuleGetter(this, "Utils", "resource://services-sync/util.js");
Flags: needinfo?(markh)
Hey :markh, there's not a huge number of changes here but I'd still like an r? before setting this to checkin-needed, just to make sure I'm not doing anything wrong or stupid in the tests.

Specifically, I have extracted the device registration tests to a separate module, test_accounts_device_registration.js, which I am skipping on B2G in xpcshell.ini. Is that cool?

Apart from that, the only differences since your last review are the lazy import you helped me with and setting the skipDeviceRegistration pref in the B2G tests.

Try server results:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=31736cf0e4e8
Attachment #8700736 - Flags: review?(markh)
Attachment #8700736 - Flags: review?(markh) → review+
Some debug leak failures in that push - I've done a few retries and if we can't get any to go green, it must be this patch. Removing checkin-needed while we want and see how they go...
Keywords: checkin-needed
They all got 1 greeen (out of 9 in the worst case!) but I see no reason to believe this made those existing problems worse (and I hope the sheriffs agree ;)
Keywords: checkin-needed
Whiteboard: [fxa] → [fxa-waffle]
Apologies again everyone, looking into it now.
Flags: needinfo?(pbooth)
Hey :yifan and :rpappalardo.

I'm needinfoing you as the respective authors of `tv_apps/smart-system/.../fxa_screen_flow_test.js` and `apps/system/.../fxa_screen_flow_test.js` in the Gaia repo.

The FxA device registration patch attached to this issue breaks the tv_apps version of those tests [1]. I have a tentative PR [2] that contains a speculative "fix" and some questions about how to run the test. Would you be able to have a look at the PR and reply to it if you're able to answer my questions?

Thanks.

[1] https://treeherder.mozilla.org/logviewer.html#?job_id=6354136&repo=fx-team
[2] https://github.com/mozilla-b2g/gaia/pull/33715
Flags: needinfo?(yliao)
Flags: needinfo?(rpappalardo)
Thank you! Updated the comment on Github.
Flags: needinfo?(yliao)
Flags: needinfo?(rpappalardo) → needinfo?
Hey :yifan, sorry to bug you again, I just wanted to check whether you saw [1]?

I've built the B2G desktop client locally with my patch applied but I'm getting a different test failure to treeherder when I run the fxa_screen_flow_test.

It's complaining about the COPPA select; I'm not sure if you guys re-use the UI from the content server, but we changed that to an input element in [2]. Maybe there is a newer version of the test in a different branch, which I should be running instead?

Or maybe I just built B2G wrong? I followed the instructions from [3] and [4].

Any ideas?

[1] https://github.com/mozilla-b2g/gaia/pull/33715
[2] https://github.com/mozilla/fxa-content-server/issues/3137
[3] https://developer.mozilla.org/en-US/docs/Mozilla/Firefox_OS/Preparing_for_your_first_B2G_build
[4] https://developer.mozilla.org/en-US/docs/Mozilla/Firefox_OS/Building_the_Firefox_OS_simulator
Flags: needinfo? → needinfo?(yliao)
Thank you for the information! I think you're right on the cause of the test error which is related to COPPA select. You can simply skip the failure test safely because creating a new user is not a feature implemented on TV.
Flags: needinfo?(yliao)
Depends on: 1064305
Hey :markh.

I was chatting to ferjm some more in IRC earlier and it seems pretty straightforward to get device registration working on B2G (I mean really working, not just the tests passing).

It just requires setting the `client.name` and `client.type` prefs over in the Gaia repo and then refactoring the getDeviceName/Type logic over here so that it can read those prefs without needing to import services-sync/util.js.

The attached tentative patch (unbuilt, untested, definitely contains mistakes) is roughly what I had in mind. How would you feel about us doing this, or something similar?
Attachment #8706665 - Flags: feedback?(markh)
Comment on attachment 8706665 [details] [diff] [review]
0002-Bug-1227527-Enable-FxA-device-registration-on-FxOS.patch

Review of attachment 8706665 [details] [diff] [review]:
-----------------------------------------------------------------

I think we are going to need an explicit check for b2g (probably by importing AppConstants.jsm and checking AppConstants.MOZ_B2G) so we avoid any references to Utils and to use different preference names there. Also, I think we are going to need some story for the default device name there. I'd recommend we get this landed without this part of the patch (to ensure desktop has it in 46) and have a new bug to get it in for b2g (maybe 46 but more likely 47). Part of that patch may be to further refactor Sync's code (eg, default device name) so it can be shared with b2g like you are doing here.

::: services/fxaccounts/FxAccounts.jsm
@@ +570,5 @@
>        });
>    },
>  
> +  getDeviceName() {
> +    const deviceName = Services.Prefs.get("client.name", "");

I think there's a little confusion here re preferences - Services.Prefs doesn't exist - it's Services.prefs - which is the *root* of the preferences namespace - so in the line above, it really is a preference "client.name", as opposed to the "services.sync.client.name" string.

(It's confusing as Sync itself uses Preferences.jsm, which is a slightly different interface and can be instantiated to use any root - and doubly confusing because it typically uses "Svc.Prefs" - which is totally different than "Services.prefs")

But even if we fixed that, it's not clear to me that the preference name "services.sync.client.name" makes sense on b2g.

@@ +572,5 @@
>  
> +  getDeviceName() {
> +    const deviceName = Services.Prefs.get("client.name", "");
> +
> +    if (deviceName === "" && Utils) {

I think you will find this reference throws on b2g rather than just being null.

@@ +576,5 @@
> +    if (deviceName === "" && Utils) {
> +      return Utils.getDefaultDeviceName();
> +    }
> +
> +    return deviceName;

is returning an empty device name really the right thing on b2g? ie, who would set this pref there?
Attachment #8706665 - Flags: feedback?(markh)
All good points :markh, thanks for the feedback. I'll revert to plan A and set the skip pref in the B2G.
This is the same patch as got the r+, just rebased to fix a conflict with recent changes.

I've not set [checkin-needed] because the related Gaia change [1] needs approval.

Also, I don't know what the correct process is after that. Does autolander make everything magically work together because I used the same bug number over there? I can see that it's linked to that PR here.

Or do we need to wait for that to be merged into some specific branch before setting [checkin-needed] to land this part of the patch?

Or should I have used a fresh bug number for the Gaia PR so that the two could proceed independently?
Attachment #8700736 - Attachment is obsolete: true
Attachment #8706665 - Attachment is obsolete: true
Forgot the link to the PR in my last comment:

[1] https://github.com/mozilla-b2g/gaia/pull/33839
(In reply to Mark Hammond [:markh] from comment #62)
> Comment on attachment 8706665 [details] [diff] [review]
> 0002-Bug-1227527-Enable-FxA-device-registration-on-FxOS.patch
> 
> Review of attachment 8706665 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think we are going to need an explicit check for b2g (probably by
> importing AppConstants.jsm and checking AppConstants.MOZ_B2G) so we avoid
> any references to Utils and to use different preference names there. Also, I
> think we are going to need some story for the default device name there. I'd
> recommend we get this landed without this part of the patch (to ensure
> desktop has it in 46) and have a new bug to get it in for b2g (maybe 46 but
> more likely 47). Part of that patch may be to further refactor Sync's code
> (eg, default device name) so it can be shared with b2g like you are doing
> here.
>

I filed bug 1238895 for the B2G part.


(In reply to Phil Booth from comment #65)
> Created attachment 8706835 [details] [diff] [review]
> 0001-Bug-1227527-Implement-basic-FxA-device-registration.patch
> 
> Also, I don't know what the correct process is after that. Does autolander
> make everything magically work together because I used the same bug number
> over there? I can see that it's linked to that PR here.

Autolander [1] automatically links the PR to the bugzilla issue if the commit message contains "Bug XXXXXX". Once the PR is linked to the bug, you need to request review just like for manually attached patches. Once the PR is r+ you need someone with merge privileges to merge the PR in Gaia master (setting the checkin-needed flag would work).

In any case, you don't need a Gaia PR for this :). Since this pref is going to have the same value in all B2G devices, you can just add the pref to [1].

[1] https://developer.mozilla.org/en-US/Firefox_OS/Developing_Gaia/Submitting_a_Gaia_patch#Easy_patch_submission_with_Autolander
[2] https://mxr.mozilla.org/mozilla-central/source/b2g/app/b2g.js
Updated patch to include the B2G pref, as per :ferjm's comment.

Carrying forward the r+ from :markh.

Just waiting for the try build to finish before requesting checkin:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=a0e982df6181
Attachment #8706827 - Attachment is obsolete: true
Attachment #8706835 - Attachment is obsolete: true
Attachment #8707424 - Flags: review+
It's taken a while to get some try results that give me sufficient confidence but, in combination, this pair from the weekend/today do so:

* https://treeherder.mozilla.org/#/jobs?repo=try&revision=0aa829482eee
* https://treeherder.mozilla.org/#/jobs?repo=try&revision=5c57132ef3ed

So, 3rd time lucky, setting checkin-needed.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/4824324ae869
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 46
Depends on: 1241306
[bugday-20160323]

Status: RESOLVED,FIXED -> UNVERIFIED

Comments:
STR: Not clear.
Developer specific testing

Component: 
Name			Firefox
Version			46.0b9
Build ID		20160322075646
Update Channel	        beta
User Agent 		Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0
OS			Windows 7 SP1 x86_64

Expected Results: 
Developer specific testing

Actual Results: 
As expected
Depends on: 1243594
You need to log in before you can comment on or make changes to this bug.