Closed Bug 1157529 Opened 10 years ago Closed 9 years ago

Refactor FxA account storage on the client to be less lossy

Categories

(Firefox :: Firefox Accounts, defect)

defect
Not set
normal
15

Tracking

()

RESOLVED FIXED
Firefox 42
Iteration:
42.2 - Jul 27
Tracking Status
firefox40 --- wontfix
firefox41 --- affected
firefox42 --- fixed

People

(Reporter: markh, Assigned: markh)

References

Details

(Whiteboard: [fxsync])

Attachments

(1 file, 5 obsolete files)

Attached patch 0001-for-export.patch (obsolete) — Splinter Review
As highlighted by bug 1153708, our storage implementation has issues that should be addressed. I don't think this can be done incrementally - we need to take a big bite.  There are 2 main issues we need to address:

1) Better management of fields that come from the login manager, so we don't end up with:
** Account data in the login manager can't be read due to it being locked.
** User unlocks for some reason unrelated to FxA.
** We update the account data, possibly clobbering the previously unlocked data that we haven't yet managed to read.

2) Inherently racey storage API - the pattern used to update account data is (a) read all data, (b) update a single field, (c) write all data. Data will be lost if this happens multiple times concurrently. An example is that Sync login might be updating the keys at the same time the profile code is updating the profile.

I believe we should fix this by moving the storage to something that is more "field" oriented. Specifically:

* The "update account data" functions should take only the fields being updated - ie, (1) above changes to a single API that says "just update these fields" instead of the read/update/write done now, meaning we can keep sane semantics when different fields are updated concurrently.

* Have the storage API know what fields are stored where and "do the right thing" when a field for the login manager is updated while the login manager is (or was) locked.

It looks unlikely I'll find much time to work on this - hopefully some of the FxA team can help!  I did a proof-of-concept, which I'm attaching here.  It's *very much* a WIP - there's still a load of work remaining - but it's here for reference.

The main thing to look at is FxaStorageManager.jsm which implements the API I described above.  The storage manager sits between the AccountState and the storage itself, and tracks what field goes where and the caching/updating of the "secure" storage as the lock state changes.  With this patch, test_accounts and test_storage_manager both work, but almost nothing else does. I've also completely ignored oauth token caching - it probably makes sense to roll the caching into "normal" storage - the only reason we used a separate .json file for tokens was due to these limitations.
Here's a patch - it's large :( But it works :)

I suggest starting with the new module FxAccountsStorage.jsm

* A new module FxAccountsStorage.jsm manages all storage. This module abstracts the fact that some fields are written to "secure" storage (ie, the login manager) and some are stored in plain-text JSON - the callers just set fields.

* The public API doesn't allow you to update fields by passing the entire "account data" object - you just pass the fields you want updated. Thus the API isn't racey in that multiple callers trying to update different fields concurrently will not overwrite each other.

* The code that migrates the account data from before we stored anything in the login manager has been dropped - but it should actually migrate correctly as the storage manager doesn't care where it *reads* fields from - it just ensures they are split correctly on write.

* The oauth tokens are now stored as a regular object in the plain JSON. This means that any oauth tokens already in the oauthTokens.json file will be lost - but that seems OK as no one is actually using them yet.

* This patch makes the AccountState object much simpler, and the patch in bug 1156752 will make it even better, and remove |this.fxaInternal|.

All xpcshell tests under services/ pass. It's likely one or 2 browser-chrome tests will need updating, but they will be trivial.
Assignee: nobody → markh
Attachment #8596283 - Attachment is obsolete: true
Attachment #8620831 - Flags: feedback?(ckarlof)
Comment on attachment 8620831 [details] [diff] [review]
0001-Bug-1157529-refactor-FxA-storage-to-be-less-lossy-an.patch

Review of attachment 8620831 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/fxaccounts/FxAccountsStorage.jsm
@@ +53,5 @@
> +  _initialize: Task.async(function* (accountData) {
> +    log.trace("initializing new storage manager");
> +    try {
> +      if (accountData) {
> +        this._haveReadSecure = true;

Not obvious to me why this is being set here to true, i.e., the code didn't just read the secure storage. :)

@@ +56,5 @@
> +      if (accountData) {
> +        this._haveReadSecure = true;
> +        // split it into the 2 parts, write it and we are done.
> +        for (let [name, val] of Iterator(accountData)) {
> +          if (FXA_PWDMGR_PLAINTEXT_FIELDS.indexOf(name) >= 0) {

Do we need to deal with memory fields here, just in case they are passed in the accountData?

@@ +64,5 @@
> +          }
> +        }
> +        // write it out and we are done.
> +        yield this._write();
> +        return;

I'm not a fan of using return to implement an else branch without else in this case. Is there a reason for doing a return here instead of just putting the code after this block in an else branch?

@@ +69,5 @@
> +      }
> +      // So we were initialized without account data - that means we need to
> +      // read the state from storage.
> +      yield this._readPlainStorage();
> +      if (this.cachedPlain && this.secureStorage) {

It's not obvious why we're checking these two things before we call doReadAndUpdateSecure. My understanding is that we're checking secureStorage because it may not exist (B2G), but why this.cachedPlain? Shouldn't that always be non-null?

@@ +100,5 @@
> +  // a signOut attempts to clear it.
> +  // So all such operations "queue" themselves via this.
> +  _queueStorageOperation(func) {
> +//    log.trace("queueing storage operation ${}", func.toString().replace(/\n/g, " "));
> +    return this._promiseStorageComplete = this._promiseStorageComplete

Cool setup here. If this ever fails, i.e., the catch below gets invoked, then does this correct set this._promiseStorageComplete back to a resolved promise so future callers aren't deadlocked?

@@ +176,5 @@
> +    // for us to.
> +    this._write();
> +  }),
> +
> +  _resetData() {

Perhaps a rename to clearCachedData? It doesn't reset all data.

@@ +185,5 @@
> +  },
> +
> +  /* Note: _readPlainStorage is only called during initialize, so isn't
> +     protected via _queueStorageOperation() nor _promiseInitialized.
> +  */

I'd like to see a summary of what this method does, i.e., it reads the plain storage from disk and if it exists, it populates the this.cachedPlainStorage object with that data. Perhaps readAndCachePlainStorage would be a better name of the method.

@@ +200,5 @@
> +      // either way, we return null.
> +      got = null;
> +    }
> +    if (!got || !got.accountData || got.version != DATA_FORMAT_VERSION) {
> +      this._resetData();

Should there ever be anything to reset here?

@@ +206,5 @@
> +    }
> +    // We need to update our .cachedPlain, but can't just assign to it as
> +    // it may need to be the exact same object as .cachedSecure
> +    // As a sanity check, .cachedPlain must be empty (as we are called by init)
> +    if (Object.keys(this.cachedPlain).length != 0) {

Does Firefox have an "assert"? Seems like a good use of it here.
Attachment #8620831 - Flags: feedback?(zack.carter)
Argh. Early submit. I'm still working on the Storage module and I added Zack as an extra set of eyes here.
Mark, it might be a good idea to idea to add some telemetry around the error cases that are equivalent to assert failures. That'll hopefully give us more insight around any bugs that are reported in association to this patch.
Comment on attachment 8620831 [details] [diff] [review]
0001-Bug-1157529-refactor-FxA-storage-to-be-less-lossy-an.patch

Review of attachment 8620831 [details] [diff] [review]:
-----------------------------------------------------------------

finished feedback on FxAccountsStorage.jsm.

::: services/fxaccounts/FxAccounts.jsm
@@ +134,5 @@
>    },
>  
>    getUserAccountData() {
> +    return this.storageManager.getAccountData().then(result => {
> +      return this.resolve(result)

lol, that's some gross code on the left.

::: services/fxaccounts/FxAccountsStorage.jsm
@@ +37,5 @@
> +}
> +
> +FxAccountsStorageManager.prototype = {
> +  _initialized: false,
> +  _haveReadSecure: false,

It's not clear to me what the semantics of this flag are. Does this mean I've try to read the secureStorage (and maybe failed because it was locked), or I successfully read an unlocked secureStorage?

@@ +53,5 @@
> +  _initialize: Task.async(function* (accountData) {
> +    log.trace("initializing new storage manager");
> +    try {
> +      if (accountData) {
> +        this._haveReadSecure = true;

Not obvious to me why this is being set here to true, i.e., the code didn't just read the secure storage. :)

@@ +56,5 @@
> +      if (accountData) {
> +        this._haveReadSecure = true;
> +        // split it into the 2 parts, write it and we are done.
> +        for (let [name, val] of Iterator(accountData)) {
> +          if (FXA_PWDMGR_PLAINTEXT_FIELDS.indexOf(name) >= 0) {

Do we need to deal with memory fields here, just in case they are passed in the accountData?

@@ +64,5 @@
> +          }
> +        }
> +        // write it out and we are done.
> +        yield this._write();
> +        return;

I'm not a fan of using return to implement an else branch without else in this case. Is there a reason for doing a return here instead of just putting the code after this block in an else branch?

@@ +69,5 @@
> +      }
> +      // So we were initialized without account data - that means we need to
> +      // read the state from storage.
> +      yield this._readPlainStorage();
> +      if (this.cachedPlain && this.secureStorage) {

It's not obvious why we're checking these two things before we call doReadAndUpdateSecure. My understanding is that we're checking secureStorage because it may not exist (B2G), but why this.cachedPlain? Shouldn't that always be non-null?

@@ +70,5 @@
> +      // So we were initialized without account data - that means we need to
> +      // read the state from storage.
> +      yield this._readPlainStorage();
> +      if (this.cachedPlain && this.secureStorage) {
> +        yield this._doReadAndUpdateSecure();

My understanding is that this section of code is basically reading both the plain and secure persisted user data and then caching that data in this object. If that's the case, it would be more clear if readPlainStorage and doReadAndUpdateSecure were named more similarly since they are both doing similar jobs against different backend stores. E.g., readAndCachePlainStorage and readAndCacheSecureStorage?

@@ +100,5 @@
> +  // a signOut attempts to clear it.
> +  // So all such operations "queue" themselves via this.
> +  _queueStorageOperation(func) {
> +//    log.trace("queueing storage operation ${}", func.toString().replace(/\n/g, " "));
> +    return this._promiseStorageComplete = this._promiseStorageComplete

Cool setup here. If this ever fails, i.e., the catch below gets invoked, then does this correct set this._promiseStorageComplete back to a resolved promise so future callers aren't deadlocked?

@@ +150,5 @@
> +    if (!newFields || 'uid' in newFields || 'email' in newFields) {
> +      // XXX - do we really want to check email here?  Once we support a
> +      // user changing email address this may need to change, but it's not
> +      // clear how we would be told of such a change anyway...
> +      throw new Error("Can't change uid or email address");

We're working on a "change email address" flow in FxA, so we may need to revisit this soon.

@@ +161,5 @@
> +          delete this.cachedPlain[name];
> +        } else {
> +          this.cachedPlain[name] = value;
> +        }
> +      } else {

Do we need to handle in memory properties here?

@@ +169,5 @@
> +      }
> +    }
> +    // If we haven't yet read the secure data, do so now, else we may write
> +    // out partial data.
> +    yield this._maybeReadAndUpdateSecure();

Does reading this after we update the values in cachedSecure mean that we may clobber the new values we're trying to set with the old data in the secureStorage?

@@ +176,5 @@
> +    // for us to.
> +    this._write();
> +  }),
> +
> +  _resetData() {

Perhaps a rename to clearCachedData? It doesn't reset all data.

@@ +185,5 @@
> +  },
> +
> +  /* Note: _readPlainStorage is only called during initialize, so isn't
> +     protected via _queueStorageOperation() nor _promiseInitialized.
> +  */

I'd like to see a summary of what this method does, i.e., it reads the plain storage from disk and if it exists, it populates the this.cachedPlainStorage object with that data. Perhaps readAndCachePlainStorage would be a better name of the method.

@@ +200,5 @@
> +      // either way, we return null.
> +      got = null;
> +    }
> +    if (!got || !got.accountData || got.version != DATA_FORMAT_VERSION) {
> +      this._resetData();

Should there ever be anything to reset here?

@@ +206,5 @@
> +    }
> +    // We need to update our .cachedPlain, but can't just assign to it as
> +    // it may need to be the exact same object as .cachedSecure
> +    // As a sanity check, .cachedPlain must be empty (as we are called by init)
> +    if (Object.keys(this.cachedPlain).length != 0) {

Does Firefox have an "assert"? Seems like a good use of it here.

@@ +281,5 @@
> +    // we can merge our cached data with the data that's already been set.
> +    if (this.secureStorage == null) {
> +      return;
> +    }
> +    if (!this._haveReadSecure) {

It seems that haveReadSecure gets set to true even if we read null from the secureStorage due to it being locked, so I'm confused how we attempt to re-read it when it's locked.

@@ +311,5 @@
> +            this.cachedSecure[name] = value;
> +          }
> +        }
> +      }
> +      this._haveReadSecure = true;

Should this be set to true even if the read failed because it was locked? See my comment above about being unclear around the semantics of this flag?

@@ +321,5 @@
> +    }
> +  }),
> +}
> +
> +/**

I'm assuming this below is mainly moved over from FxAccounts.jsm. If not, please let me know so I look at it more closely.
Comment on attachment 8620831 [details] [diff] [review]
0001-Bug-1157529-refactor-FxA-storage-to-be-less-lossy-an.patch

Review of attachment 8620831 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/fxaccounts/FxAccountsCommon.js
@@ +216,5 @@
>  // The fields we save in the plaintext JSON.
>  // See bug 1013064 comments 23-25 for why the sessionToken is "safe"
>  exports.FXA_PWDMGR_PLAINTEXT_FIELDS = ["email", "verified", "authAt",
> +                                       "sessionToken", "uid", "oauthTokens",
> +                                       "profile"];

What is "profile"? The avatar image URL? If so, I'd prefer to be more explicit. In our server we use "profile" to refer to all the user profile data she might have stored with us (e.g., display name, email, etc.)
Hmmm. There are a couple ways of looking at this. You could view the AccountState object as the "in memory cached representation of the account data", but in many ways, that representation is inside the FxAStorageManager, which looking at it now seems like it's not it's concern. 

The weird smell around FxAStorageManager is it seems to coupled to caching data for a single user. Doesn't that seem to be the concern of AccountState. I see that the thing complicating this is the complicated coordination with the secure storage, but it would be nice if the cached data was encapsulated in instantiated AccountState objects rather than being cached directing the FxAStorageManager. Another way of putting it is that I don't see how to easily extend FxAStorageManager to support multiple users, if say, we wanted to implement fast user switching. 

Can we somehow limit the concerns of the FxAStorageManager to coordinating all the storage issues and have the cached data live directly in the AccountState objects? What do you think about that?
Comment on attachment 8620831 [details] [diff] [review]
0001-Bug-1157529-refactor-FxA-storage-to-be-less-lossy-an.patch

Review of attachment 8620831 [details] [diff] [review]:
-----------------------------------------------------------------

I guess that means an f-, but I'm open to having my mind changed.
Attachment #8620831 - Flags: feedback?(ckarlof) → feedback-
(In reply to Chris Karlof [:ckarlof] from comment #5)

> ::: services/fxaccounts/FxAccountsStorage.jsm
> @@ +37,5 @@
> > +}
> > +
> > +FxAccountsStorageManager.prototype = {
> > +  _initialized: false,
> > +  _haveReadSecure: false,
> 
> It's not clear to me what the semantics of this flag are. Does this mean
> I've try to read the secureStorage (and maybe failed because it was locked),
> or I successfully read an unlocked secureStorage?

Yeah, I should probably flip the name and state of this variable around to something like needToReadSecure. Assuming data wasn't passed to initialize (which means storage is never read, period - see below) this flag means "secure storage was locked last time we tried to read or write it".

The "or write" part is important - if storage was locked when we tried to read, but then someone calls updateAccountData with a field that goes into secure storage, we must read at that point so we can update the secure data rather than override it.

> @@ +53,5 @@
> > +  _initialize: Task.async(function* (accountData) {
> > +    log.trace("initializing new storage manager");
> > +    try {
> > +      if (accountData) {
> > +        this._haveReadSecure = true;
> 
> Not obvious to me why this is being set here to true, i.e., the code didn't
> just read the secure storage. :)

Yep - see above - it should read |needToReadSecure = false;| to reflect the fact that as the data was passed in we don't need to read *any* storage.

IOW, this flag is trying to reflect "does the in-memory cache of the secure data reflect all secure storage?"

> 
> @@ +56,5 @@
> > +      if (accountData) {
> > +        this._haveReadSecure = true;
> > +        // split it into the 2 parts, write it and we are done.
> > +        for (let [name, val] of Iterator(accountData)) {
> > +          if (FXA_PWDMGR_PLAINTEXT_FIELDS.indexOf(name) >= 0) {
> 
> Do we need to deal with memory fields here, just in case they are passed in
> the accountData?

Not sure what memory fields you refer to here. The storage in this patch doesn't have any such concept - all fields are written somewhere. The patch in bug 1156752 does introduce the concept of "memory fields" and does touch this block as you suggest.

> 
> @@ +64,5 @@
> > +          }
> > +        }
> > +        // write it out and we are done.
> > +        yield this._write();
> > +        return;
> 
> I'm not a fan of using return to implement an else branch without else in
> this case. Is there a reason for doing a return here instead of just putting
> the code after this block in an else branch?

I actually prefer this style - https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style mentions as the first general guideline - "Don't put an else right after a return. Delete the else, it's unnecessary and increases indentation level". This pattern tends to be used quite heavily in desktop code.

> @@ +69,5 @@
> > +      }
> > +      // So we were initialized without account data - that means we need to
> > +      // read the state from storage.
> > +      yield this._readPlainStorage();
> > +      if (this.cachedPlain && this.secureStorage) {
> 
> It's not obvious why we're checking these two things before we call
> doReadAndUpdateSecure. My understanding is that we're checking secureStorage
> because it may not exist (B2G), but why this.cachedPlain? Shouldn't that
> always be non-null?

Yep, that's correct. The intent is that we only read secure storage if we successfully read data from the plain storage (eg, if we previously did a signOut when secure storage was locked). So I think this block should remain with |Object.keys(this.cachedPlain).length != 0| (and comment reflecting the above :)

> it would be more clear if readPlainStorage and
> doReadAndUpdateSecure were named more similarly

Good point, I'll do that.

> @@ +100,5 @@
> > +  // a signOut attempts to clear it.
> > +  // So all such operations "queue" themselves via this.
> > +  _queueStorageOperation(func) {
> > +//    log.trace("queueing storage operation ${}", func.toString().replace(/\n/g, " "));
> > +    return this._promiseStorageComplete = this._promiseStorageComplete
> 
> Cool setup here. If this ever fails, i.e., the catch below gets invoked,
> then does this correct set this._promiseStorageComplete back to a resolved
> promise so future callers aren't deadlocked?

IIUC, your question can be demonstrated by the following:

--8<--
let p = Promise.resolve().then(() => {
  throw new Error("an error");
}).catch(err => {alert("oh no")});

p.then(() => {
 alert("part 2");
})
--8<--

This code shows the first alert with "oh no", then the second alert with "part 2".  IOW, a rejection handler that completes successfully results in a resolved promise.

> Do we need to handle in memory properties here?

See above - this patch doesn't really have that concept.

> 
> @@ +169,5 @@
> > +      }
> > +    }
> > +    // If we haven't yet read the secure data, do so now, else we may write
> > +    // out partial data.
> > +    yield this._maybeReadAndUpdateSecure();
> 
> Does reading this after we update the values in cachedSecure mean that we
> may clobber the new values we're trying to set with the old data in the
> secureStorage?

Nope - I'm pretty sure _doReadAndUpdateSecure() handles this case - anything already in this.cachedSecure wins, and checkLockedUpdates() in test_storage_manager.js is for exactly this scenario.

> @@ +176,5 @@
> > +    // for us to.
> > +    this._write();
> > +  }),
> > +
> > +  _resetData() {
> 
> Perhaps a rename to clearCachedData? It doesn't reset all data.

I've no objection to the name change, but to make sure we are on the same page, what data doesn't it reset?

> @@ +200,5 @@
> > +      // either way, we return null.
> > +      got = null;
> > +    }
> > +    if (!got || !got.accountData || got.version != DATA_FORMAT_VERSION) {
> > +      this._resetData();
> 
> Should there ever be anything to reset here?

Nicely spotted :) Nope, there should never be anything to reset, but I thought "better safe than sorry" - I'm happy to either remove it or add a comment.

> Does Firefox have an "assert"? Seems like a good use of it here.

Not yet :(  Bug 1080457 is adding one - I should add a comment reflecting this.

> @@ +281,5 @@
> > +    // we can merge our cached data with the data that's already been set.
> > +    if (this.secureStorage == null) {
> > +      return;
> > +    }
> > +    if (!this._haveReadSecure) {
> 
> It seems that haveReadSecure gets set to true even if we read null from the
> secureStorage due to it being locked, so I'm confused how we attempt to
> re-read it when it's locked.
> 
> @@ +311,5 @@
> > +            this.cachedSecure[name] = value;
> > +          }
> > +        }
> > +      }
> > +      this._haveReadSecure = true;
> 
> Should this be set to true even if the read failed because it was locked?

Storage being locked means we enter the exception handler, so _haveReadSecure should never be set to true while storage is locked.

> See my comment above about being unclear around the semantics of this flag?

Hopefully that makes it clearer?
> 
> @@ +321,5 @@
> > +    }
> > +  }),
> > +}
> > +
> > +/**
> 
> I'm assuming this below is mainly moved over from FxAccounts.jsm. If not,
> please let me know so I look at it more closely.

Yep, that's correct.
(In reply to Chris Karlof [:ckarlof] from comment #6)
> What is "profile"? The avatar image URL? If so, I'd prefer to be more
> explicit. In our server we use "profile" to refer to all the user profile
> data she might have stored with us (e.g., display name, email, etc.)

It's the entire profile as received by the profile server. "avatar" is a field inside this object.

(In reply to Chris Karlof [:ckarlof] from comment #7)
> Hmmm. There are a couple ways of looking at this. You could view the
> AccountState object as the "in memory cached representation of the account
> data", but in many ways, that representation is inside the
> FxAStorageManager, which looking at it now seems like it's not it's concern. 

I view the "AccountState" as "an object that ensures the same user is logged in at the end of an operation is the same as at the start". It's primarily concerned with ensuring that if the "current user" changes while promises are in-flight the sensible things happen - ie, it should be impossible to "accidentally" update data for the wrong user, and all promises against the "old" user will be rejected.

It delegates the storage to the storage manager. Each accountState is created with a brand-new StorageManager - the accountState fully owns its storage and a single storage instance is never reused or shared between multiple AccountState objects.

> The weird smell around FxAStorageManager is it seems to coupled to caching
> data for a single user. Doesn't that seem to be the concern of AccountState.
> I see that the thing complicating this is the complicated coordination with
> the secure storage, but it would be nice if the cached data was encapsulated
> in instantiated AccountState objects rather than being cached directing the
> FxAStorageManager. Another way of putting it is that I don't see how to
> easily extend FxAStorageManager to support multiple users, if say, we wanted
> to implement fast user switching.

In that scenario there would be multiple StorageManager objects alive.

It sounds like you are under the misconception that the StorageManager is a singleton or otherwise shared? There is a 1:1 relationship between accountState and storageManager objects.

> Can we somehow limit the concerns of the FxAStorageManager to coordinating
> all the storage issues and have the cached data live directly in the
> AccountState objects? What do you think about that?

I think that's roughly what we have - the AccountState *owns* the StorageManager, and delegates access to the underlying fields to it. Thus, the AccountState doesn't need to know what "bucket" a field lives in, it just treats all the data as a single logical object with fields. The StorageManager is responsible for splitting those fields into buckets and getting them to storage.

So I think it would be a loss of encapsulation to have the various "buckets" on the AccountState object and have it responsible for creating the unified view of the data - but I might be misunderstanding your concern.

Setting needinfo - I'll put a new patch up after we've discussed this further.
Flags: needinfo?(ckarlof)
Blocks: 1174812
Comment on attachment 8620831 [details] [diff] [review]
0001-Bug-1157529-refactor-FxA-storage-to-be-less-lossy-an.patch

Review of attachment 8620831 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/fxaccounts/FxAccountsStorage.jsm
@@ +13,5 @@
> +Cu.import("resource://gre/modules/Task.jsm");
> +Cu.import("resource://gre/modules/FxAccountsCommon.js");
> +Cu.import("resource://gre/modules/osfile.jsm");
> +Cu.import("resource://services-common/utils.js");
> +

Might be useful to add a high-level explanation of what this module does/what problems it solves.

@@ +82,5 @@
> +    // We can't throw this instance away while it is still writing or we may
> +    // end up racing with the newly created one.
> +    log.trace("StorageManager finalizing");
> +    return this._promiseInitialized.then(() => {
> +      this._promiseStorageComplete

Should this have a return statement?
Attachment #8620831 - Flags: feedback?(zack.carter)
Comment on attachment 8620831 [details] [diff] [review]
0001-Bug-1157529-refactor-FxA-storage-to-be-less-lossy-an.patch

Review of attachment 8620831 [details] [diff] [review]:
-----------------------------------------------------------------

I guess that means an f-, but I'm open to having my mind changed.
Attachment #8620831 - Flags: feedback?(kcambridge)
Attachment #8620831 - Attachment is obsolete: true
Attachment #8620831 - Flags: feedback?(kcambridge)
Comment on attachment 8625936 [details] [diff] [review]
0001-Bug-1157529-refactor-FxA-storage-to-be-less-lossy-an.patch

Review of attachment 8625936 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/fxaccounts/FxAccounts.jsm
@@ +108,3 @@
>      this.fxaInternal = null;
> +    if (!this.storageManager) {
> +      return Promise.resolve();

A comment around why and when this might be called would be useful for future us.

@@ +127,4 @@
>    },
>  
>    getUserAccountData() {
> +    return this.storageManager.getAccountData().then(result => {

How come we lost the isCurrent check here? If we're not current then the storageManager will be null, which will cause this call to error out with a strange error.

@@ +260,3 @@
>        // later we might want to check an expiry date - but we currently
>        // have no such concept, so just return it.
> +      log.trace("getCachedToken returning cached token", result);

Remove the result from the log message.

@@ +938,3 @@
>              })
> +            .then(() => {
> +              return currentState.getUserAccountData();

We can probably get rid of the first getUserAccountData() because we don't need all the data to update properties anymore.

@@ +1128,5 @@
>      // Early exit for a cached token.
>      let currentState = this.currentAccountState;
>      let cached = currentState.getCachedToken(scope);
>      if (cached) {
> +      log.debug("getOAuthToken returning a cached token", cached);

Let's remove the logging of the actual token value here.

::: services/fxaccounts/FxAccountsStorage.jsm
@@ +14,5 @@
> +Cu.import("resource://gre/modules/FxAccountsCommon.js");
> +Cu.import("resource://gre/modules/osfile.jsm");
> +Cu.import("resource://services-common/utils.js");
> +
> +function FxAccountsStorageManager(options, useSecure = true) {

This is a weird smell that useSecure needs to passed here. Can the storage manager figure this out itself, i.e., move the B2G flag detection in here?

@@ +84,5 @@
> +    // We can't throw this instance away while it is still writing or we may
> +    // end up racing with the newly created one.
> +    log.trace("StorageManager finalizing");
> +    return this._promiseInitialized.then(() => {
> +      this._promiseStorageComplete

Add an explicit return.

@@ +339,5 @@
> +  deleteAccountData() {
> +    return this._queueStorageOperation(() => this._deleteAccountData());
> +  },
> +
> +  deleteAccountData: Task.async(function() {

There's a typo here and a sign that we need some tests around the queued storage operations.

::: services/fxaccounts/tests/xpcshell/test_loginmgr_storage.js
@@ +170,5 @@
>    yield fxa.signOut(/* localOnly = */ true)
>  });
>  
> +// XXXX - need more tests for the mini-migration of the login using the email
> +// vs it now using the uid.

yes please do!

::: services/fxaccounts/tests/xpcshell/test_storage_manager.js
@@ +1,1 @@
> +/* Any copyright is dedicated to the Public Domain.

Needs some tests for the queued stored writes.
Flags: needinfo?(ckarlof)
Comment on attachment 8625936 [details] [diff] [review]
0001-Bug-1157529-refactor-FxA-storage-to-be-less-lossy-an.patch

Review of attachment 8625936 [details] [diff] [review]:
-----------------------------------------------------------------

Made an initial feedback pass while waiting for Gecko to build. :-) I really like the unified API, and the fact that other FxA components no longer need to fetch and mutate the entire storage record.

::: services/fxaccounts/FxAccounts.jsm
@@ +127,4 @@
>    },
>  
>    getUserAccountData() {
> +    return this.storageManager.getAccountData().then(result => {

What if we applied the same promise queue pattern from `FxAccountsStorageManager`? Chain off `promiseInitialAccountData`, return resolved promises as long as the account is current, then reject once `signOut()` or `abort()` is called.

@@ +285,5 @@
>    // (notable exceptions are tests and when we explicitly are saving the entire
>    // set of user data.)
>    _persistCachedTokens() {
>      this._cachePreamble();
> +    return this.updateUserAccountData({ oauthTokens: this.oauthTokens }).catch(err => {

Ah, we're now persisting the scope-token map directly, instead of an array of tokens?

@@ +534,5 @@
>    setSignedInUser: function setSignedInUser(credentials) {
>      log.debug("setSignedInUser - aborting any existing flows");
> +    return this.abortExistingFlow().then(() => {
> +      let currentAccountState = this.currentAccountState = this.newAccountState(
> +        JSON.parse(JSON.stringify(credentials)) // Pass a clone of the credentials object.

I've seen `Cu.cloneInto(credentials, {})` used in some toolkit modules, too...does that have the same effect?

@@ +539,5 @@
> +      );
> +      // This promise waits for storage, but not for verification.
> +      // We're telling the caller that this is durable now (although is that
> +      // really something we should commit to? Why not let the write happen in
> +      // the background? Already does for updateAccountData ;)

Makes sense. If we do this in the background, should `FxAccountsStorageManager` use `AsyncShutdown.jsm` to wait for `finalize()`?

Though I guess there's no guarantee the credentials will be persisted if the login manager remains locked for the entire session, so maybe there's no need?

::: services/fxaccounts/FxAccountsStorage.jsm
@@ +32,5 @@
> +  this._clearCachedData();
> +  // See .initialize() below - this protects against it not being called.
> +  this._promiseInitialized = Promise.reject("initialize not called");
> +  // A promise to avoid storage races - see _queueStorageOperation
> +  this._promiseStorageComplete = Promise.resolve();

If we chained off `_promiseInitialized`, I think we could propagate initialization errors as rejections, and make sure all queued writes wait for the initial read.
(In reply to Kit Cambridge [:kitcambridge] from comment #15)

> What if we applied the same promise queue pattern from
> `FxAccountsStorageManager`? Chain off `promiseInitialAccountData`, return
> resolved promises as long as the account is current, then reject once
> `signOut()` or `abort()` is called.

That's actually a great idea! But I didn't change this patch in the interests of getting it landed :) We should certainly do that when we move to https://id.etherpad.mozilla.org/fxa-refactor in the future.

> Ah, we're now persisting the scope-token map directly, instead of an array
> of tokens?

Yeah, seemed simpler.

> 
> @@ +534,5 @@
> >    setSignedInUser: function setSignedInUser(credentials) {
> >      log.debug("setSignedInUser - aborting any existing flows");
> > +    return this.abortExistingFlow().then(() => {
> > +      let currentAccountState = this.currentAccountState = this.newAccountState(
> > +        JSON.parse(JSON.stringify(credentials)) // Pass a clone of the credentials object.
> 
> I've seen `Cu.cloneInto(credentials, {})` used in some toolkit modules,
> too...does that have the same effect?

Good point, and yeah :) I'm not even sure that was available when the code was written, but it's certainly a better option - I changed it to use that.

> Makes sense. If we do this in the background, should
> `FxAccountsStorageManager` use `AsyncShutdown.jsm` to wait for `finalize()`?

Well, it's *all* in the background already as FxAccounts itself has no such blocker, and the callers probably don't either. It's something we should consider, but I don't think this patch adds additional problems in that regard.

> Though I guess there's no guarantee the credentials will be persisted if the
> login manager remains locked for the entire session, so maybe there's no
> need?

So long as writing the JSON is atomic, I don't see a "didn't manage to write on shutdown" scenario that would cause actual problems - just some "not exactly ideal" situations :)

> If we chained off `_promiseInitialized`, I think we could propagate
> initialization errors as rejections, and make sure all queued writes wait
> for the initial read.

Fancy! :) I didn't bother with this as it adds a little complexity and we already consider "must have initialized" as part of the usage-pattern. IOW, if we did that, then we might as well get rid of the initialization promise completely, and I don't think we should block to explore that :)
I think this covers everything we discussed without too many surprises :)
Attachment #8625936 - Attachment is obsolete: true
Attachment #8628722 - Flags: review?(ckarlof)
So actually testing this brought up a few additional issues.

* The _promiseInitialized magic in FxAccountsStorage would cause an error to be printed to the console as the rejected promise was GCd. I fixed this by adding an empty .catch handler to the rejected promise just before throwing it away and replacing it with a real promise.

* I got a bit too smart by trying to use this.resolve() in getUserAccountData to check if we were still current, but this wasn't always effective as it was possible to pass that check, but still end up with .storageManager being null as we aborted the state after the check.

* Due to an error unrelated to this patch I discovered that if the storageManager's initialize() method failed things would hang. So currentAccountState.promiseInitialized() now had a catch handler.

* I found we were logging all the data stored in the json when logging at "Trace" level - we now just log the keys and not the values.
Attachment #8628722 - Attachment is obsolete: true
Attachment #8628722 - Flags: review?(ckarlof)
Attachment #8629220 - Flags: review?(ckarlof)
And to be clear, this patch doesn't actually enable the profile being cached - it just allows us to enable it later, which I'll do ASAP.
Comment on attachment 8629220 [details] [diff] [review]
0002-Bug-1157529-refactor-FxA-storage-to-be-less-lossy-an.patch

Review of attachment 8629220 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/fxaccounts/FxAccountsStorage.jsm
@@ +467,5 @@
> +         "@mozilla.org/login-manager/loginInfo;1", Ci.nsILoginInfo, "init");
> +      let login = new loginInfo(FXA_PWDMGR_HOST,
> +                                null, // aFormSubmitURL,
> +                                FXA_PWDMGR_REALM, // aHttpRealm,
> +                                uid, // aUsername

actually, using the uid may not be a smart idea at the moment - it means people who bounce between channels (ie, move back to, say, the release channel) will lose their credentials. It might make sense to keep storing the email, but also still allow either the uid or email to match in .get(). Then, once this patch has made it to release we make a change to start storing the uid instead of the email, but older versions should continue to work - they'll recognize the uid as matching even if they continue to write the email.
Scoundrel good to me.
Who you calling a scoundrel? ;)

With the change described in comment 22, plus a log tweak.
Attachment #8629220 - Attachment is obsolete: true
Attachment #8629220 - Flags: review?(ckarlof)
Attachment #8631453 - Flags: review?(ckarlof)
upon r+, we would need this uplifted to fx41
Blocks: 1182288
Rank: 15
markh, do you agree this should be uplifted to 41? I believe that's what our plan was.
Comment on attachment 8631453 [details] [diff] [review]
0001-Bug-1157529-refactor-FxA-storage-to-be-less-lossy-an.patch

Review of attachment 8631453 [details] [diff] [review]:
-----------------------------------------------------------------

r+, assuming, my comments are considered and either addressed or ignored for a good reason. :)

::: services/fxaccounts/FxAccounts.jsm
@@ +132,4 @@
>    },
>  
>    getUserAccountData() {
> +    if (!this.storageManager) {

Are you using this as a signal that the user has signed out and isn't current anymore? I believe this was added in response to a previous patch review where I pointed out that if the user wasn't current anymore, then it would error out because this.storageManager would be null. That's fine, but I don't like the lack of consistency with the check that updateUserAccountData does (i.e., checking this.isCurrent). Is there a reason for that?

@@ +144,1 @@
>      if (!this.isCurrent) {

Why does updateUserAccountData() still use if (!this.isCurrent) instead of checking if the storageManager is null?

@@ +801,5 @@
>        // We are now ready for business. This should only be invoked once
>        // per setSignedInUser(), regardless of whether we've rebooted since
>        // setSignedInUser() was called.
>        this.notifyObservers(ONVERIFIED_NOTIFICATION);
> +      return currentState.getUserAccountData();

Is this retrieving the account data again to allow that something might have changed after yield returns? (I'm comparing it against what was in a previous patch of just returning data.)

@@ +1276,5 @@
>    getSignedInUserProfile: function () {
>      let currentState = this.currentAccountState;
>      return this.profile.getProfile().then(
>        profileData => {
> +        let profile = Cu.cloneInto(profileData, {});

Nice.

::: services/fxaccounts/FxAccountsStorage.jsm
@@ +510,5 @@
> +      }
> +      let login = logins[0];
> +      // Support either the uid or the email as the username - we plan to move
> +      // to storing the uid once Fx41 hits the release channel as the code below
> +      // that handles either first landed in 41.

Can we file a bug for this and get it on the Sync backlog?
Attachment #8631453 - Flags: review?(ckarlof) → review+
Blocks: 1156752
(In reply to Chris Karlof [:ckarlof] from comment #27)
> >    getUserAccountData() {
> > +    if (!this.storageManager) {
> 
...
> I don't like the lack of consistency with the check that
> updateUserAccountData does (i.e., checking this.isCurrent). Is there a
> reason for that?

Nope :) Change to isCurrent.

> @@ +144,1 @@
> >      if (!this.isCurrent) {
> 
> Why does updateUserAccountData() still use if (!this.isCurrent) instead of
> checking if the storageManager is null?

That remains the (now) consistent isCurrent.

> @@ +801,5 @@
> >        // We are now ready for business. This should only be invoked once
> >        // per setSignedInUser(), regardless of whether we've rebooted since
> >        // setSignedInUser() was called.
> >        this.notifyObservers(ONVERIFIED_NOTIFICATION);
> > +      return currentState.getUserAccountData();
> 
> Is this retrieving the account data again to allow that something might have
> changed after yield returns? (I'm comparing it against what was in a
> previous patch of just returning data.)

That path just updated the storage with kA, kB, etc - without that extra getUserAccountData() the caller would not see these new fields.

> ::: services/fxaccounts/FxAccountsStorage.jsm
> @@ +510,5 @@
> > +      }
> > +      let login = logins[0];
> > +      // Support either the uid or the email as the username - we plan to move
> > +      // to storing the uid once Fx41 hits the release channel as the code below
> > +      // that handles either first landed in 41.
> 
> Can we file a bug for this and get it on the Sync backlog?

Opened bug 1183951 and updated the comment to reflect that. I haven't added it to the backlog yet as that bug isn't actionable for a few months.

Note that I also needed to change browser_946320_tabs_from_other_computers.js, but as that was a fairly obvious test-only change I didn't bother rerequesting review.

Thanks!
https://hg.mozilla.org/integration/fx-team/rev/be39e54d4dff
Status: NEW → ASSIGNED
Backout:
https://hg.mozilla.org/integration/fx-team/rev/de25ce87442b
sorry had to back this out since this might have caused regressions like https://treeherder.mozilla.org/logviewer.html#?job_id=3767205&repo=fx-team
Flags: needinfo?(markh)
Whiteboard: [fxsync]
new try is looking green: https://treeherder.mozilla.org/#/jobs?repo=try&revision=de2001b64429
Flags: needinfo?(markh)
https://hg.mozilla.org/integration/fx-team/rev/b29ec4137962
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #34)
> https://treeherder.mozilla.org/logviewer.html#?job_id=3824673&repo=fx-team

This one isn't yours. Sorry, opened one too many tabs there.
Also B2G xpcshell failures.
https://treeherder.mozilla.org/logviewer.html#?job_id=3829800&repo=fx-team
https://hg.mozilla.org/integration/fx-team/rev/89ebfc11c1a2
https://hg.mozilla.org/mozilla-central/rev/89ebfc11c1a2
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla42
Iteration: --- → 42.2 - Jul 27
Product: Core → Firefox
Target Milestone: mozilla42 → Firefox 42
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: