Closed Bug 971194 Opened 10 years ago Closed 10 years ago

Signed out of Sync against my will

Categories

(Firefox :: Sync, defect, P1)

defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox29 --- affected
firefox30 --- affected

People

(Reporter: rfeeley, Unassigned)

References

Details

(Whiteboard: [qa+])

I was signed out of the new FxA Sync against my own will.

Here are the logs.

https://dl.dropboxusercontent.com/u/5265484/ryans-sync-logs.zip
Blocks: 969593
Component: Sync → Firefox Sync: Backend
Product: Firefox → Mozilla Services
Interesting logs. You're seeing errors for missing fields on startup, until things get set up:


1391817904213   Sync.Service    INFO    Loading Weave 1.32.0
1391817904221   Sync.Engine.Clients     DEBUG   Engine initialized
...
1391817904243   Sync.Status     DEBUG   Status.login: success.login => error.login.reason.no_username
1391817904243   Sync.Status     DEBUG   Status.service: success.status_ok => service.client_not_configured

...

1392137118699   Sync.Status     DEBUG   Status.login: error.login.reason.no_recoverykey => error.login.reason.no_recoverykey
1392137118699   Sync.Status     DEBUG   Status.service: service.client_not_configured => service.client_not_configured


You're having network problems:

1392137118710   Sync.Service    DEBUG   Exception: NS_ERROR_UNKNOWN_HOST JS Stack trace: Res_get@resource.js:413 < Sync11Service.prototype._fetchInfo@service.js:521 < sync@enginesync.js:75 < onNotify@service.js:1207 < WrappedNotify@util.js:143 < WrappedLock@util.js:98 < _lockedSync@service.js:1201 < sync/<@service.js:1192 < WrappedCatch@util.js:72 < sync@service.js:1180


You're seeing other bugs:

1392137118689   Sync.BrowserIDManager   ERROR   Authentication error in _fetchTokenForUser: Client error.


You're even hitting the Firefox 4 upgrade path, which you'll hit if the remote syncID differs from the local syncID:

1392146433843   Sync.Service    WARN    Failed to upgrade sync key. Failing remote setup.


and apparently being in an error state while still happily syncing:

1392141596666   Sync.Collection DEBUG   mesg: POST success 200 https://sync-2-us-east-1.sync.services.mozilla.com/1.5/11476/storage/tabs
1392141596667   Sync.Collection DEBUG   POST success 200 https://sync-2-us-east-1.sync.services.mozilla.com/1.5/11476/storage/tabs
1392141724232   Sync.Status     DEBUG   Status.login: error.login.reason.no_recoverykey => error.login.reason.no_recoverykey
1392141724232   Sync.Status     DEBUG   Status.service: service.client_not_configured => service.client_not_configured
1392141724232   Sync.Status     DEBUG   Status.login: error.login.reason.no_recoverykey => error.login.reason.no_recoverykey
1392141724232   Sync.Status     DEBUG   Status.service: service.client_not_configured => service.client_not_configured


The overall concern I have with these logs: the FxA auth system, because of its additional up-front auth work, is *routinely* placing the client into an error state. That's riskier than I'd like, because it runs the risk of tickling bugs in less-well-tested code, including code that hasn't run since old passphrase-based Sync accounts were updated to use 26-char keys.

I think the core error here is that the new FxA identity module doesn't completely implement the interface that Service expects. In particular, `upgradeSyncKey` is returning `false` during remote setup, presumably because the identity module doesn't have a sync key.

  /**
   * Obtain the Sync Key.
   *
   * This returns a 26 character "friendly" Base32 encoded string on success or
   * null if no Sync Key could be found.
   *
   * If the Sync Key hasn't been set in this session, this will look in the
   * password manager for the sync key.
   */
  get syncKey() {
    if (this.syncKeyBundle) {
      // TODO: This is probably fine because the code shouldn't be
      // using the sync key directly (it should use the sync key
      // bundle), but I don't like it. We should probably refactor
      // code that is inspecting this to not do validation on this
      // field directly and instead call a isSyncKeyValid() function
      // that we can override.
      return "99999999999999999999999999";
    }
    else {
      return null;
    }
  },

browserid_identity.js has quite a few TODOs and incorrect comments copypasta'd from elsewhere, so I wouldn't be surprised if there were more behaviors like this hiding in the error handling states.

Flagging for tracking, because I see no reason why this is likely to uniquely affect Ryan, and these error states will only be more frequent under production load.
OS: Mac OS X → All
Hardware: x86 → All
Another user:

10:01:03 < cade> tracy: soooooo, definitely not successful. about:sync-log shows a whole bunch of error logs. latest is: https://www.pastebin.mozilla.org/4279196
Ryan and I were unable to repro the getting signed out with a new profile/user.  I do often see errors in the sync logs but I'm still successfully auth and syncing.
(In reply to Edwin Wong [:edwong] from comment #3)
> Ryan and I were unable to repro the getting signed out with a new
> profile/user.  I do often see errors in the sync logs but I'm still
> successfully auth and syncing.

There will need to be a certain sequence of server-related errors to cause this scenario. Regular black-box QA won't find it; it probably needs white-box testing with provoked server failures and timing issues.
Priority: -- → P1
Found one way to make this occur, by accident.

A 503 fetching info/collections => Tools shows "Set up Sync…", Prefs shows “Please sign in to reconnect”.

The 503 was the first and only request this session.

That's not right.

Repro:

* Set up Sync.
* Switch your token server URL to a token server that will 503.
* Relaunch browser.
* Tools > Sync Now.
(In reply to Richard Newman [:rnewman] from comment #6)
> Found one way to make this occur, by accident.
> 
> A 503 fetching info/collections => Tools shows "Set up Sync…", Prefs shows
> “Please sign in to reconnect”.
> 
> The 503 was the first and only request this session.
> 
> That's not right.
> 
> Repro:
> 
> * Set up Sync.
> * Switch your token server URL to a token server that will 503.
> * Relaunch browser.
> * Tools > Sync Now.

All token server errors were being treated as auth errors. Fix in Bug 977502.
Whiteboard: [qa+]
Looking through the logs, I suspect Ryan hit a magical combination of issues: FxAccounts.jsm not clearing state sufficiently on user logout, FxA Sync not clearing prefs on logout, network and other errors being treated as auth errors, and crappy luck with clock skew. These have been hopefully fixed in Bug 977502, Bug 967015, Bug 972070, and turning off timestamp checking in Hawk. browserid_identity.js could still use a fair amount of cleanup, particular in the sync key getter hack you pointed out, Richard. :) It could just be wishful thinking, but I suspect Ryan will have a much better experience on Nightly now.
Per discussion with Ryan, he is still experiencing other Sync issues, but not the ones reported in this bug. Closing as fixed.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Resolution: FIXED → WORKSFORME
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.