Closed Bug 967580 (clockskew) Opened 10 years ago Closed 10 years ago

Firefox Accounts (Sync) - 401 Unauthorized; credentials work fine on other device

Categories

(Firefox for Android Graveyard :: Android Sync, defect)

Firefox 29
ARM
Android
defect
Not set
major

Tracking

(firefox29 affected, firefox30 affected, fennec29+)

RESOLVED FIXED
Tracking Status
firefox29 --- affected
firefox30 --- affected
fennec 29+ ---

People

(Reporter: aaronmt, Assigned: nalexander)

References

Details

(Whiteboard: [qa+])

On my Samsung Galaxy Tab 2 (Android 4.1), I can sign-in; but glancing at my console, I get a 401 Unauthorized dump and notice no Sync ever taking place.

These credentials work fine on my LG Nexus 4, and I had verified the account through email a week prior.
Both devices are running Aurora 29.0a2 (02/04)
I/FxAccounts(25213): fennec :: FxAccountSyncAdapter :: handleTransition: LogMessage('keys succeeded') to Cohabiting
I/FxAccounts(25213): fennec :: FxAccountSyncAdapter :: handleTransition: LogMessage('sign succeeded') to Married
I/FxAccounts(25213): fennec :: FxAccountSyncAdapter :: handleFinal: in Married
I/FxAccounts(25213): fennec :: GlobalSession :: Running next stage checkPreconditions (org.mozilla.gecko.sync.stage.CheckPreconditionsStage@419ef550)...
I/FxAccounts(25213): fennec :: GlobalSession :: Running next stage ensureClusterURL (org.mozilla.gecko.sync.stage.CheckPreconditionsStage@419b5a50)...
I/FxAccounts(25213): fennec :: GlobalSession :: Running next stage fetchInfoCollections (org.mozilla.gecko.sync.stage.FetchInfoCollectionsStage@419f01e8)...
W/FxAccounts(25213): fennec :: SSResourceDelegate :: HTTP request failed.
W/FxAccounts(25213): fennec :: SSResourceDelegate :: HTTP response body: <html>
W/FxAccounts(25213):  <head>
W/FxAccounts(25213):   <title>401 Unauthorized</title>
W/FxAccounts(25213):  </head>
W/FxAccounts(25213):  <body>
W/FxAccounts(25213):   <h1>401 Unauthorized</h1>
W/FxAccounts(25213):   This server could not verify that you are authorized to access the document you requested.  Either you supplied the wrong credentials (e.g., bad password), or your browser does not understand how to supply the credentials required.<br/><br/>
W/FxAccounts(25213): invalid Hawk signature
W/FxAccounts(25213): 
W/FxAccounts(25213): 
W/FxAccounts(25213):  </body>
W/FxAccounts(25213): </html>
W/FxAccounts(25213): fennec :: GlobalSession :: Aborting sync due to HTTP 401
W/FxAccounts(25213): fennec :: GlobalSession :: Aborting sync: Failure fetching info/collections.
W/FxAccounts(25213): <HTTPFailureException 401 :: (<html>
Any idea, Ryan?
Flags: needinfo?(rfkelly)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.GlobalSession.handleHTTPError(GlobalSession.java:478)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.stage.FetchInfoCollectionsStage$StageInfoCollectionsDelegate.handleFailure(FetchInfoCollectionsStage.java:25)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.JSONRecordFetcher$JSONFetchHandler.handleRequestFailure(JSONRecordFetcher.java:67)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.net.SyncStorageRequest$SyncStorageResourceDelegate.handleHttpResponse(SyncStorageRequest.java:128)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.net.BaseResource.execute(BaseResource.java:276)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.net.BaseResource.go(BaseResource.java:311)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.net.BaseResource.get(BaseResource.java:317)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.net.SyncStorageRequest.get(SyncStorageRequest.java:177)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.JSONRecordFetcher.fetch(JSONRecordFetcher.java:81)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.stage.FetchInfoCollectionsStage.execute(FetchInfoCollectionsStage.java:38)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.stage.AbstractSessionManagingSyncStage.execute(AbstractSessionManagingSyncStage.java:41)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.GlobalSession.advance(GlobalSession.java:273)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.stage.CheckPreconditionsStage.execute(CheckPreconditionsStage.java:11)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.stage.AbstractSessionManagingSyncStage.execute(AbstractSessionManagingSyncStage.java:41)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.GlobalSession.advance(GlobalSession.java:273)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.stage.CheckPreconditionsStage.execute(CheckPreconditionsStage.java:11)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.stage.AbstractSessionManagingSyncStage.execute(AbstractSessionManagingSyncStage.java:41)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.GlobalSession.advance(GlobalSession.java:273)
E/FxAccounts(25213): 	at org.mozilla.gecko.sync.GlobalSession.start(GlobalSession.java:310)
E/FxAccounts(25213): 	at org.mozilla.gecko.fxa.sync.FxAccountSyncAdapter$1.handleSuccess(FxAccountSyncAdapter.java:250)
E/FxAccounts(25213): 	at org.mozilla.gecko.tokenserver.TokenServerClient$1.run(TokenServerClient.java:82)
E/FxAccounts(25213): 	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1080)
E/FxAccounts(25213): 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:573)
E/FxAccounts(25213): 	at java.lang.Thread.run(Thread.java:856)
tracking-fennec: --- → ?
Seeing this on more than one deivce, e.g, a Alcatel One Touch 8008X I randomly plucked from my stash
> invalid Hawk signature

This means something is going wrong with the request-signing: maybe timestamp issue, maybe some problem with the shared secrets on the server.  I'll see what I can find out...
Flags: needinfo?(rfkelly)
Aaron can you please check if there are any time-sync problems with the non-working versus working devices?  E.g. if the non-working devices all have their clocks a little bit of, it may indicate a problem with our handling/reporting of this on the server.
Flags: needinfo?(aaron.train)
Looks like that was it — on my One Touch 8008X the clock was off by ~15 minutes (with the correct date). I set the time to match my laptop and did a successful Sync afterwards. I don't have access to the tablet today but I suspect that it's the same issue.

What is the time differential threshold between device and server where things might go awry? Is there anything to do in this bug than?
Flags: needinfo?(aaron.train)
(+CC James just so he's aware of this)
Ugh. This is the first storage request, so it won't have 'learned' any skew from the auth server. (As indeed it should not.)

We should consider retrying info/collections, just as we do auth. And we should make sure that we update SkewHandler from i/c requests.

But more frustratingly: how can we programmatically determine that we're in this state, and it's the result of clock skew not bad credentials?
Flags: needinfo?(rfkelly)
Whiteboard: [qa+]
> But more frustratingly: how can we programmatically determine that we're in this state,
> and it's the result of clock skew not bad credentials?

I can give you a new error number for this purpose.  Shame we don't have the rich JSON error objects of fxa-auth-server or tokenserver, but a plain old number will do the same job.
Flags: needinfo?(rfkelly)
Nick - Reassign as needed
Assignee: nobody → nalexander
tracking-fennec: ? → 29+
Nick, let me know if you want protocol-level changes on the server to make this easier.
Alias: clockskew
Severity: normal → major
The bandaid for this is currently in stage and should hit prod soon; Bug 975214
Is this fix live now on production?
I guess not, I can still reproduce this issue with my devices off by ~3 to ~5 minutes or so.
> Is this fix live now on production?

Nope, sorry, it's been delayed by some other ops backend changes.  I'll work with ops to prioritize getting it out early this week.
Depends on: 979075
Is there an update here?
(In reply to Aaron Train [:aaronmt] from comment #21)
> Is there an update here?

Not a very good one.  We've separated out things that presented like this -- the 2.2 signature issue (Bug 975625) and a large class of issues like this by addressing the server.  I have a pending patch to address the first sync failing due to the client not knowing the server skew that I intend to land today (Bug 979075).

Can we get a QA status update here?  Did this rear its ugly head during the Sync test day?
Flags: needinfo?(aaron.train)
(In reply to Nick Alexander :nalexander from comment #22)
> (In reply to Aaron Train [:aaronmt] from comment #21)
> > Is there an update here?
> 
> Not a very good one.  We've separated out things that presented like this --
> the 2.2 signature issue (Bug 975625) and a large class of issues like this
> by addressing the server.  I have a pending patch to address the first sync
> failing due to the client not knowing the server skew that I intend to land
> today (Bug 979075).
> 
> Can we get a QA status update here?  Did this rear its ugly head during the
> Sync test day?

It did not, but we didn't have good Android 2.2 coverage during the test day, alas -- it was rather sparsely attended.  I'll poke Edwin (he's got a 2.2 device he's been using) and see if he's got anything new.
Not that I'm aware of; our attendee pool was too small. Through personal testing I continue to hit this and bug 980478 on some but not all devices on both mozilla-29 and mozilla-30.
(In reply to Karl Thiessen [:kthiessen] from comment #23)
> (In reply to Nick Alexander :nalexander from comment #22)
> > (In reply to Aaron Train [:aaronmt] from comment #21)
> > > Is there an update here?
> > 
> > Not a very good one.  We've separated out things that presented like this --
> > the 2.2 signature issue (Bug 975625) and a large class of issues like this
> > by addressing the server.  I have a pending patch to address the first sync
> > failing due to the client not knowing the server skew that I intend to land
> > today (Bug 979075).
> > 
> > Can we get a QA status update here?  Did this rear its ugly head during the
> > Sync test day?
> 
> It did not, but we didn't have good Android 2.2 coverage during the test
> day, alas -- it was rather sparsely attended.  I'll poke Edwin (he's got a
> 2.2 device he's been using) and see if he's got anything new.

Sorry, communication breakdown -- we believe that Sync will fail hard with "invalid-signature" for 2.2 devices (all? only certain? devices) due to a busted crypto implementation from Android.  I meant, did we see lots of clockskew issue at the test day.
Flags: needinfo?(aaron.train)
Nick, can we close this, given the RSA/DSA changeover and clock skew elimination?
Flags: needinfo?(nalexander)
(In reply to Richard Newman [:rnewman] from comment #26)
> Nick, can we close this, given the RSA/DSA changeover and clock skew
> elimination?

The DSA changeover hasn't landed, but it is my understanding that this ticket tracks 'invalid-timestamp', not 'invalid-signature'.
Status: NEW → RESOLVED
Closed: 10 years ago
Flags: needinfo?(nalexander)
Resolution: --- → FIXED
(In reply to Nick Alexander :nalexander from comment #27)
> (In reply to Richard Newman [:rnewman] from comment #26)
> > Nick, can we close this, given the RSA/DSA changeover and clock skew
> > elimination?
> 
> The DSA changeover hasn't landed, but it is my understanding that this
> ticket tracks 'invalid-timestamp', not 'invalid-signature'.

... and the client-side timestamp changes in Bug 980478, and the server side timestamp changes (no bug number handy), have addressed this.
Product: Android Background Services → Firefox for Android
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.