Closed Bug 1059787 Opened 7 years ago Closed 7 years ago

Sometimes when trying to sign un with Firefox Accounts we are getting a 401 error

Categories

(Firefox :: Firefox Accounts, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: borjasalguero, Unassigned)

References

Details

(Whiteboard: [qa+][workaround])

Attachments

(3 files, 1 obsolete file)

Sometimes, when trying to sign up with Firefox Accounts in Loop, we are retrieving the following error '{"code":401,"errno":110,"error":"bad signature in chain"}'.
Whiteboard: [qa+]
I traced down the message to:

    ./node_modules/jwcrypto/lib/cert.js:138

But we don't use jwcrypto directly in loop.

This message could be returned by the FxA verifier here in loop/fxa.js:77

Chris — Do you know what can cause this kind of problem?

Borja — Can you confirm this come with Firefox Accounts rather than with MSISDN Gateway?
Flags: needinfo?(ckarlof)
Flags: needinfo?(borja.bugzilla)
I see this error message only generated by a local Persona verifier library for Python:

https://github.com/mozilla/PyBrowserID/blob/master/browserid/verifiers/local.py#L131

Is any part of Loop or the MSISDN verification service using Python that would generate this?
Flags: needinfo?(ckarlof) → needinfo?(rhubscher)
Possibly related to bug 1060490.
(In reply to Rémy Hubscher (:natim) from comment #1)

> Borja — Can you confirm this come with Firefox Accounts rather than with
> MSISDN Gateway?

Yep. This is happening with FxA.
Flags: needinfo?(borja.bugzilla)
Borja,

Can you provide some steps to reproduce? I'm assuming this is from the Loop app on FxOS, using the native FxA login/signup on FxOS, talking to the Loop production server, but please clarify.
Flags: needinfo?(borja.bugzilla)
Hi Chris,
This is not easy to reproduce, but I did when logging in with FxA with a Firefox OS device. It happens to me after playing several times logging in FxA launching the flow with Loop App, and then logging out/logging in from the Settings App. After several times, suddenly Im getting the error:
'{"code":401,"errno":110,"error":"bad signature in chain"}'
From the Loop server when trying to log in with Loop & FxA.
I hope it helps! First thing would be to identify the reasons in the server for getting this error back, and then investigate if this could be a platform issue or not.
Flags: needinfo?(borja.bugzilla)
My hypothesis here is that this is Loop forwarding an error returned by the FxA verifier. I believe the FxA Verifier was last deployed with a 0.4.x version of jwcrypto, and here is what is likely triggering the error: https://github.com/mozilla/jwcrypto/blob/0.4.x/lib/cert.js#L172

Not particularly helpful:

>  // verify the chain
>  verifyChain(certs, now, getRoot, function(err, certParamsArray) {
>    // simplify error message
>    if (err) {
>      // allow through the malformed signature
>      if (err == 'malformed signature' ||
>          err == "assertion issued later than verification date" ||
>          err == "assertion has expired")
>        return cb(err);
>      else
>        return cb("bad signature in chain");
>    }

but we have may some insight into what it's *not*. It is probably some malformed assertion, but it's hard to tell where it's getting borked without some logs.
Sam, you have ever seen this on FxA for FxOS?
Flags: needinfo?(spenrose)
I haven't seen a weird occurrence of errno 110, no.

(In reply to Chris Karlof [:ckarlof] from comment #9)
> Sam, you have ever seen this on FxA for FxOS?
Flags: needinfo?(spenrose)
It looks like we've seen something similar in other contexts as well: https://bugzilla.mozilla.org/show_bug.cgi?id=1045502

It may be a bug with the remote FxA verifier.
Alternatively, it could be a bug on the shared Desktop/FxOS code that generates these assertions. We're working on getting more verbose logging deployed to the verifier to help debug this further:

https://github.com/mozilla/browserid-verifier/issues/59
Ok thank you Chris.
Flags: needinfo?(rhubscher)
Remy,

It would also be helpful if the Loop server could verbosely log the client request (assertion, headers, arams, url, etc.) when this happens so we can try to debug why it is occurring.
Added Bug 1059787 for this
Depends on: 1064971
We are now logging 401 errors request headers to be able to debug this in the future.

I will close for now. As soon as 0.12.0 has been released to production, feel free to open a new bug about this and we will be able to debug it more deeply.

https://github.com/mozilla-services/loop-server/commit/919aec50ec694f0f8f2cc2c52204b412366cd544
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
ok. I am working on the 0.12.0 release now, so we can check this again next week and possibly file new bugs...
Status: RESOLVED → VERIFIED
Blocks: 1068064
Duplicate of this bug: 1068064
We are investigation the assertion verification error here: https://bugzilla.mozilla.org/show_bug.cgi?id=1059787

Could you please provide us the keypair that you are using with a failing assertion?
Status: VERIFIED → REOPENED
Flags: needinfo?(ferjmoreno)
Resolution: WORKSFORME → ---
:rfkelly and I did some debugging of this last week: https://github.com/mozilla/jwcrypto/issues/95

Notes:

* Alexis provided us with an assertion that verified with a "invalid signature" error. We debugged the verification and it failed the "same way" wrt to the math in two different implementations. For now, we're ruling out a verification bug.
* FxOS assertion generation shares code with desktop sync and we haven't seen these errors in sync.
* Most of the crypto for assertion generation relies on NSS code.

We are operating under the assumption that there is something different between desktop and FxOS assertion generation. Most of the heavy lifting is done in NSS, which is a little opaque to us. As :natim mentioned, getting a keypair that generated an invalid assertion would help us move forward.
(In reply to Rémy Hubscher (:natim) from comment #19)
> We are investigation the assertion verification error here:
> https://bugzilla.mozilla.org/show_bug.cgi?id=1059787
> 
> Could you please provide us the keypair that you are using with a failing
> assertion?

I replied via email. We will provide the keypair along with the failing assertion as soon as we manage to reproduce the issue again.
Flags: needinfo?(ferjmoreno)
Ok thanks a lot Fernando.
Hi Fernando, did you get a chance to send the key pair to Remy via email?  Daniel mentioned that seeing this more frequently.
Flags: needinfo?(ferjmoreno)
Not yet, we are having some issues obtaining these keys. I emailed them with the details.
Flags: needinfo?(ferjmoreno)
Depends on: 1088995
I spent some time reading the client code and didn't see anything wrong, but I have an idea. One key fact appears to be that other FxA mozID consumers work. Does this happen when the loop user has an email address? If someone can reproduce this on a device with the developer apps installed, they should be able to check and see if the assertion is being altered in some way via the following steps:

1) With developer apps installed, open UITests app.
2) Tab "API", menu item "Firefox Accounts mozId".
3) Press request and use the same account info that the Loop client is using.
4) The assertion will be printed; compare to what Loop client sends to Loop server.

I can also look at debugging jwcrypto per bug 1088995, but I do not put my odds of success very high.
Assignee: nobody → spenrose
Assigning to Sam to reproduce on our side.
I tried to reproduce what Borja saw in comment #7 and after ~10 FxA login/logout cycles (plus a few Loop login/logout cycles while still logged into FxA, just for fun) I cannot yet. I can try some more.

However, I would like to know if anyone has seen this in "normal" usage. It's certainly a bug if you have to login/logout 4 times in 2 minutes to trigger it, but maybe it is not a blocker.

Also, do you have to change users to trigger it?
The following is probably not correct, but we could test it. If the "expired" certificate described in the github issue [1] comes from the previous session, then maybe the promise chain for getting an assertion in the new session is resolving before the promise chain for signing out in the old session is destroying the old certificate cache. Here are the steps to test this theory:

1) Set the following prefs to "debug":
  - identity.fxaccounts.loglevel
  - identity.fxaccounts.log.appender.dump

2) Check the logcat for this line from signout [2]:
  - "Signed out"

3) If it comes after this line from getAssertion [3] then we PROBABLY have this problem:
  - "getAssertionFromCert returning signed: <boolean or data>"

To be more sure, we would need to reproduce on a build with more fine-grained logging of signOut, at [4] and maybe one or two other places.

[1] https://github.com/mozilla/browserid-crypto/issues/95#issue-44752033
[2] https://github.com/mozilla/gecko-dev/blob/master/services/fxaccounts/FxAccountsManager.jsm#L298
[3] https://github.com/mozilla/gecko-dev/blob/master/services/fxaccounts/FxAccounts.jsm#L705
[4] https://github.com/mozilla/gecko-dev/blob/master/services/fxaccounts/FxAccounts.jsm#L577
I'm getting this 401 currently - 31 Oct 2014, 09:17 GMT.

I've reproduced this on Nightly & Beta with Fresh profiles for approx the last 5 minutes.

It correctly detects if I enter the wrong password.

Response from the server is:

{"code":401,"errno":110,"error":"Unauthorized","message":"Invalid authentication token in request signature","info":"https://github.com/mozilla/fxa-auth-server/blob/master/docs/api.md#response-format"}
Moved comment 29 out to bug 1092090 as its an issue with signing into the FxA not Loop.
(In reply to Mark Banner (:standard8) from comment #30)
> Moved comment 29 out to bug 1092090 as its an issue with signing into the
> FxA not Loop.

I think you meant Bug 1092080.
Hi Borja,  are you still seeing this after the fix that landed on 10/31 for FxA?  I haven't seen us reproduce this and couldn't reproduce with a new account creation or sign-in/sign-out.

If you can't repro anymore - are we good to close?
Flags: needinfo?(borja.bugzilla)
(In reply to sescalante from comment #32)
> Hi Borja,  are you still seeing this after the fix that landed on 10/31 for
> FxA?  I haven't seen us reproduce this and couldn't reproduce with a new
> account creation or sign-in/sign-out.
> 
> If you can't repro anymore - are we good to close?

I am not reproducing it lately but I would like QA team can confirm it.
Massimo, Javier, please can you check it? it seems to be the same issue reported in bug 1083817. 
(https://bugzilla.mozilla.org/show_bug.cgi?id=1083817#c10)
Thanks a lot!
Flags: needinfo?(mbarone976)
Flags: needinfo?(javier.deprado)
(In reply to sescalante from comment #32)
> Hi Borja,  are you still seeing this after the fix that landed on 10/31 for
> FxA?  I haven't seen us reproduce this and couldn't reproduce with a new
> account creation or sign-in/sign-out.
> 
> If you can't repro anymore - are we good to close?

sescalante (or someone else) can you tell me what "fix landed on 10/31"? We are seeing this in a specific situation on a 2.0 build, and we'd like to know what you think fixed it. Thanks!
Flags: needinfo?(sescalante)
we thought it was a server side fix that resolved the 401 error.  Bug 1092080 - Can't sign into Firefox Accounts - Invalid Token
Flags: needinfo?(sescalante)
Bug 1092080 was a service disruption related to a bad deploy. I suspect it was unrelated to this ongoing and unresolved issue.

I think best leads we have are currently represented in https://bugzilla.mozilla.org/show_bug.cgi?id=1059787#c27 and https://bugzilla.mozilla.org/show_bug.cgi?id=1059787#c28.
Flags: needinfo?(javier.deprado)
Duplicate of this bug: 1097898
All,
I'm sending an HTTP POST to https://verifier.login.persona.org/verify with the body:

"{"assertion": "eyJhbGciOiJSUzI1NiJ9.eyJmeGEtZ2VuZXJhdGlvbiI6MTQxNTIzMDA2OTAyMCwiZnhhLWxhc3RBdXRoQXQiOjE0MTU4MjEzNjYsImZ4YS12ZXJpZmllZEVtYWlsIjoibmRlc2F1bG5pZXJzQG1vemlsbGEuY29tIiwicHVibGljLWtleSI6eyJhbGdvcml0aG0iOiJEUyIsInkiOiJlN2JkOTU1NTEyNjgxMjZiZDgyNGRjZjU0NWE1MGFkODM0NGU5ZDU1YWRhMjM3NWM4OWU0YTlkY2EyMjRkYzJhNDY2YTBhM2M1MjgzZDEyNGY3OTcyOThlNWY4ZmVmYjE4OGE2NjAwMmExNTZlNDFjMTI0NjY1N2FlOGU4Y2I0NWIxNWEyZjI1OWY4MTIwNTU4OGUwMDZmY2E0ZjU4OWJiMzQzMmZiOThjZWI0ODRkMWY2Yjk3MjU4NDg5Yjc4MDA3YzBkZDI5NTBhZTVlNGU1Y2ZmMzgyNTI3ZDFhYzUzMzBjM2U3OTM4ZTgxYmRmMDU2OWU0ZWVjZGJkNDY1NTAwIiwicCI6ImZmNjAwNDgzZGI2YWJmYzViNDVlYWI3ODU5NGIzNTMzZDU1MGQ5ZjFiZjJhOTkyYTdhOGRhYTZkYzM0ZjgwNDVhZDRlNmUwYzQyOWQzMzRlZWVhYWVmZDdlMjNkNDgxMGJlMDBlNGNjMTQ5MmNiYTMyNWJhODFmZjJkNWE1YjMwNWE4ZDE3ZWIzYmY0YTA2YTM0OWQzOTJlMDBkMzI5NzQ0YTUxNzkzODAzNDRlODJhMThjNDc5MzM0MzhmODkxZTIyYWVlZjgxMmQ2OWM4Zjc1ZTMyNmNiNzBlYTAwMGMzZjc3NmRmZGJkNjA0NjM4YzJlZjcxN2ZjMjZkMDJlMTciLCJxIjoiZTIxZTA0ZjkxMWQxZWQ3OTkxMDA4ZWNhYWIzYmY3NzU5ODQzMDljMyIsImciOiJjNTJhNGEwZmYzYjdlNjFmZGYxODY3Y2U4NDEzODM2OWE2MTU0ZjRhZmE5Mjk2NmUzYzgyN2UyNWNmYTZjZjUwOGI5MGU1ZGU0MTllMTMzN2UwN2EyZTllMmEzY2Q1ZGVhNzA0ZDE3NWY4ZWJmNmFmMzk3ZDY5ZTExMGI5NmFmYjE3YzdhMDMyNTkzMjllNDgyOWIwZDAzYmJjNzg5NmIxNWI0YWRlNTNlMTMwODU4Y2MzNGQ5NjI2OWFhODkwNDFmNDA5MTM2YzcyNDJhMzg4OTVjOWQ1YmNjYWQ0ZjM4OWFmMWQ3YTRiZDEzOThiZDA3MmRmZmE4OTYyMzMzOTdhIn0sInByaW5jaXBhbCI6eyJlbWFpbCI6IjI3YjE5OTE2NjkxZjRlZjU4YmVlMzU0YzA4Yjk1NTljQGFwaS5hY2NvdW50cy5maXJlZm94LmNvbSJ9LCJpYXQiOjE0MTU4MjEzNTc0NjIsImV4cCI6MTQxNTg0Mjk2NzQ2MiwiaXNzIjoiYXBpLmFjY291bnRzLmZpcmVmb3guY29tIn0.SOV7s4n81AuvVh3Ehdar8zPG7emeWVf_pzCy5vckR_Qm2ns523yoTJ-o0pnwTUkSPxN10sWbREqgABGAjtfnCWQv4NMTjMNaLcB472m-NPQs40zSqeUeJWGIB3XylImOukg-W4D42-ACDOaVe_-UgsiG1OS_y672A-iIOyialhKJP-4AD_HC25sPJo-D8BYA6TWm38g0A6STLgg907h-oFf9qtE67DNGTNE5dnRjWFiG_7pWDmVq3uHGrqt9n6T5ElFtMfBOHKvgp4OLkHrF3FOX9C7DPZX1MWdi4PgaDq6v5cq1DtMPT029NTBX5gh-U1GaxE0LNM34FwnwO7pFWw~eyJhbGciOiJEUzEyOCJ9.eyJleHAiOjIyMDQyMjEzNjc3NjQsImF1ZCI6ImFwcDovL3doZXJlcy1teS13YXRlci5meG9zLmdhbWVzLmRpc25leS5jb20ifQ==.cq60pg7jpO1Qad2PuzODCPmXTYV2daXrZiuh36V1YbPyY-dE3VL_zg==", "audience": "app://wheres-my-water.fxos.games.disney.com/index.html"}" __init.js:24

Getting a response status code and body:
"Response Status: 200" __init.js:24

"{"status":"failure","reason":"invalid signature"}"
Nick, can you give us build information? In particular, did you see this on 2.0 latest? Chris if this is still live I'm thinking it might be time to ask for some NSS expertise.
Flags: needinfo?(nick)
Sure, and this is intermittent by the way.
fxos 2.0 b2g 32 eng build v188 base image with newer full flame-kk image flashed on top.
Build identifier: 20141112000204
Flags: needinfo?(nick)
I discussed with Sam and here's where are:

We're reasonably confident that verification is happening correctly (https://github.com/mozilla/browserid-crypto/issues/95), so we're focusing on the client. We've never seen this issue in Sync, which shares code with FxOS for generating assertions and generates a large number of them in production. This could be a FxOS only issue.

Next steps are for Sam to help people encountering this error to provide more information:

1) Log the public key in the signed cert and the public key used to sign the leaf assertion (they should match). 
2) Log the private key, which should help us recompute the signatures off device to debug any issues with the signature generation.

We may need help from SecEng on the latter one.
We are running into this same issue in the Where's My Water when trying to do In App Purchases. We get an Invalid Signature error after what appeared to be a successful login. 


{"status":"failure","reason":"invalid signature"}" __init.js:24

"Verification service did not return a valid email. Full response: {"status":"failure","reason":"invalid signature"}" __init.js:24

and the app is unable to continue with a purchase. I had to quit the app and relaunch before it would prompt me for a new sign in and complete the purchase.
Seeing lots of these.  This is probably going to prevent Disney QA from signing off on IAPs if 1 in 4 or 1 in 8 authentication attempts fail.
Nick, last time Sam attempted to reproduce, he couldn't. If you're able to make it happen 1 in 4 times, can you please work with Sam so he can reliably reproduce locally himself?
Flags: needinfo?(spenrose)
Flags: needinfo?(nick)
I am seeing this specifically when navigator.mozId.watch triggers a password prompt (see #1028398).  navigator.mozId.request was never called in this instance.  FxA calls:

1. navigator.mozId.watch
1a. user logs in
2. navigator.mozId onlogin fires
2a. my app waits until onready to try to contact the verifier service
3. navigator.mozId onready fires
4. My app sends a POST to https://verifier.login.persona.org/verify
5. receive response "{"status":"failure","reason":"invalid signature"}".

This doesn't mean that this never happens when navigator.mozId.request is called, or at least I can't rule it out just yet.

I'm not sure what else I can do to help you guys reproduce this.  Since it's intermittent, I have only caught it in my game, but not yet a simplified test case: https://bugzilla.mozilla.org/attachment.cgi?id=8524827
Flags: needinfo?(nick)
"{"assertion": "eyJhbGciOiJSUzI1NiJ9.eyJmeGEtZ2VuZXJhdGlvbiI6MTQxNTIzMDA2OTAyMCwiZnhhLWxhc3RBdXRoQXQiOjE0MTYzNDQwMDAsImZ4YS12ZXJpZmllZEVtYWlsIjoibmRlc2F1bG5pZXJzQG1vemlsbGEuY29tIiwicHVibGljLWtleSI6eyJhbGdvcml0aG0iOiJEUyIsInkiOiIxODJiNDM2MjA4ZmYxZDBkYTgyMzJhZjFiZWU0NDM1NDZlODBkMTEyZDZlMWUwODA4YWY4ODU2MWU5MDUyYTJiNDc5YjAyYmRlMGZiM2RmOWU0NzkxODk5NjM5ODE0MGUxMWZmZDAxMjQ3YWM0ZmJhZDUwMTE4NWUzZDFlMmFmNWQ3OTk3ZDY3MWQ1MzM4MWYxNDE0OGJhM2Q1NjZiZjlhZWJkMmI1YjY1YWRiYTk1ODllOGVhOGJlNTkwNjBmMjU5M2ExYWM2ODA4MTliM2IwNDAyNTc5YjMxYWNkNmMxMTA4ZGFiYjAyNDkzZGQwYzE1YmJlNGFkMDhlMDk4MGE4IiwicCI6ImZmNjAwNDgzZGI2YWJmYzViNDVlYWI3ODU5NGIzNTMzZDU1MGQ5ZjFiZjJhOTkyYTdhOGRhYTZkYzM0ZjgwNDVhZDRlNmUwYzQyOWQzMzRlZWVhYWVmZDdlMjNkNDgxMGJlMDBlNGNjMTQ5MmNiYTMyNWJhODFmZjJkNWE1YjMwNWE4ZDE3ZWIzYmY0YTA2YTM0OWQzOTJlMDBkMzI5NzQ0YTUxNzkzODAzNDRlODJhMThjNDc5MzM0MzhmODkxZTIyYWVlZjgxMmQ2OWM4Zjc1ZTMyNmNiNzBlYTAwMGMzZjc3NmRmZGJkNjA0NjM4YzJlZjcxN2ZjMjZkMDJlMTciLCJxIjoiZTIxZTA0ZjkxMWQxZWQ3OTkxMDA4ZWNhYWIzYmY3NzU5ODQzMDljMyIsImciOiJjNTJhNGEwZmYzYjdlNjFmZGYxODY3Y2U4NDEzODM2OWE2MTU0ZjRhZmE5Mjk2NmUzYzgyN2UyNWNmYTZjZjUwOGI5MGU1ZGU0MTllMTMzN2UwN2EyZTllMmEzY2Q1ZGVhNzA0ZDE3NWY4ZWJmNmFmMzk3ZDY5ZTExMGI5NmFmYjE3YzdhMDMyNTkzMjllNDgyOWIwZDAzYmJjNzg5NmIxNWI0YWRlNTNlMTMwODU4Y2MzNGQ5NjI2OWFhODkwNDFmNDA5MTM2YzcyNDJhMzg4OTVjOWQ1YmNjYWQ0ZjM4OWFmMWQ3YTRiZDEzOThiZDA3MmRmZmE4OTYyMzMzOTdhIn0sInByaW5jaXBhbCI6eyJlbWFpbCI6IjI3YjE5OTE2NjkxZjRlZjU4YmVlMzU0YzA4Yjk1NTljQGFwaS5hY2NvdW50cy5maXJlZm94LmNvbSJ9LCJpYXQiOjE0MTYzNDM5OTE5NDcsImV4cCI6MTQxNjM2NTYwMTk0NywiaXNzIjoiYXBpLmFjY291bnRzLmZpcmVmb3guY29tIn0.IS1oWkZST03_AWYjS3aCmsj_ZuG_XqkDIHWYR1-icq86E9qJsHPghcT3TE35GGOrtYHlEAw-JG6xaM2nFR8d0OMl35AvInjMnebGBRPJmMJiXY4HVpRKiCr8MuNFY_hpS0_cLx4RSsnwkCf5qPle6RYeoACvoX5S1gLSk50iMhBc_iYiKTOd7U67f-WRCDIEMitUnuczzoVVSRiU1eo-X00nNIWaq4opAhW1ehLQOFgOqntGyE-WdkBFOOttBU2poRf3LS3hkc3MIN02p5bM50f86nY9li5mSzttBiO6_LraiwMwzwG6jYInQgb_JI1oBIRljpA1GJosKK8nCUaeqg~eyJhbGciOiJEUzEyOCJ9.eyJleHAiOjIyMDQ3NDQzMzAwMjIsImF1ZCI6ImFwcDovL3doZXJlcy1teS13YXRlci5meG9zLmdhbWVzLmRpc25leS5jb20ifQ==.WusNJV80OsL7UWOuJ63AgJ2JubnMh6j8gcpn0DWQ2r0TzOHluW6sZA==", "audience": "app://wheres-my-water.fxos.games.disney.com/index.html"}" __init.js:24

my assertion, if it helps (maybe finding the audience in the server logs?)
./bin/check-assertion "eyJhbGciOiJSUzI1NiJ9.eyJmeGEtZ2VuZXJhdGlvbiI6MTQxNTIzMDA2OTAyMCwiZnhhLWxhc3RBdXRoQXQiOjE0MTYzNDQwMDAsImZ4YS12ZXJpZmllZEVtYWlsIjoibmRlc2F1bG5pZXJzQG1vemlsbGEuY29tIiwicHVibGljLWtleSI6eyJhbGdvcml0aG0iOiJEUyIsInkiOiIxODJiNDM2MjA4ZmYxZDBkYTgyMzJhZjFiZWU0NDM1NDZlODBkMTEyZDZlMWUwODA4YWY4ODU2MWU5MDUyYTJiNDc5YjAyYmRlMGZiM2RmOWU0NzkxODk5NjM5ODE0MGUxMWZmZDAxMjQ3YWM0ZmJhZDUwMTE4NWUzZDFlMmFmNWQ3OTk3ZDY3MWQ1MzM4MWYxNDE0OGJhM2Q1NjZiZjlhZWJkMmI1YjY1YWRiYTk1ODllOGVhOGJlNTkwNjBmMjU5M2ExYWM2ODA4MTliM2IwNDAyNTc5YjMxYWNkNmMxMTA4ZGFiYjAyNDkzZGQwYzE1YmJlNGFkMDhlMDk4MGE4IiwicCI6ImZmNjAwNDgzZGI2YWJmYzViNDVlYWI3ODU5NGIzNTMzZDU1MGQ5ZjFiZjJhOTkyYTdhOGRhYTZkYzM0ZjgwNDVhZDRlNmUwYzQyOWQzMzRlZWVhYWVmZDdlMjNkNDgxMGJlMDBlNGNjMTQ5MmNiYTMyNWJhODFmZjJkNWE1YjMwNWE4ZDE3ZWIzYmY0YTA2YTM0OWQzOTJlMDBkMzI5NzQ0YTUxNzkzODAzNDRlODJhMThjNDc5MzM0MzhmODkxZTIyYWVlZjgxMmQ2OWM4Zjc1ZTMyNmNiNzBlYTAwMGMzZjc3NmRmZGJkNjA0NjM4YzJlZjcxN2ZjMjZkMDJlMTciLCJxIjoiZTIxZTA0ZjkxMWQxZWQ3OTkxMDA4ZWNhYWIzYmY3NzU5ODQzMDljMyIsImciOiJjNTJhNGEwZmYzYjdlNjFmZGYxODY3Y2U4NDEzODM2OWE2MTU0ZjRhZmE5Mjk2NmUzYzgyN2UyNWNmYTZjZjUwOGI5MGU1ZGU0MTllMTMzN2UwN2EyZTllMmEzY2Q1ZGVhNzA0ZDE3NWY4ZWJmNmFmMzk3ZDY5ZTExMGI5NmFmYjE3YzdhMDMyNTkzMjllNDgyOWIwZDAzYmJjNzg5NmIxNWI0YWRlNTNlMTMwODU4Y2MzNGQ5NjI2OWFhODkwNDFmNDA5MTM2YzcyNDJhMzg4OTVjOWQ1YmNjYWQ0ZjM4OWFmMWQ3YTRiZDEzOThiZDA3MmRmZmE4OTYyMzMzOTdhIn0sInByaW5jaXBhbCI6eyJlbWFpbCI6IjI3YjE5OTE2NjkxZjRlZjU4YmVlMzU0YzA4Yjk1NTljQGFwaS5hY2NvdW50cy5maXJlZm94LmNvbSJ9LCJpYXQiOjE0MTYzNDM5OTE5NDcsImV4cCI6MTQxNjM2NTYwMTk0NywiaXNzIjoiYXBpLmFjY291bnRzLmZpcmVmb3guY29tIn0.IS1oWkZST03_AWYjS3aCmsj_ZuG_XqkDIHWYR1-icq86E9qJsHPghcT3TE35GGOrtYHlEAw-JG6xaM2nFR8d0OMl35AvInjMnebGBRPJmMJiXY4HVpRKiCr8MuNFY_hpS0_cLx4RSsnwkCf5qPle6RYeoACvoX5S1gLSk50iMhBc_iYiKTOd7U67f-WRCDIEMitUnuczzoVVSRiU1eo-X00nNIWaq4opAhW1ehLQOFgOqntGyE-WdkBFOOttBU2poRf3LS3hkc3MIN02p5bM50f86nY9li5mSzttBiO6_LraiwMwzwG6jYInQgb_JI1oBIRljpA1GJosKK8nCUaeqg~eyJhbGciOiJEUzEyOCJ9.eyJleHAiOjIyMDQ3NDQzMzAwMjIsImF1ZCI6ImFwcDovL3doZXJlcy1teS13YXRlci5meG9zLmdhbWVzLmRpc25leS5jb20ifQ==.WusNJV80OsL7UWOuJ63AgJ2JubnMh6j8gcpn0DWQ2r0TzOHluW6sZA=="
==== cert ====
issuer: api.accounts.firefox.com
principal: {"email":"27b19916691f4ef58bee354c08b9559c@api.accounts.firefox.com"}
iat: Tue Nov 18 2014 12:53:11 GMT-0800 (PST) (1416343991947)
exp: Tue Nov 18 2014 18:53:21 GMT-0800 (PST) (1416365601947)
(seeking public key! trying to fetch .well-known from api.accounts.firefox.com)
(using a median timestamp to ensure no timestamp failure)
cert is properly signed

==== assertion ====
{ header: { alg: 'DS128' },
  payload:
   { exp: 2204744330022,
     aud: 'app://wheres-my-water.fxos.games.disney.com' },
  signature: '5aeb0d255f343ac2fb5163ae27adc0809d89b9b9cc87a8fc81ca67d03590dabd13cce1e5b96eac64',
  headerSegment: 'eyJhbGciOiJEUzEyOCJ9',
  payloadSegment: 'eyJleHAiOjIyMDQ3NDQzMzAwMjIsImF1ZCI6ImFwcDovL3doZXJlcy1teS13YXRlci5meG9zLmdhbWVzLmRpc25leS5jb20ifQ==',
  cryptoSegment: 'WusNJV80OsL7UWOuJ63AgJ2JubnMh6j8gcpn0DWQ2r0TzOHluW6sZA==' }
audience: app://wheres-my-water.fxos.games.disney.com
expires: Sat Nov 12 2039 12:58:50 GMT-0800 (PST)
FATAL: assertion was likely issued after cert expired
(verifying with an expiration date that should be valid for this assertion.)
FATAL: assertion is NOT properly signed: VerificationError: invalid signature
The date on my device is correct.  The time is incorrectly ahead by 3 hours, does that make a difference?
I would guess no, since the difference between issued at (iat) and expiration (exp) for the cert is 6 hours.
This assertion is consistent with other bad assertions I've seen: the signature on the leaf assertion is not verifying.

Some reasons this may be happening:

1) There is a bug in the verification code. This has been ruled out because we have never seen this error in Sync, which also uses BiD assertions generated from this shared code, and we've reproduced the verification code with two different verifier implementations: https://bugzilla.mozilla.org/show_bug.cgi?id=1059787#c41 (I notice that you are using the persona verifier instead of https://verifier.accounts.firefox.com/v2, but they should be roughly the same.)

2) We're using a different public key to sign the leaf assertion than what's included in the signed cert. Sam is currently working on some debugging code to investigate this theory. A related reason is that data passed to the hashing algorithm differs in some way from the data included in the assertion.

3) There is a bug the signature generating code. The heavy lifting of signatures generation is done in NSS and the private key used to generate the signatures is not exposed to the calling code. We may need help from the Crypto Eng team to figure out what code we'd need to include in a build to test this theory. We may wait to see the result of 2) before investigating this, because since we've never seen this in Sync, this would mean a FxOS specific NSS. Possible, but I'm skeptical. 

The crucial next step is to get a build with the extra debugging into the hands of someone who can reproduce this, if Sam can't reproduce himself.
> FATAL: assertion was likely issued after cert expired

This is spurious, and certainly not fatal.  The code in check-assertion that prints this is a pretty strange check:

    if (components.payload.exp > (exp + 5000))
      console.log("FATAL: assertion was likely issued after cert expired");

It's basically "the assertion expires significantly later than the certificate expires", but AFAICT this was done on purpose and won't cause an actual verification failure.

> FATAL: assertion is NOT properly signed: VerificationError: invalid signature

This is the one that's worrying.
Chris, the bug appears to be associated with logging out of FxA and then quickly logging back in. Nick was calling mozId.logout() (which shouldn't do anything in an FxA rather than Persona context -- see bug 1101131 ) inside his onerror() and oncancel() handlers; I suspect removing that call will reduce the impact on WMW. Meantime I'm going to work on the public key logging.
Flags: needinfo?(spenrose)
I added Edwin to the cc list to help with coordinating this effort.
:\n can you confirm comment 52 resolves this issue?
Flags: needinfo?(nick)
Out of band I received a patch that prints private keys, and I've dumped the corresponding public keys. Now we need to reproduce when those keys are printing. I've spent the day attempting to retrigger the issue and failing. We have a couple of attacks left:

1) Attack the intermittent STR success issue by scripting repeated logins, basing them on https://github.com/mozilla-b2g/gaia/tree/master/apps/system/fxa/test/marionette. We'd need to rewrite the test to use the live server. Also, it's possible that by moving to marionette from the pair of privileged apps in which we've seen the problem, we will make it impossible to reproduce.

2) Get a build with the patches installed on a device / situation that can reproduce them.
I have now remove all calls to navigator.mozId.logout from both games, yet I still run into this error.
Flags: needinfo?(nick)
Status update: I'm stuck getting a working build with the debugging patches applied. I have a few more cycles to spend on it today, and will continue tomorrow.
Attached file bad_signatures.txt
Got a device that's getting bad signatures pretty consistently (looks like 100% of the time).  Here are the assertions.
I have a 2.0 build which prints keys, but I can't reproduce the problem yet.
Flags: needinfo?(borja.bugzilla)
:\n can you print your phone version info.  I'd assume you're on b2g2.0 head revision on flame-kk with base image v188 or greater.  You can get the base image this way:
https://gist.github.com/edmoz/0db36eaa5d5cea0a036e#file-gistfile1-txt

can you repro this with the settings > fxa sign in flow.
Attached file test_case.zip (obsolete) —
This test app has been know to reproduce the invalid signature when side loaded.
We spent much of Friday attempting and failing to load our production app from a Marketplace instance. Towards the end of the day, Krupa put test_case on Marketplace and I attempted, and failed, to reproduce the problem. Unfortunately I have just now side-loaded test_case onto my Flame, and also failed to reproduce. On Friday the 14th I was able to reproduce with test_case side-loaded onto a 2.0 shallow flash. So at this point we know:

1) 2.0 latest + side-load does NOT reproduce, at least not easily.
2) 2.0 Nov 14 + side-load did reproduce.

And viable theories include:

1) Some subtle recent change in 2.0 fixed the issue.
2) Some other factor that varies from run to run but that we haven't yet identified is the underlying cause.
3) Reproduction is subject to some confounding factor that makes frequency very erratic, making manual reproduction an ineffective Q/A method (a subset of 2).

I will do more debugging on the crypto routines, but that is slow work.
tl;dr: Low reproducible, causes bad UX due to error message, might be related to side-loading.

In-app workaround:
Alternative #1: Let user retry when 401 is encountered.
1) Remove error message and ask user to sign in again due to a network issue.

Alternative #2: Switch to Firefox Accounts web flow (Risky, needs user state management in-app with Logout)
Flags: needinfo?(nick)
Whiteboard: [qa+] → [qa+][workaround]
For #1: Verify by testing: Sign the user out when a 401 encounters to flush the bad state.
Attached file test_case.zip
The old test case was for a separate bug and did not even reach out to the verifier.  This one does, and I've used it to repro the issue, though it takes a few tries.  The idea is:

1. sideload app via app manager
2. sign out of FxA using settings
3. run app
4. if signed in, close app, repeat 2-3 until invalid signature occurs.
Flags: needinfo?(nick)
Attachment #8527098 - Attachment is obsolete: true
> Alternative #2: Switch to Firefox Accounts web flow (Risky, needs user state management in-app with Logout)

I agree this would be risky at this stage in the game.
I think Ryan fixed that, Ryan, can you confirm and point to the right bug if any?
Flags: needinfo?(rfkelly)
I don't believe I've fixed anything related to this bug...
Flags: needinfo?(rfkelly)
Flags: needinfo?(mbarone976)
It's possible markh ran in to an instance of this bug on Desktop Sync:

10:00 PM <markh> hrm - something strange is happening - I just *created* an FxA account, verified it, and sync *immediately* went into the "needs reauth" state!
10:01 PM <markh> heh - and re-auth isn't helping!
10:01 PM — markh scratches head
10:01 PM <rfkelly> markh we've seen a few reports of weirdness around that lately, e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=1112962
10:01 PM <firebot> Bug 1112962 — UNCONFIRMED, nobody@mozilla.org — Sync has issues authenticating account name and password on FF Start
10:01 PM <markh> yeah
10:02 PM <markh> my logs are showing I can get keys etc, but the sync token is failing
10:03 PM <markh> fetching the token that is
10:03 PM <rfkelly> invalid-client-state maybe?
10:04 PM <markh> I don't think so - just a 401 - http://pastebin.mozilla.org/8102545
10:06 PM <markh> rfkelly: the complete log is at http://pastebin.mozilla.org/8102546 - note that as soon as we see we are verified we fetch all the keys and cert - but then get that 401 fetching the token
10:06 PM <rfkelly> oh, plain old invalid-credentials
10:06 PM <rfkelly> IIRC that means the assertion was not valid
10:07 PM <markh> then at line 69 I reentered the password, which was accepted by accounts.firefox.com - but then remained in that exact same state
10:08 PM <rfkelly> any chance you can capture the outgoing assertion in the tokenserver request?
10:10 PM <markh> rfkelly: is that the post to https://oauth-latest.dev.lcip.org/v1/authorization?
10:11 PM <rfkelly> you shouldn't be posting to oauth as part of sync
10:12 PM <rfkelly> I mean the req to token.services.mozilla.com
10:12 PM <markh> heh - I *am* posting to that URL
:markh and I ended up debugging this in private IRC channel, sorry I should have circled back to #sync with the resolution - he was using assertions issued by latest.dev.lcip.org but submitting them to the prod tokenserver at token.services.mozilla.com.  The issuer mismatch was causing the 401s in Mark's case.
(In reply to Ryan Kelly [:rfkelly] from comment #71)
> :markh and I ended up debugging this in private IRC channel, sorry I should
> have circled back to #sync with the resolution 

open always wins
Ryan / Chris — 
 - Do you have any update on this bug?
 - Were you able to reproduce and track it down?
 - Do you need anymore information from us or the FxOS team?
 - Can we close the bug?
Flags: needinfo?(rfkelly)
Flags: needinfo?(ckarlof)
It never got any clear resolution. As far as I know it has only appeared in a semi-reproducible:

* in WMW testing by :\n
* in Loop FxOS testing

There was some suggestion from :spenrose's investigation that it may be related to how these apps were loaded in a dev environment, but we're not sure. It's been difficult to reproduce, and I haven't heard about it being a significant problem at this time, so it's currently not a priority for us.
Flags: needinfo?(ckarlof)
This is not directly related to the Loop-Server so I am closing it.
Status: REOPENED → RESOLVED
Closed: 7 years ago7 years ago
Flags: needinfo?(rfkelly)
Resolution: --- → WONTFIX
Component: Server → FxAccounts
Product: Loop → Core
Summary: [Loop][Server] Sometimes when trying to sign un with Firefox Accounts we are getting a 401 error → Sometimes when trying to sign un with Firefox Accounts we are getting a 401 error
Assignee: spenrose → nobody
WONTFIX is appropriate for now, but I moved it over to Core::FxAccounts.
Product: Core → Firefox
You need to log in before you can comment on or make changes to this bug.