Closed Bug 1045706 Opened 10 years ago Closed 10 years ago

Production and stage are not configured to accept msisdn and Fxa assertions

Categories

(Hello (Loop) :: Server, defect, P1)

x86_64
Linux
defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: alexis+bugs, Assigned: mostlygeek)

Details

(Whiteboard: [qa+])

Production/Stage Environment: If we configure FxOS Loop client to use Production [3] or Stage Server [4] , we get a “Bad Gateway” error when trying to login into Loop Server, i.e. It seems we can validate our MSISDN or our FxAccounts but there is no way to login into Loop.

We have changed the app origin in that case to be app://loop.stage.mozaws.net or app://loop.services.mozilla.com.

I guess that's because the server is not configured to accept these audiences, so I'll debug that with ops.
Correct values are:

> "fxaAudiences": ["http://loop.services.mozilla.com", "app://loop.services.mozilla.com"],

For production, and

> "fxaAudiences": ["http://loop.stage.mozaws.net", "app://loop.stage.mozaws.net"],
And for MSISDN assertion:

>    fxaTrustedIssuers: ["msisdn.services.mozilla.com", "api.accounts.firefox.com"]
How did we get this far without detecting this problem?
Was this introduced with a recent release? Or did we have this issue all along?

REF: https://github.com/mozilla-services/puppet-config/pull/746

Seems like QA should run a test to check configs...
Severity: normal → major
Priority: -- → P1
Whiteboard: [qa+]
Also, 
:alexis and :natim
Do we have everything we need listed in the Comment above?

Second - do we need to make any changes to any other services, or is this only affecting loop-server?
I have opened appropriate bugs.
:alexis thanks for validating that we need config verifications ;-)

See bug 1045964 for Loop-Server.
See bug 1045966 for MSISDN-Gateway.
Can someone please convince me this is correct?
loop/yaml/app/loop_server.prod.yaml
fxaVerifier: "https://verifier.stage.mozaws.net/v2"

Why not
fxaVerifier: "https://verifier.accounts.firefox.com/v2"

Stage and Prod Verifier are not *always* the same versions.
Especially if we are testing pre-release fixes to Verifier which may nor may not actually make it to Prod...
Status: NEW → ASSIGNED
Verifier can be prod for both (our loop stage and loop prod)
Yep you're right.

The smoke tests aren't actually testing the authentication because we don't have any simple way to create FxA assertions on the client…

In order to test this, we should test with a real client I believe. Maybe the upcoming functional tests will solve this problem.

This is normally something that only needs the loop-server to be updated.

(In reply to James Bonacci [:jbonacci] from comment #6)
> Why not
> fxaVerifier: "https://verifier.accounts.firefox.com/v2"

That's actually correct. I pingued mostlygeek yesterday about that and we now have the right values there.


I don't know if that's being deployed yet? Mostlygeek, can you update us?
Flags: needinfo?(bwong)
- stage updated
- prod updated
Flags: needinfo?(bwong)
Verified a new update to Loop-Server Stage:
Versions:
loop-server-svcops 0.9.2-1 x86_64 20712660
puppet-config-loop 20140729184904-1 x86_64 13881

Checking the yamls:
loop_server.prod.yaml

I see two sections:
    # domain of website for fxa verification
    fxaAudiences: 
        - "https://loop.services.mozilla.com"
    fxaVerifier: "https://verifier.accounts.firefox.com/v2"
    fxaTrustedIssuers: 
        - "api.accounts.firefox.com"

and

    fxaAudiences: 
        - "http://loop.services.mozilla.com"
        - "app://loop.services.mozilla.com"
    fxaVerifier: "https://verifier.stage.mozaws.net/v2"
    fxaTrustedIssuers: 
        - "api.accounts.firefox.com"
        - "msisdn.services.mozilla.com"

The "fix" to this file is only for the second set of entries. 
Also, noting that fxaVerifier has different values in the two sections.
:alexis and :natim - please verify that these are as expected.


loop_server.stage.yaml
Again I see two sections:
    # domain of website for fxa verification
    fxaAudiences: 
        - "https://loop.stage.mozaws.net"
    fxaVerifier: "https://verifier.stage.mozaws.net/v2"
    fxaTrustedIssuers: 
        - "auth.stage.mozaws.net"

and

    fxaAudiences: 
        - "http://loop.stage.mozaws.net"
        - "app://loop.stage.mozaws.net"
    fxaVerifier: "https://verifier.stage.mozaws.net/v2"
    fxaTrustedIssuers: 
        - "api-accounts.stage.mozaws.net"
        - "msisdn-dev.stage.mozaws.net"

This last one is unexpected. I would have assumed this:
    fxaTrustedIssuers: 
        - "api-accounts.stage.mozaws.net"
        - "msisdn.stage.mozaws.net"

or even this:
    fxaTrustedIssuers: 
        - "api-accounts.stage.mozaws.net"
        - "msisdn-dev.stage.mozaws.net"
        - "msisdn.stage.mozaws.net"

:alexis and :natim please verify...


loop/yaml/app/loop_server.yaml
This looks as expected (assuming we default to Prod for both Stage and Prod use of this yaml):
    # fxaAudience should be the server's url
    fxaAudiences: 
        - "http://loop.services.mozilla.com"
        - "app://loop.services.mozilla.com"
    fxaVerifier: "https://verifier.accounts.firefox.com/v2"
    fxaTrustedIssuers: 
        - "api.accounts.firefox.com"

But, this does not look right (not connected to this bug):
webAppUrl: "https://call.%{env}.mozaws.net/#call/{token}"

That works for Stage: https://call.stage.mozaws.net
but not for Prod: https://call.mozilla.com

:mostlygeek please verify...


Moving on to load testing and client to client testing.
Actually, this may not be right either: /data/loop-server/config/settings.json
...etc...
    "fxaTrustedIssuers": [
        "api-accounts.stage.mozaws.net",
        "msisdn-dev.stage.mozaws.net"
    ],

vs.

    "fxaTrustedIssuers": [
        "api-accounts.stage.mozaws.net",
        "msisdn.stage.mozaws.net"
    ],

or even
    "fxaTrustedIssuers": [
        "api-accounts.stage.mozaws.net",
        "msisdn-dev.stage.mozaws.net",
        "msisdn.stage.mozaws.net"
    ],
Yes 

    "fxaTrustedIssuers": [
        "api-accounts.stage.mozaws.net",
        "msisdn-dev.stage.mozaws.net"
    ],

Are the right values.
:natim thanks.
Still need :mostlygeek to walk through the rest of the configs for me.


Did an end-to-end test in the form of a call across two desktops running Aurora (Fx33).
Nightly crashed consistently once the call was accepted.


Holding off on load testing until I can get ahold of our third-party.
My test window is 3pm - 5pm PDT.
So I will wait till then for load testing.

Once the configs are re-verified, though, I can mark this Resolved/Fixed.
Addendum to Comment 14: We can not verify mobile to mobile or mobile to desktop on Prod due to bug 1042861

Also:
10 min and 30min low-level load tests against Production were successful.
This should not be a blocker for bug 1042861 (although I might want it to be).
This release was for a specific config change and it is already in Production.

I will open a new bug to unblock bug 1042861.
No longer blocks: 1042861
Load testing against the live server has started.
Testing will be complete around 5pm PDT.
Load testing has completed on Stage. 
Everything looks good.


So, I can Verify this bug once :mostlygeek and :alexis and :natim have a final look at my questions/comments above...
I managed to register a Loop user with the production server, so I am closing this one. Thanks!
Assignee: nobody → bwong
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
James, thanks for the detective work.

I don't understand why you have such values for prod on the webapp url, thouth. The config says it shouldn't be like that, see https://github.com/mozilla-services/puppet-config/blob/master/loop/yaml/app/loop_server.prod.yaml
:ferjm thanks for the verification.
:alexis not sure, but I will file a separate bug.
Status: RESOLVED → VERIFIED
(In reply to James Bonacci [:jbonacci] from comment #23)
> Filed https://github.com/mozilla-services/puppet-config/issues/750 as a
> follow-up.

Page not found, it says.
:mwargers indeed!
That's because it is locked down.
Title is this: "loop_server.yaml appears to have incorrect generic webAppUrl value for Dev, Stage, Prod use"
You need to log in before you can comment on or make changes to this bug.