China FxA stack points to dev instance of channelserver in GCP
Categories
(Mozilla China :: General, defect)
Tracking
(Not tracked)
People
(Reporter: hectorz, Unassigned)
References
()
Details
Originally a slack thread, filed as a bug per :rfkelly's suggestion:
(Mozilla Online) didn't set up the channelserver because the default dev/nonprod one works for our internal testing.
Maybe we should start self-hosting it now that iOS will use it in prod, and switch to the prod one in GCP before our own is ready.
Second message:
... I just realized I won't be able to switch the channelserver uri as easily as I thought.
Updating the fxa-content config alone won't refresh the pref cached in desktop Fx, and mismatch of its value in desktop Fx and fxa-content might mean pairing flow not working.
I also doubt it's a good idea to send China iOS users to the dev channelserver instance in GCP. ...
If the said switch to our own instance is indeed necessary, (I'd like to know) if it's possible to hold the release of the pairing flow (in Fx iOS, and at least for those using China FxA) for a while, so that I have the chance to release an extension update to force update the desktop pref?
Third message:
Or maybe there could be some combination of fxa-content configurations which could make the pairing flow fail gracefully?
Again, thank you for you opinions on this and sorry for the troubles caused by this misconfiguration on our side.
Comment 1•5 years ago
|
||
Maybe we should start self-hosting it now that iOS will use it in prod, and switch to the prod one in GCP before our own is ready.
I'm sure you know how to find this, but noting for completeness: the default production channelserver is at:
- wss://channelserver.services.mozilla.com
IIRC it's hosted in GCP, unlike the rest of the default prod FxA stack.
mismatch of its value in desktop Fx and fxa-content might mean pairing flow not working.
Agreed, I don't think the pairing flow will work if Desktop doesn't agree on the same channelserver URL as fxa-content.
I tested this out a bit locally, but starting a pairing flow and then editing the channel_id
parameter in the resulting URL. The result was roughly:
- The desktop side showed the QR code, then sat waiting for the mobile client to connect, which it never did. It eventually timed out and showed an error message.
- The mobile side loaded fxa-content, which very quickly showed an error message saying "Error while creating the pairing channel". You can see the resulting UI by visiting this link.
So it does fail, but at least it fails quickly I guess?
Or maybe there could be some combination of fxa-content configurations which could make the pairing flow fail gracefully?
I had a bit of a think about this, but I wasn't able to come up with any suggestions.
We could teach the fxa-content side of the pairing flow to try multiple different channelserver instances in turn, but that sounds like just as much work as shipping the config change via the desktop addon.
Reporter | ||
Comment 2•5 years ago
|
||
(In reply to Ryan Kelly [:rfkelly] from comment #1)
Maybe we should start self-hosting it now that iOS will use it in prod, and switch to the prod one in GCP before our own is ready.
I'm sure you know how to find this, but noting for completeness: the default production channelserver is at:
- wss://channelserver.services.mozilla.com
IIRC it's hosted in GCP, unlike the rest of the default prod FxA stack.
Yes, I'm aware.
Jiechen have set up our own channelserver instance for testing, and to my knowledge bug 1530520 is the only place non-default config values are mentioned.
mismatch of its value in desktop Fx and fxa-content might mean pairing flow not working.
Agreed, I don't think the pairing flow will work if Desktop doesn't agree on the same channelserver URL as fxa-content.
I tested this out a bit locally, but starting a pairing flow and then editing the
channel_id
parameter in the resulting URL. The result was roughly:
- The desktop side showed the QR code, then sat waiting for the mobile client to connect, which it never did. It eventually timed out and showed an error message.
- The mobile side loaded fxa-content, which very quickly showed an error message saying "Error while creating the pairing channel". You can see the resulting UI by visiting this link.
So it does fail, but at least it fails quickly I guess?
Or maybe there could be some combination of fxa-content configurations which could make the pairing flow fail gracefully?
I had a bit of a think about this, but I wasn't able to come up with any suggestions.
We could teach the fxa-content side of the pairing flow to try multiple different channelserver instances in turn, but that sounds like just as much work as shipping the config change via the desktop addon.
Thanks for the analysis above. I'm still working on the extension patch, and our desktop QA should be able to test it later today or tomorrow morning.
After shipping both fxa-content config change & the extension update, I'll see what additional updates to fxa-content I can make (either try multiple channelserver as you described, or advise the user to check whether their extension is up to date on this particular error).
Reporter | ||
Comment 3•5 years ago
|
||
Friday afternoon status updates,
(In reply to Hector Zhao [:hectorz] from comment #2)
......
Jiechen have set up our own channelserver instance for testing, and to my knowledge bug 1530520 is the only place non-default config values are mentioned.
Our mobile/service QA Bingqing have finished testing it. We'll update the fxa-content config to expose it next Monday.
...... I'm still working on the extension patch, and our desktop QA should be able to test it later today or tomorrow morning.
Our desktop QA Yanfang Liu have finished testing the above mentioned fix. It relies on the server change, so we'll release it next Tuesday at the earliest.
Reporter | ||
Comment 4•5 years ago
•
|
||
(In reply to Hector Zhao [:hectorz] from comment #3)
(In reply to Hector Zhao [:hectorz] from comment #2)
Jiechen have set up our own channelserver instance for testing, and to my knowledge bug 1530520 is the only place non-default config values are mentioned.
Our mobile/service QA Bingqing have finished testing it. We'll update the fxa-content config to expose it next Monday.
It's live: fxa-content-server & channelserver
Reporter | ||
Comment 5•5 years ago
|
||
(In reply to Hector Zhao [:hectorz] from comment #3)
(In reply to Hector Zhao [:hectorz] from comment #2)
...... I'm still working on the extension patch, and our desktop QA should be able to test it later today or tomorrow morning.
Our desktop QA Yanfang Liu have finished testing the above mentioned fix. It relies on the server change, so we'll release it next Tuesday at the earliest.
This is also live: updates.json and it will take ~ two weeks to reach the majority of our users.
Reporter | ||
Comment 6•5 years ago
|
||
(In reply to Hector Zhao [:hectorz] from comment #2)
......
- The mobile side loaded fxa-content, which very quickly showed an error message saying "Error while creating the pairing channel". You can see the resulting UI by visiting this link.
The error message is not shown when visiting /pair/failure
directly, but /pair/supp
works.
......
After shipping both fxa-content config change & the extension update, I'll see what additional updates to fxa-content I can make (either try multiple channelserver as you described, or advise the user to check whether their extension is up to date on this particular error).
The above mentioned error message is not localized, we're testing a quick & dirty patch to translate it, and also advise users to use email+password instead.
Reporter | ||
Comment 7•5 years ago
|
||
(In reply to Hector Zhao [:hectorz] from comment #6)
(In reply to Hector Zhao [:hectorz] from comment #2)
......
- The mobile side loaded fxa-content, which very quickly showed an error message saying "Error while creating the pairing channel". You can see the resulting UI by visiting this link.
The error message is not shown when visiting
/pair/failure
directly, but/pair/supp
works.
It looks like the channel id here "cYN4AVH9lehM3pc5kla1pB" happens to trigger a "Invalid ChannelID specified" warning and a 404 error in channelserver, and the "Error while creating the pairing channel" msg will be shown after receiving an "error" event.
But in our testing with a real device, more frequently we're seeing an successful 101 upgrade to WebSocket, which is then immediately closed after a server side "Attempt to connect to unknown channel" msg. This leaves the fxa-content page spinning forever, seemingly from the lack of any "close" event handler in fxa-pairing-channel 1
Reporter | ||
Comment 8•5 years ago
|
||
(In reply to Hector Zhao [:hectorz] from comment #6)
......
After shipping both fxa-content config change & the extension update, I'll see what additional updates to fxa-content I can make (either try multiple channelserver as you described, or advise the user to check whether their extension is up to date on this particular error).
The above mentioned error message is not localized, we're testing a quick & dirty patch to translate it, and also advise users to use email+password instead.
This went live ~20 hrs ago. I'm resolving this bug as FIXED.
(In reply to Hector Zhao [:hectorz] from comment #7)
......
But in our testing with a real device, more frequently we're seeing an successful 101 upgrade to WebSocket, which is then immediately closed after a server side "Attempt to connect to unknown channel" msg. This leaves the fxa-content page spinning forever, seemingly from the lack of any "close" event handler in fxa-pairing-channel 1
I sent our patch as a PR in case you'd like to take it.
Description
•