Many FxA device registrations from Desktop are failing with an "invalid parameter" error
Categories
(Firefox :: Sync, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox82 | --- | fixed |
People
(Reporter: rfkelly, Assigned: markh)
References
Details
Attachments
(1 file)
According to these numbers that I pulled from BigQuery, almost 75% of POST /v1/account/device
requests from Firefox Desktop fail with an "Invalid Parameter" error. That's a lot! It also seems to be specific to desktop, because if I look at the equivalent numbers for e.g. Fenix then less than 1% of such requests fail in this way.
From looking at the individual request logs, it appears that some users are stuck in a loop of trying to update their device registration, failing, then trying again later.
What "invalid parameter" could desktop be submitting that causes so many failures? Unfortunately the available logs don't have many clues, but some suggestions:
- The submitted device name must be 255 chars or less, and there are a handful of unicode characters that aren't allowed therein. Maybe some users have really long default device names?
- The push callback URL is validated to match
/^https:\/\/updates\.push\.services\.mozilla\.com(\/.*)?$/
, maybe we're sometimes generating URLs that do not match this format?
Unfortunately the validation errors happen within the machinery of the FxA server framework, marking them a little fiddly to log about on the server side. I wonder if there's anything we can do to log extra data about this on the client (particularly in client telemetry).
Reporter | ||
Comment 1•5 years ago
|
||
it appears that some users are stuck in a loop of trying to update their device registration, failing, then trying again later.
Regardless of the underlying reason, we should also add some sort of circuit-breaker to avoid infinite retry loops here.
Reporter | ||
Comment 2•5 years ago
|
||
From a bit of experimenting, we think this might be related to handling of unicode in the default device name, as described in Bug 1396498.
I followed the steps therein to put a "ß" character in my computer's hostname, them signed in to sync in a fresh profile. Sync now thinks my device name should be "ryan’s Nightly on pomelo" (that's the unicode replacement character) and is getting 400 errors when trying to register its device record.
Mark suggested we should check the resulting string for validity and fall back to a sensible default if it's invalid utf8.
Reporter | ||
Updated•5 years ago
|
Reporter | ||
Comment 3•5 years ago
|
||
(I'll also add that if this is the source of the errors, it's unlikely to help us with our send-tab metrics, because devices that fail to upload their device-record will not be eligible for send-tab at all. It will be preventing these users from using send-tab, but preventing it at a before before we even start measuring whether a tab sent to the device arrives successfully)
Comment 4•5 years ago
|
||
The severity field is not set for this bug.
:teshaq, could you have a look please?
For more information, please visit auto_nag documentation.
Reporter | ||
Updated•5 years ago
|
Assignee | ||
Comment 5•5 years ago
|
||
Talking to the l10n team and Ryan, we think the best approach is to have the server allow the replacement chars, and for this bug to turn into "replace all other invalid chars with the replacement char, and limit the length to 255 chars". So I opened an fxa issue for that.
Assignee | ||
Comment 6•5 years ago
|
||
but probably close to what we want once https://github.com/mozilla/fxa/issues/6269 is fixed?
Updated•5 years ago
|
Comment 8•5 years ago
|
||
bugherder |
Reporter | ||
Comment 9•5 years ago
|
||
ni? myself to come back to this in a week or two and see if it helped, once the FxA server-side piece ships and the client-side piece as been in Nightly for a while
Reporter | ||
Comment 10•5 years ago
|
||
I re-ran the numbers and the situation here is substantially improved.
There was a huge change in success rate (from ~25% to ~75%) on the 1st of September, which corresponds to the server starting to accept the unicode replacement character in device names. So I guess we had many devices that were already trying to use the replacement character in their device name.
The client-side fix is not in release yet, but if I restrict the query to just Firefox 82 then we're seeing basically zero "invalid parameter" errors, a substantial improvement of Firefox 81. Overall I think this work has worked out very nicely!
Description
•