Closed Bug 785045 Opened 12 years ago Closed 12 years ago

"Upgrade Required" response for Sync 2.0

Categories

(Cloud Services :: General, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rfkelly, Assigned: rfkelly)

References

Details

(Whiteboard: [qa?])

Attachments

(1 obsolete file)

Bug 714304 introduced a kind of "upgrade required" response for Sync1.1.  We should spec  something similar for Sync2.0.  No code required in storage-syncserver because these rules would be done by the load balancer, but we need something in the spec so clients know what to expect.
Whiteboard: [qa?]
Blocks: 720964
In the spirit of Bug 784592, I propose something like:

    403 Forbidden

    { 'status': 'error',
      'errors': [{'location': 'header',
                  'name': 'User-Agent',
                  'reason': 'invalid',
                  'description': 'You user-agent is not supported, please upgrade'}]}
Assignee: nobody → rfkelly
OS: Linux → All
Hardware: x86 → All
Summary: "Upgrade Required" response for sync 2.0 → "Upgrade Required" response for Sync 2.0
Maybe should we provide a list of User Agents that are supported? Just like this, the response isn't that useful to the app developer.
(In reply to Alexis Metaireau (:alexis) from comment #2)
> Maybe should we provide a list of User Agents that are supported? Just like
> this, the response isn't that useful to the app developer.

Alas, the whole point is to allow blocklisting of misbehaving clients. We don't know what they are until they misbehave.

This is basically a neat and tidy alternative to a Zeus rule that sends a 503 with a really long Retry-After for certain UAs.
Then, why asking them to … upgrade? I would directly say to them: "This user-agent is not supported and had been blocked until further investigation". 

And if they are throttled, for instance, tell them when they will be authorized to query again. I think throttling is a good solution to avoid having them to send too much queries. The Retry-After thing sounds a thing that would help, yes.

Also, I'm wondering if a Zeus rule would not be faster than putting this directly in the codebase, as you said?
(In reply to Alexis Metaireau (:alexis) from comment #4)
> Then, why asking them to … upgrade? I would directly say to them: "This
> user-agent is not supported and had been blocked until further
> investigation". 

Stats suggest that lots of users run really old versions. We still have Firefox 8 clients, and active versions of Android Sync from Q1.

If one of those is acting badly for some reason -- as, for example, desktop does when it has a huge list of records to delete; Bug 736390 -- we want to ask them to upgrade as soon as a new version is available.

Simply telling them "your client is blocked" isn't actionable. It's not the user's fault that their client is misbehaving.

> And if they are throttled, for instance, tell them when they will be
> authorized to query again.

You're thinking like a developer :D

The typical recipient of this message is a user who's wondering why their client isn't syncing. The Android Sync version of this in 1.1 ties directly into the same logic that we use for storage version checking, which on desktop throws up a yellow bar saying that you need to upgrade.

> I think throttling is a good solution to avoid
> having them to send too much queries. The Retry-After thing sounds a thing
> that would help, yes.

Throttling on the server doesn't stop us getting huge payloads hundreds of times a second (an ops concern, I hear). Throttling on the client requires working client backoff logic, which we *know* can fail -- Firefox 7 taught us that lesson.

This is a simple and permanent Band-Aid: go away, client, and come back when you don't suck. Easy to get right. If we get it wrong, we have a backup.

Retry-After would persist across an upgrade, so we can't just say "go away for a year".

> Also, I'm wondering if a Zeus rule would not be faster than putting this
> directly in the codebase, as you said?

This is a protocol design; as Ryan pointed out in Comment 0:

"No code required in storage-syncserver because these rules would be done by the load balancer".
Assignee: rfkelly → nobody
Component: Server: Sync → General
Assignee: nobody → rfkelly
Status: NEW → ASSIGNED
If a lot of people need to upgrade, then that's probably our best option, I agree. The messages I was thinking about were messages for the User-Agent to use and (in my mind) translate to end-users (that's why I was talking about technical things in this response). 

I'm not sure this is a common practice, or even something we would like, though; and if it's not, then that's probably better to directly send the right message from the server.

The best option, in term of feedback seems to have a list of these bad UA on the server and to do some work at creating the right messages for the different UA, but it sounds like a lot of work. The solution you're proposing sounds a good catch-all :)
Alexis, I think you're right that a more detailed description is better, and it would ideally depend on which user-agent was blocked in the first place.

So someone with a too-old version of Desktop Firefox might see "Firefox 17 is no longer support, please upgrade to Firefox 20 or newer" but a mobile user would get a different message.

From a protocol point of view, intent to identify this particular error by the three data pieces {'location': 'header', 'name': 'User-Agent', 'reason': 'invalid'} and leave the precise description message unspecified.  We then have the flexibility to make it as descriptive as we can under whatever circumstances happen to need it.
Blocks: 784592
No longer blocks: 784592
Depends on: 784592
Attaching concrete docs proposal, which depends on the docs patch from Bug 784592.
Attachment #660310 - Flags: review?(rnewman)
(In reply to Alexis Metaireau (:alexis) from comment #6)
> If a lot of people need to upgrade, then that's probably our best option, I
> agree. The messages I was thinking about were messages for the User-Agent to
> use and (in my mind) translate to end-users (that's why I was talking about
> technical things in this response). 

I don't think the server response text would be directly presented to users; we don't have enough information on the server to determine the client language, and there's no translation facility available on the client.

The message we print here is more an indication of the semantics of the response, and for users to see in the logs.

Making it detailed is great, but the main aim is to make it actionable and clear, rather than scary.

(I learned over the past two years about the category of users who know enough to find logs, and maybe even report bugs, but also aren't sophisticated enough, so that some kinds of error messages scare them. "WARNING: DATA LOSS" is one example. It's those people I'd be targeting.)

> The best option, in term of feedback seems to have a list of these bad UA on
> the server and to do some work at creating the right messages for the
> different UA, but it sounds like a lot of work.

We can't really hope to localize these, unless we force clients to send Accept-Language. And even then it's probably unnecessary; the client just needs to see some kind of simple indicator of the problem, at which point it can present the correct localized UI.

Like Ryan says: yeah, make the message clear, but the important part is the machine-readability and simplicity.

So on that topic:

Having to have logic like

    if (response.jsonBody.reason == "invalid" &&
        response.jsonBody.location == "header" &&
        response.jsonBody.name.equalsIgnoreCase("User-Agent")) {
      // I'm an expired user agent!
    }

seems like a really complicated way to do this.

Crazy idea:

    if (response.jsonBody.reason == "upgrade-required") {
      ...
    }

? That allows us to use this response regardless of how we determine that your client is outdated, and it's way simpler for clients to get right. Yes, this adds another entry to the enumeration, but I think it counts as a top-level wrongness…
(In reply to Richard Newman [:rnewman] from comment #9)
>
> Crazy idea:
> 
>     if (response.jsonBody.reason == "upgrade-required") {
>       ...
>     }
> 

I've been trying to avoid having an unbounded list of specific error conditions, be they numeric codes or pre-defined strings.  But perhaps that instinct is ultimately counter-productive.

The conditions-required response in tokenserver is a similar example, where "reason == conditions-required" would be much simpler than inspecting multiple fields on the error object.

Ideally I would just squat a 4XX status code for each of these, but that's not going to happen.

I think I'll just wait for :gps to weigh in here before I go too far down this train of thought :-)
Comment on attachment 660310 [details] [diff] [review]
docs patch defining upgrade-required response

Let's make that happen!

Yeah, the issue is a tension between what one might term "precision" (that is, using similar mechanisms to report similar kinds of things) and more prosaic engineering issues, like "make critical error handling code super simple".
Attachment #660310 - Flags: feedback?(gps)
Comment on attachment 660310 [details] [diff] [review]
docs patch defining upgrade-required response

Review of attachment 660310 [details] [diff] [review]:
-----------------------------------------------------------------

Still waiting for input from gps, but here's a typo!

::: source/storage/apis-2.0.rst
@@ +575,5 @@
> +    The server refused to fulfill the request, for reasons other than invalid
> +    user credentials.
> +
> +    This response may be used to refuse service to clients with known problems
> +    or incompatabilities.  The JSON error response in this case will include

incompatibilities
Comment on attachment 660310 [details] [diff] [review]
docs patch defining upgrade-required response

Review of attachment 660310 [details] [diff] [review]:
-----------------------------------------------------------------

A machine readable 403 is definitely the right response from the server. The question is whether we're doing it right.

I generally consider user agent sniffing evil. It's obviously not a security feature since user agent strings can be easily spoofed. So, what we are trying to build is a mechanism to lock out old or misbehaving clients at a finer granularity than account level. And, one of these scenarios is one we want the client to react to so the user may take action to rectify it.

For what we are using it for, I think user agent sniffing is a necessary evil. The alternative is additional metadata on every request (e.g. the client sends a features set or encodes something similar in an Accept* header).

I do have beef with the machine readability of the response as documented. I know rfkelly said he wants to avoid enumeration of error classes. But from a client perspective, this is what's needed. "invalid" doesn't quite cut it. Now, if you say the 3-tuple of ("header", "User-Agent", "invalid") is unique to "upgrade required" barring a protocol version bump, then I'm sold.
Attachment #660310 - Flags: feedback?(gps) → feedback+
(In reply to Gregory Szorc [:gps] from comment #13)

> I generally consider user agent sniffing evil. It's obviously not a security
> feature since user agent strings can be easily spoofed. So, what we are
> trying to build is a mechanism to lock out old or misbehaving clients at a
> finer granularity than account level.

See my Comment 9:

"That allows us to use this response regardless of how we determine that your client is outdated, and it's way simpler for clients to get right."

> For what we are using it for, I think user agent sniffing is a necessary
> evil. The alternative is additional metadata on every request (e.g. the
> client sends a features set or encodes something similar in an Accept*
> header).

Even so, what if version 0.4 and version 0.5 both send the same features, but version 0.4 has an awful bug? If we rev our versions appropriately, we stand a chance at fine-grained blocking.

I don't think the question is about whether we can use User-Agents for this kind of thing. It's whether we want to use a complex JSON blob to indicate this kind of error situation, and whether we think it's bad to 'lie' in that blob if we want to trigger this logic for problems *not* detected by User-Agent parsing.
(In reply to Gregory Szorc [:gps] from comment #13)
> I do have beef with the machine readability of the response as documented. I
> know rfkelly said he wants to avoid enumeration of error classes.

Yes, but I'm slowly being convinced otherwise by this discussion and the similar one with conditions-required.  These are not nitty-gritty little errors with the same character as "oh you sent a bad value for X-If-Unmodified-Since".  They're part of the flow of the protocol.

> But from a client perspective, this is what's needed. "invalid" doesn't quite cut it.
> Now, if you say the 3-tuple of ("header", "User-Agent", "invalid") is unique
> to "upgrade required" barring a protocol version bump, then I'm sold.

What if we appropriate the "status" field and use that for a (very very short!) list of enumerated protocol-flow-level responses?  Example:

    403 Forbidden

    { 'status': 'upgrade-required',
      'errors': [{'location': 'header',
                  'name': 'User-Agent',
                  'reason': 'invalid',
                  'description': 'You user-agent is not supported, please upgrade'}]}

The client can decide what to do by switching on the single value in "status".  The further details are available to help debug or diagnose the problem.
(In reply to Ryan Kelly [:rfkelly] from comment #15)

> What if we appropriate the "status" field and use that for a (very very
> short!) list of enumerated protocol-flow-level responses?

That would work for me, I think.
(In reply to Richard Newman [:rnewman] from comment #16)
> (In reply to Ryan Kelly [:rfkelly] from comment #15)
> 
> > What if we appropriate the "status" field and use that for a (very very
> > short!) list of enumerated protocol-flow-level responses?
> 
> That would work for me, I think.

Ditto.
Comment on attachment 660310 [details] [diff] [review]
docs patch defining upgrade-required response

Marking that patch as obsolete…
Attachment #660310 - Attachment is obsolete: true
Attachment #660310 - Flags: review?(rnewman)
I have folded the changes from this bug into the docs patch in Bug 784592, since it is needed in order to explain the new way things are being done in that bug.
Subsumed by Bug 784592, which is now resolved.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: