Closed Bug 714304 Opened 14 years ago Closed 13 years ago

Implement handling of "upgrade required" response

Categories

(Firefox for Android Graveyard :: Android Sync, defect, P3)

Product:

Component:

Platform:

ARM

Android

Type:

defect

Priority:

P3

Severity:

normal

Tracking

(firefox14 fixed, blocking-fennec1.0 +)

Status:

RESOLVED FIXED

Milestone:

mozilla15

Tracking Flags:

Tracking

Status

firefox14

---

fixed

blocking-fennec1.0

---

+

People

(Reporter: rnewman, Assigned: liuche)

References

Details

(Whiteboard: [qa+])

Attachments

(2 files)

Doc change. 14 years ago Richard Newman [:rnewman] 1.74 KB, patch	telliott : review+	Details \| Diff \| Splinter Review
Patch. 13 years ago Richard Newman [:rnewman] 29.45 KB, patch	rnewman : review+ blassey : approval-mozilla-aurora+	Details \| Diff \| Splinter Review

Richard Newman [:rnewman]

Reporter

Description

•

14 years ago

I want to make sure that "orphaned" Sync accounts on Android devices don't end up being a punishing load on our infrastructure. We intend to have a get-out clause for this by conditionally returning HTTP 426 Upgrade Required, and have all Android Sync clients watch for this and self-disable. Alongside auto-update checks, this should keep our poor infra safe.

Gregory Szorc [:gps]

Comment 1

•

14 years ago

Do we use HTTP 426 today? If not, I have hesitations about introducing it. My main concern is I don't feel it is appropriate for our use. AFAICT 426 is intended to be utilized for switching the protocol on the channel (e.g. to TLS or to some other non-HTTP protocol). I don't believe we would be doing this, so the use of 426 isn't appropriate.

Comment 2

•

14 years ago

Please suggest something better.

Richard Newman [:rnewman]

Reporter

Comment 3

•

14 years ago

(In reply to Gregory Szorc [:gps] from comment #1) > Do we use HTTP 426 today? If not, I have hesitations about introducing it. I don't believe we do. > My main concern is I don't feel it is appropriate for our use. AFAICT 426 is > intended to be utilized for switching the protocol on the channel (e.g. to > TLS or to some other non-HTTP protocol). I don't believe we would be doing > this, so the use of 426 isn't appropriate. You are correct that existing uses of 426 are typically associated with an Upgrade: header for protocol switching. That said, it's semantically closest to what we are asserting to the client: "you need to make some change in how you make requests before we will proceed". The alternative would be to send some other 4xx or 5xx response with something like an X-Weave-Piss-Off header, which I consider to be less meaningful. I did my research and discussed with atoll before settling on this as "good enough". I would be ecstatic to find something better. Suggestions welcome. Requirements: be understood by clients to mean "client isn't good enough, don't try again"; doesn't require parsing another header to be understood; doesn't collide with some other Sync code.

Comment 4

•

14 years ago

I think gps's concern is valid, since 426 will be consumed by Necko someday before it ever reaches us, if we do implement it.

Gregory Szorc [:gps]

Comment 5

•

14 years ago

I'm not sure exactly what you are trying to do here. But, I'll throw out the following suggestions: 1) For REST services, you can expose newer instances/versions of the service under a new URI space. Old/incompatible clients that hit an unsupported API get a 3XX or 410 response. 2) Bake some kind of version exchange in the headers. Some will argue URIs should be used for this in a REST service. 3) Return HTTP 403 Forbidden. The response body could contain JSON that tells clients to disable. Either way, it sounds like this will require a new Sync HTTP storage API version, no?

Richard Newman [:rnewman]

Reporter

Comment 6

•

14 years ago

(In reply to Gregory Szorc [:gps] from comment #5) > I'm not sure exactly what you are trying to do here. I'm attempting to identify a response which it's acceptable for a client to interpret as "never sync again". Background, which I was too lazy to put in Comment 0: because Sync now runs independently of Fennec, we will gradually accrue users who install Firefox on their Android device, set up Sync, and then never use or upgrade Fennec again. Unlike XUL Sync, their Sync client will be happily running in the background for so long as their phone is on, until they uninstall Fennec. Given the number of Weave 0.1ish users we still have, this is something I want to avoid. Old clients will have bugs, security issues, or might otherwise benefit from being told to go away. We could do that with ordinary 4xx or 5xx responses, but then we end up with the situation we have now -- old Sync 1.7 clients that make Request URI Too Long requests over and over, getting a 400 back and retrying a little later. The solution is to have a way of simply and unambiguously saying "go away and don't come back". Backoff isn't suited for this (easily mis-handled or lost by the client, can't be distinguished as "upgrade required"). > Either way, it sounds like this will require a new Sync HTTP storage API > version, no? This is exactly what I want to avoid, because I want to do this *this week*, and I want this to be implementable in Zeus. That is, it should be as simple as if (userAgent == "AndroidSync 0.1") { response.status = 498; response.addHeader("X-Weave-Alert", "Piss off and upgrade"); // Yadda yadda. return; } That means a decent hack now, and a proper solution later. Toby and I have already discussed this for Sync 2.0, I think. "Decent hack" seems to imply a status code or header that's not already in use, and won't be, and minimal header parsing. We *could* squat a code -- HTTP 468 Piss Off -- but using Upgrade Required without an Upgrade header seemed quite straightforward.

Gregory Szorc [:gps]

Comment 7

•

14 years ago

Since you are looking for a decent hack now and that hack only needs to work on the Android client you are currently writing, I would recommend a custom HTTP header over 426 or custom HTTP status code. Just because it is a hack doesn't give you leeway to disobey HTTP specs. I would also combine this HTTP header with a 403 response, since that has the semantics most closely related to what you are trying to accomplish. Of course, the client would look for the header first and the 403 should never come into play. I would recommend making the header name as specific as possible. e.g. X-Weave-Client-Upgrade-Needed instead of X-Weave-Stop-Requesting: Your client is too old. A proper solution can be devised when the storage API version is bumped.

Richard Newman [:rnewman]

Reporter

Comment 8

•

14 years ago

atoll, what are you thoughts? You suggested a status code over a header, so I want to make sure your base concerns are addressed.

Comment 9

•

14 years ago

HTTP/1.1 400 Bad Request X-Weave-Alert: ERR_UPGRADE_REQUIRED X-Weave-Backoff: 2419200 8 Where 8 is the numeric error code labeled ERR_UPGRADE_REQUIRED in the wiki, and triggers localized upgrade text.

Richard Newman [:rnewman]

Reporter

Comment 10

•

14 years ago

wfm.

Summary: Implement handling of HTTP 426 → Implement handling of "upgrade required" response

Richard Newman [:rnewman]

Reporter

Comment 11

•

14 years ago

Attached patch Doc change. — Details — Splinter Review

Camping this response code.

Assignee: nobody → rnewman

Status: NEW → ASSIGNED

Attachment #585021 - Flags: review?(telliott)

Gregory Szorc [:gps]

Comment 12

•

14 years ago

While I don't like using HTTP 400, we use it today and in a very similar manner. So, I think this is a sufficient stop-gap until the next sync protocol version is cut.

Toby Elliott [:telliott]

Comment 13

•

14 years ago

Comment on attachment 585021 [details] [diff] [review] Doc change. Approach seems least disruptive for now. I'm assuming that we're just going to handle this at the zeus layer. We didn't discuss anything formal for this in the 2.0 spec, though we should. We have the concept of "account needs purging", but nothing akin to "gtfo"

Attachment #585021 - Flags: review?(telliott) → review+

Richard Newman [:rnewman]

Reporter

Comment 14

•

14 years ago

Doc change landed: https://hg.mozilla.org/services/docs/rev/79125b1cedf9 This bug will track the Android Sync implementation. We can address the Zeus bit should that ever be necessary.

Tony Chung [:tchung]

Updated

•

13 years ago

tracking-fennec: --- → ?

Richard Newman [:rnewman]

Reporter

Updated

•

13 years ago

Priority: -- → P3

Brad Lassey [:blassey] (use needinfo?)

Comment 15

•

13 years ago

rnewman, my understanding is we need this for 12, not necessarily 11. Please confirm.

tracking-fennec: ? → 12+

Richard Newman [:rnewman]

Reporter

Comment 16

•

13 years ago

Yup, not something I'm tracking for a particular release; we'll just get it in as soon as we can, so we don't have to block User-Agents for bad versions.

Richard Newman [:rnewman]

Reporter

Updated

•

13 years ago

Blocks: 745431

Allison Naaktgeboren :ally

Updated

•

13 years ago

tracking-fennec: 12+ → ---

blocking-fennec1.0: --- → +

Damon Sicore (:damons)

Comment 17

•

13 years ago

I'm terrified of all of these sync bugs where we are still implementing features. Is this critical to our release? Why must this bug block if it were the last bug?

Richard Newman [:rnewman]

Reporter

Comment 18

•

13 years ago

(In reply to Damon Sicore (:damons) from comment #17) > I'm terrified of all of these sync bugs where we are still implementing > features. Is this critical to our release? Why must this bug block if it > were the last bug? We discussed this at Monday's triage meeting. Speaking generally: All of these items are in the warpath now because we spent the last two or three months implementing super-duper high-priority product desires like form history sync, rather than finishing the protocol implementation. I do not want to ship a Sync client that has known gaps and undocumented/untested/unaddressed behavior in sizable portions of its state machine. That leads to situations where we have to make a hard decision like "permanently block *all* Fennec clients from syncing until they upgrade", because the alternative is possible data loss or account damage. Speaking specifically: This particular bug is to hard-stop a client if it's outdated and known-bad (which, coincidentally, is the approach we'd take for the above). The alternative is for us to have a more expensive 503 barrier permanently configured into our LBs, which is not something I want to impose on services ops. This is low-effort, low-risk, high-reward.

Allison Naaktgeboren :ally

Updated

•

13 years ago

Whiteboard: [not started]

Allison Naaktgeboren :ally

Updated

•

13 years ago

Assignee: rnewman → liuche

Whiteboard: [not started] → [work started]

Richard Newman [:rnewman]

Reporter

Comment 19

•

13 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/9882d0013809 Gonna flag this for Aurora approval momentarily, because we wont address it in triage until Monday.

Whiteboard: [work started]

Target Milestone: --- → mozilla15

Richard Newman [:rnewman]

Reporter

Comment 20

•

13 years ago

Attached patch Patch. — Details — Splinter Review

Attachment #621228 - Flags: review+

Attachment #621228 - Flags: approval-mozilla-aurora?

Richard Newman [:rnewman]

Reporter

Comment 21

•

13 years ago

QA: to verify this you'll need ops help. First: adb shell setprop log.tag.SyncAccounts VERBOSE Set up a Sync account against stage. You'll see in Settings that Fennec is checked in the Sync account pane, which means it's syncing. Verify that syncing has occurred normally. Now ask ops to configure the stage setup to return Upgrade Required (code 16 on a 400). Start capturing a log on the device, and initiate a sync. You should see the sync fail immediately, with the Fennec checkbox being *unchecked*. You'll see in the log "Setting authority ... to not sync automatically." Automatic syncs will now *not* occur after any duration. Forced syncs can be used if you first check the checkbox. Don't test that right now. Now disable the stage 400 configuration, and install another build of Fennec. (It must be the same channel -- if you're testing a Nightly, install another Nightly. The same build is fine, I think.) After installation, the Fennec checkbox should be checked again, and syncs should proceed normally.

Whiteboard: [qa+]

Ed Morley [:emorley]

Comment 22

•

13 years ago

https://hg.mozilla.org/mozilla-central/rev/9882d0013809

Status: ASSIGNED → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Brad Lassey [:blassey] (use needinfo?)

Updated

•

13 years ago

Attachment #621228 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Richard Newman [:rnewman]

Reporter

Comment 23

•

13 years ago

This should be safe, and non-harmful even if it doesn't work, so not waiting on explicit QA verification. https://hg.mozilla.org/releases/mozilla-aurora/rev/8c8e96d70ab5

status-firefox14: --- → fixed

Nobody; OK to take it and work on it

Updated

•

12 years ago

Product: Mozilla Services → Android Background Services

Richard Newman [:rnewman]

Reporter

Updated

•

12 years ago

Blocks: 895526

Updated

•

8 years ago

Product: Android Background Services → Firefox for Android

Updated

•

5 years ago

Product: Firefox for Android → Firefox for Android Graveyard

You need to log in before you can comment on or make changes to this bug.