Closed Bug 818038 Opened 12 years ago Closed 12 years ago

Bouncer returns status code "206 Partial Content" instead of "302 Found"

Categories

(Webtools :: Bouncer, defect)

defect
Not set
major

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: takeshi2, Unassigned)

Details

"Range:" header in HTTP request incorrectly changes status code in response from Bouncer.

Example request (for download.mozilla.org):
http request [
  GET /?product=firefox-17.0.1-partial-17.0&os=win&lang=ja HTTP/1.1
  Host: download.mozilla.org
  User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/17.0 Firefox/17.0
  Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
  Accept-Language: ja,en-us;q=0.7,en;q=0.3
  Connection: keep-alive
  Range: bytes=0-299999
  Cookie: dmo=10.8.81.217.1546031755309482
]

Expected result:
http response [
  HTTP/1.1 302 Found
  Server: Apache
  X-Backend-Server: bouncer2.webapp.phx1.mozilla.com
  Cache-Control: max-age=15
  Content-Type: text/html; charset=UTF-8
  Date: Tue, 04 Dec 2012 14:54:04 GMT
  Location: http://download.cdn.mozilla.net/pub/mozilla.org/firefox/releases/17.0.1/update/win32/ja/firefox-17.0-17.0.1.partial.mar
  Keep-Alive: timeout=3, max=368
  Connection: Keep-Alive
  X-Cache-Info: caching
  Content-Length: 0
]

Actual result:
http response [
  HTTP/1.1 206 Partial Content
  Content-Length: 300000
  Content-Range: bytes 0-299999/0
  Content-Type: text/html; charset=UTF-8
  Server: Apache
  X-Backend-Server: bouncer5.webapp.phx1.mozilla.com
  Cache-Control: max-age=15
  Date: Tue, 04 Dec 2012 14:17:08 GMT
  Location: http://download.cdn.mozilla.net/pub/mozilla.org/firefox/releases/17.0.1/update/win32/ja/firefox-17.0-17.0.1.partial.mar
  Keep-Alive: timeout=3, max=500
  Connection: Keep-Alive
  X-Cache-Info: cached
]

This prevents Firefox's silent update.
You can avoid this problem with app.update.download.backgroundInterval=0 in about:config on Firefox.
You can also test this problem by the following command:

$ curl --no-keepalive -r 0-299999 -i 'http://download.mozilla.org/?product=firefox-17.0.1-partial-17.0&os=win&lang=en-US'
HTTP/1.1 206 Partial Content
Content-Length: 300000
Content-Range: bytes 0-299999/0
Content-Type: text/html; charset=UTF-8
Server: Apache
X-Backend-Server: bouncer9.webapp.phx1.mozilla.com
Cache-Control: max-age=15
Date: Tue, 04 Dec 2012 14:58:37 GMT
Location: http://download.cdn.mozilla.net/pub/mozilla.org/firefox/releases/17.0.1/update/win32/en-US/firefox-17.0-17.0.1.partial.mar
Keep-Alive: timeout=3, max=280
Connection: Keep-Alive
X-Cache-Info: cached

curl: (18) transfer closed with 300000 bytes remaining to read
This appears not to be a bouncer problem, but a CDN problem. The location of the download is the CDN, which appears to be returning partial content.
Nope, this is definitely bouncer-side... that query is being sent to download.mozilla.org. It happens to return a link to the CDN, but the CDN is not as yet involved in serving the request when this happens.

I believe this is a new issue caused by our enabling of the Zeus cache a few days ago. It appears to me that it generally works properly if and only if we have a cache miss. That is, Apache does the right thing, but Zeus sees the incoming Range header and knee-jerk-responds with a 206.

I've emailed them about this, but in the meantime I'm going to look into simply stripping out the Range header from these queries before it consults the cache... I think that will fix this. Since it only ever responds with a 302 or 404 (never any content), this should have zero effect (the mirror - or CDN in this case - would still obey Range headers properly).
This should be fixed now. I have configured our load balancers to ignore incoming Range headers for queries to download.mozilla.org. Since it only returns with a 302 (with a Content-Length of zero), this should have no negative effect. Future queries to the actual mirror being redirected to would be totally separate, and should behave as expected.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Thanks. Now Firefox's silent update is working as expected.
Status: RESOLVED → VERIFIED
(In reply to Jake Maul [:jakem] from comment #4)
> This should be fixed now. I have configured our load balancers to ignore
> incoming Range headers for queries to download.mozilla.org. Since it only
> returns with a 302 (with a Content-Length of zero), this should have no
> negative effect. Future queries to the actual mirror being redirected to
> would be totally separate, and should behave as expected.

Do we know what regressed this? If this did cause the 17.0.1 uptake issues, it'd be good to know why the same did not happen to 16.
Not a regression, per-se. We had previously not been using the LB cache at all. In the past it was infeasible to use it- the returned mirror really would vary on every hit (because we had many community mirrors, plus geoip-based mirror selection), so caching was impractical... the result *had* to pass through to the backend nodes for examination.

It was enabled as a scalability fix on release day. We were maxxed out on Apache workers on the backend nodes. We allocated more workers, but were quickly swamped again. Enabling this cache was a huge infrastructure win, and bought us a *drastic* increase in capacity.

Our testing at the time showed no ill effects, but I don't believe we tried sending requests with Range headers... it didn't occur to us that the client would send those (though it's obvious in retrospect), or that it would matter anyway... it shouldn't have affected the result like it did. The vendor is investigating, and at first glance indicated it did seem like a potential bug.
I *just* got confirmation from the vendor that they are classifying this as a bug, and will be fixing it in a future release (no ETA yet known). The workaround I put in place in comment 4 is the official recommended solution in the meantime.
You need to log in before you can comment on or make changes to this bug.