Closed
Bug 848074
Opened 13 years ago
Closed 12 years ago
Bouncer tests are saying firefox-latest-euballot builds are 404'ing from download.mozilla.org
Categories
(Infrastructure & Operations Graveyard :: WebOps: Other, task)
Infrastructure & Operations Graveyard
WebOps: Other
Tracking
(Not tracked)
RESOLVED
INCOMPLETE
People
(Reporter: stephend, Assigned: nmaul)
References
()
Details
Our bouncer.prod tests [1] are failing:
msg = "Failed on http://download.mozilla.org \nUsing {'lang': 'en-GB', 'product': 'firefox-latest-euballot', 'os': 'win'}"
Looks like the alias for firefox-latest-euballot isn't working.
http://qa-selenium.mv.mozilla.com:8080/job/bouncer.prod/46691/testReport/junit/tests.test_redirects/TestRedirects/test_redirect_for_firefox_aliases__1_/
Comment 1•13 years ago
|
||
Interestingly, I can't reproduce this with curl:
curl -IL "http://download.mozilla.org/?product=firefox-latest-euballot&os=win&lang=en-GB"
HTTP/1.1 302 Found
Server: Apache
X-Backend-Server: bouncer8.webapp.phx1.mozilla.com
Content-Type: text/html; charset=iso-8859-1
Date: Tue, 05 Mar 2013 20:00:08 GMT
Location: http://download-eu.mozilla.org/?product=firefox-19.0-euballot&os=win&lang=en-GB
Transfer-Encoding: chunked
Connection: Keep-Alive
X-Cache-Info: not cacheable; response is 302 without expiry time
HTTP/1.1 302 Found
Server: Apache
X-Backend-Server: bouncer7.webapp.phx1.mozilla.com
Cache-Control: max-age=15
Content-Type: text/html; charset=UTF-8
Date: Tue, 05 Mar 2013 20:00:01 GMT
Location: http://wpc.1237.edgecastcdn.net/801237/download.cdn.mozilla.net/pub/mozilla.org/firefox/releases/19.0/win32-EUballot/en-GB/Firefox%20Setup%2019.0.exe
Transfer-Encoding: chunked
Connection: Keep-Alive
X-Cache-Info: cached
HTTP/1.1 200 OK
Accept-Ranges: bytes
Access-Control-Allow-Origin: *
Cache-Control: max-age=345600
Content-Type: application/octet-stream
Date: Tue, 05 Mar 2013 19:59:43 GMT
ETag: "384360-1370e78-4d5d3a9e4c300"
Expires: Sat, 09 Mar 2013 19:59:43 GMT
Last-Modified: Sat, 16 Feb 2013 08:56:12 GMT
Server: Apache
X-Backend-Server: ftp3.dmz.scl3.mozilla.com
X-Cache-Info: cached
But when I try with Firefox I eventually get sent to:
https://ne1.wpc.edgecastcdn.net/801237/download.cdn.mozilla.net/pub/mozilla.org/firefox/releases/19.0/win32-EUballot/en-GB/Firefox%20Setup%2019.0.exe
Comment 2•13 years ago
|
||
From the web console, my path seems to be:
[15:02:53.067] GET http://download.mozilla.org/?product=firefox-latest-euballot&os=win&lang=en-GB [HTTP/1.1 302 Found 156ms]
[15:02:53.214] GET http://download-eu.mozilla.org/?product=firefox-19.0-euballot&os=win&lang=en-GB [HTTP/1.1 302 Found 180ms]
[15:02:53.463] GET https://ne1.wpc.edgecastcdn.net/801237/download.cdn.mozilla.net/pub/mozilla.org/firefox/releases/19.0/win32-EUballot/en-GB/Firefox%20Setup%2019.0.exe [HTTP/1.1 404 Not Found 13ms]
Updated•13 years ago
|
Assignee: server-ops → server-ops-webops
Component: Server Operations → Server Operations: Web Operations
QA Contact: shyam → nmaul
| Assignee | ||
Updated•13 years ago
|
Assignee: server-ops-webops → nmaul
| Assignee | ||
Comment 3•13 years ago
|
||
This is resolved, though not because of any action we (Mozilla) took.
It appears there are 2 errors here:
1) Edgecast sometimes redirects from wpc.1237.edgecastcdn.net to ne1.wpc.edgecastcdn.net, which 404's. Other times it doesn't have this redirect, and works fine. I am at a loss as to what causes this, and will have to talk to them about it.
2) Sentry believes Akamai was too slow to respond between 19:35 and 20:25, so it disabled Akamai, leaving only Edgecast. This is a logic problem that's very hard to work around, but should normally be not an issue- the other CDN should pick up the slack. This time it was problematic, due to #1. We'll have to talk to Akamai about what may have happened here to cause Sentry to believe they were underperforming.
Through rose-colored glasses on a silver cloud, it's kinda nice that problem #2 happened, or we might never have uncovered problem #1. :)
The other thing to investigate is how we can do Sentry better. This is a tough problem, and I don't have a good answer. At the core of the issue is, how do you effectively monitor the health of a CDN? The best answer I know of is to use something like Cedexis, which we do. But then you still need redundancy in case Cedexis is down (or sending you to nodes that are slow, as was the case today), which we have... and that's what saved us here today.
One thing that will help our response time is alerting based on Sentry status. Bug 848101 has been opened to do precisely that.
| Assignee | ||
Comment 4•12 years ago
|
||
I'm going to close this out, because there's nothing more I can do here.
Edgecast flatly denies that problem #1 happened or even can happen (on their end). I have no idea how :bhearsum landed at ne1.wpc.edgecastcdn.net... it is not (and AFAIK never has been) in bouncer. That web console snippet certainly seems to indicate that bouncer sent him directly to ne1.wpc, so I'm at a loss. It *looks* like bouncer is at fault, but I can't see how.
I can't say for sure if this is the same 404 that WebQA was seeing, or a different one.
Akamai was... less than helpful. The case is now closed, although no definitive cause was ever found. I still believe they had a temporary / intermittent network or server problem, but they didn't find (or won't admit) one.
I do believe both issues were likely transient in nature, so I'm not too concerned about a repeat. The only distressing thing here is not getting the answers I wanted out of the CDN vendors.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → INCOMPLETE
Updated•12 years ago
|
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Updated•7 years ago
|
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•