Closed Bug 861906 Opened 11 years ago Closed 11 years ago

CDN broken over https?

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

All
Other
task
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rrosario, Assigned: cturra)

Details

This is pretty bad. Right now sumo is showing up unstyled and without scripts for users that don't have it cached.

$ curl https://support.cdn.mozilla.net/media/css/home-min.css
curl: (7) couldn't connect to host

http works.
I just realized this *could* just be my network. I'm going to ask others :)
It worked for me (NYC RoadRunner) but DFEOJM is saying it's down... Are there any known issues between our CDN's network and other networks? How do we even check that?
earlier this morning i saw a couple alerts from akamai like the below. i logged into our akamai control panel and the errors had cleared themselves (i assume this was connectivity related). 


NAME:        Origin Connection Failure
TYPE:        Origin Connection Failure
SERVICE:     HTTP Content Delivery
START TIME:  Mon, Apr 15, 11:52 GMT 2013   
EMAIL DELAY: This email was sent after the condition was sustained for 6 minutes. 

DATA: 
      CP Code:                     141292
      CP Code Description:         download.cdn.mozilla.net
      Hits:                        13
      Errors:                      2
      Alert Condition (% Errors):  11
      Alert Threshold:             2
      Edge IP:                     23.63.101.187

DESCRIPTION:
The Origin Connection Failure Alert indicates errors that are connection-related,
i.e., Akamai edge servers successfully performed the DNS lookup but are unable to
make contact with your origin server. 

This alert triggers when the error percentage for the selected CP code exceeds
the percentage threshold you specified.
The alert clears when the alert condition is no longer met (exceeded).
The percent error value is the last recorded value that met the alert condition.
just a quick follow up - according to our alert activity in akamai, this alert cleared at Apr 15, 2013 12:09:32 PM Greenwich Mean Time.
DFOEOJM is still saying it can't hit that URL, but it doesn't give us much detail about *why*. Ricky's error makes it look like it can't even talk to the CDN, not that the CDN can't talk to the origin server.
(In reply to James Socol [:jsocol, :james] from comment #6)
> Ricky's error makes it look like it can't even talk to
> the CDN, not that the CDN can't talk to the origin server.

Correct. Firefox tells me:

Unable to connect
      
Firefox can't establish a connection to the server at support.cdn.mozilla.net.
curious. i just did some testing from a couple locations around north america and found the following (all connect fine):

vancouver: 
---
$ curl -I https://support.cdn.mozilla.net/media/css/home-min.css
HTTP/1.1 200 OK
Server: Apache
X-Backend-Server: support3.webapp.phx1.mozilla.com
Content-Type: text/css; charset=utf-8
ETag: "4d5b7a68f8e40"
Last-Modified: Thu, 14 Feb 2013 23:30:57 GMT
X-Cache-Info: caching
Cache-Control: max-age=31536000
Expires: Tue, 15 Apr 2014 16:57:19 GMT
Date: Mon, 15 Apr 2013 16:57:19 GMT
Connection: keep-alive

portland:
---
$ curl -I https://support.cdn.mozilla.net/media/css/home-min.css
HTTP/1.1 200 OK
Server: Apache
X-Backend-Server: support3.webapp.phx1.mozilla.com
Content-Type: text/css; charset=utf-8
ETag: "4d5b7a68f8e40"
Last-Modified: Thu, 14 Feb 2013 23:30:57 GMT
X-Cache-Info: caching
Cache-Control: max-age=31535978
Expires: Tue, 15 Apr 2014 17:17:58 GMT
Date: Mon, 15 Apr 2013 17:18:20 GMT
Connection: keep-alive

san jose:
---
$ curl -I https://support.cdn.mozilla.net/media/css/home-min.css
HTTP/1.1 200 OK
Server: Apache
X-Backend-Server: support3.webapp.phx1.mozilla.com
Content-Type: text/css; charset=utf-8
ETag: "4d5b7a68f8e40"
Last-Modified: Thu, 14 Feb 2013 23:30:57 GMT
X-Cache-Info: cached
Cache-Control: max-age=31536000
Expires: Tue, 15 Apr 2014 17:19:01 GMT
Date: Mon, 15 Apr 2013 17:19:01 GMT
Connection: keep-alive

phenix:
---
$ curl -I https://support.cdn.mozilla.net/media/css/home-min.css
HTTP/1.1 200 OK
Server: Apache
X-Backend-Server: support1.webapp.phx1.mozilla.com
Content-Type: text/css; charset=utf-8
ETag: "4d5b7a68f8e40"
Last-Modified: Thu, 14 Feb 2013 23:30:57 GMT
X-Cache-Info: cached
Cache-Control: max-age=31530410
Expires: Tue, 15 Apr 2014 15:46:22 GMT
Date: Mon, 15 Apr 2013 17:19:32 GMT
Connection: keep-alive
I just tried again, It's working now for me now. Yay! I haven't found anybody else with the issue other than DFEOJM, so I think we can just close this?
i am going to close, but will continue to monitor for the next little bit just to be extra careful :)
Assignee: server-ops-webops → cturra
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
For future reference, when this happens, we need some specific pieces of data to have much chance tracking this down:

1) From the affected machine, do a DNS lookup of the affected domain (support.cdn.mozilla.net). This tells us the IP of the CDN node that is affected, which is *vital* to helping Akamai (or any CDN vendor, really) isolate a problem.

2) The physical location (and ISP, preferably) of the affected machines.

3) (optional, but sometimes helpful) a traceroute to the IP discovered in #1, from the affected machine/device.

4) The (external) IP of the affected machine/device. Something like whatismyip.org or ifconfig.me is usually good.

5) Request/response headers are fantastic, but frequently don't include sufficient detail as to source and destination IPs.


If we go to the vendor with a problem and don't have at least this info, we basically won't get anywhere... they're going to run us in circles until we can help them to help us. It's not that they don't want to help so much as they *can't* help effectively, unless it's part of a known network-wide issue (which in Akamai's case means it'll probably be on twitter, reddit, blogs, news sites, etc).
I left a note in mana with another cool trick you can do on Akamai to get it to spit out more data when troubleshooting. This isn't recommended for general use, but it's really handy for testing with curl.

https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=1081607#ContentDeliveryNetworks%28CDN%29-Akamai

TL;DR: there's a set of Pragma headers you can send in the request, which will yield a set of Response headers with extra logistical information.
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.