Status

Infrastructure & Operations
WebOps: Other
--
critical
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: rrosario, Assigned: cturra)

Tracking

Details

(Reporter)

Description

5 years ago
This is pretty bad. Right now sumo is showing up unstyled and without scripts for users that don't have it cached.

$ curl https://support.cdn.mozilla.net/media/css/home-min.css
curl: (7) couldn't connect to host

http works.
(Reporter)

Comment 1

5 years ago
I just realized this *could* just be my network. I'm going to ask others :)
It worked for me (NYC RoadRunner) but DFEOJM is saying it's down... Are there any known issues between our CDN's network and other networks? How do we even check that?
(Assignee)

Comment 4

5 years ago
earlier this morning i saw a couple alerts from akamai like the below. i logged into our akamai control panel and the errors had cleared themselves (i assume this was connectivity related). 


NAME:        Origin Connection Failure
TYPE:        Origin Connection Failure
SERVICE:     HTTP Content Delivery
START TIME:  Mon, Apr 15, 11:52 GMT 2013   
EMAIL DELAY: This email was sent after the condition was sustained for 6 minutes. 

DATA: 
      CP Code:                     141292
      CP Code Description:         download.cdn.mozilla.net
      Hits:                        13
      Errors:                      2
      Alert Condition (% Errors):  11
      Alert Threshold:             2
      Edge IP:                     23.63.101.187

DESCRIPTION:
The Origin Connection Failure Alert indicates errors that are connection-related,
i.e., Akamai edge servers successfully performed the DNS lookup but are unable to
make contact with your origin server. 

This alert triggers when the error percentage for the selected CP code exceeds
the percentage threshold you specified.
The alert clears when the alert condition is no longer met (exceeded).
The percent error value is the last recorded value that met the alert condition.
(Assignee)

Comment 5

5 years ago
just a quick follow up - according to our alert activity in akamai, this alert cleared at Apr 15, 2013 12:09:32 PM Greenwich Mean Time.
DFOEOJM is still saying it can't hit that URL, but it doesn't give us much detail about *why*. Ricky's error makes it look like it can't even talk to the CDN, not that the CDN can't talk to the origin server.
(Reporter)

Comment 7

5 years ago
(In reply to James Socol [:jsocol, :james] from comment #6)
> Ricky's error makes it look like it can't even talk to
> the CDN, not that the CDN can't talk to the origin server.

Correct. Firefox tells me:

Unable to connect
      
Firefox can't establish a connection to the server at support.cdn.mozilla.net.
(Assignee)

Comment 8

5 years ago
curious. i just did some testing from a couple locations around north america and found the following (all connect fine):

vancouver: 
---
$ curl -I https://support.cdn.mozilla.net/media/css/home-min.css
HTTP/1.1 200 OK
Server: Apache
X-Backend-Server: support3.webapp.phx1.mozilla.com
Content-Type: text/css; charset=utf-8
ETag: "4d5b7a68f8e40"
Last-Modified: Thu, 14 Feb 2013 23:30:57 GMT
X-Cache-Info: caching
Cache-Control: max-age=31536000
Expires: Tue, 15 Apr 2014 16:57:19 GMT
Date: Mon, 15 Apr 2013 16:57:19 GMT
Connection: keep-alive

portland:
---
$ curl -I https://support.cdn.mozilla.net/media/css/home-min.css
HTTP/1.1 200 OK
Server: Apache
X-Backend-Server: support3.webapp.phx1.mozilla.com
Content-Type: text/css; charset=utf-8
ETag: "4d5b7a68f8e40"
Last-Modified: Thu, 14 Feb 2013 23:30:57 GMT
X-Cache-Info: caching
Cache-Control: max-age=31535978
Expires: Tue, 15 Apr 2014 17:17:58 GMT
Date: Mon, 15 Apr 2013 17:18:20 GMT
Connection: keep-alive

san jose:
---
$ curl -I https://support.cdn.mozilla.net/media/css/home-min.css
HTTP/1.1 200 OK
Server: Apache
X-Backend-Server: support3.webapp.phx1.mozilla.com
Content-Type: text/css; charset=utf-8
ETag: "4d5b7a68f8e40"
Last-Modified: Thu, 14 Feb 2013 23:30:57 GMT
X-Cache-Info: cached
Cache-Control: max-age=31536000
Expires: Tue, 15 Apr 2014 17:19:01 GMT
Date: Mon, 15 Apr 2013 17:19:01 GMT
Connection: keep-alive

phenix:
---
$ curl -I https://support.cdn.mozilla.net/media/css/home-min.css
HTTP/1.1 200 OK
Server: Apache
X-Backend-Server: support1.webapp.phx1.mozilla.com
Content-Type: text/css; charset=utf-8
ETag: "4d5b7a68f8e40"
Last-Modified: Thu, 14 Feb 2013 23:30:57 GMT
X-Cache-Info: cached
Cache-Control: max-age=31530410
Expires: Tue, 15 Apr 2014 15:46:22 GMT
Date: Mon, 15 Apr 2013 17:19:32 GMT
Connection: keep-alive
(Reporter)

Comment 9

5 years ago
I just tried again, It's working now for me now. Yay! I haven't found anybody else with the issue other than DFEOJM, so I think we can just close this?
(Assignee)

Comment 10

5 years ago
i am going to close, but will continue to monitor for the next little bit just to be extra careful :)
Assignee: server-ops-webops → cturra
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED

Comment 11

5 years ago
For future reference, when this happens, we need some specific pieces of data to have much chance tracking this down:

1) From the affected machine, do a DNS lookup of the affected domain (support.cdn.mozilla.net). This tells us the IP of the CDN node that is affected, which is *vital* to helping Akamai (or any CDN vendor, really) isolate a problem.

2) The physical location (and ISP, preferably) of the affected machines.

3) (optional, but sometimes helpful) a traceroute to the IP discovered in #1, from the affected machine/device.

4) The (external) IP of the affected machine/device. Something like whatismyip.org or ifconfig.me is usually good.

5) Request/response headers are fantastic, but frequently don't include sufficient detail as to source and destination IPs.


If we go to the vendor with a problem and don't have at least this info, we basically won't get anywhere... they're going to run us in circles until we can help them to help us. It's not that they don't want to help so much as they *can't* help effectively, unless it's part of a known network-wide issue (which in Akamai's case means it'll probably be on twitter, reddit, blogs, news sites, etc).

Comment 12

5 years ago
I left a note in mana with another cool trick you can do on Akamai to get it to spit out more data when troubleshooting. This isn't recommended for general use, but it's really handy for testing with curl.

https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=1081607#ContentDeliveryNetworks%28CDN%29-Akamai

TL;DR: there's a set of Pragma headers you can send in the request, which will yield a set of Response headers with extra logistical information.
Component: Server Operations: Web Operations → WebOps: Other
Product: mozilla.org → Infrastructure & Operations
You need to log in before you can comment on or make changes to this bug.