Closed Bug 1243303 Opened 8 years ago Closed 8 years ago

about:healthreport broken because CloudFront can't load the origin for fhr.cdn.mozilla.net/

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

task
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: philipp, Assigned: rwatson)

Details

(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/2516])

currently when a user tries to open the firefox healthreport through  about:healthreport, an error message is diplayed instead. this issue is getting reported in multiple support channels:

ERROR
The request could not be satisfied.
CloudFront wasn't able to connect to the origin.

Generated by cloudfront (CloudFront)
Request ID: ***
Heads-up flags.
I'm not sure who knows about this or owns the "origin".
Flags: needinfo?(mreid)
Flags: needinfo?(kparlante)
We load the about:healthreport from:
https://fhr.cdn.mozilla.net/%LOCALE%/v4/
... so e.g.:
https://fhr.cdn.mozilla.net/en-US/v4/

That comes up with:
> ERROR
> The request could not be satisfied.
> CloudFront wasn't able to connect to the origin. 

We have minimum documentation for fhr.cdn.mozilla.net here:
https://mana.mozilla.org/wiki/pages/viewpage.action?pageId=33101035

The origin is actually reachable:
http://fhr-origin.cdn.mozilla.net/en-US/v4/
Summary: FHR - error message on about:healthreport → about:healthreport broken because CloudFront can't load the origin for fhr.cdn.mozilla.net/
Flags: needinfo?(mlankford)
Assignee: nobody → webops-labs
Severity: critical → blocker
Component: Web: Health Report → WebOps: Labs
Product: Firefox Health Report → Infrastructure & Operations
QA Contact: smani
Version: 44 Branch → unspecified
Assignee: webops-labs → server-ops-webops
Component: WebOps: Labs → WebOps: Other
Severity: blocker → major
Flags: needinfo?(mlankford)
Whiteboard: [measurement:client:tracking]
cloudfront was showing some weird issues and throwing 502 errors we switched the DNS and CND to Edgecast to resolve the issue while we investigate why Cloudfront isn't working.
Assignee: server-ops-webops → rwatson
Flags: needinfo?(mreid)
Flags: needinfo?(kparlante)
Whiteboard: [measurement:client:tracking] → [kanban:https://webops.kanbanize.com/ctrl_board/2/2516] [measurement:client:tracking]
This issue has been resolved and Edgecast is being kept as CDN until we can figure out why this happened.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/2516] [measurement:client:tracking] → [kanban:https://webops.kanbanize.com/ctrl_board/2/2516]
(In reply to Ryan Watson [:w0ts0n] from comment #4)
> This issue has been resolved and Edgecast is being kept as CDN until we can
> figure out why this happened.

Do we have root cause on this yet?
Flags: needinfo?(rwatson)
I can speak to this.

This was moved to CF as part of our evacuation of Edgecast. It was one of the earlier (though not the first) property moved in Q1.

Edgecast (the initial CDN) does not require that the origin host have a valid certificate. The expectation is that, since Edgecast buys the certificate for you, you may not have a valid one, and so they intentionally accept invalid certificates. Edgecast was configured to connect to the origin using "fhr.cdn.mozilla.net" (this is relevant later).

In contrast, AWS requires that the certificate presented by the origin host be valid (correct name, not expired, etc). Additionally, by default AWS CF contacts the origin using the origin's hostname, *not* the hostname used by the client that sent the request. (The client connects using "fhr.cdn.mozilla.net", the origin is "fhr-origin.cdn.mozilla.net").


The origin was configured to return a proper certificate for fhr.cdn.mozilla.net, but *not* for fhr-origin.cdn.mozilla.net. The net result is that Edgecast worked but Cloudfront did not.


The fix was 2-fold... either of which was sufficient to resolve the issue, but both were performed:

1) Change the Zeus config to send a valid certificate for both names.

2) Change AWS to pass through the original Host header from the client when contacting the origin server.
Flags: needinfo?(rwatson)
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.