Closed Bug 1234464 Opened 9 years ago Closed 8 years ago

[MDN] Static assets hosted by the CDN are missing some headers

Categories

(Infrastructure & Operations Graveyard :: WebOps: Community Platform, task)

task
Not set
blocker

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: openjck, Assigned: nmaul)

References

Details

(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/2360] )

MDN serves static assets up with a Vary, Cache-Control, and ACAO headers:

    > curl -I https://developer.mozilla.org/static/build/styles/mdn.e21f8f108f4d.css
    HTTP/1.1 200 OK
    Server: Apache
    X-Backend-Server: developer2.webapp.scl3.mozilla.com
    Vary: Accept-Encoding
    Cache-Control: public, max-age=315360000
    Content-Type: text/css; charset="utf-8"
    Date: Tue, 22 Dec 2015 05:50:24 GMT
    Access-Control-Allow-Origin: *
    Accept-Ranges: bytes
    Connection: Keep-Alive
    Last-Modified: Thu, 17 Dec 2015 22:44:22 GMT
    X-Cache-Info: cached
    Content-Length: 80336

But those headers are missing when the same files are accessed over the CDN:

    > curl -I https://developer.cdn.mozilla.net/static/build/styles/mdn.e21f8f108f4d.css
    HTTP/1.1 200 OK
    Content-Encoding: gzip
    Accept-Ranges: bytes
    Content-Type: text/css
    Date: Tue, 22 Dec 2015 05:59:16 GMT
    Etag: "139d0"
    Last-Modified: Thu, 17 Dec 2015 22:44:22 GMT
    Server: ECAcc (ewr/14DC)
    X-Backend-Server: developer2.webapp.scl3.mozilla.com
    X-Cache: HIT
    X-Cache-Info: caching
    Content-Length: 22166

Can we make sure the CDN serves those headers as well? The headers are set automatically by WhiteNoise[1] to optimize caching of static assets.

Thanks!

[1] https://github.com/evansd/whitenoise
Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/2360]
Small nit: use curl -v instead of -I ... -I causes a HEAD request, which can result in different headers than a GET. It doesn't matter in this case, but it's worth keeping in mind for the future. You will however have to throw away the output somehow... "-o /dev/null" will do the trick.


The problem here is that developer.cdn.mozilla.net does not go to developer.mozilla.org... it goes to developer-origin.cdn.mozilla.net. This lives on the MDN cluster, but does *not* run a copy of the app. It's simplified. Things like /static and /media work as they're just Aliases to static files on disk, but actual app URLs don't work.

This is a conscious choice from a long time ago, across most of our web properties using CDN. The concern that I remember was that we didn't want the app itself to be available at the "wrong" URL, as it would 1) be confusing, and 2) can mess up search engine rankings.


There are 2 obvious solutions:

1) We use mod_proxy to send select bits of traffic destined for developer-origin.cdn.mozilla.net over to developer.mozilla.org, where it will then be served by the actual app. This may ring a bell: it's exactly what mdn.mozillademos.org does for user created content (demos) on MDN. This solution effectively *adds* complexity to the system, but it's the most direct solution.

2) We change the CDN to use developer.mozilla.org as the origin. This eliminates a good bit of complexity overall, at the risk of making stuff available on the CDN domain name that we didn't intend to. This could potentially be mitigated by clever rules in the Apache config, but right now I can't say by how much. Safest to assume no mitigation and that the site would be fully available at developer.cdn.mozilla.net.

3) A bigger change- move the whole domain to a CDN (so that developer.mozilla.org resolved to a CDN, not to the servers in SCL3) and eliminate developer.cdn.mozilla.net entirely. This has rather large ramifications (especially with respect to caching and auth), so I wouldn't recommend it as a quick fix. But long term, it's probably the cleanest setup.


For the record, Bedrock (www.mozilla.org) has just recently moved to a system much like #3. www.mozilla.org goes to Cloudflare, which is a CDN... it in turn uses hosts in AWS and our SCL3 datacenter as its origin.


Personally, I would recommend #2, but I'm biased- from my position I see the complexity but not the risk. It could be that #1 is more palatable to you.
Flags: needinfo?(jkarahalis)
Blocks: 1234712
Thanks for the detailed explanation.

(In reply to Jake Maul [:jakem] from comment #1)
> Things like /static and /media work as they're just Aliases to
> static files on disk, but actual app URLs don't work.

Is that still the case? I thought bug 1233156 changed that, such that the CDN now hits the application for files under /static.

If not, #1 works for me. It may be the simplest solution in the short term. We only need the CDN to hit the application for files under /static because the application serves them up directly with special headers.
Flags: needinfo?(jkarahalis)
(In reply to John Karahalis [:openjck] from comment #2)
> If not, #1 works for me. It may be the simplest solution in the short term.
> We only need the CDN to hit the application for files under /static because
> the application serves them up directly with special headers.

That phrasing is confusing. Trying again...

The application is able to serve files under /static directly without an alias. When the application serves those files, it serves them up with special values for some headers to optimize caching. We want the CDN to serve those files up with the same headers. So #1 makes sense to me. Can we continue doing that for user-contributed demos and also start using that approach for files under /static?
Bumping the priority since this is affecting page load time.
Severity: normal → critical
I hate to be a nag, but performance is taking a pretty big hit. Bumping the severity up one.
Severity: critical → blocker
Assignee: server-ops-webops → nmaul
Should be good now. May take <max-age> time for all CDN responses to update and be correct.

The diff looks like this:

-    Alias /static /data/www/developer.mozilla.org/kuma/static
+    # Alias /static /data/www/developer.mozilla.org/kuma/static
+    ProxyPassMatch ^/(static/.+)$ http://developer-local:81/$1
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
See Also: → 1238030
(In reply to Jake Maul [:jakem] from comment #6)
> Should be good now. May take <max-age> time for all CDN responses to update
> and be correct.

Is it possible to speed that up? Clients are still getting no max-age from the CDN at the moment.
(In reply to John Karahalis [:openjck] from comment #7)
> Is it possible to speed that up? Clients are still getting no max-age from
> the CDN at the moment.

Opened up bug 1239718 for that
Product: Infrastructure & Operations → Infrastructure & Operations Graveyard
You need to log in before you can comment on or make changes to this bug.