Closed Bug 1183347 Opened 9 years ago Closed 9 years ago

Allow non-redirect response for fetch

Categories

(Content Services Graveyard :: Tiles: Content Front-End, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: Mardak, Assigned: oyiptong)

References

Details

(Whiteboard: .?)

Turns out XHR always follows redirects, and CORS spec says to set Origin: null on redirect (http://www.w3.org/TR/cors/#redirect-steps).

This prevents us from making XHR from cdn to onyx to be redirected back to cdn as various browsers reject the request:

x = new XMLHttpRequest(); x.open("GET", "https://tiles.services.mozilla.com/v3/links/fetch/en-US/release"); x.send()

Firefox:
> Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at https://tiles.cdn.mozilla.net/desktop/US/en-US.18ebe23068d541bb00cebc5b4a02e08919376b91.ag.json. (Reason: CORS header 'Access-Control-Allow-Origin' missing).

Chrome:
> XMLHttpRequest cannot load https://tiles.cdn.mozilla.net/desktop/US/en-US.18ebe23068d541bb00cebc5b4a02e08919376b91.ag.json. No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin 'null' is therefore not allowed access.

Safari:
> XMLHttpRequest cannot load https://tiles.cdn.mozilla.net/desktop/US/en-US.18ebe23068d541bb00cebc5b4a02e08919376b91.ag.json. Cannot make any requests from null.


Instead of responding 303, we could respond 200 for a special request, e.g., ?noredirect to then provide the json location in the response body.
Assignee: nobody → oyiptong
oyiptong, it seems like the CORS failures is from cdn caching.

I just uploaded to stage a dummy US/zu tile:

x = new XMLHttpRequest(); x.open("GET", "https://onyx_tiles.stage.mozaws.net/v3/links/fetch/zu/release"); x.send()

That is successful with the first part of the request:

> Origin:"https://tiles-resources-stage-tiless3-8ugtjiv04rwz.s3.amazonaws.com"

Response:

> Access-Control-Allow-Origin:"https://tiles-resources-stage-tiless3-8ugtjiv04rwz.s3.amazonaws.com"
> Location:"https://tiles-resources-stage-tiless3-8ugtjiv04rwz.s3.amazonaws.com/desktop/US/zu.5fbd3cff53c5768abfdceb963e4d300f5a86da4c.ag.json"

Automatic second request:

> Origin:"null"

Second response:

> Access-Control-Allow-Origin:"*"



However, if I try this for prod, I get the errors from comment 0, but the second response headers don't have ACAO:* probably because the cached response didn't have those headers:

> Last-Modified:"Wed, 08 Jul 2015 04:30:01 GMT"
> X-Cache:"HIT"

I'll try to avoid the cache on prod to see if we can WONTFIX this bug.
Indeed, it seems to be the cdn caching that was causing issues.

x = new XMLHttpRequest(); x.open("GET", "https://tiles.services.mozilla.com/v3/links/fetch/zu/nightly"); x.send()

That seems to work fine across various browsers.

mostlygeek, is there a good way to flush the cdn cache so that their response headers include Access-Control-Allow-Origin:"*"

oyiptong, or is it that there's an existing file, so splice does not set those headers? Do we need to somehow set the ACAO:* header for each of the json files?
Flags: needinfo?(oyiptong)
Flags: needinfo?(bwong)
We set the CORS headers at the bucket level in S3. This should mean that when we make a request every file under the bucket should have ACAO set. These headers are mirrored on the CDN.

This seems to be working correctly on S3 and on Edgecast, our CDN.

On S3:

> $ curl -v -H 'Origin: mozilla.org' https://tiles-resources-prod-tiless3-qbv71djahz3b.s3.amazonaws.com/distributions/desktop/336518d62fe0cdff52ec0dff97ffb95608fd8c8c.2015-07-14T04-54-47.916077.json > /dev/null
> * Hostname was NOT found in DNS cache
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 54.231.160.58...
> * Connected to tiles-resources-prod-tiless3-qbv71djahz3b.s3.amazonaws.com (54.231.160.58) port 443 (#0)
> * TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA
> * Server certificate: *.s3.amazonaws.com
> * Server certificate: VeriSign Class 3 Secure Server CA - G3
> * Server certificate: VeriSign Class 3 Public Primary Certification Authority - G5
> > GET /distributions/desktop/336518d62fe0cdff52ec0dff97ffb95608fd8c8c.2015-07-14T04-54-47.916077.json HTTP/1.1
> > User-Agent: curl/7.37.1
> > Host: tiles-resources-prod-tiless3-qbv71djahz3b.s3.amazonaws.com
> > Accept: */*
> > Origin: mozilla.org
> >
> < HTTP/1.1 200 OK
> < x-amz-id-2: 6z1Kv4JSra/0DcnSDMnEaRrHz7f6/BnS0i4ut7PL61nTCswBM8mJ2zPxWXkpA2b0T5UD04U4g9o=
> < x-amz-request-id: 8C4A982D71211C76
> < Date: Tue, 14 Jul 2015 14:34:04 GMT
> < Access-Control-Allow-Origin: *
> < Access-Control-Allow-Methods: GET
> < Vary: Origin, Access-Control-Request-Headers, Access-Control-Request-Method
> < Content-Disposition: inline
> < Cache-Control: public, max-age=31536000
> < Last-Modified: Tue, 14 Jul 2015 04:54:59 GMT
> < ETag: "19aeb4dcbab4c7999c735a4b135366d6"
> < Accept-Ranges: bytes
> < Content-Type: application/json
> < Content-Length: 2797594
> * Server AmazonS3 is not blacklisted
> < Server: AmazonS3

Loading the same resource on Edgecast for the first time yield the expected results:

> $ curl -v -H 'Origin: mozilla.org' https://tiles.cdn.mozilla.net/distributions/desktop/336518d62fe0cdff52ec0dff97ffb95608fd8c8c.2015-07-14T04-54-47.916077.json > /dev/null
> * Hostname was NOT found in DNS cache
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 93.184.215.191...
> * Connected to tiles.cdn.mozilla.net (93.184.215.191) port 443 (#0)
> * TLS 1.2 connection using TLS_ECDHE_RSA_WITH_RC4_128_SHA
> * Server certificate: *.cdn.mozilla.net
> * Server certificate: DigiCert High Assurance CA-3
> * Server certificate: DigiCert High Assurance EV Root CA
> > GET /distributions/desktop/336518d62fe0cdff52ec0dff97ffb95608fd8c8c.2015-07-14T04-54-47.916077.json HTTP/1.1
> > User-Agent: curl/7.37.1
> > Host: tiles.cdn.mozilla.net
> > Accept: */*
> > Origin: mozilla.org
> >
> < HTTP/1.1 200 OK
> < Accept-Ranges: bytes
> < Access-Control-Allow-Methods: GET
> < Access-Control-Allow-Origin: *
> < Cache-Control: public, max-age=31536000
> < Content-Disposition: inline
> < Content-Type: application/json
> < Date: Tue, 14 Jul 2015 15:07:46 GMT
> < Etag: "19aeb4dcbab4c7999c735a4b135366d6"
> < Last-Modified: Tue, 14 Jul 2015 04:54:59 GMT
> * Server AmazonS3 is not blacklisted
> < Server: AmazonS3
> < Vary: Origin, Access-Control-Request-Headers, Access-Control-Request-Method
> < x-amz-id-2: GleAv8bk6Wvl+uUBUusA7aisFFDEaWDKvkOCg5otcgnDW6e8roiEEh6rHFVc6+FbQpV9v0WOUGA=
> < x-amz-request-id: 1CD67B67413F9D88
> < Content-Length: 2797594

Loading it from Edgecast the second time also yields good results:

> $ curl -v -H 'Origin: mozilla.org' https://tiles.cdn.mozilla.net/distributions/desktop/336518d62fe0cdff52ec0dff97ffb95608fd8c8c.2015-07-14T04-54-47.916077.json > /dev/null
> * Hostname was NOT found in DNS cache
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 93.184.215.191...
> * Connected to tiles.cdn.mozilla.net (93.184.215.191) port 443 (#0)
> * TLS 1.2 connection using TLS_ECDHE_RSA_WITH_RC4_128_SHA
> * Server certificate: *.cdn.mozilla.net
> * Server certificate: DigiCert High Assurance CA-3
> * Server certificate: DigiCert High Assurance EV Root CA
> > GET /distributions/desktop/336518d62fe0cdff52ec0dff97ffb95608fd8c8c.2015-07-14T04-54-47.916077.json HTTP/1.1
> > User-Agent: curl/7.37.1
> > Host: tiles.cdn.mozilla.net
> > Accept: */*
> > Origin: mozilla.org
> >
> < HTTP/1.1 200 OK
> < Accept-Ranges: bytes
> < Access-Control-Allow-Methods: GET
> < Access-Control-Allow-Origin: *
> < Cache-Control: public, max-age=31536000
> < Content-Disposition: inline
> < Content-Type: application/json
> < Date: Tue, 14 Jul 2015 15:08:08 GMT
> < Etag: "19aeb4dcbab4c7999c735a4b135366d6"
> < Last-Modified: Tue, 14 Jul 2015 04:54:59 GMT
> * Server ECAcc (mdw/1270) is not blacklisted
> < Server: ECAcc (mdw/1270)
> < x-amz-id-2: GleAv8bk6Wvl+uUBUusA7aisFFDEaWDKvkOCg5otcgnDW6e8roiEEh6rHFVc6+FbQpV9v0WOUGA=
> < x-amz-request-id: 1CD67B67413F9D88
> < X-Cache: HIT
> < Content-Length: 2797594
Flags: needinfo?(oyiptong)
It seems ACAO is being included, on every subsequent request, both to CDN and S3, at least for me
Here's the curl for the fetch redirected to .ag.json file that appears to be cached where onyx replies with ACAO but not the cdn:

$ curl -vLH 'Origin: https://tiles.cdn.mozilla.net' https://tiles.services.mozilla.com/v3/links/fetch/en-US/nightly > /dev/null 
* Hostname was NOT found in DNS cache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 54.191.162.200...
* Connected to tiles.services.mozilla.com (54.191.162.200) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
* Server certificate: *.services.mozilla.com
* Server certificate: DigiCert SHA2 Secure Server CA
* Server certificate: DigiCert Global Root CA
> GET /v3/links/fetch/en-US/nightly HTTP/1.1
> User-Agent: curl/7.37.1
> Host: tiles.services.mozilla.com
> Accept: */*
> Origin: https://tiles.cdn.mozilla.net
> 
< HTTP/1.1 303 SEE OTHER
< Access-Control-Allow-Origin: https://tiles.cdn.mozilla.net
< Content-Type: text/html; charset=utf-8
< Date: Tue, 14 Jul 2015 15:50:08 GMT
< Location: https://tiles.cdn.mozilla.net/desktop-prerelease/US/en-US.4a8406dd9f1f297c32243ec976b0ea691e60ca7e.ag.json
< Content-Length: 0
< Connection: keep-alive
< 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connection #0 to host tiles.services.mozilla.com left intact
* Issue another request to this URL: 'https://tiles.cdn.mozilla.net/desktop-prerelease/US/en-US.4a8406dd9f1f297c32243ec976b0ea691e60ca7e.ag.json'
* Hostname was NOT found in DNS cache
*   Trying 93.184.215.191...
* Connected to tiles.cdn.mozilla.net (93.184.215.191) port 443 (#1)
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_RC4_128_SHA
* Server certificate: *.cdn.mozilla.net
* Server certificate: DigiCert High Assurance CA-3
* Server certificate: DigiCert High Assurance EV Root CA
> GET /desktop-prerelease/US/en-US.4a8406dd9f1f297c32243ec976b0ea691e60ca7e.ag.json HTTP/1.1
> User-Agent: curl/7.37.1
> Host: tiles.cdn.mozilla.net
> Accept: */*
> Origin: https://tiles.cdn.mozilla.net
> 
< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Cache-Control: public, max-age=31536000
< Content-Disposition: inline
< Content-Type: application/json
< Date: Tue, 14 Jul 2015 15:50:09 GMT
< Etag: "5ef360920494652b26a4ea312a2a4871"
< Last-Modified: Wed, 08 Jul 2015 04:52:08 GMT
* Server ECAcc (rhv/8125) is not blacklisted
< Server: ECAcc (rhv/8125)
< Vary: Accept-Encoding
< x-amz-id-2: SeczOjaHUzOh6vv9NAhjf/nNNCPlaWxrPyq1sKeShiVYmqWwMfGGnOZnudfAqohKgVvb90wPGsE=
< x-amz-request-id: 8CC82EBE992583ED
< X-Cache: HIT
< Content-Length: 9705
Here's the curl for the test "zu" locale that has the correct headers from cdn:

$ curl -vLH 'Origin: https://tiles.cdn.mozilla.net' https://tiles.services.mozilla.com/v3/links/fetch/zu/nightly > /dev/null 
* Hostname was NOT found in DNS cache
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 52.11.253.3...
* Connected to tiles.services.mozilla.com (52.11.253.3) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
* Server certificate: *.services.mozilla.com
* Server certificate: DigiCert SHA2 Secure Server CA
* Server certificate: DigiCert Global Root CA
> GET /v3/links/fetch/zu/nightly HTTP/1.1
> User-Agent: curl/7.37.1
> Host: tiles.services.mozilla.com
> Accept: */*
> Origin: https://tiles.cdn.mozilla.net
> 
< HTTP/1.1 303 SEE OTHER
< Access-Control-Allow-Origin: https://tiles.cdn.mozilla.net
< Content-Type: text/html; charset=utf-8
< Date: Tue, 14 Jul 2015 15:54:22 GMT
< Location: https://tiles.cdn.mozilla.net/desktop-prerelease/US/zu.828f009e314447404e3b32e838bb54e53462b61b.ag.json
< Content-Length: 0
< Connection: keep-alive
< 
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
* Connection #0 to host tiles.services.mozilla.com left intact
* Issue another request to this URL: 'https://tiles.cdn.mozilla.net/desktop-prerelease/US/zu.828f009e314447404e3b32e838bb54e53462b61b.ag.json'
* Hostname was NOT found in DNS cache
*   Trying 93.184.215.191...
* Connected to tiles.cdn.mozilla.net (93.184.215.191) port 443 (#1)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_RC4_128_SHA
* Server certificate: *.cdn.mozilla.net
* Server certificate: DigiCert High Assurance CA-3
* Server certificate: DigiCert High Assurance EV Root CA
> GET /desktop-prerelease/US/zu.828f009e314447404e3b32e838bb54e53462b61b.ag.json HTTP/1.1
> User-Agent: curl/7.37.1
> Host: tiles.cdn.mozilla.net
> Accept: */*
> Origin: https://tiles.cdn.mozilla.net
> 
< HTTP/1.1 200 OK
< Accept-Ranges: bytes
< Access-Control-Allow-Methods: GET
< Access-Control-Allow-Origin: *
< Cache-Control: public, max-age=31536000
< Content-Disposition: inline
< Content-Type: application/json
< Date: Tue, 14 Jul 2015 15:54:22 GMT
< Etag: "cba3380aedc165c68eb63ab08068bd4f"
< Last-Modified: Tue, 14 Jul 2015 06:59:25 GMT
* Server ECAcc (rhv/8156) is not blacklisted
< Server: ECAcc (rhv/8156)
< Vary: Accept-Encoding
< x-amz-id-2: V1LKhgiDG0LKIK4TWCFMA8OqVvzjWZKGibQIIXktopvpI1OO6hNlOAYkzMFvBgSvtFa3m7lY5e4=
< x-amz-request-id: 6AED1D9F76CEEFB7
< X-Cache: HIT
< Content-Length: 375
Upon our conversation on IRC, I think I know what the problem is:

The CDN will cache the first response from S3. If the original request didn't specify the Origin header, the response from S3 will be without the ACAO header.

That means that subsequent requests will not have this header, even if those requests specify an Origin header.

A potential solution to get around the caching issue is to use a technique called "cache-busting". It is to specify a query parameter. It could be a timestamp, but that would nullify the cache. We could possibly specify a day or a number representing the week or month?

> $ curl -v -H "Origin: mozilla.org" "https://tiles.cdn.mozilla.net/distributions/desktop/336518d62fe0cdff52ec0dff97ffb95608fd8c8c.2015-07-14T04-54-47.916077.json?cachebusting=true" > /dev/null
> * Hostname was NOT found in DNS cache
>   % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
>                                  Dload  Upload   Total   Spent    Left  Speed
>   0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0*   Trying 93.184.215.191...
> * Connected to tiles.cdn.mozilla.net (93.184.215.191) port 443 (#0)
> * TLS 1.2 connection using TLS_ECDHE_RSA_WITH_RC4_128_SHA
> * Server certificate: *.cdn.mozilla.net
> * Server certificate: DigiCert High Assurance CA-3
> * Server certificate: DigiCert High Assurance EV Root CA
> > GET /distributions/desktop/336518d62fe0cdff52ec0dff97ffb95608fd8c8c.2015-07-14T04-54-47.916077.json?cachebusting=true HTTP/1.1
> > User-Agent: curl/7.37.1
> > Host: tiles.cdn.mozilla.net
> > Accept: */*
> > Origin: mozilla.org
> >
> < HTTP/1.1 200 OK
> < Accept-Ranges: bytes
> < Access-Control-Allow-Methods: GET
> < Access-Control-Allow-Origin: *
> < Cache-Control: public, max-age=31536000
> < Content-Disposition: inline
> < Content-Type: application/json
> < Date: Tue, 14 Jul 2015 15:55:16 GMT
> < Etag: "19aeb4dcbab4c7999c735a4b135366d6"
> < Last-Modified: Tue, 14 Jul 2015 04:54:59 GMT
> * Server AmazonS3 is not blacklisted
> < Server: AmazonS3
> < Vary: Origin, Access-Control-Request-Headers, Access-Control-Request-Method
> < x-amz-id-2: 7mgcy0VrqQyCYSc+AViE+6fVm/LE+6ajwsrq0+OBD8QwsEAqVsJZllS+2c4GjcVnRY0wdkZQ1OQ=
> < x-amz-request-id: 516FE953F1FD8834
> < Content-Length: 2797594
12:08 <oyiptong> hmm
12:08 <oyiptong> the simpler solution
12:08 <oyiptong> would be if the CDN always made the request to S3 with the Origin header
12:08 <oyiptong> so it always returns with ACAO
12:08 <oyiptong> mostlygeek: can we do that?
12:09 <oyiptong> can we make it so that edgecast always sends an additional header, even if the original request doesn't specify it?
Won't need this with bug 1183778. Also, with a separate hello channel from bug 1181368, most likely the cache will be populated with the appropriate ACAO:* response.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Flags: needinfo?(bwong)
You need to log in before you can comment on or make changes to this bug.