Closed Bug 1255031 Opened 8 years ago Closed 8 years ago

Input not accepting HB submissions. CORS?

Categories

(Input Graveyard :: Backend, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: glind, Assigned: osmose)

Details

## Observed:

- from localhost, and from Travis-ci testing, the 'phonehome' tests aren't working.  I get console debug blaming CORS.

```
Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at
                   https://input.mozilla.org/api/v2/hb/. (Reason: CORS header 'Access-Control-Allow-Origin'
                   missing).Cross-Origin Request Blocked: The Same Origin Policy disallows reading the remote resource at
                   https://input.mozilla.org/api/v2/hb/. (Reason: CORS header 'Access-Control-Allow-Origin' missing).
```

## Speculation:

 [bug 1252522] Upgrade to Django 1.8.10.
(I am also looking through my stuff, to see if `request` from jetpack got dumber somehow)
I have identified the problem:  

in `request` (from jetpack)

```
    xhr.setRequestHeader("Content-Type", 'application/json');
```
If I remove this line, all goes as planned.  I am not sure why :)  If any of you have insights, share them.

For now, I am going to patch around it.
(Ah, poop, I get 415 errors without this.  Maybe something changed here after all?)
We definitely had CORS headers on the HB api endpoints at one point. That was implemented in bug #1126030. The code looks ok at first glance:

https://github.com/mozilla/fjord/blob/master/fjord/heartbeat/api_views.py#L23

I have a couple of thoughts off the top of my head:

1. we updated DRF in Input or something and the flow changed somehow so as_view() isn't getting called? this doesn't feel likely.

2. something changed between when Django spits out the HTTP response and it getting to the client and the headers are getting stripped or something; maybe this is related to the recent http/2 changes?


I think the first thing I'd do is see if it spits out the header when running in a local dev environment. Maybe worth checking on -stage, too. I think that'll shed more light on this.
On a whim, I tried one of the curl request examples in the HB API docs and I get back an Access-Control-Allow-Origin header just like we want:

curl -v -XPOST 'https://input.mozilla.org/api/v2/hb/' \
    -H 'Accept: application/json; indent=4' \
    -H 'Content-type: application/json' \
    -d '
{
    "person_id": "c1dd81f2-6ece-11e4-8a01-843a4bc832e4",
    "survey_id": "lunch",
    "flow_id": "20141117_attempt1",
    "experiment_version": "1",
    "response_version": 1,
    "question_id": "howwaslunch",
    "question_text": "how was lunch?",
    "variation_id": "1",
    "updated_ts": 1416011156000,
    "is_test": true
}'

*   Trying 63.245.213.23...
* Connected to input.mozilla.org (63.245.213.23) port 443 (#0)
* found 173 certificates in /etc/ssl/certs/ca-certificates.crt
* found 704 certificates in /etc/ssl/certs
* ALPN, offering http/1.1
* SSL connection using TLS1.2 / ECDHE_RSA_AES_128_GCM_SHA256
* 	 server certificate verification OK
* 	 server certificate status verification SKIPPED
* 	 common name: input.mozilla.org (matched)
* 	 server certificate expiration date OK
* 	 server certificate activation date OK
* 	 certificate public key: RSA
* 	 certificate version: #3
* 	 subject: 
* 	 start date: Fri, 17 Apr 2015 00:00:00 GMT
* 	 expire date: Fri, 29 Apr 2016 12:00:00 GMT
* 	 issuer: C=US,O=DigiCert Inc,OU=www.digicert.com,CN=DigiCert SHA2 Extended Validation Server CA
* 	 compression: NULL
* ALPN, server accepted to use http/1.1
> POST /api/v2/hb/ HTTP/1.1
> Host: input.mozilla.org
> User-Agent: curl/7.43.0
> Accept: application/json; indent=4
> Content-type: application/json
> Content-Length: 332
> 
* upload completely sent off: 332 out of 332 bytes
< HTTP/1.1 400 BAD REQUEST
< Server: Apache
< Vary: Cookie
< X-Backend-Server: input3.webapp.phx1.mozilla.com
< Content-Type: application/json; indent=4
< Public-Key-Pins: max-age=1296000; pin-sha256="r/mIkG3eEpVdm+u/ko/cwxzOMo1bk4TyHIlByibiA5E="; pin-sha256="WoiWRyIOVNa9ihaBciRSC7XHjliYS9VwUGOIud4PB18=";
< Date: Wed, 09 Mar 2016 17:33:21 GMT
< Transfer-Encoding: chunked
< Access-Control-Allow-Origin: *
< Connection: close
< Set-Cookie: anoncsrf=ilsmNSGehxg3R0tCgBYbxagjyOYo0m5M; expires=Wed, 09-Mar-2016 19:33:21 GMT; httponly; Max-Age=7200; Path=/; secure
< Set-Cookie: multidb_pin_writes=y; expires=Wed, 09-Mar-2016 17:33:36 GMT; httponly; Max-Age=15; Path=/
< X-Frame-Options: DENY
< Allow: POST, OPTIONS
< X-Cache-Info: not cacheable; request wasn't a GET or HEAD
< 
{
    "msg": "bad request; see errors",
    "errors": {
        "updated_ts": "updated timestamp is same or older than existing data"
    }
* Closing connection 0

That's using HTTP 1.1. curl has an --http2 flag to force it to use http2, but I'm getting "Unsupported protocol" back, so I'm not sure how to make that work without a lot of hoops.
Disabling HTTP2 in prod fixed this. We've left it on in staging to aid diagnosing the issue some more. mythmon suspects a Firefox bug; he was unable to replicate this using curl over HTTP2 and the "Copy as curl" version of a failed request I generated in Firefox.
Severity: blocker → normal
Details on the failures:

- On Firefox, with HTTP2 enabled on the server, I get a "411 Length Required" error on the pre-flight OPTIONS request for a POST to Input's heartbeat endpoint.

- On Firefox, with HTTP2 disabled on the server, I get a "403 Forbidden" due to a missing payload. This is expected behavior consistent with the request successfully hitting the Python service.

- On curl, on both HTTP1.1 and HTTP2, mythmon gets the "403 Forbidden" as expected.

Hence, our suspicion that maybe Firefox is doing HTTP2 wrong somehow, or the mix of Firefox and our load balancer is not working.
Sorry for the need-info, Daniel, but would you be up for helping take a look at this?  Thanks!
Flags: needinfo?(daniel)
stephend: Thinking it's a Firefox bug is still just a suspicion at this point, and I don't think we're ready to have a Firefox dev look at it. We need to research the issue a bit more, and now that production isn't broken, I'm putting this on the backburner while I work on other stuff. Sorry if I made it sound like I was more confident in that guess than I am.
Flags: needinfo?(daniel)
(In reply to Michael Kelly [:mkelly,:Osmose] from comment #7)
> Details on the failures:
> 
> - On Firefox, with HTTP2 enabled on the server, I get a "411 Length
> Required" error on the pre-flight OPTIONS request for a POST to Input's
> heartbeat endpoint.

<t>
   A server &MAY; reject a request that contains a message body but
   not a <x:ref>Content-Length</x:ref> by responding with
   <x:ref>411 (Length Required)</x:ref>.
</t>

Does the pre-flight OPTIONS request have a message body?
For example, I totally just realized that curl probably doesn't pre-flight since that's a browser thing, so maybe our lack of replication in curl isn't quite valid. Whoops.
check your server.. a number of them had a problem with our h2 code that sends the fin for the options stream on a 0 byte data frame (rather than on the HEADERS frame). If so, that would be a (common) server bug.
(In reply to Richard Soderberg [:atoll] from comment #11)
> Does the pre-flight OPTIONS request have a message body?

Nope, it doesn't. The full server response is:

411 Length Required
A request of the requested method OPTIONS requires a valid Content-Length.

I'll research more.
(In reply to Patrick McManus [:mcmanus] from comment #13)
> check your server.. a number of them had a problem with our h2 code that
> sends the fin for the options stream on a 0 byte data frame (rather than on
> the HEADERS frame). If so, that would be a (common) server bug.

Where can I learn more about this issue and our history of community it to other server vendors? I'll probably have to explain it to our vendor, so any help would be appreciated :)
Okay. I can't reproduce the issue with curl, so I assume that this is indeed Firefox doing fancy things, as per bug 1247237.

> OPTIONS /api/v2/hb/ HTTP/1.1
> Host: input.allizom.org
> User-Agent: curl/7.47.1
> accept: text/html,application/xhtml+xml,application/xml,q=0.9,*/*;q=0.8
> accept-language: en-US,en;q=0.5
> access-control-request-method: POST
> access-control-request-headers: content-type
> origin: null
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
< HTTP/2.0 200
< server:Apache
< vary:Cookie
< x-backend-server:input1.stage.webapp.phx1.mozilla.com
< content-type:text/html; charset=utf-8
< public-key-pins:max-age=1296000; pin-sha256="r/mIkG3eEpVdm+u/ko/cwxzOMo1bk4TyHIlByibiA5E="; pin-sha256="WoiWRyIOVNa9ihaBciRSC7XHjliYS9VwUGOIud4PB18=";
< date:Thu, 10 Mar 2016 17:31:23 GMT
< access-control-allow-origin:*
< set-cookie:anoncsrf=Xqq3OjMmjB54iBGWPo5xA8MNVmwjeJdM; expires=Thu, 10-Mar-2016 19:31:23 GMT; httponly; Max-Age=7200; Path=/; secure
< x-frame-options:DENY
< access-control-allow-headers:content-type
< access-control-allow-methods:POST
< content-length:0

Has anyone a la bug 1247237 written a command-line unit test for this sort of issue, that we could provide to the vendor?
I ended up using nghttp with the --no-content-length parameter to reproduce the issue with simple GET and OPTIONS requests. Bug 1255489 tracks our communications with the vendor, but there's nothing else the app needs to do to make this work in the future, so you may either close this bug or keep it open as a tracker (your preference).
Let's close it out, then, since we fixed it on prod. Once the Zeus issue is fixed feel free to bug me about enabling HTTP2 again. :D
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: Input → Input Graveyard
You need to log in before you can comment on or make changes to this bug.