Last Comment Bug 798430 - HTTP request headers lost on channel redirect (regression in Firefox 15)
: HTTP request headers lost on channel redirect (regression in Firefox 15)
Status: RESOLVED FIXED
[fixed in services][qa-]
:
Product: Cloud Services
Classification: Client Software
Component: Firefox Sync: Backend (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla19
Assigned To: Richard Newman [:rnewman]
:
:
Mentors:
: 781017 794338 807074 (view as bug list)
Depends on:
Blocks: 794338 794652 799111 802601
  Show dependency treegraph
 
Reported: 2012-10-05 09:53 PDT by Gregory Szorc [:gps]
Modified: 2012-12-19 10:41 PST (History)
21 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
affected
+
affected
+
fixed
+
fixed


Attachments
Modify Sync's RESTRequest test to verify UA (4.69 KB, patch)
2012-10-05 09:53 PDT, Gregory Szorc [:gps]
no flags Details | Diff | Splinter Review
Illustrative patch — Sync's channel notification listener doesn't copy headers. v1 (2.95 KB, patch)
2012-10-07 01:03 PDT, Richard Newman [:rnewman]
gps: review+
Details | Diff | Splinter Review
Applied header copying to RESTRequest (3.50 KB, patch)
2012-10-09 12:13 PDT, Gregory Szorc [:gps]
no flags Details | Diff | Splinter Review
Proposed patch. v1 (9.33 KB, patch)
2012-10-09 21:51 PDT, Richard Newman [:rnewman]
gps: review+
Details | Diff | Splinter Review
Patch for Aurora. (9.46 KB, patch)
2012-10-18 14:39 PDT, Richard Newman [:rnewman]
rnewman: review+
lukasblakk+bugs: approval‑mozilla‑aurora+
Details | Diff | Splinter Review
Patch for Beta. (9.05 KB, patch)
2012-10-18 14:43 PDT, Richard Newman [:rnewman]
rnewman: review+
akeybl: approval‑mozilla‑beta+
Details | Diff | Splinter Review

Description Gregory Szorc [:gps] 2012-10-05 09:53:45 PDT
Created attachment 668486 [details] [diff] [review]
Modify Sync's RESTRequest test to verify UA

+++ This bug was initially created as a clone of Bug #794652 +++

Cloning bug into the public sphere.

As part of investigating bug 794652, it appears the underlying issue is that Necko is losing HTTP request headers when processing redirects, either HTTP or internal channel redirects. The issue from bug 794652 manifests in different ways:

A) We set a custom User-Agent request header and after redirect the application's default User-Agent header is sent over the wire, not the custom value.
B) We set an Authorization header and we believe (however we haven't yet confirmed) that this request header is lost as part of handling the redirect.

The specific issue plaguing Sync saw an uptick when Firefox 15 hit the stable channel on August 28. See bug 794652 for some raw data. Essentially, there was a regression somewhere in the 14-15 window.

I have implemented a unit test that shows HTTP request headers are not preserved across channel redirects. However, my unit test fails on Firefox 14. bsmith tells me there are multiple classes of channel redirects and its possible that I'm not testing the right one. So, there appear to be multiple pathways affected.

The test in question tests code at https://mxr.mozilla.org/mozilla-central/source/services/common/rest.js.

See related: bug 553888, bug 401564.
Comment 1 Honza Bambas (:mayhemer) 2012-10-05 12:51:34 PDT
(In reply to Gregory Szorc [:gps] from comment #0)
> However, my unit test fails on Firefox 14. 

So, is this really an Fx 15 regression then?
Comment 2 Richard Newman [:rnewman] 2012-10-05 13:05:30 PDT
(In reply to Honza Bambas (:mayhemer) from comment #1)
> (In reply to Gregory Szorc [:gps] from comment #0)
> > However, my unit test fails on Firefox 14. 
> 
> So, is this really an Fx 15 regression then?

I think Greg's point is summarized as:

* There seem to be multiple cases in which Necko doesn't copy headers (correctly) across redirects
* Here's a test that illustrates one of these
* Other evidence seems to suggest that the particular regression that we want to fix happened in 15, not 14
* There are lots of different redirect scenarios
* Ergo, please treat this test as an example of the kind of thing that *should* work but does not, and is likely to be the cause of Bug 794652.
Comment 3 Josh Aas 2012-10-06 07:20:22 PDT
Honza, Patrick - either of you have any idea what might be wrong here?
Comment 4 Christian :Biesinger (don't email me, ping me on IRC) 2012-10-07 00:29:03 PDT
Er, Necko doesn't copy headers across redirects, period. Unless that changed lately...?
Comment 5 Richard Newman [:rnewman] 2012-10-07 00:48:46 PDT
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #4)
> Er, Necko doesn't copy headers across redirects, period. Unless that changed
> lately...?

Then perhaps what happened in fx15 is that something that previously _didn't_ appear to be a redirect now _does_; some random distribution across Sync requests (not just at the start of a sync!) now arrive without our custom headers, and this appears to affect only 15+.

Our `asyncOnChannelRedirect` method just says "OK" to the redirect. Should it be initializing `newChannel` as we do when we first create the initial channel?
Comment 6 Richard Newman [:rnewman] 2012-10-07 01:03:31 PDT
Created attachment 668886 [details] [diff] [review]
Illustrative patch — Sync's channel notification listener doesn't copy headers. v1
Comment 7 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-10-07 01:52:46 PDT
The Sync server doesn't ever (according to the Sync guys) send any 3xx responses. So, any redirects that happen would be "internal" redirects. The hypothesis is that something has changed where we are now doing internal redirects now where we didn't before.

However, we also filed this because originally GPS thought that his test case passed on 14 and broke on 15. But, if it fails on 14 too, then there's less (no?) evidence that this is a problem with (internal) redirects.
Comment 8 Richard Newman [:rnewman] 2012-10-07 12:08:48 PDT
Comment on attachment 668886 [details] [diff] [review]
Illustrative patch — Sync's channel notification listener doesn't copy headers. v1

I want Greg to see if this makes a difference. (His test patch doesn't apply on my tree…)
Comment 9 Gregory Szorc [:gps] 2012-10-07 12:13:23 PDT
Comment on attachment 668886 [details] [diff] [review]
Illustrative patch — Sync's channel notification listener doesn't copy headers. v1

Review of attachment 668886 [details] [diff] [review]:
-----------------------------------------------------------------

This seems like a straightforward patch. But, bsmith said that necko should copy headers transparently on internal redirects. biesi might be in disagreement. Given the confusion, I'd like to see a necko peer r+ this.

::: services/sync/modules/resource.js
@@ +588,5 @@
>    asyncOnChannelRedirect:
>      function asyncOnChannelRedirect(oldChannel, newChannel, flags, callback) {
>  
> +    // Copy across the redirect the headers that our caller set.
> +    for each (let header in this._headersToCopy) {

for (let x of y)
Comment 10 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-10-07 13:48:05 PDT
(In reply to Gregory Szorc [:gps] from comment #9)
> This seems like a straightforward patch. But, bsmith said that necko should
> copy headers transparently on internal redirects. biesi might be in
> disagreement. Given the confusion, I'd like to see a necko peer r+ this.

I guess I am very wrong. Really, only the application (Sync in this case) can really know what to do on redirect.

First, I would write a separate patch that uses telemetry to count how many channels you make vs. how many times your redirect handler is called, and then push this to all three channels. Perhaps instead of logging a simple count, you should log the value of the redirect flags; then the count of redirects would be equal to the sum of the counts of the samples. This would allow you to distinguish between internal vs. server-initiated redirects, for example. (This should be done in the Sync bug.)

In your redirect handler, you should be checking that the new URI is equal to the old URI (or, at least, that they have the same origin), and probably also check that the that the redirect is an INTERNAL_REDIRECT.

It is a common pattern to have a function called SetupChannel() that does everything that needs to be done in both the initial and redirect cases. (Setting headers and various flags on the connection.) Then your redirect handler would just have to sanity-check the new URI and then call SetupChannel().

Anyway, if you're not trying to fix a bug in Necko then you should put the patch for Sync in the other bug (bug 794652).

Back to this (potential) bug: Recently we've been doing work on proxy-related things, including especially making synchronous proxy stuff asynchronous. Could this be causing us to do some internal redirect more frequently?
Comment 11 Gregory Szorc [:gps] 2012-10-07 16:51:46 PDT
Are there any about:config prefs, etc that people can switch to enable logging, etc so we can try to track this down?
Comment 12 Richard Newman [:rnewman] 2012-10-07 17:32:41 PDT
(In reply to Gregory Szorc [:gps] from comment #11)
> Are there any about:config prefs, etc that people can switch to enable
> logging, etc so we can try to track this down?

https://developer.mozilla.org/docs/HTTP_Logging
Comment 13 Richard Newman [:rnewman] 2012-10-07 17:41:02 PDT
(In reply to Brian Smith (:bsmith) from comment #10)

> I guess I am very wrong. Really, only the application (Sync in this case)
> can really know what to do on redirect.

Is there a document somewhere that defines exactly what Necko will copy in which circumstances?

For example, I assume that it copies the request URI (at least, it will for nsIHttpChannel). Will it copy Content-Type? The request body? Channel flags?

IMO it would be really broken to — by default — transfer request bodies across an internal redirect without transferring the Content-Type header, so is Biesi oversimplifying when he says "Necko doesn't copy headers across redirects, period", or does it not transfer request bodies?

Also, this reminds me of an old issue that some users reported. One example:

http://forums.mozillazine.org/viewtopic.php?f=27&t=2025227
Comment 14 Gregory Szorc [:gps] 2012-10-09 12:13:17 PDT
Created attachment 669685 [details] [diff] [review]
Applied header copying to RESTRequest

Per rnewman's request.
Comment 15 Brian Smith (:briansmith, :bsmith, use NEEDINFO?) 2012-10-09 12:40:07 PDT
If you don't add the checks I suggest in comment 10, then you are creating significant security risk, by potentially forwarding the auth headers to an arbitrary domain. Please make sure you understand my suggestions in comment 10 before you check in that patch.
Comment 16 Richard Newman [:rnewman] 2012-10-09 13:14:37 PDT
(In reply to Brian Smith (:bsmith) from comment #15)
> If you don't add the checks I suggest in comment 10, then you are creating
> significant security risk, by potentially forwarding the auth headers to an
> arbitrary domain. Please make sure you understand my suggestions in comment
> 10 before you check in that patch.

Understood; this sketch patch was just verifying that this direction changes the behavior in the right way. I'll put a finished patch in Bug 794652.
Comment 17 Gregory Szorc [:gps] 2012-10-09 13:34:24 PDT
An awesome user attached an HTTP log to bug 799111. I'm sure a Necko peer will be able to make more sense of it than me.
Comment 18 Honza Bambas (:mayhemer) 2012-10-09 15:55:01 PDT
(In reply to Gregory Szorc [:gps] from comment #17)
> An awesome user attached an HTTP log to bug 799111. I'm sure a Necko peer
> will be able to make more sense of it than me.

I commented on that bug.
Comment 19 Richard Newman [:rnewman] 2012-10-09 20:11:29 PDT
Since Bug 769764, there doesn't seem to be a way to trigger an internal redirect -- I don't see any examples from the netwerk tests, and I can only find bugs about it.

Can any of you necko folks suggest how I might write a test for this scenario?
Comment 20 Richard Newman [:rnewman] 2012-10-09 21:51:48 PDT
Created attachment 669860 [details] [diff] [review]
Proposed patch. v1

This version combines test and change.

Note that I factored out the logic for deciding redirect behavior, because we can't provoke a matching redirect in our test!

Only RESTRequest is tested.

xpcshell-tests pass for me. Have not yet run TPS.
Comment 21 Gregory Szorc [:gps] 2012-10-10 08:48:34 PDT
Comment on attachment 669860 [details] [diff] [review]
Proposed patch. v1

Review of attachment 669860 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM.
Comment 22 Honza Bambas (:mayhemer) 2012-10-10 08:52:38 PDT
Please wait with landing these patches until I finish investigation in bug 799111.  I agree with bsmith, this is potentially dangerous thing to do.
Comment 23 Richard Newman [:rnewman] 2012-10-10 10:13:14 PDT
(In reply to Honza Bambas (:mayhemer) from comment #22)
> Please wait with landing these patches until I finish investigation in bug
> 799111.  I agree with bsmith, this is potentially dangerous thing to do.

Note that the reviewed patch encapsulates bsmith's logic:

+    let isInternal = !!(flags & Ci.nsIChannelEventSink.REDIRECT_INTERNAL);
+    let isSameURI  = newChannel.URI.equals(oldChannel.URI);
+    this._log.debug("Channel redirect: " + oldChannel.URI.spec + ", " +
+                    newChannel.URI.spec + ", internal = " + isInternal);
+    return isInternal && isSameURI;


I don't know if it'll fix the problem (because I'm unable to trigger a redirect that would match those conditions), but I would be surprised if this patch were dangerous. (Please enlighten me if it is, though!)
Comment 24 Honza Bambas (:mayhemer) 2012-10-10 10:53:08 PDT
(In reply to Richard Newman [:rnewman] from comment #23)
> Note that the reviewed patch encapsulates bsmith's logic:

With this logic it is then safe.
Comment 25 Richard Newman [:rnewman] 2012-10-10 11:51:09 PDT
https://hg.mozilla.org/services/services-central/rev/c11c4092c4bd
https://hg.mozilla.org/integration/mozilla-inbound/rev/f75d369a701f

Will ask bobm to see if this has an impact when it gets into Nightly. Then we can see about uplifts.
Comment 26 Jason Duell [:jduell] (needinfo me) 2012-10-10 12:02:57 PDT
Comment on attachment 669860 [details] [diff] [review]
Proposed patch. v1

Review of attachment 669860 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/common/rest.js
@@ +561,5 @@
> +    try {
> +      if (this.shouldCopyOnRedirect(oldChannel, newChannel, flags)) {
> +        this._log.trace("Copying headers for safe internal redirect.");
> +        for (let key in this._headers) {
> +          newChannel.setRequestHeader(key, this._headers[key], false);

Good thing we split nsHttpHeaderArray::SetHeaderFromNet out of SetHeader, or we'd have barfed on singleton headers when you do this.  Yay for conservative design choices...
Comment 27 Richard Newman [:rnewman] 2012-10-10 12:04:14 PDT
(In reply to Jason Duell (:jduell) from comment #26)

> Good thing we split nsHttpHeaderArray::SetHeaderFromNet out of SetHeader, or
> we'd have barfed on singleton headers when you do this.  Yay for
> conservative design choices...

To be fair, the code that sets the header in the first place has been doing this since ~2009…
Comment 28 Ed Morley [:emorley] 2012-10-11 06:59:22 PDT
https://hg.mozilla.org/mozilla-central/rev/f75d369a701f
Comment 29 Richard Newman [:rnewman] 2012-10-11 21:03:58 PDT
Set all the flags!
Comment 30 Lukas Blakk [:lsblakk] use ?needinfo 2012-10-16 15:52:38 PDT
(In reply to Richard Newman [:rnewman] from comment #29)
> Set all the flags!

Nominate for uplift all the patches!
Comment 31 Richard Newman [:rnewman] 2012-10-16 17:11:02 PDT
bobm: would you be so kind as to see if Nightly 19 UA 401 rates have decreased in the past 5 days?
Comment 32 Gregory Szorc [:gps] 2012-10-17 10:04:08 PDT
A user in bug 794338 said that Nightly caused their Sync to start working (presumably meaning their 401's stopped).
Comment 33 Richard Newman [:rnewman] 2012-10-17 13:54:04 PDT
Nightly numbers in the error logs are too small to make a determination, but there are two 20121012 and up 19.0a1 UAs with 401s. This is in contrast to user reports that this fixes the problem.

My instinct is to uplift; worst-case (probably) it just doesn't have an effect. We'll be able to see more clearly on Aurora and Beta.
Comment 34 Richard Newman [:rnewman] 2012-10-18 14:39:07 PDT
Created attachment 672962 [details] [diff] [review]
Patch for Aurora.

I aim to land this in Aurora, watch for screams, then Beta.

[Approval Request Comment]
Bug caused by (feature/regressing bug #):
  Unknown.

User impact if declined:
  Some users, sometimes, will see messaging about incorrect Sync credentials.

Testing completed (on m-c, etc.): 
  Been in Nightly for a week; no screams. Some users have verified that this fixes their problem, but we don't have enough statistics to know for sure.

Risk to taking this patch (and alternatives if risky): 
  Slim; it's additive.

String or UUID changes made by this patch: 
  None.
Comment 35 Richard Newman [:rnewman] 2012-10-18 14:43:49 PDT
Created attachment 672967 [details] [diff] [review]
Patch for Beta.

Beta doesn't have Bug 644734, so the Aurora patch failed to apply cleanly.
Comment 36 Lukas Blakk [:lsblakk] use ?needinfo 2012-10-19 11:38:17 PDT
Comment on attachment 672962 [details] [diff] [review]
Patch for Aurora.

Report back on the volume of any Aurora screams so we can decide if it's time to go ahead with Beta as well.
Comment 37 Richard Newman [:rnewman] 2012-10-19 23:17:44 PDT
https://hg.mozilla.org/releases/mozilla-aurora/rev/db0f6c3046b2

*prepares earplugs*
Comment 38 Lukas Blakk [:lsblakk] use ?needinfo 2012-10-22 12:33:12 PDT
Triage drive-by, waiting a little longer for this one - we'll want it in before beta 4 (next Monday).
Comment 39 Richard Newman [:rnewman] 2012-10-23 18:52:58 PDT
bobm reports ~0 Aurora 401s for Sync now. I've heard no screams. Flagging for beta approval. Ready when you are!
Comment 40 Alex Keybl [:akeybl] 2012-10-24 09:58:46 PDT
Comment on attachment 672967 [details] [diff] [review]
Patch for Beta.

This is a good time in the beta cycle to take changes of this nature - thanks Richard.
Comment 42 Gregory Szorc [:gps] 2012-11-01 14:41:03 PDT
*** Bug 807074 has been marked as a duplicate of this bug. ***
Comment 43 Richard Newman [:rnewman] 2012-12-14 10:20:22 PST
*** Bug 794338 has been marked as a duplicate of this bug. ***
Comment 44 Richard Newman [:rnewman] 2012-12-16 14:48:07 PST
Bug 806494 includes this log:

1355415075728	Sync.Service	INFO	In sync().
1355415075728	Sync.Status	INFO	Resetting Status.
1355415075728	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1355415075861	Sync.Resource	DEBUG	Channel redirect: https://scl2-sync4.services.mozilla.com/1.1/vi4usibtxmx2cgoutp5piyzw2uqtlbod/info/collections, https://scl2-sync4.services.mozilla.com/1.1/vi4usibtxmx2cgoutp5piyzw2uqtlbod/info/collections, 4
1355415076572	Sync.Resource	DEBUG	mesg: GET fail 401 https://scl2-sync4.services.mozilla.com/1.1/vi4usibtxmx2cgoutp5piyzw2uqtlbod/info/collections
1355415076572	Sync.Resource	DEBUG	GET fail 401 https://scl2-sync4.services.mozilla.com/1.1/vi4usibtxmx2cgoutp5piyzw2uqtlbod/info/collections

That '4' denotes an internal redirect. The URIs are equal, so:

    // For internal redirects, copy the headers that our caller set.
    try {
      if ((flags & Ci.nsIChannelEventSink.REDIRECT_INTERNAL) &&
          newChannel.URI.equals(oldChannel.URI)) {
        this._log.trace("Copying headers for safe internal redirect.");
        for (let header of this._headersToCopy) {
          let value = oldChannel.getRequestHeader(header);
          if (value) {
            newChannel.setRequestHeader(header, value);
          }
        }
      }
    } catch (ex) {
      this._log.error("Error copying headers: " + CommonUtils.exceptionStr(ex));
    }

will run. There's no error, yet the request still gets a 401 response.

Either there's a bug in this code, or there's some omission after the redirect that we aren't handling.
Comment 45 Richard Newman [:rnewman] 2012-12-19 10:41:17 PST
*** Bug 781017 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.