The default bug view has changed. See this FAQ.

HTTP request headers lost on channel redirect (regression in Firefox 15)

RESOLVED FIXED in Firefox 17

Status

Cloud Services
Firefox Sync: Backend
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: gps, Assigned: rnewman)

Tracking

unspecified
mozilla19
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox15 affected, firefox16+ affected, firefox17+ fixed, firefox18+ fixed)

Details

(Whiteboard: [fixed in services][qa-])

Attachments

(5 attachments, 1 obsolete attachment)

(Reporter)

Description

5 years ago
Created attachment 668486 [details] [diff] [review]
Modify Sync's RESTRequest test to verify UA

+++ This bug was initially created as a clone of Bug #794652 +++

Cloning bug into the public sphere.

As part of investigating bug 794652, it appears the underlying issue is that Necko is losing HTTP request headers when processing redirects, either HTTP or internal channel redirects. The issue from bug 794652 manifests in different ways:

A) We set a custom User-Agent request header and after redirect the application's default User-Agent header is sent over the wire, not the custom value.
B) We set an Authorization header and we believe (however we haven't yet confirmed) that this request header is lost as part of handling the redirect.

The specific issue plaguing Sync saw an uptick when Firefox 15 hit the stable channel on August 28. See bug 794652 for some raw data. Essentially, there was a regression somewhere in the 14-15 window.

I have implemented a unit test that shows HTTP request headers are not preserved across channel redirects. However, my unit test fails on Firefox 14. bsmith tells me there are multiple classes of channel redirects and its possible that I'm not testing the right one. So, there appear to be multiple pathways affected.

The test in question tests code at https://mxr.mozilla.org/mozilla-central/source/services/common/rest.js.

See related: bug 553888, bug 401564.
Blocks: 794652
No longer depends on: 794652
(In reply to Gregory Szorc [:gps] from comment #0)
> However, my unit test fails on Firefox 14. 

So, is this really an Fx 15 regression then?
(Assignee)

Comment 2

5 years ago
(In reply to Honza Bambas (:mayhemer) from comment #1)
> (In reply to Gregory Szorc [:gps] from comment #0)
> > However, my unit test fails on Firefox 14. 
> 
> So, is this really an Fx 15 regression then?

I think Greg's point is summarized as:

* There seem to be multiple cases in which Necko doesn't copy headers (correctly) across redirects
* Here's a test that illustrates one of these
* Other evidence seems to suggest that the particular regression that we want to fix happened in 15, not 14
* There are lots of different redirect scenarios
* Ergo, please treat this test as an example of the kind of thing that *should* work but does not, and is likely to be the cause of Bug 794652.

Comment 3

5 years ago
Honza, Patrick - either of you have any idea what might be wrong here?
Er, Necko doesn't copy headers across redirects, period. Unless that changed lately...?
(Assignee)

Comment 5

5 years ago
(In reply to Christian :Biesinger (don't email me, ping me on IRC) from comment #4)
> Er, Necko doesn't copy headers across redirects, period. Unless that changed
> lately...?

Then perhaps what happened in fx15 is that something that previously _didn't_ appear to be a redirect now _does_; some random distribution across Sync requests (not just at the start of a sync!) now arrive without our custom headers, and this appears to affect only 15+.

Our `asyncOnChannelRedirect` method just says "OK" to the redirect. Should it be initializing `newChannel` as we do when we first create the initial channel?
Flags: needinfo?(cbiesinger)
(Assignee)

Comment 6

5 years ago
Created attachment 668886 [details] [diff] [review]
Illustrative patch — Sync's channel notification listener doesn't copy headers. v1
The Sync server doesn't ever (according to the Sync guys) send any 3xx responses. So, any redirects that happen would be "internal" redirects. The hypothesis is that something has changed where we are now doing internal redirects now where we didn't before.

However, we also filed this because originally GPS thought that his test case passed on 14 and broke on 15. But, if it fails on 14 too, then there's less (no?) evidence that this is a problem with (internal) redirects.
(Assignee)

Comment 8

5 years ago
Comment on attachment 668886 [details] [diff] [review]
Illustrative patch — Sync's channel notification listener doesn't copy headers. v1

I want Greg to see if this makes a difference. (His test patch doesn't apply on my tree…)
Attachment #668886 - Flags: review?(gps)
(Reporter)

Comment 9

5 years ago
Comment on attachment 668886 [details] [diff] [review]
Illustrative patch — Sync's channel notification listener doesn't copy headers. v1

Review of attachment 668886 [details] [diff] [review]:
-----------------------------------------------------------------

This seems like a straightforward patch. But, bsmith said that necko should copy headers transparently on internal redirects. biesi might be in disagreement. Given the confusion, I'd like to see a necko peer r+ this.

::: services/sync/modules/resource.js
@@ +588,5 @@
>    asyncOnChannelRedirect:
>      function asyncOnChannelRedirect(oldChannel, newChannel, flags, callback) {
>  
> +    // Copy across the redirect the headers that our caller set.
> +    for each (let header in this._headersToCopy) {

for (let x of y)
Attachment #668886 - Flags: review?(gps) → review+
(In reply to Gregory Szorc [:gps] from comment #9)
> This seems like a straightforward patch. But, bsmith said that necko should
> copy headers transparently on internal redirects. biesi might be in
> disagreement. Given the confusion, I'd like to see a necko peer r+ this.

I guess I am very wrong. Really, only the application (Sync in this case) can really know what to do on redirect.

First, I would write a separate patch that uses telemetry to count how many channels you make vs. how many times your redirect handler is called, and then push this to all three channels. Perhaps instead of logging a simple count, you should log the value of the redirect flags; then the count of redirects would be equal to the sum of the counts of the samples. This would allow you to distinguish between internal vs. server-initiated redirects, for example. (This should be done in the Sync bug.)

In your redirect handler, you should be checking that the new URI is equal to the old URI (or, at least, that they have the same origin), and probably also check that the that the redirect is an INTERNAL_REDIRECT.

It is a common pattern to have a function called SetupChannel() that does everything that needs to be done in both the initial and redirect cases. (Setting headers and various flags on the connection.) Then your redirect handler would just have to sanity-check the new URI and then call SetupChannel().

Anyway, if you're not trying to fix a bug in Necko then you should put the patch for Sync in the other bug (bug 794652).

Back to this (potential) bug: Recently we've been doing work on proxy-related things, including especially making synchronous proxy stuff asynchronous. Could this be causing us to do some internal redirect more frequently?
(Reporter)

Updated

5 years ago
Blocks: 794338
(Reporter)

Comment 11

5 years ago
Are there any about:config prefs, etc that people can switch to enable logging, etc so we can try to track this down?
(Assignee)

Comment 12

5 years ago
(In reply to Gregory Szorc [:gps] from comment #11)
> Are there any about:config prefs, etc that people can switch to enable
> logging, etc so we can try to track this down?

https://developer.mozilla.org/docs/HTTP_Logging
(Assignee)

Comment 13

5 years ago
(In reply to Brian Smith (:bsmith) from comment #10)

> I guess I am very wrong. Really, only the application (Sync in this case)
> can really know what to do on redirect.

Is there a document somewhere that defines exactly what Necko will copy in which circumstances?

For example, I assume that it copies the request URI (at least, it will for nsIHttpChannel). Will it copy Content-Type? The request body? Channel flags?

IMO it would be really broken to — by default — transfer request bodies across an internal redirect without transferring the Content-Type header, so is Biesi oversimplifying when he says "Necko doesn't copy headers across redirects, period", or does it not transfer request bodies?

Also, this reminds me of an old issue that some users reported. One example:

http://forums.mozillazine.org/viewtopic.php?f=27&t=2025227
(Reporter)

Updated

5 years ago
Blocks: 799111
(Reporter)

Comment 14

5 years ago
Created attachment 669685 [details] [diff] [review]
Applied header copying to RESTRequest

Per rnewman's request.
If you don't add the checks I suggest in comment 10, then you are creating significant security risk, by potentially forwarding the auth headers to an arbitrary domain. Please make sure you understand my suggestions in comment 10 before you check in that patch.
(Assignee)

Comment 16

5 years ago
(In reply to Brian Smith (:bsmith) from comment #15)
> If you don't add the checks I suggest in comment 10, then you are creating
> significant security risk, by potentially forwarding the auth headers to an
> arbitrary domain. Please make sure you understand my suggestions in comment
> 10 before you check in that patch.

Understood; this sketch patch was just verifying that this direction changes the behavior in the right way. I'll put a finished patch in Bug 794652.
(Reporter)

Comment 17

5 years ago
An awesome user attached an HTTP log to bug 799111. I'm sure a Necko peer will be able to make more sense of it than me.
(In reply to Gregory Szorc [:gps] from comment #17)
> An awesome user attached an HTTP log to bug 799111. I'm sure a Necko peer
> will be able to make more sense of it than me.

I commented on that bug.
(Assignee)

Comment 19

5 years ago
Since Bug 769764, there doesn't seem to be a way to trigger an internal redirect -- I don't see any examples from the netwerk tests, and I can only find bugs about it.

Can any of you necko folks suggest how I might write a test for this scenario?
(Assignee)

Updated

5 years ago
Assignee: nobody → rnewman
Component: Networking: HTTP → Firefox Sync: Backend
Product: Core → Mozilla Services
Version: Trunk → unspecified
(Assignee)

Comment 20

5 years ago
Created attachment 669860 [details] [diff] [review]
Proposed patch. v1

This version combines test and change.

Note that I factored out the logic for deciding redirect behavior, because we can't provoke a matching redirect in our test!

Only RESTRequest is tested.

xpcshell-tests pass for me. Have not yet run TPS.
Attachment #669860 - Flags: review?(gps)
(Assignee)

Updated

5 years ago
Attachment #669685 - Attachment is obsolete: true
(Reporter)

Comment 21

5 years ago
Comment on attachment 669860 [details] [diff] [review]
Proposed patch. v1

Review of attachment 669860 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM.
Attachment #669860 - Flags: review?(gps) → review+
Please wait with landing these patches until I finish investigation in bug 799111.  I agree with bsmith, this is potentially dangerous thing to do.
(Assignee)

Comment 23

5 years ago
(In reply to Honza Bambas (:mayhemer) from comment #22)
> Please wait with landing these patches until I finish investigation in bug
> 799111.  I agree with bsmith, this is potentially dangerous thing to do.

Note that the reviewed patch encapsulates bsmith's logic:

+    let isInternal = !!(flags & Ci.nsIChannelEventSink.REDIRECT_INTERNAL);
+    let isSameURI  = newChannel.URI.equals(oldChannel.URI);
+    this._log.debug("Channel redirect: " + oldChannel.URI.spec + ", " +
+                    newChannel.URI.spec + ", internal = " + isInternal);
+    return isInternal && isSameURI;


I don't know if it'll fix the problem (because I'm unable to trigger a redirect that would match those conditions), but I would be surprised if this patch were dangerous. (Please enlighten me if it is, though!)
Status: NEW → ASSIGNED
(In reply to Richard Newman [:rnewman] from comment #23)
> Note that the reviewed patch encapsulates bsmith's logic:

With this logic it is then safe.
(Assignee)

Comment 25

5 years ago
https://hg.mozilla.org/services/services-central/rev/c11c4092c4bd
https://hg.mozilla.org/integration/mozilla-inbound/rev/f75d369a701f

Will ask bobm to see if this has an impact when it gets into Nightly. Then we can see about uplifts.
Flags: needinfo?(cbiesinger)
Whiteboard: [fixed in services][qa-]
Target Milestone: --- → mozilla19
Comment on attachment 669860 [details] [diff] [review]
Proposed patch. v1

Review of attachment 669860 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/common/rest.js
@@ +561,5 @@
> +    try {
> +      if (this.shouldCopyOnRedirect(oldChannel, newChannel, flags)) {
> +        this._log.trace("Copying headers for safe internal redirect.");
> +        for (let key in this._headers) {
> +          newChannel.setRequestHeader(key, this._headers[key], false);

Good thing we split nsHttpHeaderArray::SetHeaderFromNet out of SetHeader, or we'd have barfed on singleton headers when you do this.  Yay for conservative design choices...
(Assignee)

Comment 27

5 years ago
(In reply to Jason Duell (:jduell) from comment #26)

> Good thing we split nsHttpHeaderArray::SetHeaderFromNet out of SetHeader, or
> we'd have barfed on singleton headers when you do this.  Yay for
> conservative design choices...

To be fair, the code that sets the header in the first place has been doing this since ~2009…
https://hg.mozilla.org/mozilla-central/rev/f75d369a701f
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
(Assignee)

Comment 29

5 years ago
Set all the flags!
status-firefox16: --- → affected
status-firefox17: --- → affected
status-firefox18: --- → affected
tracking-firefox18: --- → ?
(In reply to Richard Newman [:rnewman] from comment #29)
> Set all the flags!

Nominate for uplift all the patches!
(Assignee)

Comment 31

5 years ago
bobm: would you be so kind as to see if Nightly 19 UA 401 rates have decreased in the past 5 days?
(Reporter)

Updated

5 years ago
Blocks: 802601
(Reporter)

Comment 32

5 years ago
A user in bug 794338 said that Nightly caused their Sync to start working (presumably meaning their 401's stopped).

Updated

5 years ago
tracking-firefox18: ? → +
(Assignee)

Comment 33

5 years ago
Nightly numbers in the error logs are too small to make a determination, but there are two 20121012 and up 19.0a1 UAs with 401s. This is in contrast to user reports that this fixes the problem.

My instinct is to uplift; worst-case (probably) it just doesn't have an effect. We'll be able to see more clearly on Aurora and Beta.
(Assignee)

Comment 34

5 years ago
Created attachment 672962 [details] [diff] [review]
Patch for Aurora.

I aim to land this in Aurora, watch for screams, then Beta.

[Approval Request Comment]
Bug caused by (feature/regressing bug #):
  Unknown.

User impact if declined:
  Some users, sometimes, will see messaging about incorrect Sync credentials.

Testing completed (on m-c, etc.): 
  Been in Nightly for a week; no screams. Some users have verified that this fixes their problem, but we don't have enough statistics to know for sure.

Risk to taking this patch (and alternatives if risky): 
  Slim; it's additive.

String or UUID changes made by this patch: 
  None.
Attachment #672962 - Flags: review+
Attachment #672962 - Flags: approval-mozilla-beta?
Attachment #672962 - Flags: approval-mozilla-aurora?
(Assignee)

Updated

5 years ago
Attachment #672962 - Attachment description: Patch for Aurora and Beta. → Patch for Aurora.
Attachment #672962 - Flags: approval-mozilla-beta?
(Assignee)

Comment 35

5 years ago
Created attachment 672967 [details] [diff] [review]
Patch for Beta.

Beta doesn't have Bug 644734, so the Aurora patch failed to apply cleanly.
Attachment #672967 - Flags: review+
Attachment #672967 - Flags: approval-mozilla-beta?
Comment on attachment 672962 [details] [diff] [review]
Patch for Aurora.

Report back on the volume of any Aurora screams so we can decide if it's time to go ahead with Beta as well.
Attachment #672962 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
(Assignee)

Comment 37

5 years ago
https://hg.mozilla.org/releases/mozilla-aurora/rev/db0f6c3046b2

*prepares earplugs*
status-firefox18: affected → fixed
Triage drive-by, waiting a little longer for this one - we'll want it in before beta 4 (next Monday).
(Assignee)

Comment 39

5 years ago
bobm reports ~0 Aurora 401s for Sync now. I've heard no screams. Flagging for beta approval. Ready when you are!
Comment on attachment 672967 [details] [diff] [review]
Patch for Beta.

This is a good time in the beta cycle to take changes of this nature - thanks Richard.
Attachment #672967 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(Reporter)

Comment 41

5 years ago
https://hg.mozilla.org/releases/mozilla-beta/rev/f52112829d72
status-firefox17: affected → fixed
(Reporter)

Updated

4 years ago
Blocks: 807074
(Reporter)

Updated

4 years ago
No longer blocks: 807074
Duplicate of this bug: 807074
(Assignee)

Updated

4 years ago
Duplicate of this bug: 794338
(Assignee)

Comment 44

4 years ago
Bug 806494 includes this log:

1355415075728	Sync.Service	INFO	In sync().
1355415075728	Sync.Status	INFO	Resetting Status.
1355415075728	Sync.Status	DEBUG	Status.service: success.status_ok => success.status_ok
1355415075861	Sync.Resource	DEBUG	Channel redirect: https://scl2-sync4.services.mozilla.com/1.1/vi4usibtxmx2cgoutp5piyzw2uqtlbod/info/collections, https://scl2-sync4.services.mozilla.com/1.1/vi4usibtxmx2cgoutp5piyzw2uqtlbod/info/collections, 4
1355415076572	Sync.Resource	DEBUG	mesg: GET fail 401 https://scl2-sync4.services.mozilla.com/1.1/vi4usibtxmx2cgoutp5piyzw2uqtlbod/info/collections
1355415076572	Sync.Resource	DEBUG	GET fail 401 https://scl2-sync4.services.mozilla.com/1.1/vi4usibtxmx2cgoutp5piyzw2uqtlbod/info/collections

That '4' denotes an internal redirect. The URIs are equal, so:

    // For internal redirects, copy the headers that our caller set.
    try {
      if ((flags & Ci.nsIChannelEventSink.REDIRECT_INTERNAL) &&
          newChannel.URI.equals(oldChannel.URI)) {
        this._log.trace("Copying headers for safe internal redirect.");
        for (let header of this._headersToCopy) {
          let value = oldChannel.getRequestHeader(header);
          if (value) {
            newChannel.setRequestHeader(header, value);
          }
        }
      }
    } catch (ex) {
      this._log.error("Error copying headers: " + CommonUtils.exceptionStr(ex));
    }

will run. There's no error, yet the request still gets a 401 response.

Either there's a bug in this code, or there's some omission after the redirect that we aren't handling.
(Assignee)

Updated

4 years ago
Duplicate of this bug: 781017
You need to log in before you can comment on or make changes to this bug.