Closed Bug 1345893 Opened 7 years ago Closed 7 years ago

suspending channel during http-on-modify-request breaks redirect

Categories

(Core :: Networking: HTTP, defect, P1)

52 Branch
defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox52 --- wontfix
firefox-esr52 --- fixed
firefox53 --- wontfix
firefox54 --- verified
firefox55 --- fixed

People

(Reporter: fengyc.work, Assigned: mixedpuppy)

References

Details

(Whiteboard: [necko-active])

Attachments

(7 files)

Attached image redirect.PNG
User Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Firefox/52.0
Build ID: 20170302120751

Steps to reproduce:

In firefox 52 and later, when I try to redirect an old url to a new url, the browser still get the old url first and then redirect to the new url. 

E.g. redirect https://www.baidu.com to http://www.qq.com , the browser will get https://www.baidu.com and then redirect to http://www.qq.com . And if the old url is not accessible, the browser will say could not find the host and not redirect to the new url.

This behavior is different from firefox 51 and earlier version of firefox.

The code:

function redirect(details) {
    // Do something and return a blockingResponse
}

function handleRedirect(details) {
    if (_redirect_promise_supported) {
        var promise = new Promise(function (resolve, reject) {
            var blockingResponse = redirect(details);
            return resolve(blockingResponse);
        });
        return promise;
    } else {
        return redirect(details);
    }
}

browser.webRequest.onBeforeRequest.addListener(
    handleRedirect,
    {urls: ["<all_urls>"]},
    ["blocking"]
);



Actual results:

Redirect an old url to a new url, the browser will get the old url and then redirect to the new url. If the old url is not accessible, the browser could not redirect to the new url.


Expected results:

The browser does not need to get the old url, just redirect to the new url.
Component: Untriaged → WebExtensions: Request Handling
Product: Firefox → Toolkit
Found https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/addons/WebRequest.jsm#924
Not sure where should I look at next... Any help would be great.
I can't let bug 1346077's STR work in Fx51.0.1, Nightly 55.0a1 (2017-03-10) (32-bit), etc.


This bug's STR code seems to be incomplete.
Thanks for replying! As the description said, this bug was first occurred in Fx52. I have reproduced it on Fx52.0 (Win-64) and the build from Mozilla build VM (trunk?). I'm uploading the request archives now.
Attached file without-undirect.saz
Attached file fx51-with-undirect.saz
Attached file fx52-with-undirect.saz
(In reply to 61.1p57 from comment #4)
> Thanks for replying! As the description said, this bug was first occurred in
> Fx52. I have reproduced it on Fx52.0 (Win-64) and the build from Mozilla
> build VM (trunk?). I'm uploading the request archives now.

But, I don't see the disqus-undirect 0.1.0 work in Fx51.0.1, with links on https://www.reddit.com/domain/disq.us/ (like http://disq.us/t/2hcdjbr or https://disq.us/t/2hcdjbr), the GET /t/2hcdjbr HTTP/1.1 is send.
> But, I don't see the disqus-undirect 0.1.0 work in Fx51.0.1, with links on
> https://www.reddit.com/domain/disq.us/ (like http://disq.us/t/2hcdjbr or
> https://disq.us/t/2hcdjbr), the GET /t/2hcdjbr HTTP/1.1 is send.

Well, it only redirects links that contain the original URL such as https://disq.us/url?url=http%3A%2F%2Fimququ.com%3A{rand}&cuid={uid}. I couldn't find a way to get the original URL for a short link without sending requests to disq.us, so I left it as is.
Regression window:
https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=60266cddf76a3c6d34266b64ad63d835d48ceb76&tochange=3e357271c0f733762d9cc5f5c70112b655e07a1e


STR:
bug 1346077's STR and this bug's comment 9.
Blocks: 1273138
Status: UNCONFIRMED → NEW
Has Regression Range: --- → yes
Has STR: --- → yes
Ever confirmed: true
(In reply to YF (Yang) from comment #10)
> Regression window:
> https://hg.mozilla.org/integration/autoland/
> pushloghtml?fromchange=60266cddf76a3c6d34266b64ad63d835d48ceb76&tochange=3e35
> 7271c0f733762d9cc5f5c70112b655e07a1e
> 
> 
> STR:
> bug 1346077's STR and this bug's comment 9.

I don't believe this is the regression.  That change should only affect system principal urls, and you would never get a call into the listener, thus be unable to redirect at all.  It was also later updated:

https://dxr.mozilla.org/mozilla-central/source/toolkit/modules/addons/WebRequest.jsm#763
shane investigating more
Flags: needinfo?(mixedpuppy)
How are you seeing that the original request is still happening?  Are you using a network monitor, or dev tools?  See Bug 1339063.  Dev tools still picks up the start of the original request and displays it.
Flags: needinfo?(mixedpuppy) → needinfo?(fengyc.work)
Answered my own question, it's in the image attached.  I think you're seeing a dev tools bug, not an actual network request.

YF, I still don't know how you decided on a regression, did you develop some test for that?  If so please attach.
Flags: needinfo?(yfdyh000)
I actually see the requests with network monitor (archive attached), and it blocks the page if there are issues with the original request such as invalid certificate.
Attached image Fx52 vs Fx51
After re-checking through mozregression (with that add-on, browser set to 127.0.0.1:8888 via Fiddlet), I got the same regression range.
Flags: needinfo?(yfdyh000)
(In reply to YF (Yang) from comment #16)
> Created attachment 8847142 [details]
> Fx52 vs Fx51
> 
> "with that add-on"

What addon?  If you have test code please attach it to the bug.  I know I can make my own, I want to see what you're doing to generate your results.
Flags: needinfo?(yfdyh000)
STR:
1. Install the https://addons.mozilla.org/en/firefox/addon/disqus-undirect/
2. Set up a network capture, e.g. Fiddlet.
3. Open the https://disq.us/url?url=http%3A%2F%2Fimququ.com%3A{rand}


Check the network packets, it should not have to connect to disq.us.
Flags: needinfo?(yfdyh000)
Perfect, thank you!
(In reply to Shane Caraveo (:mixedpuppy) from comment #13)
> How are you seeing that the original request is still happening?  Are you
> using a network monitor, or dev tools?  See Bug 1339063.  Dev tools still
> picks up the start of the original request and displays it.

To reproduce:

1. In firefox 52, install URLRedirector https://addons.mozilla.org/firefox/addon/urlredirector/
2. Open the options page of the addon, add a user rule, e.g. origin ".*yahoo.*" to target "https://www.microsoft.com" , enable the rule and click Save button.
3. Open fiddler to monitor http requests
4. In firefox 52, visit "yahoo.com", firefox would redirect the url to "https://www.microsoft.com"
5. In fiddler, could see a log "Tunnel to www.yahoo.com:443" and then followed by a log "Tunnel to www.microsoft.com:443"

Not a bug of dev tools.
URLRedirector doesn't seem to work for me.

I was able to reproduce this, I traced port 80 using tcpflow (redirecting http urls).  The request that would make it through is a cache validation request.  Once I cleared the firefox cache, I was unable to reproduce this.  Could one of you verify by clearing your cache and reproducing?

FWIW this is my test:

==== str

- in terminal: tcpflow -p -c -i en0 port 80
- temp load the addon
- open tab, type caraveo.com press enter
- tab redirects as expected
- terminal shows requests for both domains

==== manifest.json
{
  "description": "Redirect Test",
  "manifest_version": 2,
  "name": "Redirect Test",
  "version": "1.0",
  "permissions": [
    "webRequest",
    "webRequestBlocking",
    "<all_urls>"
  ],
  "background": {
    "scripts": ["background.js"]
  }
}


==== background.js
browser.webRequest.onBeforeRequest.addListener(details => {
  console.log(`onBeforeRequest ${details.url}`);
  if (details.url.includes("caraveo.com")) {
    let redirectUrl = details.url
                             .replace("caraveo.com", "tinyshinyhouseonwheels.com");
    return {redirectUrl};
  }
  return {};
}, {urls: ["<all_urls>"]}, ["blocking"]);
browser.webRequest.onErrorOccurred.addListener(details => {
  console.log(`onErrorOccurred ${details.url}`);
}, {urls: ["<all_urls>"]});
browser.webRequest.onCompleted.addListener(details => {
  console.log(`onCompleted ${details.url}`);
}, {urls: ["<all_urls>"]});

==== tcpflow output

tcpflow: listening on en0
192.168.001.010.54586-208.113.154.037.00080: GET / HTTP/1.1
Host: www.caraveo.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:55.0) Gecko/20100101 Firefox/55.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
If-Modified-Since: Tue, 05 Sep 2006 07:26:42 GMT
If-None-Match: "15d-41cafc7b19080"
 
 
208.113.154.037.00080-192.168.001.010.54586: HTTP/1.1 304 Not Modified
Date: Thu, 16 Mar 2017 23:56:09 GMT
Server: Apache
Connection: Keep-Alive
Keep-Alive: timeout=2, max=100
ETag: "15d-41cafc7b19080"
Vary: Accept-Encoding
 
 
192.168.001.010.54589-208.113.154.169.00080: GET / HTTP/1.1
Host: tinyshinyhouseonwheels.com
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:55.0) Gecko/20100101 Firefox/55.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Connection: keep-alive
Upgrade-Insecure-Requests: 1
(In reply to Shane Caraveo (:mixedpuppy) from comment #21)
> URLRedirector doesn't seem to work for me.
> 
> I was able to reproduce this, I traced port 80 using tcpflow (redirecting
> http urls).  The request that would make it through is a cache validation
> request.  Once I cleared the firefox cache, I was unable to reproduce this. 
> Could one of you verify by clearing your cache and reproducing?
> 
> FWIW this is my test:
> 
> ==== str
> 
> - in terminal: tcpflow -p -c -i en0 port 80
> - temp load the addon
> - open tab, type caraveo.com press enter
> - tab redirects as expected
> - terminal shows requests for both domains
> 
> ==== manifest.json
> {
>   "description": "Redirect Test",
>   "manifest_version": 2,
>   "name": "Redirect Test",
>   "version": "1.0",
>   "permissions": [
>     "webRequest",
>     "webRequestBlocking",
>     "<all_urls>"
>   ],
>   "background": {
>     "scripts": ["background.js"]
>   }
> }
> 
> 
> ==== background.js
> browser.webRequest.onBeforeRequest.addListener(details => {
>   console.log(`onBeforeRequest ${details.url}`);
>   if (details.url.includes("caraveo.com")) {
>     let redirectUrl = details.url
>                              .replace("caraveo.com",
> "tinyshinyhouseonwheels.com");
>     return {redirectUrl};
>   }
>   return {};
> }, {urls: ["<all_urls>"]}, ["blocking"]);
> browser.webRequest.onErrorOccurred.addListener(details => {
>   console.log(`onErrorOccurred ${details.url}`);
> }, {urls: ["<all_urls>"]});
> browser.webRequest.onCompleted.addListener(details => {
>   console.log(`onCompleted ${details.url}`);
> }, {urls: ["<all_urls>"]});
> 
> ==== tcpflow output
> 
> tcpflow: listening on en0
> 192.168.001.010.54586-208.113.154.037.00080: GET / HTTP/1.1
> Host: www.caraveo.com
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:55.0)
> Gecko/20100101 Firefox/55.0
> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> Accept-Language: en-US,en;q=0.5
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Upgrade-Insecure-Requests: 1
> If-Modified-Since: Tue, 05 Sep 2006 07:26:42 GMT
> If-None-Match: "15d-41cafc7b19080"
>  
>  
> 208.113.154.037.00080-192.168.001.010.54586: HTTP/1.1 304 Not Modified
> Date: Thu, 16 Mar 2017 23:56:09 GMT
> Server: Apache
> Connection: Keep-Alive
> Keep-Alive: timeout=2, max=100
> ETag: "15d-41cafc7b19080"
> Vary: Accept-Encoding
>  
>  
> 192.168.001.010.54589-208.113.154.169.00080: GET / HTTP/1.1
> Host: tinyshinyhouseonwheels.com
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:55.0)
> Gecko/20100101 Firefox/55.0
> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> Accept-Language: en-US,en;q=0.5
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Upgrade-Insecure-Requests: 1

I can reproduct the bug on firefox 52 32bit, windows 7 64bit, with fiddler, and firefox 52, macOS 10.12.3, using the code.

The code doesn't work, so I change it a little bit:

manifest.json:
{
  "description": "Redirect Test",
  "manifest_version": 2,
  "name": "Redirect Test",
  "version": "1.0",
  "permissions": [
    "webRequest",
    "webRequestBlocking",
    "<all_urls>"
  ],
  "background": {
    "scripts": ["background.js"]
  }
}
backgroud.js:
browser.webRequest.onBeforeRequest.addListener(
    function (details) {
        console.log("onBeforeRequest " + details.url);
        if (details.url.match(/google/)) {
            var newURL = "http://www.caraveo.com/";
            return {redirectUrl: newURL};
      }
      return {};
    },
    {urls: ["<all_urls>"]},
    ["blocking"]);

browser.webRequest.onErrorOccurred.addListener(
    function (details){
        console.log("onErrorOccurred " + details.url);
    },
    {urls: ["<all_urls>"]});

browser.webRequest.onCompleted.addListener(
    function(details){
        console.log("onCompleted " + details.url);
    },
    {urls: ["<all_urls>"]});

The code should redirect .*google.* to http://www.caraveo.com/. I clear the cache and try to visit www.google.com(www.google.com is blocked, and need to be redirected.)

The result on firefox 52, win 7 is:
Firefox can’t establish a connection to the server at www.google.com , do not redirect.

The result on firefox 52, macOS is:
After 13619ms (as google is blocked, a long wait), and then redirect to http://www.caraveo.com/

Then I change the code "details.url.match(/google/)" to "details.url.match(/thisisnotahost/)", and visit https://thisisnotahost.com, I got :

The result on firefox 52, win7 is:
Firefox can’t establish a connection to the server at thisisnotahost.com, do not redirect.

The result on firefox 52, macOS is:
Firefox wait 20ms (may be waitting for dns lookup), and redirect to http://www.caraveo.com/
(In reply to Shane Caraveo (:mixedpuppy) from comment #21)
> URLRedirector doesn't seem to work for me.
> 
> I was able to reproduce this, I traced port 80 using tcpflow (redirecting
> http urls).  The request that would make it through is a cache validation
> request.  Once I cleared the firefox cache, I was unable to reproduce this. 
> Could one of you verify by clearing your cache and reproducing?
> 
> FWIW this is my test:
> 
> ==== str
> 
> - in terminal: tcpflow -p -c -i en0 port 80
> - temp load the addon
> - open tab, type caraveo.com press enter
> - tab redirects as expected
> - terminal shows requests for both domains
> 
> ==== manifest.json
> {
>   "description": "Redirect Test",
>   "manifest_version": 2,
>   "name": "Redirect Test",
>   "version": "1.0",
>   "permissions": [
>     "webRequest",
>     "webRequestBlocking",
>     "<all_urls>"
>   ],
>   "background": {
>     "scripts": ["background.js"]
>   }
> }
> 
> 
> ==== background.js
> browser.webRequest.onBeforeRequest.addListener(details => {
>   console.log(`onBeforeRequest ${details.url}`);
>   if (details.url.includes("caraveo.com")) {
>     let redirectUrl = details.url
>                              .replace("caraveo.com",
> "tinyshinyhouseonwheels.com");
>     return {redirectUrl};
>   }
>   return {};
> }, {urls: ["<all_urls>"]}, ["blocking"]);
> browser.webRequest.onErrorOccurred.addListener(details => {
>   console.log(`onErrorOccurred ${details.url}`);
> }, {urls: ["<all_urls>"]});
> browser.webRequest.onCompleted.addListener(details => {
>   console.log(`onCompleted ${details.url}`);
> }, {urls: ["<all_urls>"]});
> 
> ==== tcpflow output
> 
> tcpflow: listening on en0
> 192.168.001.010.54586-208.113.154.037.00080: GET / HTTP/1.1
> Host: www.caraveo.com
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:55.0)
> Gecko/20100101 Firefox/55.0
> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> Accept-Language: en-US,en;q=0.5
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Upgrade-Insecure-Requests: 1
> If-Modified-Since: Tue, 05 Sep 2006 07:26:42 GMT
> If-None-Match: "15d-41cafc7b19080"
>  
>  
> 208.113.154.037.00080-192.168.001.010.54586: HTTP/1.1 304 Not Modified
> Date: Thu, 16 Mar 2017 23:56:09 GMT
> Server: Apache
> Connection: Keep-Alive
> Keep-Alive: timeout=2, max=100
> ETag: "15d-41cafc7b19080"
> Vary: Accept-Encoding
>  
>  
> 192.168.001.010.54589-208.113.154.169.00080: GET / HTTP/1.1
> Host: tinyshinyhouseonwheels.com
> User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:55.0)
> Gecko/20100101 Firefox/55.0
> Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> Accept-Language: en-US,en;q=0.5
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Upgrade-Insecure-Requests: 1

Tcpflow didn't log the request of https://www.google.com and https://thisisnotahost.com , firefox 52 on win 7 64bit just say could not connect to server, and not redirect to http://www.caraveo.com/ (firefox 52 on macOS could redirect to that target, though need to wait ...)
Flags: needinfo?(fengyc.work)
> Tcpflow didn't log the request of https://www.google.com and
> https://thisisnotahost.com , firefox 52 on win 7 64bit just say could not
> connect to server, and not redirect to http://www.caraveo.com/ (firefox 52
> on macOS could redirect to that target, though need to wait ...)

That's because you're using https which is not on port 80.  The redirect can be tested with plain http requests.
Verified on all current branches, the initial request and response happens and it is *not* a cache request.

final str

- open firefox with fresh profile
- open newtab (I change to blank to avoid noise in tcpflow)
- in terminal: tcpflow -p -c -i en0 port 80
- temp load the addon
- open tab, type caraveo.com press enter
- tab redirects as expected
- terminal shows request/response for both domains
webextensions: --- → +
Priority: -- → P2
The problem came down to the expectation that http-on-modify-request would be handled synchronously (see BeginConnect in nsHttpChannel.cpp).  WebExtensions does not, and suspends the channel during that notification.
Summary: Weird behavior of webRequest url redirection → suspending channel during http-on-modify-request breaks redirect
webextensions: + → ---
Component: WebExtensions: Request Handling → Networking: HTTP
Product: Toolkit → Core
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

Seems like it should work to me, but want to run it by Honza.
Attachment #8848647 - Flags: review?(honzab.moz)
Attachment #8848647 - Flags: feedback+
Assignee: nobody → mixedpuppy
Whiteboard: [necko-active]
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

https://reviewboard.mozilla.org/r/121544/#review125876

I want decent explanation of the patch before I'm going to review it.  Please add it to a bugzilla comment.
Attachment #8848647 - Flags: review?(honzab.moz) → review-
If you suspend the channel during http-on-modify-request, the request continues to connect because nsHttpChannel::BeginConnect is not expecting the channel to be suspended during that notfication.  Web extensions now does exactly this.

http-on-modify-request is notified, and immediately after a request for redirect is checked.  This patch properly handles a suspended channel after the notification.

One change to this I think should be done is to move the call to SetLoadGroupUserAgentOverride to still happen just after CallOnModifyRequestObservers, since there is the possibility code could suspend the channel during SetLoadGroupUserAgentOverride.  In fact, probably need to scan the code for safe suspend handling for any of the notifications:

https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpHandler.h#277
Flags: needinfo?(honzab.moz)
(In reply to Shane Caraveo (:mixedpuppy) from comment #31)

Thanks!

> If you suspend the channel during http-on-modify-request, the request
> continues to connect because nsHttpChannel::BeginConnect is not expecting
> the channel to be suspended during that notfication.  Web extensions now
> does exactly this.
> 
> http-on-modify-request is notified, and immediately after a request for
> redirect is checked.  

Not sure I follow this line, but now I know what I should find in the patch, so it's OK.

> This patch properly handles a suspended channel after
> the notification.
> 
> One change to this I think should be done is to move the call to
> SetLoadGroupUserAgentOverride to still happen just after
> CallOnModifyRequestObservers

No.  It's there on purpose.  The callback may want to remove UA.  SetLoadGroupUserAgentOverride would readd.

> , since there is the possibility code could
> suspend the channel during SetLoadGroupUserAgentOverride.  

Where exactly?

> In fact, probably
> need to scan the code for safe suspend handling for any of the notifications:
> 
> https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/
> nsHttpHandler.h#277

I think we need to check suspend > 0 only before we start doing something (open cache entry, hit network)

If you feel the patch is up to r, please re-ask for r now.  Thanks.
Flags: needinfo?(honzab.moz)
(In reply to Honza Bambas (:mayhemer) from comment #32)
> (In reply to Shane Caraveo (:mixedpuppy) from comment #31)
> 
> Thanks!
> 
> > If you suspend the channel during http-on-modify-request, the request
> > continues to connect because nsHttpChannel::BeginConnect is not expecting
> > the channel to be suspended during that notfication.  Web extensions now
> > does exactly this.
> > 
> > http-on-modify-request is notified, and immediately after a request for
> > redirect is checked.  
> 
> Not sure I follow this line, but now I know what I should find in the patch,
> so it's OK.

Prior to this patch the code was essentially this:

    // notify "http-on-modify-request" observers
    CallOnModifyRequestObservers();
    SetLoadGroupUserAgentOverride();
    // Check to see if we should redirect this channel elsewhere by
    // nsIHttpChannel.redirectTo API request
    if (mAPIRedirectToURI) {
        return AsyncCall(&nsHttpChannel::HandleAsyncAPIRedirect);
    }


The problem is the redirect is checked synchronously after the synchronous notifications in CallOnModifyRequestObservers and SetLoadGroupUserAgentOverride.  If you suspend the channel during either of those calls [most importantly after CallOnModifyRequestObservers], we need to wait before checking the redirect.  My patch inserts that suspend handing right after CallOnModifyRequestObservers.  My suggestion below is to do the suspend check after SetLoadGroupUserAgentOverride instead.

> > One change to this I think should be done is to move the call to
> > SetLoadGroupUserAgentOverride to still happen just after
> > CallOnModifyRequestObservers
> 
> No.  It's there on purpose.  The callback may want to remove UA. 
> SetLoadGroupUserAgentOverride would readd.
> 
> > , since there is the possibility code could
> > suspend the channel during SetLoadGroupUserAgentOverride.  
> 
> Where exactly?

In the observer listening for http-on-useragent-request.

> > In fact, probably
> > need to scan the code for safe suspend handling for any of the notifications:
> > 
> > https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/
> > nsHttpHandler.h#277
> 
> I think we need to check suspend > 0 only before we start doing something
> (open cache entry, hit network)

We need to do it before using any possible change from an observer that would affect code paths (e.g. the redirect handling).  

> If you feel the patch is up to r, please re-ask for r now.  Thanks.

I'll look to see if I think any of the other notifications are a problem first, and update the patch if necessary.
(In reply to Shane Caraveo (:mixedpuppy) from comment #33)
> (In reply to Honza Bambas (:mayhemer) from comment #32)
> > (In reply to Shane Caraveo (:mixedpuppy) from comment #31)
> > 
> > Thanks!
> > 
> > > If you suspend the channel during http-on-modify-request, the request
> > > continues to connect because nsHttpChannel::BeginConnect is not expecting
> > > the channel to be suspended during that notfication.  Web extensions now
> > > does exactly this.
> > > 
> > > http-on-modify-request is notified, and immediately after a request for
> > > redirect is checked.  
> > 
> > Not sure I follow this line, but now I know what I should find in the patch,
> > so it's OK.
> 
> Prior to this patch the code was essentially this:
> 
>     // notify "http-on-modify-request" observers
>     CallOnModifyRequestObservers();
>     SetLoadGroupUserAgentOverride();
>     // Check to see if we should redirect this channel elsewhere by
>     // nsIHttpChannel.redirectTo API request
>     if (mAPIRedirectToURI) {
>         return AsyncCall(&nsHttpChannel::HandleAsyncAPIRedirect);
>     }

Ah!  I got confused with https://bugzilla.mozilla.org/attachment.cgi?id=8811342&action=diff
Hmm.. needs thinking first.

> 
> 
> The problem is the redirect is checked synchronously after the synchronous
> notifications in CallOnModifyRequestObservers and
> SetLoadGroupUserAgentOverride.  If you suspend the channel during either of
> those calls [most importantly after CallOnModifyRequestObservers], we need
> to wait before checking the redirect.  My patch inserts that suspend handing
> right after CallOnModifyRequestObservers.  My suggestion below is to do the
> suspend check after SetLoadGroupUserAgentOverride instead.
> 
> > > One change to this I think should be done is to move the call to
> > > SetLoadGroupUserAgentOverride to still happen just after
> > > CallOnModifyRequestObservers
> > 
> > No.  It's there on purpose.  The callback may want to remove UA. 
> > SetLoadGroupUserAgentOverride would readd.
> > 
> > > , since there is the possibility code could
> > > suspend the channel during SetLoadGroupUserAgentOverride.  
> > 
> > Where exactly?
> 
> In the observer listening for http-on-useragent-request.

Got it.

> 
> > > In fact, probably
> > > need to scan the code for safe suspend handling for any of the notifications:
> > > 
> > > https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/
> > > nsHttpHandler.h#277
> > 
> > I think we need to check suspend > 0 only before we start doing something
> > (open cache entry, hit network)
> 
> We need to do it before using any possible change from an observer that
> would affect code paths (e.g. the redirect handling).  
> 
> > If you feel the patch is up to r, please re-ask for r now.  Thanks.
> 
> I'll look to see if I think any of the other notifications are a problem
> first, and update the patch if necessary.

OK, thanks.
So, http-on-examine-cached-response and http-on-examine-merged-response are suspicious in that we're not checking if the channel is suspended after those notifications.  WebExtensions definitely can suspend the channel during those.  I'm not familiar enough with the possible ramifications at that point to know if a) it doesn't matter, or b) waiting to un-suspend would be a problem.

I've updated the patch from before, it at least fixes http-on-modify-request, we can follow up with another bug (or second patch here) if examine-cached-response or examine-merged-response need to be fixed also.
A note on the latest patch, the small part of the patch removed was actually done in bug 1325054 which handled a different async issue.
Short chat with Kris: webextensions webrequest doesn't need to worry about suspend after http-on-examine-cached-response and http-on-examine-merged-response notifications.  Somebody else may run into a problem with it in the future.
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

https://reviewboard.mozilla.org/r/121544/#review126260

This looks good when I understand the goal.

r- for (not just these) two things:

- missing a try push
- missing an automated test (an xpcshell test under netwerk/test/unit)

::: commit-message-ff04d:1
(Diff revision 2)
> +Bug 1345893 fix handling of suspended channel during http-on-modify-request, r?mayhemer

Should be:

Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler

::: netwerk/protocol/http/nsHttpChannel.cpp:5970
(Diff revision 2)
>    return AsyncOpen(listener, nullptr);
>  }
>  
> -// BeginConnect() SHOULD NOT call AsyncAbort(). AsyncAbort will be called by
> +// AsyncAbort() will be called on failure. Only AsyncOpen and
> -// functions that called BeginConnect if needed. Only AsyncOpen and
>  // OnProxyAvailable ever call BeginConnect.

leave the original comment

::: netwerk/protocol/http/nsHttpChannel.cpp:6107
(Diff revision 2)
>  
>      SetLoadGroupUserAgentOverride();
>  
> +    if (mSuspendCount) {
> +        LOG(("Waiting until resume to do BeginConnectContinue [this=%p]\n", this));
> +        mCallOnResume = &nsHttpChannel::HandleBeginConnectContinue;

assert !mCallOnResume

::: netwerk/protocol/http/nsHttpChannel.cpp:6110
(Diff revision 2)
> +    if (mSuspendCount) {
> +        LOG(("Waiting until resume to do BeginConnectContinue [this=%p]\n", this));
> +        mCallOnResume = &nsHttpChannel::HandleBeginConnectContinue;
> +        return NS_OK;
> +    }
> +    HandleBeginConnectContinue();

you want to call BeginConnectContinue()

tests would catch this...

::: netwerk/protocol/http/nsHttpChannel.cpp:6126
(Diff revision 2)
> +        mCallOnResume = &nsHttpChannel::HandleBeginConnectContinue;
> +        return;
> +    }
> +
> +    LOG(("nsHttpChannel::HandleBeginConnectContinue [this=%p]\n", this));
> +    nsresult rv = BeginConnectContinue();

nit: please define nsresult rv as the first declaration in a method.
Attachment #8848647 - Flags: review?(honzab.moz) → review-
(In reply to (offline 29.3. - 3.4.) Honza Bambas (:mayhemer) from comment #39)
> Comment on attachment 8848647 [details]
> Bug 1345893 fix handling of suspended channel during http-on-modify-request,
> 
> https://reviewboard.mozilla.org/r/121544/#review126260
> 
> This looks good when I understand the goal.
> 
> r- for (not just these) two things:
> 
> - missing a try push

There is a try push on reviewboard.

> - missing an automated test (an xpcshell test under netwerk/test/unit)

I have to think about how this will work.  The only way I verified there was a real problem was by watching tcp traffic.  If you have suggestions let me know.

> > -// BeginConnect() SHOULD NOT call AsyncAbort(). AsyncAbort will be called by
> > +// AsyncAbort() will be called on failure. Only AsyncOpen and
> > -// functions that called BeginConnect if needed. Only AsyncOpen and
> >  // OnProxyAvailable ever call BeginConnect.
> 
> leave the original comment
> 
> ::: netwerk/protocol/http/nsHttpChannel.cpp:6110
> (Diff revision 2)
> > +    if (mSuspendCount) {
> > +        LOG(("Waiting until resume to do BeginConnectContinue [this=%p]\n", this));
> > +        mCallOnResume = &nsHttpChannel::HandleBeginConnectContinue;
> > +        return NS_OK;
> > +    }
> > +    HandleBeginConnectContinue();
> 
> you want to call BeginConnectContinue()

I thought I had removed the error handling from AsyncOpen and OnProxyAvailable, that was the intention.  It didn't make sense to me to continue adding more of the same error handling.  I should remove the suspend check just above the call if we kept the error handling the way I did it.

> tests would catch this...

Not clear about what you mean here.
see comment 40
Flags: needinfo?(honzab.moz)
Blocks: 1351165
Priority: P2 → P1
Mass wontfix for bugs affecting firefox 52.
So, I went though the code more carefully and the patch seems to be OK, modulo the comments I've already added.

I think you need the same code be added to nsHttpChannel::DoAuthRetry too.

Other important thing is to return immediately mStatus when mCanceled after the callbacks notification.  This should already be there (is an existing bug: If someone sets the redirectTo URI and also cancels the channel, we should behave in a "canceled" way.)

Regarding a test, we have tests that just need to enhance.  Please _hg cp_ them to new tests and:

Test at [1]: should be enhanced to a test that doesn't call cancel(ERROR) on the channel prior to its resume ; should actually load some content (see some of the other tests in the same dir for an example of an http request handling, search "registerPathHandler")

Test at [2]: should just suspend the channel and later resume it while the expected redirected content should still be loaded


[1] https://dxr.mozilla.org/mozilla-central/source/netwerk/test/unit/test_suspend_channel_before_connect.js
[2] https://dxr.mozilla.org/mozilla-central/source/netwerk/test/unit/test_redirect_from_script.js
Flags: needinfo?(honzab.moz)
I've pushed a new patch with a test that fails as I expect without the fixes and works with them.

Fixing nsHttpChannel::DoAuthRetry is much more involved, it will require some significant restructuring of nsHttpChannel::OnStopRequest (where DoAuthRetry is used).  I'd like to see that as a followup rather than in this patch, since I feel the common case is handled by this patch.
Flags: needinfo?(honzab.moz)
(In reply to Shane Caraveo (:mixedpuppy) from comment #46)
> I've pushed a new patch with a test that fails as I expect without the fixes
> and works with them.

Cool, thanks

> 
> Fixing nsHttpChannel::DoAuthRetry is much more involved, it will require
> some significant restructuring of nsHttpChannel::OnStopRequest (where
> DoAuthRetry is used).  I'd like to see that as a followup rather than in
> this patch, since I feel the common case is handled by this patch.

Yes please, if that's too complicated, let's leave it for a followup or even wontfix it there (preferred).  Note that the transaction pump is suspended when the channel is suspended.  But IIUC, this will start the network request anyway.
Flags: needinfo?(honzab.moz)
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

https://reviewboard.mozilla.org/r/121544/#review130582

::: netwerk/protocol/http/nsHttpChannel.cpp
(Diff revision 3)
>    }
>    return AsyncOpen(listener, nullptr);
>  }
>  
> -// BeginConnect() SHOULD NOT call AsyncAbort(). AsyncAbort will be called by
> +// AsyncAbort() will be called on failure. Only AsyncOpen and
> -// functions that called BeginConnect if needed. Only AsyncOpen and

keep the original comment (see below)

::: netwerk/protocol/http/nsHttpChannel.cpp:6107
(Diff revision 3)
>      CallOnModifyRequestObservers();
>  
>      SetLoadGroupUserAgentOverride();
>  
> +    HandleBeginConnectContinue();
> +    return NS_OK;

no, the code has to look like this (now I understand the comment change and agree with it even less):


    SetLoadGroupUserAgentOverride();

    // Check if request was cancelled from on-modify-request or on-useragent...blabla
    if (mCanceled) {
        return mStatus;
    }

    if (mSuspendCount) {
        MOZ_ASSERT(!mCallOnResume);
        LOG(("Waiting until resume BeginConnect [this=%p]\n", this));
        mCallOnResume = &nsHttpChannel::HandleBeginConnectContinue;
        return NS_OK;
    }

    return BeginConnectContinue();
}

::: netwerk/test/unit/test_suspend_channel_before_redirect.js:1
(Diff revision 3)
>  

please add a comment with a list of steps this test performs and what it is actually testing.

::: netwerk/test/unit/test_suspend_channel_before_redirect.js:20
(Diff revision 3)
> +var redirectedUrl;
>  
> -// A server that waits for a connect. If a channel is suspended it should not
> -// try to connect to the server until it is is resumed or not try at all if it
> +// A server that waits for a connect. If a channel is redirected this server
> +// should never receive a connection.
> -// is cancelled as in this test.
>  function TestServer() {

there is no need for this test server socket if you are using the http server (HttpServer()), please remove the raw server

::: netwerk/test/unit/test_suspend_channel_before_redirect.js:68
(Diff revision 3)
> -      // happen until we resume the channel.
> +      Promise.resolve().then(() => {
> -      let timer = Cc["@mozilla.org/timer;1"].createInstance(Ci.nsITimer);
> -      timer.initWithCallback(() => {
> -        chan.cancel(Cr.NS_BINDING_ABORTED);
>          chan.resume();
> -      }, 1000, Ci.nsITimer.TYPE_ONE_SHOT);
> +        chan.redirectTo(ios.newURI(redirectedUrl));

I wanted test_redirect_from_script.js be the one to do the redirect test, not test_suspend_channel_before_connect.js

::: netwerk/test/unit/test_suspend_channel_before_redirect.js:74
(Diff revision 3)
> +      });
>      }
>    }
>  };
>  
>  var listener = {

please use ChannelListener (in head_channels.js - automatically included for your convenience) instead

::: netwerk/test/unit/xpcshell.ini:368
(Diff revision 3)
>  # The local cert service used by this test is not currently shipped on Android
>  skip-if = os == "android"
>  [test_1073747.js]
>  [test_safeoutputstream_append.js]
>  [test_suspend_channel_before_connect.js]
> +[test_suspend_channel_before_redirect.js]

I wanted two separate test cases in comment 44, looks like you've merged to only one which tests only half of the functionality here.  Please create both or state why you believe this is sufficient.  thanks
Attachment #8848647 - Flags: review?(honzab.moz) → review-
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

https://reviewboard.mozilla.org/r/121544/#review130582

> no, the code has to look like this (now I understand the comment change and agree with it even less):
> 
> 
>     SetLoadGroupUserAgentOverride();
> 
>     // Check if request was cancelled from on-modify-request or on-useragent...blabla
>     if (mCanceled) {
>         return mStatus;
>     }
> 
>     if (mSuspendCount) {
>         MOZ_ASSERT(!mCallOnResume);
>         LOG(("Waiting until resume BeginConnect [this=%p]\n", this));
>         mCallOnResume = &nsHttpChannel::HandleBeginConnectContinue;
>         return NS_OK;
>     }
> 
>     return BeginConnectContinue();
> }

Ok, I see where you're going now as well, I'll adjust the patch.  However, cancel there is still expected to work synchronously, if the channel is suspended and cancelled, code path will still go through HandleBeginConnectContinue.  Placing the cancel check at the top of BeginConnectContinue handles both.

> there is no need for this test server socket if you are using the http server (HttpServer()), please remove the raw server

Actually, the test server socket is great for verifying that the suspend handling is working correctly.  If it is not, you get a connection to that test server.

Without fixes in httpchannel, *at some point* the redirect does finally happen, so I'm uncertain that the original channel reqeust would always finish the request, whereas it does always make the initial socket connection.

> I wanted test_redirect_from_script.js be the one to do the redirect test, not test_suspend_channel_before_connect.js

For testing the specific failure with suspend/redirect test_suspend_channel_before_connect.js was a better match.

> I wanted two separate test cases in comment 44, looks like you've merged to only one which tests only half of the functionality here.  Please create both or state why you believe this is sufficient.  thanks

I was about to justify it but reread the comment and realize I misunderstood something there before.  You basically want a redirect test and a cancel test that both ensure there is no connection in this situation.
(In reply to Shane Caraveo (:mixedpuppy) from comment #49)
> However,
> cancel there is still expected to work synchronously, if the channel is
> suspended and cancelled, code path will still go through
> HandleBeginConnectContinue.  Placing the cancel check at the top of
> BeginConnectContinue handles both.

Putting it ALSO to that place does it, yes, I agree.

> 
> > there is no need for this test server socket if you are using the http server (HttpServer()), please remove the raw server
> 
> Actually, the test server socket is great for verifying that the suspend
> handling is working correctly.  If it is not, you get a connection to that
> test server.

Use the http server only, there is a way.  You can register more than one path handler.  This is testing the redirectTo api, right?  So, you open a channel to /content1 URL which gives 200 and "content #1" body.  But you want to redirect it to /content2 URL via the redirectTo api.  Hence, have two handlers registered on the http server, one for /content1 path and other for /content2 with "content #2" body.  When the test does what you want (redirects) you should get "content #2" response.  There are plenty of tests in this directory you can look for examples.

> 
> > I wanted test_redirect_from_script.js be the one to do the redirect test, not test_suspend_channel_before_connect.js
> 
> For testing the specific failure with suspend/redirect
> test_suspend_channel_before_connect.js was a better match.

Maybe it is a better match for both tests I wanted, I just wanted to give you a base test where you will have to do the least amount of change.  Up to you how you proceed, tho.

> 
> > I wanted two separate test cases in comment 44, looks like you've merged to only one which tests only half of the functionality here.  Please create both or state why you believe this is sufficient.  thanks
> 
> I was about to justify it but reread the comment and realize I misunderstood
> something there before.  You basically want a redirect test and a cancel
> test that both ensure there is no connection in this situation.

There are actually 2x2 cases when a channel is suspended from on-modify-request, we should test: 
- redirected and canceled (expected to not get any content and see the failure in onstoprequest)
- not-redirected and canceled (same result)
- redirected and not-canceled (expect to load the target content)
- not-redirected and not-canceled (expect to get content for the URL)

And it's also possible to cancel the channel in the time between on-modify-request and resume.  Which adds 4 more tests, I believe.

To build tests for all of them is simple when you have one good boilerplate.  It would be good to keep all of them as separate files to allow parallelism and keep the tests readable.

Thanks!
(In reply to Honza Bambas (:mayhemer) from comment #50)
> (In reply to Shane Caraveo (:mixedpuppy) from comment #49)
> > However,
> > cancel there is still expected to work synchronously, if the channel is
> > suspended and cancelled, code path will still go through
> > HandleBeginConnectContinue.  Placing the cancel check at the top of
> > BeginConnectContinue handles both.
> 
> Putting it ALSO to that place does it, yes, I agree.

Are you saying to do it in both places?  If so, I don't understand the necessity for doing anything beyond a check at the top of BeginConnectContinue.

> > > there is no need for this test server socket if you are using the http server (HttpServer()), please remove the raw server
> > 
> > Actually, the test server socket is great for verifying that the suspend
> > handling is working correctly.  If it is not, you get a connection to that
> > test server.
> 
> Use the http server only, there is a way. You can register more than one
> path handler. 

What I am trying to say is, I think (not entirely sure) that there may be a way to end up only with a connect, but not finishing the request.  From that perspective, I'm unsure the http server response handler would get called.  I'll have to look and try it.  I already knew it could have multiple handlers.

> And it's also possible to cancel the channel in the time between
> on-modify-request and resume.  Which adds 4 more tests, I believe.

The patch checks for redirects and cancel after resume, more tests shouldn't be necessary.
(In reply to Shane Caraveo (:mixedpuppy) from comment #51)
> (In reply to Honza Bambas (:mayhemer) from comment #50)
> > (In reply to Shane Caraveo (:mixedpuppy) from comment #49)
> > > However,
> > > cancel there is still expected to work synchronously, if the channel is
> > > suspended and cancelled, code path will still go through
> > > HandleBeginConnectContinue.  Placing the cancel check at the top of
> > > BeginConnectContinue handles both.
> > 
> > Putting it ALSO to that place does it, yes, I agree.
> 
> Are you saying to do it in both places?  If so, I don't understand the
> necessity for doing anything beyond a check at the top of
> BeginConnectContinue.

The channel can be canceled from on-modify etc or during the period it's suspended.  We should therefor check on both places: right after the callbacks notification and right after the resume.  If the code is duplicated for common paths I think it's OK.  Or am I missing something?

> 
> > > > there is no need for this test server socket if you are using the http server (HttpServer()), please remove the raw server
> > > 
> > > Actually, the test server socket is great for verifying that the suspend
> > > handling is working correctly.  If it is not, you get a connection to that
> > > test server.
> > 
> > Use the http server only, there is a way. You can register more than one
> > path handler. 
> 
> What I am trying to say is, I think (not entirely sure) that there may be a
> way to end up only with a connect, but not finishing the request.  

I didn't take a deep look at the test, but up to you.  If you believe the raw server is needed, leave it then.  Sorry for any excessive churn about it.

> From that
> perspective, I'm unsure the http server response handler would get called. 
> I'll have to look and try it.  I already knew it could have multiple
> handlers.

OK.

> 
> > And it's also possible to cancel the channel in the time between
> > on-modify-request and resume.  Which adds 4 more tests, I believe.
> 
> The patch checks for redirects and cancel after resume, more tests shouldn't
> be necessary.

I think it is.  I really would like to see tests also for the case we cancel during the suspend period.

Or outline what all code paths and cases there can be yourself here.  Maybe I'm actually exaggerating.

Thanks.
> There are actually 2x2 cases when a channel is suspended from
> on-modify-request, we should test: 
> - redirected and canceled (expected to not get any content and see the
> failure in onstoprequest)
> - not-redirected and canceled (same result)
> - redirected and not-canceled (expect to load the target content)
> - not-redirected and not-canceled (expect to get content for the URL)

That last case is the typical case (a successful un-modified request), wouldn't just about every http request test in m-c fail if that were broken?

> And it's also possible to cancel the channel in the time between
> on-modify-request and resume.  Which adds 4 more tests, I believe.

This patch is all about redirect (and now cancel) in the time between on-modify-request and resume.  I would expect that the synchronous paths are already well covered by tests, and that those would fail if that code path was broken.
(In reply to Shane Caraveo (:mixedpuppy) from comment #53)
> > There are actually 2x2 cases when a channel is suspended from
> > on-modify-request, we should test: 
> > - redirected and canceled (expected to not get any content and see the
> > failure in onstoprequest)
> > - not-redirected and canceled (same result)
> > - redirected and not-canceled (expect to load the target content)
> > - not-redirected and not-canceled (expect to get content for the URL)
> 
> That last case is the typical case (a successful un-modified request),
> wouldn't just about every http request test in m-c fail if that were broken?

Yes, you are right, don't add it.

> 
> > And it's also possible to cancel the channel in the time between
> > on-modify-request and resume.  Which adds 4 more tests, I believe.
> 
> This patch is all about redirect (and now cancel) in the time between
> on-modify-request and resume.  

Between?  I thought that you needed to suspend the channel from inside on-modify-request, but anyway.

What I want to make sure is that we don't break expectations when a channel is mainly canceled during the period.  On resume, it must notify the listener (AsyncAbort does that) and stop it's work.  We must have a test for at least this and I'm not aware of any existing one - maybe there already is?

There should also be a test to check that if the channel is redirected (either from on-modify-request or during the suspend period) we don't perform that redirect when the channel is _also_ canceled (either from on-modify-request or during the suspend period.)  That's IMO a second test that that should be added.

I understand that writing tests is more work but it saves future work - a lot!  Hence, at least these two (if not 4, individually for canceling from on-modify-request and during the suspend period) should be added.

If you know about tests that check these paths, just refer them here.

> I would expect that the synchronous 

synchronous?  You mean for the non-suspended case?  Yes, definitely.  But that is not my concern.

> paths are
> already well covered by tests, and that those would fail if that code path
> was broken.
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

https://reviewboard.mozilla.org/r/121544/#review131356

hg cp makes sense when the number of changes in the copied file is small, but it seems that there is a big number of changes now in the test - you almost rewrote it completely.  it then should be a new file.

r- for the test which doesn't test much.  I commented out both |return mStatus;| statements and the test happily passed.

::: netwerk/protocol/http/nsHttpChannel.cpp:6114
(Diff revision 4)
> +        return mStatus;
> +    }
> +
> +    if (mSuspendCount) {
> +        LOG(("Waiting until resume BeginConnect [this=%p]\n", this));
> +        mCallOnResume = &nsHttpChannel::HandleBeginConnectContinue;

assert !mCallOnResume

::: netwerk/test/unit/test_suspend_channel_async_on_modified.js:45
(Diff revision 4)
> -var requestListenerObserver = {
> -
> -  QueryInterface: function queryinterface(iid) {
> -    if (iid.equals(Ci.nsISupports) ||
> -        iid.equals(Ci.nsIObserver))
> -      return this;
> +function startChannelRequest(baseUrl, flags) {
> +  var chan = NetUtil.newChannel({
> +    uri: baseUrl,
> +    loadUsingSystemPrincipal: true
> +  });
> +  chan.asyncOpen2(new ChannelListener(() => do_execute_soon(run_next_test), null, flags));

you want to check the result content, right?

::: netwerk/test/unit/test_suspend_channel_async_on_modified.js:63
(Diff revision 4)
> -      var chan = subject.QueryInterface(Ci.nsIHttpChannel);
> +  onModifyListener(chan => {
> +    // Suspend the channel then yield to make this async.
> -      chan.suspend();
> +    chan.suspend();
> -      var obs = Cc["@mozilla.org/observer-service;1"].getService();
> -      obs = obs.QueryInterface(Ci.nsIObserverService);
> -      obs.removeObserver(this, "http-on-modify-request");
> +    Promise.resolve().then(() => {
> +      chan.resume();
> +      chan.redirectTo(ios.newURI(redirectedUrl));

redirect before you resume

::: netwerk/test/unit/test_suspend_channel_async_on_modified.js:89
(Diff revision 4)
> +    // Suspend the channel then yield to make this async.
> +    chan.suspend();
> +    Promise.resolve().then(() => {
> -        chan.resume();
> +      chan.resume();
> -      }, 1000, Ci.nsITimer.TYPE_ONE_SHOT);
> -    }
> +      chan.cancel(Cr.NS_BINDING_ABORTED);
> +      chan.redirectTo(ios.newURI(redirectedUrl));

cancel and redirect before you resume.

::: netwerk/test/unit/test_suspend_channel_async_on_modified.js:110
(Diff revision 4)
>  
> -  onStopRequest: function test_onStopR(request, ctx, status) {
> -    do_execute_soon(run_next_test);
> -  }
> -};
> +  onModifyListener(chan => {
> +    chan.suspend();
> +    Promise.resolve().then(() => {
> +      chan.resume();
> +      chan.cancel(Cr.NS_BINDING_ABORTED);

cancel before you resume

::: netwerk/test/unit/test_suspend_channel_async_on_modified.js:135
(Diff revision 4)
> -  obs.addObserver(requestListenerObserver, "http-on-modify-request", false);
> -  var chan = NetUtil.newChannel({
> -    uri:"http://localhost:" + serv.port,
> -    loadUsingSystemPrincipal: true
> +  onModifyListener(chan => {
> +    chan.suspend();
> +    chan.cancel(Cr.NS_BINDING_ABORTED);
> +    Promise.resolve().then(() => {
> +      chan.resume();
> +      chan.redirectTo(ios.newURI(redirectedUrl));

redirect before you resume
Attachment #8848647 - Flags: review?(honzab.moz) → review-
(In reply to Honza Bambas (:mayhemer) from comment #55)
> (In reply to Shane Caraveo (:mixedpuppy) from comment #53)

> > > And it's also possible to cancel the channel in the time between
> > > on-modify-request and resume.  Which adds 4 more tests, I believe.
> > 
> > This patch is all about redirect (and now cancel) in the time between
> > on-modify-request and resume.  
> 
> Between?  I thought that you needed to suspend the channel from inside
> on-modify-request, but anyway.

In order to have an asynchronous code path in this situation, you must suspend the channel during http-on-modified and call cancel or redirect after the observer call is finished (thus between on-modify and resume).  That is why my tests use promise.then to force the cancel/redirect/resume into an async call.

If you call cancel/redirect *during* on-modify, those can be handled immediately after the call to CallOnModifyRequestObservers regardless if the channel was suspended, thus the results would be available synchronously.  The last test in the patch covers calling cancel synchronously, and it will hit the code path that checks for cancel before suspend is checked.

The way I read your request in comment 50 is that you are asking for tests that exercise the synchronous behavior by calling cancel/redirect during on-modify, after which you say that there should be more tests to exercise the async behavior by calling cancel/redirect between on-modify and resume.  I'm saying that the synchronous behavior should already be covered by existing tests (if not, they should have been).
(In reply to Honza Bambas (:mayhemer) from comment #56)
> Comment on attachment 8848647 [details]
> Bug 1345893 - Handle Suspend() called on an HTTP channel from
> http-on-modify-request handler,
> 
> https://reviewboard.mozilla.org/r/121544/#review131356
> 
> hg cp makes sense when the number of changes in the copied file is small,
> but it seems that there is a big number of changes now in the test - you
> almost rewrote it completely.  it then should be a new file.
> 
> r- for the test which doesn't test much.  I commented out both |return
> mStatus;| statements and the test happily passed.

As they do without the patch, I'm not sure why.  The test that does cancel and redirect seems like it should fail without the patch but it passes.  The original test I wrote I tested pre-patch specifically to make sure I covered the failure I'm fixing.  It still fails without the patch.  All the other tests pass either way.

> ::: netwerk/test/unit/test_suspend_channel_async_on_modified.js:45
> (Diff revision 4)
> > -var requestListenerObserver = {
> > -
> > -  QueryInterface: function queryinterface(iid) {
> > -    if (iid.equals(Ci.nsISupports) ||
> > -        iid.equals(Ci.nsIObserver))
> > -      return this;
> > +function startChannelRequest(baseUrl, flags) {
> > +  var chan = NetUtil.newChannel({
> > +    uri: baseUrl,
> > +    loadUsingSystemPrincipal: true
> > +  });
> > +  chan.asyncOpen2(new ChannelListener(() => do_execute_soon(run_next_test), null, flags));
> 
> you want to check the result content, right?

ChannelListener is doing some basic validity checks on the response (e.g. content length checks), otherwise I don't feel like it's necessary to verify the content itself, the important part is making sure the original request doesn't continue when you redirect.
(In reply to Shane Caraveo (:mixedpuppy) from comment #57)
> (In reply to Honza Bambas (:mayhemer) from comment #55)
> > (In reply to Shane Caraveo (:mixedpuppy) from comment #53)
> 
> > > > And it's also possible to cancel the channel in the time between
> > > > on-modify-request and resume.  Which adds 4 more tests, I believe.
> > > 
> > > This patch is all about redirect (and now cancel) in the time between
> > > on-modify-request and resume.  
> > 
> > Between?  I thought that you needed to suspend the channel from inside
> > on-modify-request, but anyway.
> 
> In order to have an asynchronous code path in this situation, you must
> suspend the channel during http-on-modified and call cancel or redirect
> after the observer call is finished (thus between on-modify and resume). 
> That is why my tests use promise.then to force the cancel/redirect/resume
> into an async call.
> 
> If you call cancel/redirect *during* on-modify, those can be handled
> immediately after the call to CallOnModifyRequestObservers regardless if the
> channel was suspended, thus the results would be available synchronously. 
> The last test in the patch covers calling cancel synchronously, and it will
> hit the code path that checks for cancel before suspend is checked.
> 
> The way I read your request in comment 50 is that you are asking for tests
> that exercise the synchronous behavior by calling cancel/redirect during
> on-modify, after which you say that there should be more tests to exercise
> the async behavior by calling cancel/redirect between on-modify and resume. 
> I'm saying that the synchronous behavior should already be covered by
> existing tests (if not, they should have been).

OK, we could get lost among too many comments probably.  I'll try to summarize more schematically.

You can do whichever combination of three calls (let's mark them as [123], see below) on the channel from on-modify-request or on-useragent that are all affected by this patch:

1 - suspend()
2 - cancel(some-error)
3 - redirectTo(target-URI)

Trivially, all tests obviously check for [] (none of the calls.)  We have tests for [2], [3] (cancel from the callback, or redirect from the callback).  

We definitely don't have tests of [1], [12], [13] - that's your new code.  These you definitely have to add.  

Ideally we should also have a test for [123] as well to check we don't redirect when also canceled.

Then, you need tests for [1] and later, between on-modify/on-useragent and resume(), [2], [3], [23].  These should be added as well.

To sum, I can 7 individual tests.  You have a very good boiler plate now, the test from the last patch just needs few adjustments to do all of this, so we are very close already.



Now we should state how the channel should exactly behave.  Sorry if I may have changed some details during our discussion here about this, I got lost in this myself (sorry, I'm really busy with number of other stuff these days so I can easily get distracted):

- When on-modify or on-useragent callback suspends the channel and also redirects it via redirectTo(), then the channel MUST NOT redirect until it's resumed.  This is your primary goal, as I understand it.

- When on-modify or on-useragent callback suspends the channel and also cancels it, then the channel MUST NOT do anything (return a failure from AsyncOpen, if it's on the stack, or call on the listener's OnStartRequest/OnStopRequest via AsyncAbort) until the channel is resumed again.

- And for clarity, when on-modify or on-useragent callback suspends the channel and that channel is later either canceled or redirected in the period between suspension and resuming, the channel again MUST NOT do anything until resumed.

- And of course, cancellation takes priority over redirection



(In reply to Shane Caraveo (:mixedpuppy) from comment #58)
> (In reply to Honza Bambas (:mayhemer) from comment #56)
> > Comment on attachment 8848647 [details]
> > Bug 1345893 - Handle Suspend() called on an HTTP channel from
> > http-on-modify-request handler,
> > 
> > https://reviewboard.mozilla.org/r/121544/#review131356
> > 
> > hg cp makes sense when the number of changes in the copied file is small,
> > but it seems that there is a big number of changes now in the test - you
> > almost rewrote it completely.  it then should be a new file.
> > 
> > r- for the test which doesn't test much.  I commented out both |return
> > mStatus;| statements and the test happily passed.
> 
> As they do without the patch, I'm not sure why.  The test that does cancel
> and redirect seems like it should fail without the patch but it passes.  The
> original test I wrote I tested pre-patch specifically to make sure I covered
> the failure I'm fixing.  It still fails without the patch.  All the other
> tests pass either way.

Apparently you are missing something in the test then.  More below.

> 
> > ::: netwerk/test/unit/test_suspend_channel_async_on_modified.js:45
> > (Diff revision 4)
> > > -var requestListenerObserver = {
> > > -
> > > -  QueryInterface: function queryinterface(iid) {
> > > -    if (iid.equals(Ci.nsISupports) ||
> > > -        iid.equals(Ci.nsIObserver))
> > > -      return this;
> > > +function startChannelRequest(baseUrl, flags) {
> > > +  var chan = NetUtil.newChannel({
> > > +    uri: baseUrl,
> > > +    loadUsingSystemPrincipal: true
> > > +  });
> > > +  chan.asyncOpen2(new ChannelListener(() => do_execute_soon(run_next_test), null, flags));
> > 
> > you want to check the result content, right?
> 
> ChannelListener is doing some basic validity checks on the response (e.g.
> content length checks), otherwise I don't feel like it's necessary to verify
> the content itself, 

Oh really?  No, it's the most important thing you have to do:

- if you get content of the URL you've opened the channel for you know that the channel has loaded that URL (has not been canceled and has not been redirected)
- if you get content of the URL you redirectTo() the channel you know the channel has successfully redirected (and has not been canceled)
- if you get an error from the channel (in OnStopRequest) you know you successfully canceled the channel ; to do this check pass CL_EXPECT_FAILURE flag to the channel listener when you create it

> the important part is making sure the original request
> doesn't continue when you redirect.

No, sorry, it's not enough.

Hopefully everything is clear now.
(In reply to Honza Bambas (:mayhemer) from comment #59)

> You can do whichever combination of three calls (let's mark them as [123],
> see below) on the channel from on-modify-request or on-useragent that are
> all affected by this patch:
> 
> 1 - suspend()
> 2 - cancel(some-error)
> 3 - redirectTo(target-URI)

Much of that was redundant, but fine, I added more tests.  Any test canceling a channel will still pass regardless due to what I describe later in this response.

> Now we should state how the channel should exactly behave.  Sorry if I may
> have changed some details during our discussion here about this, I got lost
> in this myself (sorry, I'm really busy with number of other stuff these days
> so I can easily get distracted):
> 
> - When on-modify or on-useragent callback suspends the channel and also
> redirects it via redirectTo(), then the channel MUST NOT redirect until it's
> resumed.  This is your primary goal, as I understand it.

That already works fine and is not the goal.

The STR for this bug is:

1. make request to http://foo/
2. suspend channel in on-modify
3. after on-modify, redirect to http://foo/bar then resume channel 

expected results:

- a request happens after resume to http://foo/bar

actual results:

- a request happens to http://foo/     <-- WRONG, should not happen at all
- a second request happens after resume to http://foo/bar

The goal is to prevent that first request because it should not happen.  That is fixed by handling suspend after on-modify.  The test testAsyncRedirect in the test file covers this issue, it fails appropriately pre-patch and succeeds correctly post-patch.


> > > r- for the test which doesn't test much.  I commented out both |return
> > > mStatus;| statements and the test happily passed.
> > 
> > As they do without the patch, I'm not sure why.  The test that does cancel
> > and redirect seems like it should fail without the patch but it passes.  The
> > original test I wrote I tested pre-patch specifically to make sure I covered
> > the failure I'm fixing.  It still fails without the patch.  All the other
> > tests pass either way.
> 
> Apparently you are missing something in the test then.  More below.

I traced every test, and the current httpchannel code handles every situation where the channel is canceled and I see behavior I expect (no request is made).  However, canceled channels are handled much later than necessary allowing unnecessary code execution.  The early checks we're adding simply prevent that unnecessary code execution.  This is why the tests pass pre-patch as well as post-patch.

As far as I can tell by examining httpchannel, there are no other hooks between the early-exit checks we're adding and the current checks that handle channel cancellation.  So I have no way to test the difference, I can only test that we didn't break existing behavior.

Here are some locations that cancel is handled pre-patch allowing these tests to pass without the added cancel checks.

testAsyncCancelRedirect:
tests cancel + redirect after on-modify
is canceled here:
https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpChannel.cpp#6393

testSyncCancelRedirect:
tests cancel in on-modify, redirect after on-modify
is canceled here:
https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpChannel.cpp#6197

testSyncCancelRedirect2:
tests redirect + cancel IN on-modify, channel is not suspended
canceled here:
https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/nsHttpChannel.cpp#2688

> > > you want to check the result content, right?
> > 
> > ChannelListener is doing some basic validity checks on the response (e.g.
> > content length checks), otherwise I don't feel like it's necessary to verify
> > the content itself, 
> 
> Oh really?  No, it's the most important thing you have to do:

I'll add the content check, but it is not necessary because the path handlers issue a fail or success.  We expect a response in only two tests.  You cannot get the response for an incorrect request and have the test pass.

For all the other tests where we cancel the channel: I already pass CL_EXPECT_FAILURE in addition to having a request handler that makes the test fail if the request is received.  I added another check on response as well.
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

https://reviewboard.mozilla.org/r/121544/#review132962

thanks!  r+ with few very small details tweaked.

::: netwerk/protocol/http/nsHttpChannel.cpp:6110
(Diff revision 5)
>      SetLoadGroupUserAgentOverride();
>  
> +    // Check if request was cancelled during on-modify-request or on-useragent.
> +    if (mCanceled) {
> +        return mStatus;
> +    }

so, this doesn't conform to what I've outlined in comment 59, but let's go with it.  it's similar to what we've been doing before, so it's better from the regression point of view.

::: netwerk/test/unit/test_suspend_channel_on_modified.js:21
(Diff revision 5)
> +// redirected or cancelled.
> +var baseUrl;
> +
> +function failResponseHandler(metadata, response)
> +{
> +  var text = "Ooops!";

s/Ooops!/failed response/

::: netwerk/test/unit/test_suspend_channel_on_modified.js:29
(Diff revision 5)
> +  do_check_true(false, "Received request when we shouldn't.");
> +}
> +
> +function successResponseHandler(metadata, response)
> +{
> +  var text = "Hello!";

s/Hallo!/success response/

::: netwerk/test/unit/test_suspend_channel_on_modified.js:52
(Diff revision 5)
> +  var chan = NetUtil.newChannel({
> +    uri: baseUrl,
> +    loadUsingSystemPrincipal: true
> +  });
> +  chan.QueryInterface(Ci.nsIHttpChannel);
> +  chan.requestMethod = "GET";

I don't think you need to do these, it's the default.

::: netwerk/test/unit/test_suspend_channel_on_modified.js:75
(Diff revision 5)
> +
> +add_test(function testSimpleCancel() {
> +  onModifyListener(chan => {
> +    chan.cancel(Cr.NS_BINDING_ABORTED);
> +  });
> +  startChannelRequest(baseUrl, CL_EXPECT_FAILURE);

Just a note: for this case (and few others) asyncOpen2 won't throw only because the proxy preferences are by default set to resolve a proxy, which is an asynchronous operation (at least on windows).

::: netwerk/test/unit/test_suspend_channel_on_modified.js:170
(Diff revision 5)
> +  httpServer.registerPathHandler("/", failResponseHandler);
> +  httpServer.registerPathHandler("/fail", failResponseHandler);
> +  httpServer.registerPathHandler("/success", successResponseHandler);
> +  httpServer.start(-1);
> +
> +  baseUrl = `http://localhost:${httpServer.identity.primaryPort}/`;

nit: could you define baseUrl w/o the trailing slash?  the usage would then be more readable (more resembling a URL)
Attachment #8848647 - Flags: review?(honzab.moz) → review+
Pushed by mixedpuppy@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/2b66ecba042d
Handle Suspend() called on an HTTP channel from http-on-modify-request handler, r=mayhemer
https://hg.mozilla.org/mozilla-central/rev/2b66ecba042d
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Can we request an Aurora uplift (or a Beta uplift if it missed next merging) for this bug? 

----------------------------
[User impact if declined]
Redirecting extensions using WebRequest would keep broken for six more weeks:

https://addons.mozilla.org/zh-CN/firefox/addon/redirector/
The redirecting speed would be much slower than it supposed to be, or it would not work at all (if the origin site is down) as the origin request was not cancelled.

https://addons.mozilla.org/en-US/firefox/addon/smart-https-revived/
This kind of security add-ons would be invalid as the HTTP request would still be sent.

[Describe test coverage new/current, TBPL]
Automated test + landed on central and got confirmed from the reporter (I reported a duplicated bug on the same day...)

[Risks and why]
I am not confident enough to talk about the risks here. Maybe the bug owner can have a heads-up on this?
Flags: needinfo?(mixedpuppy)
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

Approval Request Comment
[Feature/Bug causing the regression]:channel.suspend does not suspend during on-modify
[User impact if declined]:request redirection result in two requests, the initial plus the redirected request.  
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]:none
[Is the change risky?]:moderate
[Why is the change risky/not risky?]:it is a low level change to httpchannel 
[String changes made/needed]:none

I do think this should be uplifted to Fx54, not sure if we've missed the merge and would need beta uplift at this point.
Flags: needinfo?(mixedpuppy)
Attachment #8848647 - Flags: approval-mozilla-aurora?
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

Fix a redirect issue and include test. Aurora54+.
Attachment #8848647 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Flags: qe-verify+
needs rebase
merging netwerk/protocol/http/nsHttpChannel.cpp
merging netwerk/protocol/http/nsHttpChannel.h
merging netwerk/test/unit/xpcshell.ini
2017-04-18 11:49:18.852 FileMerge[5589:2084360] Unable to load platform at path /Applications/Xcode.app/Contents/Developer/Platforms/AppleTVOS.platform
2017-04-18 11:49:18.853 FileMerge[5589:2084360] Unable to load platform at path /Applications/Xcode.app/Contents/Developer/Platforms/AppleTVSimulator.platform
2017-04-18 11:49:18.854 FileMerge[5589:2084360] Unable to load platform at path /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneOS.platform
2017-04-18 11:49:18.855 FileMerge[5589:2084360] Unable to load platform at path /Applications/Xcode.app/Contents/Developer/Platforms/iPhoneSimulator.platform
2017-04-18 11:49:18.858 FileMerge[5589:2084360] Unable to load platform at path /Applications/Xcode.app/Contents/Developer/Platforms/WatchOS.platform
2017-04-18 11:49:18.859 FileMerge[5589:2084360] Unable to load platform at path /Applications/Xcode.app/Contents/Developer/Platforms/WatchSimulator.platform
Flags: needinfo?(mixedpuppy)
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

FYI, the conflict was in nsHttpChannel.h. Also, this'll need Beta approval (and the Aurora approval cleared) now that 54 has been uplifted.
Attachment #8848647 - Flags: approval-mozilla-beta?
Please consider fixing this in ESR, otherwise those who are on the ESR channel would be stuck with this bug for several months.
This patch applies to aurora and beta branches pull this morning.
Flags: needinfo?(mixedpuppy)
Try for aurora comment 73 and beta comment 74
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

[Approval Request Comment]
See comment 67 for details.  Fx52 was the first version affected by this
Attachment #8848647 - Flags: approval-mozilla-esr52?
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

Per comment #68, Beta54+.
Attachment #8848647 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 8848647 [details]
Bug 1345893 - Handle Suspend() called on an HTTP channel from http-on-modify-request handler,

This seems like a core scenario, ESR52.2+
Attachment #8848647 - Flags: approval-mozilla-esr52? → approval-mozilla-esr52+
I was able to reproduce the issue here using URLRedirector and fiddler with instructions from comment 20. I used Firefox 52.0.2 to reproduce. 

Using Firefox 54.0 RC I could not repro this anymore so I'll go ahead and mark it as verified on 54 branch.
Flags: qe-verify+
Blocks: 1407384
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: