Stalled proxy connection reused when the proxy responses to CONNECT with 403
Categories
(Core :: Networking: HTTP, defect, P2)
Tracking
()
People
(Reporter: numsys, Assigned: mayhemer)
References
(Blocks 1 open bug, Regression)
Details
(Keywords: regression, Whiteboard: [necko-triaged])
Attachments
(1 file)
47 bytes,
text/x-phabricator-request
|
RyanVM
:
approval-mozilla-esr60+
|
Details | Review |
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:64.0) Gecko/20100101 Firefox/64.0
Steps to reproduce:
With a proxy when a SSL website is denied, a persistent connexion is maintained
Firefox at least 52 -> 64 with a web filter proxy
1 - I'm surfing to the website https://www.toto.fr
2 - I receive a 403 (blank page in FF)
3 - Right after I refresh the page, and I wait a long long time before timeout.
Despite the header "403 Forbidden" firefox still waiting connections, I guess that this behaviour could also generates some other bugs
Actual results:
After a refresh the browser still waiting, not happening with IE or Chrome
A real life issue is filtering ADS, surf to www.aaa.com and block www.bbb.com (ads), there are multiple links to bbb in aaa, in this case www.aaa.com never appears completely (or after a long time) the browser hangs and still waiting bbb (the name appears in bottom: waiting for bbb)
With many users the proxy resources are quickly exhausted
Expected results:
Firefox should totally drop the connexion when the 403 is received.
As a workaround you can disable client_persistent_connection in proxy (lost of performance) or use an another browser (too bad)
Here a short tcpdump trace https://nas.traceroot.fr:8081/owncloud/index.php/s/egrcXnU3lxiU0mi
Updated•6 years ago
|
Comment 1•6 years ago
|
||
Hi Reporter,
Could you please make a Http log?
Please see https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging.
Thanks.
toto.fr is denied -> 403, but right after that I refresh and FF is waiting a connection
Comment 4•6 years ago
|
||
The priority flag is not set for this bug.
:selenamarie, could you have a look please?
Comment 5•6 years ago
|
||
Junior -- could you have a look?
Comment 6•6 years ago
|
||
The second channel is canceled in the child process.
Hello FredB,
Could you please also provide the child log? i.e., log.txt-child.[number]
Thanks.
There is no such file with timestamp,rotate:200,nsHttp:5,cache2:5,nsSocketTransport:5,nsHostResolver:5
Comment 8•6 years ago
|
||
Try set security.sandbox.content.level to 0 in about:config during the experiment?
Thanks.
Assignee | ||
Comment 9•6 years ago
|
||
And also start logging from the very start of the application ([1]) and not through about:networking.
Reporter | ||
Comment 10•6 years ago
|
||
@junior nothing more
@honza no log at all (FI the path is wrong in documentation, here "c:\Program Files (x86)\Mozilla Firefox\firefox.exe" -MOZ_LOG=timestamp,rotate:200,nsHttp:5,cache2:5,nsSocketTransport:5,nsHostResolver:5 -MOZ_LOG_FILE=%TEMP%\log.txt"
Not
c:\Program Files\Mozilla Firefox (x86)
Assignee | ||
Comment 11•6 years ago
|
||
(In reply to FredB from comment #10)
@junior nothing more
@honza no log at all (FI the path is wrong in documentation, here "c:\Program Files (x86)\Mozilla Firefox\firefox.exe" -MOZ_LOG=timestamp,rotate:200,nsHttp:5,cache2:5,nsSocketTransport:5,nsHostResolver:5 -MOZ_LOG_FILE=%TEMP%\log.txt"Not
c:\Program Files\Mozilla Firefox (x86)
Fixed, thanks for spotting this!
Not that the command line args don't work with Firefox 60 and less. If you are trying any older such version, you will need to use environment variables [1]
If you are on 61+, then please try to create an empty folder in the c: root (e.g. "firefoxlog") and change to -MOZ_LOG_FILE=c:\firefoxlog\log.txt.
thanks.
Reporter | ||
Comment 12•6 years ago
|
||
Firefox 65 (32 bits)
"c:\Program Files (x86)\Mozilla Firefox\firefox.exe" -MOZ_LOG=timestamp,rotate:200,nsHttp:5,cache2:5,nsSocketTransport:5,nsHostResolver:5 -MOZ_LOG_FILE=c:\firefoxlog\log.txt
Nothing in c:\firefox.log
Assignee | ||
Comment 13•6 years ago
|
||
Create a folder such as c:\logs or something. User space applications usually can't write to c:\ root.
Assignee | ||
Comment 14•6 years ago
|
||
FredB, any news? Do you need any more help?
(Also note that 403 is a regular response when coming from the end server and doesn't mean to drop the keep-alive connection it was received on, if the server is HTTP/1.1 and doesn't explicitly response with connection: close; HTTP/2 is a completely different story, but w/o the log we can't say what's actually happening)
Reporter | ||
Comment 15•6 years ago
|
||
Yes It's what I'm doing, sorry I just forgot the directoy in my previous answer c:\firefox\
"c:\Program Files (x86)\Mozilla Firefox\firefox.exe" -MOZ_LOG=timestamp,rotate:200,nsHttp:5,cache2:5,nsSocketTransport:5,nsHostResolver:5 -MOZ_LOG_FILE=c:\firefoxlog\log.txt
c:\firefoxlog Is always empty, same thing with set environement
Yes maybe it's not really a bug, but Chrome and IE close the connection, so with a web filter there is a high performance issue with FF because many connections still opened for nothing
Assignee | ||
Comment 16•6 years ago
|
||
It's very strange that there are no logs produced. This is something us and number of other users do very often. I'm not sure we can move forward w/o them.
There is also another way, via about:networking#logging. Leave the default setup and just Start Logging before reproducing the issue and Stop Logging after.
Reporter | ||
Comment 17•6 years ago
|
||
Ok here https://nas.traceroot.fr:8081/owncloud/index.php/s/CYY6A0RgdW1XSlm
Stopped right after the timeout
Reporter | ||
Comment 18•6 years ago
|
||
Here I'm playing with the 403 answers, and now Firefox close the connection even with many f5 refresh https://github.com/e2guardian/e2guardian/blob/v5.3/src/ConnectionHandler.cpp#L1805
https://nas.traceroot.fr:8081/owncloud/index.php/s/xIuPPcwN8EY1eUp
But as far I can tell it's a corruped header now (\r\n is missing at the end) this could potentially lead to some others ICAP errors
Assignee | ||
Comment 19•6 years ago
|
||
(In reply to FredB from comment #17)
Ok here https://nas.traceroot.fr:8081/owncloud/index.php/s/CYY6A0RgdW1XSlm
Stopped right after the timeout
Thanks.
Based on the claim that this affect 52+, this is probably a regression (or more uncover of an existing bug) from bug 1345910.
This pretty much looks like a misconfiguration of the proxy, but we should handle it more gracefully. The 403 apparently comes from the proxy as a response to CONNECT.
There are two scenarios.
- behaving "correctly"
- a request to an https site
- we do a CONNECT request to the proxy
- get 407
- nsHttpChannel at [1] sets the sticky flag, which is then set on the transaction doing re-auth
- we do another CONNECT request with proxy auth headers
- get 403
- as the connection is marked sticky, it's not reclaimed during time of the first call to nsHttpTransaction::Close [2]
(- later the channel detects proxy conn failure [3]) - and the code from bug 1345910 closes (DontReuse()) the conn, so the bug doesn't manifest in this case (this is just a coincidence, the code has to react to a different kind of error)
- behaving "incorrectly"
- a request to an https site
- we do a CONNECT request to the proxy
- get 403
- the connection is not marked sticky (still, it's kept alive)
- we reclaim the conn (add it to the idle poll for reuse) at [2], note that it's keep-alive according the response header in the 403 head
Now this connection in an unexpectedly broken state is ready for reuse.
[1] https://searchfox.org/mozilla-central/rev/5e3bffe964110b8d1885b4236b8abd5c94d8f609/netwerk/protocol/http/nsHttpChannel.cpp#8427
[2] https://searchfox.org/mozilla-central/rev/00c0d068ece99717bea7475f7dc07e61f7f35984/netwerk/protocol/http/nsHttpTransaction.cpp#1162
[3] https://searchfox.org/mozilla-central/rev/5e3bffe964110b8d1885b4236b8abd5c94d8f609/netwerk/protocol/http/nsHttpChannel.cpp#1841
Possible fix could be to treat the connection as sticky (to not release it) when the transaction has mProxyConnectFailed set.
What I can see in the log for the bad case:
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: I/nsHttp http response [
HTTP/1.1 403 Forbidden
Content-Type: text/html
Content-Length: 142
Date: Mon, 11 Feb 2019 15:15:20 GMT
X-Cache: MISS from proxyorion_test
X-Cache-Lookup: NONE from proxyorion_test:3128
Via: 1.1 proxyorion_test (squid/4.5)
Connection: keep-alive
OriginalHeaders
Content-Type: text/html
Content-Length: 142
Date: Mon, 11 Feb 2019 15:15:20 GMT
X-Cache: MISS from proxyorion_test
X-Cache-Lookup: NONE from proxyorion_test:3128
Via: 1.1 proxyorion_test (squid/4.5)
Connection: keep-alive
]
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpTransaction::CheckForStickyAuthScheme this=0F9AAC00
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpConnection::OnHeadersAvailable [this=1186A400 trans=0F9AAC00 response-head=11838EE0]
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp Connection can be reused [this=1186A400 idle-timeout=115sec]
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp proxy CONNECT failed! endtoendssl=1 onlyconnect=0
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: E/nsHttp nsHttpTransaction::HandleContent [this=0F9AAC00 count=142 read=142 mContentRead=142 mContentLength=142]
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpTransaction::RemoveDispatchedAsBlocking this=0F9AAC00 not blocking
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpTransaction 0F9AAC00 request context set to null in ReleaseBlockingTransaction() - was 0BD46C10
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable 1186A400 trans->ws rv=0 n=142 socketin=0
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpTransaction::WriteSegments 0F9AAC00
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable 1186A400 trans->ws rv=80470002 n=142 socketin=0
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpConnection::CloseTransaction[this=1186A400 trans=0F9AAC00 reason=80470002]
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpTransaction::Close [this=0F9AAC00 reason=0]
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpConnectionMgr::RemoveActiveTransaction t=0F9AAC00 tabid=200000001(1) thr=0
2019-02-11 15:15:09.098000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpConnectionMgr::ReclaimConnection [conn=1186A400]
now the connection is already returned the connection to the pool as reusable, making it inaccessible to the http channel that later detects the proxy connection has failed and could mark the connection as non-reusable.
2019-02-11 15:15:09.099000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpTransaction::RemoveDispatchedAsBlocking this=0F9AAC00 not blocking
2019-02-11 15:15:09.099000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpTransaction 0F9AAC00 request context set to null in ReleaseBlockingTransaction() - was 00000000
2019-02-11 15:15:09.099000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpConnectionMgr::OnMsgReclaimConnection [ent=14F67A40 conn=1186A400]
2019-02-11 15:15:09.099000 UTC - [Parent 4392: Socket Thread]: E/nsHttp nsHttpConnection::SetupSSL 1186A400 caps=0x1029 PS.....[tlsflags0x00000000]www.toto.fr:443 (http:172.23.0.11:3128)[:]
2019-02-11 15:15:09.099000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpConnection::BeginIdleMonitoring [this=1186A400]
2019-02-11 15:15:09.099000 UTC - [Parent 4392: Socket Thread]: V/nsHttp Entering Idle Monitoring Mode [this=1186A400]
2019-02-11 15:15:09.099000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpConnection::TTL: 1186A400 www.toto.fr idle 2 timeout 115000
2019-02-11 15:15:09.107000 UTC - [Parent 4392: Main Thread]: V/nsHttp HttpBaseChannel::SetNewListener [this=11C86000, mListener=1023E73C, newListener=0F963380]
2019-02-11 15:15:09.109000 UTC - [Parent 4392: Main Thread]: D/nsHttp sending progress and status notification [this=11C86000 status=804b000a progress=0/0]
804b000a = STATUS_WAITING_FOR
2019-02-11 15:15:09.113000 UTC - [Parent 4392: Main Thread]: D/nsHttp nsHttpChannel::ResumeInternal [this=11C86000]
2019-02-11 15:15:09.113000 UTC - [Parent 4392: Main Thread]: D/nsHttp nsHttpChannel::OnStartRequest [this=11C86000 request=0E23D300 status=0]
2019-02-11 15:15:09.113000 UTC - [Parent 4392: Main Thread]: D/nsHttp nsHttpChannel::ProcessResponse [this=11C86000 httpStatus=403]
2019-02-11 15:15:09.113000 UTC - [Parent 4392: Main Thread]: D/nsHttp Cancelling failed proxy CONNECT [this=11C86000 httpStatus=403]
Here we know for sure we failed to connect the proxy, but can't close the connection, it's too late.
2019-02-11 15:15:09.113000 UTC - [Parent 4392: Main Thread]: D/nsHttp nsHttpChannel::Cancel [this=11C86000 status=804b0048]
2019-02-11 15:15:09.113000 UTC - [Parent 4392: Main Thread]: V/nsHttp nsHttpConnectionMgr::CancelTransaction [trans=0F9AAC00 reason=804b0048]
2019-02-11 15:15:09.113000 UTC - [Parent 4392: Main Thread]: D/cache2 CacheFileInputStream::Close() [this=0FDD09C0]
2019-02-11 15:15:09.113000 UTC - [Parent 4392: Main Thread]: D/nsHttp nsHttpChannel::CallOnStartRequest [this=11C86000]
2019-02-11 15:15:09.114000 UTC - [Parent 4392: Socket Thread]: V/nsHttp nsHttpTransaction::Close [this=0F9AAC00 reason=804b0048]
2019-02-11 15:15:09.114000 UTC - [Parent 4392: Socket Thread]: V/nsHttp already closed
Assignee | ||
Comment 20•6 years ago
|
||
Reporter | ||
Comment 21•6 years ago
|
||
Thank you, what should be a "good" answer from proxy when a ssl website is denied?
I will test without any identification (407) but I'm almost sure that I tried before without more success
I can't do more tests righ now, but at the end of the week
Assignee | ||
Comment 22•6 years ago
|
||
(In reply to FredB from comment #21)
Thank you, what should be a "good" answer from proxy when a ssl website is denied?
I think I understand now. The 403 response is how you implement site blocking directly on the proxy, right? Then I think 403 is just fine, you just found an actual bug in Firefox!
I will test without any identification (407) but I'm almost sure that I tried before without more success
I can't do more tests righ now, but at the end of the week
I've submitted a try build, soon I will provide a link to a test build. I would then ask you to test it. Please don't try any other workarounds or fixes in your environment until then, so we can confirm the fix. I have no way to test it locally right now.
Thanks!
Assignee | ||
Comment 23•6 years ago
|
||
Here https://queue.taskcluster.net/v1/task/DP63nIAmTs6jy7XivDKHsA/runs/0/artifacts/public/build/install/sea/target.installer.exe is a test build installer. This is based on Firefox 67 (current Nightly.) Please install it and run it for your test case.
A new profile will be used automatically for you: https://support.mozilla.org/en-US/kb/dedicated-profiles-firefox-installation
Thanks!
Assignee | ||
Comment 24•6 years ago
|
||
Comment on attachment 9043060 [details]
Bug 1522093 - Make sure to not reuse a connection to a proxy that returns an error response we can't gracefully handle, r=kershaw
Comment 25•6 years ago
|
||
Hello,
I work with FredB and he asked me to test the fix you sent us.
Good news, it seems to work, I don't have to wait a timeout after refreshing the page, it is dropped when 403 is received.
When there is multiple link in website (like the example with ads), it works too, the browser stop to hang
(the name of bad website do not appears in bottom : waiting for bbb)
I let you test on your side.
Regards
PS : Do you have an idea if can expect a fix for FF54 and more ? It would solve our problem
Assignee | ||
Comment 26•6 years ago
|
||
(In reply to gautier-ph from comment #25)
Hello,
I work with FredB and he asked me to test the fix you sent us.
Good news, it seems to work, I don't have to wait a timeout after refreshing the page, it is dropped when 403 is received.
Thank you! This verifies the fix for me, greatly appreciated.
When there is multiple link in website (like the example with ads), it works too, the browser stop to hang
(the name of bad website do not appears in bottom : waiting for bbb)I let you test on your side.
Regards
PS : Do you have an idea if can expect a fix for FF54 and more ? It would solve our problem
When this gets reviewed (in order of days) it will land in 67, Nightly. We will probably uplift to 66 Beta, but not any other branch, no. Definitely not to 54 (if that was not a typo) and not to 64 either (it's EOL already, current Release is 65.)
Assignee | ||
Updated•6 years ago
|
Comment 27•6 years ago
|
||
Pushed by rmaries@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6de9408be22f
Make sure to not reuse a connection to a proxy that returns an error response we can't gracefully handle, r=kershaw
Reporter | ||
Comment 28•6 years ago
|
||
Hi all,
Ok check done, I confirm it works
I also tested to inject some other responses (302,307, etc) all good !
Comment 29•6 years ago
|
||
bugherder |
Comment 30•6 years ago
|
||
Hi,
Thank you for your answer,
In our environment we only use the ESR release.
You said it will land in release 67 and not before but will it be also pushed in ESR release 60.7.0 ?
Regards
Assignee | ||
Comment 31•6 years ago
|
||
(In reply to gautier-ph from comment #30)
will it be also pushed in ESR release 60.7.0 ?
That sound more plausible, as this is something effecting corporations. I will nominate! Thanks.
Assignee | ||
Comment 32•6 years ago
|
||
Comment on attachment 9043060 [details]
Bug 1522093 - Make sure to not reuse a connection to a proxy that returns an error response we can't gracefully handle, r=kershaw
ESR Uplift Approval Request
If this is not a sec:{high,crit} bug, please state case for ESR consideration
This bug effects mostly corporate proxies configured as web blockers.
User impact if declined
The effect is that connections to outer servers may hang forever after connecting a filtered host/URI. User has to refresh but it doesn't always resolve the issue. So I consider this as a major usability problem.
Fix Landed on Version
67
Risk to taking this patch
Medium
Why is the change risky/not risky? (and alternatives if risky)
Low-Medium only because this has not been so far widely tested for side effects and this code is always sensitive.
We will try to create a test (bug 1527489), but it MAY depend on another infra bug to be fixed first (bug 469228)
String or UUID changes made by this patch
None
Assignee | ||
Comment 33•6 years ago
|
||
gautier-ph, FredB, let you know that there may also be a workaround on your side. instead of replying with 403+connection: keep-alive, you may answer with 403+connection: close and also actually close the connection on the proxy side. this is the bit that I think is a misconfiguration of your blocking proxy.
Reporter | ||
Comment 34•6 years ago
|
||
I tested, connection: close, but without more success (and with 302 and 307 requests also)
This bug seems fixed to me now
Thanks to all
Updated•6 years ago
|
Updated•6 years ago
|
Comment 35•6 years ago
|
||
I'd feel better about taking this on ESR if it had some bake time on Beta first.
Assignee | ||
Comment 36•6 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #35)
I'd feel better about taking this on ESR if it had some bake time on Beta first.
Sorry for late answer. It's less than two weeks to merge m-c to m-b. Honestly, from my experience, beta coverage won't tell us much. Proxies are more enterprise thing and the case this bug deals with is even more 'enterprisy'. Baking on release may tell us something, but Beta will likely not.
So, push to Beta now to get it to Release soon? :)
Updated•6 years ago
|
Comment 37•6 years ago
|
||
Hi FredB, I am trying to reproduce this issue using your steps but I'm not sure what web filter proxy did you use ? I tried setting up a proxy from the Firefox settings but I can't find a way to block that toto.fr website, IF I set up a Proxy server in Firefox Network settings its asking for a username and password and as soon as I login, the website loads correctly, if I do not use a password and just click cancel then the page fails to load with the 407 proxy error.
Can you help us with some steps on how to set up our environment so we can reproduce this issue?
Comment 38•6 years ago
|
||
Hi, After talking to Fred and Philippe via email, This issue has been Verified in our latest Beta builds and it's been confirmed as Verified, I will mark it accordingly. Thank you both.
Comment 39•6 years ago
|
||
Since there is a workaround, I am not sure I see the urgency. We could wait one more cycle for 68, which will be a new major ESR. Or, at that point, also uplift to 60.8.0 esr. mayhemer, what do you think?
Comment 40•6 years ago
|
||
Hi, I'm Fredb
This bug is especially annoying in professional environment (performance reduced with FF) and ESR version is the target for us.
There is no workaround except using FF 67
Assignee | ||
Comment 41•6 years ago
|
||
(In reply to Liz Henry (:lizzard) (use needinfo) from comment #39)
Since there is a workaround, I am not sure I see the urgency. We could wait one more cycle for 68, which will be a new major ESR. Or, at that point, also uplift to 60.8.0 esr. mayhemer, what do you think?
Let's uplift to esr 60.8.
Updated•6 years ago
|
Comment 42•5 years ago
|
||
Comment on attachment 9043060 [details]
Bug 1522093 - Make sure to not reuse a connection to a proxy that returns an error response we can't gracefully handle, r=kershaw
Fixes connection hangs for users in corporate environments. Baked on 67 release without known issues. Approved for 60.8esr.
Comment 43•5 years ago
|
||
bugherder uplift |
Comment 44•5 years ago
|
||
Backed out for bustage at nsHttpChannel.cpp:1657:
https://hg.mozilla.org/releases/mozilla-esr60/rev/409a5e270fa68682629a992def5fcd8a65fbc328
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=252659924&repo=mozilla-esr60
netwerk/protocol/http/nsHttpChannel.cpp:1657:17: error: 'class mozilla::net::nsHttpTransaction' has no member named 'DontReuseConnection'; did you mean 'SetConnection'?
Comment 45•5 years ago
|
||
Bugbug thinks this bug is a regression, but please revert this change in case of error.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 46•5 years ago
|
||
This one can't be backported because we don't have nsHttpTransaction::DontReuseConnection() API; added in bug 1451293 (67).
Comment 47•5 years ago
|
||
Thanks, ESR68 it is then. For those involved in this bug from outside Mozilla, the first 68.0esr builds will be available in early July, though we won't force-migrate users from ESR60 until the 68.2esr release later this year.
Updated•5 years ago
|
Updated•5 years ago
|
Updated•3 years ago
|
Description
•