Closed
Bug 1409858
Opened 7 years ago
Closed 7 years ago
Proxy with authorization causes all networking to lock up
Categories
(Core :: Networking, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla58
People
(Reporter: yegors, Assigned: mcmanus)
Details
(Whiteboard: [necko-triaged])
Attachments
(1 file)
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/60.0.3112.90 Safari/537.36 OPR/47.0.2631.71 Steps to reproduce: Load a PAC file into the browser which connects to a HTTPS proxy server (nghttpx + squid backend) that requires authentication. function FindProxyForURL(url, host) { return "HTTPS proxy-domain.com"; } Actual results: All network requests lock up, which requires a browser restart to get it all working again. Unloading the PAC file has no effect. This issue only occurs when using the Webextensions API or loading the PAC file manually into the browser. Expected results: Should have gotten an authentication popup caused by a 407 response code, requiring the user to enter their username/password.
Updated•7 years ago
|
Component: Untriaged → WebExtensions: General
Product: Firefox → Toolkit
Updated•7 years ago
|
Component: WebExtensions: General → WebExtensions: Request Handling
Flags: needinfo?(mixedpuppy)
Comment 1•7 years ago
|
||
(In reply to yegors from comment #0) > This issue only occurs when using the Webextensions API or loading the PAC > file manually into the browser. Can you clarify this statement? It sounds like you're saying you can also create the problem without using webextensions, but loading the PAC in network preferences. In that case, I'm assuming that this will happen for you without any extensions installed. Is that correct?
Flags: needinfo?(yegors)
The issue occurs when loading the PAC file from disk, however when using a legacy extension to load a base64 encoded PAC file into FF56, the same issue does not occur. It happens only when using a pure webextension version + manual loading of PAC file from disk. You can confirm by using the above posted PAC file with the following hostname: ro-003.windscribe.com The issue occurs in 57 beta, and 58 nightly as well.
Flags: needinfo?(yegors)
Comment 3•7 years ago
|
||
I'm able to produce the failure without any involvement from any extension. Likely the fact that it also happens with a webextension installing the pac script is due to the underlying problem. STR create a pac file from comment 0 using domain from comment 2 install via preferences restart firefox go to mozilla.org The request is stuck with a message about doing the TLS handshake with mozilla.org. If I already had mozilla.org cached prior to installing the pac, and that tab was the selected tab, and you reopen tabs on restart, you'll get the proxy auth dialog. cancel that, try to reload, same failure, you never get an auth dialog again. Also interesting is that trying to open the browser toolbox in this scenario results in a blank window for the toolbox.
status-firefox57:
--- → affected
status-firefox58:
--- → affected
Component: WebExtensions: Request Handling → Networking
Flags: needinfo?(mixedpuppy)
Product: Toolkit → Core
Updated•7 years ago
|
Summary: WebExtensions - Proxy with authorization causes all networking to lock up → Proxy with authorization causes all networking to lock up
Updated•7 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Also encountered this bug. After network lockup, Web Extension is powerless to change anything.
Comment 5•7 years ago
|
||
Unclear if this is a regression or not. That's the first thing we should figure out. Either way this is nasty enough that a 57 uplift would be great if we can get a simple fix here. Honza is the logical person here, but he's out sick. I'll see if I can find someone else to get started at least.
Flags: needinfo?(honzab.moz)
Priority: -- → P1
Updated•7 years ago
|
Whiteboard: [necko-triaged]
Thanks for the concern. To be more specific. Before it was possible to access low-level API and set headers to prevent such situation. Like here: https://developer.mozilla.org/ru/docs/Setting_HTTP_request_headers. With Web Extension when response returns HTTP 407 it freezes everything. And we cannot set "Proxy-Authorization" header from within Web Extension according to policy rules to prevent it. There should be a way to handle it withing Web Extension. Something is definitely off: there is no response at all, not even error, timeout etc.
Reporter, is the certificate for the proxy server self-signed (or otherwise not something that firefox would normally accept without an exception being made)? If so, you probably need to visit the proxy server directly first (just put https://proxy-domain.com in the url bar) and add an exception for the certificate. Then we'll be able to make connections over that proxy.
Flags: needinfo?(yegors)
No, the certificate is not self signed. We use LetsEncrypt.
Flags: needinfo?(yegors)
Another piece of information: We noticed this issue only when we upgraded nghttpx to the latest version. Previously, when the proxy hostname was accessed directly (just go to it as if it's a URL), it would return a 400 bad Request error, see https://ro-002.windscribe.com This behavior changed recently to return a 407 instead, see https://ro-003.windscribe.com My best guess here is that this causes an infinite loop. This did not occur in the legacy extension API since we explicitly set the Proxy-Authorization header with the required credentials, but this API is no longer available in webextensions, as mentioned by Alex above. Link to the issue: https://github.com/nghttp2/nghttp2/issues/1030
Comment 10•7 years ago
|
||
It's definitely in a loop. Manually configuring the proxy again and turning on http logging, I got many megabytes of: [Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable 0x11050a000 trans->ws rv=0 n=0 socketin=0 [Socket Thread]: D/nsHttp TLSFilterTransaction::WriteSegments 0x113e2f9d0 max=32768 [Socket Thread]: V/nsHttp nsHttpTransaction::WriteSegments 0x11dff9800 [Socket Thread]: V/nsHttp nsHttpConnectionMgr::TouchThrottlingTimeWindow [Socket Thread]: D/nsHttp TLSFilterTransaction::OnWriteSegment 0x113e2f9d0 max=32768 [Socket Thread]: D/nsHttp TLSFilterTransaction 0x113e2f9d0 called trans->WriteSegments rv=0 0 [Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable 0x11050a000 trans->ws rv=0 n=0 socketin=0 [Socket Thread]: D/nsHttp TLSFilterTransaction::WriteSegments 0x113e2f9d0 max=32768 [Socket Thread]: V/nsHttp nsHttpTransaction::WriteSegments 0x11dff9800 [Socket Thread]: V/nsHttp nsHttpConnectionMgr::TouchThrottlingTimeWindow [Socket Thread]: D/nsHttp TLSFilterTransaction::OnWriteSegment 0x113e2f9d0 max=32768 [Socket Thread]: D/nsHttp TLSFilterTransaction 0x113e2f9d0 called trans->WriteSegments rv=0 0
Comment 11•7 years ago
|
||
(In reply to Alex from comment #6) > With Web Extension when response returns HTTP 407 it freezes everything. And > we cannot set "Proxy-Authorization" header from within Web Extension > according to policy rules to prevent it. Actually you can set that header, just not for system level requests. So it's not terribly helpful. Instead, you can use webRequest.onAuthRequired which will allow you to return credentials for system requests as of 57 (Bug 1401350). But otherwise, this bug has nothing to do with extensions. There is some lower level interaction happening that is causing a loop.
Reporter | ||
Comment 12•7 years ago
|
||
Alright. Just FYI, the test hosts above will no longer exhibit the issue. We downgraded the software on our end to allow access since these are production machines.
Comment 13•7 years ago
|
||
Yegors, Do you have any remaining machines that exhibit the issue? If we have a live URI it will make this easier to debug for us.
Flags: needinfo?(yegors)
Comment 14•7 years ago
|
||
Pat: rumor has it you're actually the best candidate for this bug while Honza is out.
Flags: needinfo?(mcmanus)
Assignee | ||
Comment 15•7 years ago
|
||
I really don't know if this is a regression or not.. or it is more about 407 or https-proxy or requires both. I'm juggling a few things today as well. Nick knows the https tunnel code too, but its really not clear what's going on. As for regression - that's not clear either. some form of auth definitely was tested and working with https proxies but I'm not sure if it was exactly this flavor or not. will research.
Flags: needinfo?(mcmanus)
Comment 16•7 years ago
|
||
If this is happening inside h2 or is h2 specific, then I'm probably not the right person. Patrick or Nick know these pieces better. A life machine to test with would be really helpful. Other option is a full log according: https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging#Logging_HTTP_activity_by_manually_setting_environment_variables Thanks.
Flags: needinfo?(honzab.moz)
Comment 17•7 years ago
|
||
Or, is it still reproducible with ro-003.windscribe.com ?
Reporter | ||
Comment 18•7 years ago
|
||
You can use this for testing: https://stg-ca-001.windscribe.com
Flags: needinfo?(yegors)
Comment 19•7 years ago
|
||
(In reply to yegors from comment #18) > You can use this for testing: https://stg-ca-001.windscribe.com Thanks, locally reproducible for me. > xul.dll!mozilla::net::TLSFilterTransaction::OnWriteSegment(0x000002116dd2d000, , 0x00000029685fd7e0) Line 281 C++ xul.dll!mozilla::net::nsHttpTransaction::WritePipeSegment(0x0000000000000000, 0x0000021162c50c00, 0x000002116dd2d000, 1715765368, 32768, 0x00000029685fd7e0) Line 830 C++ xul.dll!nsPipeOutputStream::WriteSegments(0x00007ffe6c4371dc, 0x0000021162c50c00, 32768, 0x00000029685fd918) Line 1835 C++ xul.dll!mozilla::net::nsHttpTransaction::WriteSegments(0x0000021166efd578, 32768, 0x00000029685fd918) Line 950 C++ xul.dll!mozilla::net::TLSFilterTransaction::WriteSegments(0x000002116df5bc08, 32768, 0x00000029685fd918) Line 360 C++ xul.dll!mozilla::net::nsHttpConnection::OnSocketReadable() Line 1914 C++ xul.dll!mozilla::net::nsHttpConnection::OnInputStreamReady(0x0000000000008000) Line 2250 C++ xul.dll!mozilla::net::SpdyConnectTransaction::WriteSegments(0x000002116dd640c8, 32768, 0x00000029685feaf8) Line 1208 C++ xul.dll!mozilla::net::Http2Stream::WriteSegments(0x0000021166ed9820, 32768, 0x00000029685feaf8) Line 308 C++ xul.dll!mozilla::net::Http2Session::WriteSegmentsAgain(0x000002116ac8c408, 32768, 0x00000029685feaf8, 0x00000029685feaf0) Line 3216 C++ xul.dll!mozilla::net::nsHttpConnection::OnSocketReadable() Line 1914 C++ xul.dll!mozilla::net::nsHttpConnection::OnInputStreamReady(0x0000000000000000) Line 2250 C++ xul.dll!mozilla::net::nsSocketInputStream::OnSocketReady(NS_OK) Line 299 C++ xul.dll!mozilla::net::nsSocketTransport::OnSocketReady(0x000002116a6be520, ) Line 2209 C++ xul.dll!mozilla::net::nsSocketTransportService::DoPollIteration() Line 1215 C++ xul.dll!mozilla::net::nsSocketTransportService::Run() Line 948 C++ PR_Read() returns -12263 == SSL_ERROR_RX_RECORD_TOO_LONG leaving number of read bytes at 0, which discards the error in nsPipeOutputStream::WriteSegments and makes us loop in nsHttpConnection::OnSocketReadable (n=0, mSocketInCondition=OK). Not sure if mFilterReadCode is implemented correctly and should be rewritten to break the loop in in this case. According [1] Patrick or Nick should look at this. [1] https://hg.mozilla.org/mozilla-central/annotate/20d57b9c4183973af4af5e078dff2aec0b74f928/netwerk/protocol/http/TunnelUtils.cpp#l277
Assignee | ||
Comment 20•7 years ago
|
||
yegors - thanks for the help and especially the test case. I can see with it that it triggers a different code path than my h2-proxy-with-407 test case (which is still working fine modulo a couple stale debug assertions that wouldn't impact you). I suspect this is a latent bug rather than a regression; but still need a little testing.
Assignee | ||
Comment 21•7 years ago
|
||
honza, what ff version(s) did you test with? (and assertions on or off?) thanks.
Comment 22•7 years ago
|
||
(In reply to Patrick McManus [:mcmanus] from comment #21) > honza, what ff version(s) did you test with? (and assertions on or off?) > thanks. opt m-c@2c17b581bc18 [Wed, 11 Oct 2017 08:10:50 -0400 (9 days ago)] (got the stacks from) and up to date nightly (just logs, but looks exactly the same).
Assignee | ||
Comment 23•7 years ago
|
||
I have tested with the proxy from comment 18 with several versions, all of which fail. So this isn't a P1 regression. I'll try and figure out a fix though. I have another proxy that is also HTTP/2 and 407 based and it doesn't have a serious problem on any of the versions. Not sure what the difference is. tested; 57b9, 56.0.1, 55.0.3, and 52.4.1-esr
Assignee | ||
Comment 25•7 years ago
|
||
yegors, I might have a fix. If you'd like to email me a credential at mcmanus@ducksong.com I can try your proxy for a full unit test.. but right now I can confirm that I can get the normal 407 popup and then fail to authenticate (i.e. the lockup does not happen). I would also like to give you a build to test - but our CI seems to be down at the moment so I'm not sure when I would have a build for you. thanks.
Flags: needinfo?(yegors)
Reporter | ||
Comment 26•7 years ago
|
||
Patrick, I emailed you the test credentials.
Flags: needinfo?(yegors)
Assignee | ||
Comment 27•7 years ago
|
||
the proxy in comment 18 is confirmed to work with the credentials from comment 26 and the patch from https://treeherder.mozilla.org/#/jobs?repo=try&revision=7ea399883d82c7b6ff5ce24db6138d48da1791dd
Assignee | ||
Comment 28•7 years ago
|
||
would you be willing to try out a bug fix build? This is a 64 bit windows build https://queue.taskcluster.net/v1/task/WtsX9s60QS-PZb2bde4KNw/runs/0/artifacts/public/build/target.zip you can also find a 64 bit linux build by clicking on the B in the linux-64 line in url in comment 27. thanks!
Comment hidden (mozreview-request) |
Assignee | ||
Comment 30•7 years ago
|
||
mozreview-review |
Comment on attachment 8920711 [details] Bug 1409858 - H2 CONNECT Tunnel Marked Plaintext Too Late https://reviewboard.mozilla.org/r/191714/#review196934 ::: netwerk/protocol/http/Http2Session.cpp:598 (Diff revision 1) > > void > Http2Session::DontReuse() > { > LOG3(("Http2Session::DontReuse %p\n", this)); > + if (!OnSocketThread()) { this is now called from nshttpchannel on some of the "sticky" failed auth code changes Honza made over the summer. It needs to be proxied over to the socket thread.. though that's not the root cause of the bug. ::: netwerk/protocol/http/Http2Stream.cpp:1055 (Diff revision 1) > if (mIsTunnel) { > LOG3(("Http2Stream %p Tunnel Response code %d", this, httpResponseCode)); > if ((httpResponseCode / 100) != 2) { > MapStreamToPlainText(); > } > + MapStreamToHttpConnection(); this is the root cause of the bug, the channel isn't marked plaintext in time and the 407 gets interpreted as ciphertext ::: netwerk/protocol/http/TunnelUtils.cpp:395 (Diff revision 1) > > if (!mSecInfo) { > return NS_ERROR_FAILURE; > } > > - uint32_t notUsed; > + nsCOMPtr<nsISSLSocketControl> ssl(do_QueryInterface(mSecInfo)); we no longer drive the handshake with "" in the 0-RTT era.. not the source of the bug, but it does create an assertion in the testcase (that is handled fine at runtime).
Reporter | ||
Comment 31•7 years ago
|
||
Patrick's fix is confirmed. Good work!
Updated•7 years ago
|
Assignee: nobody → mcmanus
Flags: needinfo?(jduell.mcbugs)
Priority: P1 → P2
Comment 32•7 years ago
|
||
mozreview-review |
Comment on attachment 8920711 [details] Bug 1409858 - H2 CONNECT Tunnel Marked Plaintext Too Late https://reviewboard.mozilla.org/r/191714/#review197268 ::: netwerk/protocol/http/Http2Session.cpp:609 (Diff revision 1) > + } > + > mShouldGoAway = true; > - if (!mStreamTransactionHash.Count()) > + if (!mClosed && !mStreamTransactionHash.Count()) { > Close(NS_OK); > -} > + } nit: I think reviewboard is claiming there's a tab here.
Attachment #8920711 -
Flags: review?(hurley) → review+
Comment 33•7 years ago
|
||
hg error in cmd: hg push -r tip ssh://hg.mozilla.org/integration/autoland: pushing to ssh://hg.mozilla.org/integration/autoland searching for changes remote: adding changesets remote: adding manifests remote: adding file changes remote: added 1 changesets with 3 changes to 3 files remote: remote: remote: ************************** ERROR **************************** remote: Rev 5a6479bf1fcb needs "Bug N" or "No bug" in the commit message. remote: Patrick McManus <mcmanus@ducksong.com> remote: Buf 1409858 - H2 CONNECT Tunnel Marked Plaintext Too Late r=nwgh remote: remote: MozReview-Commit-ID: IjuYDuvufRd remote: ************************************************************* remote: remote: remote: transaction abort! remote: rollback completed remote: pretxnchangegroup.c_commitmessage hook failed abort: push failed on remote
Comment hidden (mozreview-request) |
Comment 35•7 years ago
|
||
Pushed by mcmanus@ducksong.com: https://hg.mozilla.org/integration/autoland/rev/d1ae4a601616 H2 CONNECT Tunnel Marked Plaintext Too Late r=nwgh
Comment 36•7 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/d1ae4a601616
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Updated•7 years ago
|
Flags: needinfo?(jduell.mcbugs) → needinfo?(mcmanus)
Assignee | ||
Comment 38•7 years ago
|
||
no - this is a bug we've already had (not a regression) - and not common enough to take the risk for uplift.
Flags: needinfo?(mcmanus)
Updated•7 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•