Closed
Bug 1253582
Opened 9 years ago
Closed 9 years ago
Secure websocket using h2 coalescing map?
Categories
(Core :: Networking: WebSockets, defect)
Tracking
()
RESOLVED
FIXED
mozilla48
People
(Reporter: andrew, Assigned: mcmanus)
Details
(Whiteboard: [necko-active])
Attachments
(2 files)
|
3.66 MB,
application/gzip
|
Details | |
|
4.82 KB,
patch
|
michal
:
review+
ritu
:
approval-mozilla-aurora+
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36
Steps to reproduce:
Trying to open https://www.idokep.hu/idokep or https://www.idokep.hu/hoterkep pages, which are using socket.io 1.4.5 websockets
Actual results:
mostly the websocket connection fails and after timeout, the page falls back to a failsafe page (idokep_old and hoterkep_old page)
Expected results:
the browser should stay at the current page, and display a zoomable map
On the client side, the connection fails with error code 1006
Instpecting the connection with wireshark when the problem occurs, no connection attempt made to the server. So i think there is no problem on the server side.
Any other browser work perfectly.
Component: Untriaged → Webapp Runtime
OS: Unspecified → All
Hardware: Unspecified → x86_64
Updated•9 years ago
|
Component: Webapp Runtime → Networking: WebSockets
Product: Firefox → Core
| Assignee | ||
Comment 2•9 years ago
|
||
public testcase - worth checking
Flags: needinfo?(michal.novotny)
Whiteboard: [necko-active]
Comment 3•9 years ago
|
||
I can reproduce it, but AFAICS the problem is on the server side. On the first load everything works correctly, when I reload the page websocket connection cannot be established because server responds with error 404 to upgrade request. I've attached gzipped NSPR log.
First load:
request (time 00:06:41.824742)
http request [
GET /socket.io/?EIO=3&transport=websocket HTTP/1.1
Host: hoterkep.idokep.hu
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:47.0) Gecko/20100101 Firefox/47.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Sec-WebSocket-Version: 13
Origin: https://www.idokep.hu
Sec-WebSocket-Key: bvRpUdlviJzpTt4mqEp2ZA==
Cookie: _ga=GA1.2.1591016506.1457297472; __gfp_64b=k6dqHPpVSQzkaJriw47pN32wC_aUYpu1dffUbHMAsA7.27; _gat=1
Connection: keep-alive, Upgrade
Pragma: no-cache
Cache-Control: no-cache
Upgrade: websocket
]
response (time 00:06:41.977152)
http response [
HTTP/1.1 101 Switching Protocols
Server: nginx
Date: Mon, 07 Mar 2016 00:06:41 GMT
Connection: upgrade
Upgrade: websocket
Sec-WebSocket-Accept: dh9So/z/bev0BMf4W2+tJYA9ccU=
]
Second load:
request (time 00:09:36.482819)
http request [
GET /socket.io/?EIO=3&transport=websocket HTTP/1.1
Host: hoterkep.idokep.hu
User-Agent: Mozilla/5.0 (X11; Linux i686; rv:47.0) Gecko/20100101 Firefox/47.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate, br
Sec-WebSocket-Version: 13
Origin: https://www.idokep.hu
Sec-WebSocket-Key: XKjo7UpLuYul0M9oQC/Uwg==
Cookie: _ga=GA1.2.1591016506.1457297472; __gfp_64b=k6dqHPpVSQzkaJriw47pN32wC_aUYpu1dffUbHMAsA7.27; _gat=1
Connection: keep-alive, Upgrade
Pragma: no-cache
Cache-Control: no-cache
Upgrade: websocket
]
response (time 00:09:36.648987)
http response [
HTTP/1.1 404 Not Found
Server: nginx
Date: Mon, 07 Mar 2016 00:09:36 GMT
Content-Type: text/html; charset=UTF-8
Content-Length: 162
Connection: keep-alive
]
There are another 2 tries to establish websocket connection and both are failed with 404 by the server.
Flags: needinfo?(michal.novotny)
I find out that firefox sometimes connects to the wrong server, and this is not related to websockets.
These 404 errors comes from a different server, which don't have the websocket service.
The problem occurs even with simple static files.
www.idokep.hu have multiple A records, while hoterkep.idokep.hu only one.
How can these IP adresses mixed up?
It's hard to reproduce correctly the problem.
I've created a test page: http://hoterkep.idokep.hu/index.html
Clicking through several times the the two ssl and the dummy pages, once the websocket fails, the dummy static page also drop a 404 error, beacause only the hoterkep.idokep.hu domain and it's 79.172.211.37 addressed server have that page.
DNS records seems to be ok for me.
grepping through the site log, it's only related to firefox user agents:
38 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
38 Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:43.0) Gecko/20100101 Firefox/43.0
40 Mozilla/5.0 (Windows NT 6.3; Win64; x64; rv:44.0) Gecko/20100101 Firefox/44.0
42 Mozilla/5.0 (Windows NT 5.1; rv:39.0) Gecko/20100101 Firefox/39.0
47 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:42.0) Gecko/20100101 Firefox/42.0
51 Mozilla/5.0 (Windows NT 6.3; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0
52 Mozilla/5.0 (Windows NT 6.2; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0
91 Mozilla/5.0 (Windows NT 5.1; rv:43.0) Gecko/20100101 Firefox/43.0
95 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0
100 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Firefox/38.0
115 Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:44.0) Gecko/20100101 Firefox/44.0
123 Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:44.0) Gecko/20100101 Firefox/44.0
124 Mozilla/5.0 (Windows NT 6.0; rv:44.0) Gecko/20100101 Firefox/44.0
124 Mozilla/5.0 (Windows NT 6.3; rv:44.0) Gecko/20100101 Firefox/44.0
132 Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:44.0) Gecko/20100101 Firefox/44.0
173 GemiusSDK/1.1 (iPhone; CPU iPhone OS 8_1 like Mac OS X) AppleWebKit/600.1.4 (KHTML, like Gecko) Version/8.0 Mobile/12B410 Safari/600.1.4
177 Mozilla/5.0 (Windows NT 6.1; rv:43.0) Gecko/20100101 Firefox/43.0
404 Mozilla/5.0 (Windows NT 10.0; rv:44.0) Gecko/20100101 Firefox/44.0
449 Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:44.0) Gecko/20100101 Firefox/44.0
1347 Mozilla/5.0 (Windows NT 6.3; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0
1568 Mozilla/5.0 (Windows NT 10.0; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0
1635 -
3241 Mozilla/5.0 (Windows NT 5.1; rv:44.0) Gecko/20100101 Firefox/44.0
3600 Mozilla/5.0 (Windows NT 6.1; rv:44.0) Gecko/20100101 Firefox/44.0
5637 Mozilla/5.0 (Windows NT 6.1; WOW64; rv:44.0) Gecko/20100101 Firefox/44.0
After disabled the browser's DNS cache, the connection problem never occurs:
http://ccm.net/faq/555-disabling-the-dns-cache-in-mozilla-firefox
(In reply to andrew from comment #7)
> After disabled the browser's DNS cache, the connection problem never occurs:
> http://ccm.net/faq/555-disabling-the-dns-cache-in-mozilla-firefox
The connection problem appeared again, this setting changed nothing.
| Assignee | ||
Comment 9•9 years ago
|
||
thanks for all the information. its probably a bug but I need to do a little more legwork to confirm - you can probably work around it until a fix goes out
the issue is that www.idokep.hu has cert valid for hoterkep.idokep.hu (probably a wildcard) and www.idokep.hu has used http/2 and there is a dns overlap between them.. that ties those hosts together for the purpose of http/2 host coalescing. (i.e. we intentionally route requests for hoterkep.idokep.hu to open connecitons made to the former hostname - though they are clearly labeled as hoterkep.idokep.hu)
its a bug because we aren't using an h2 connection to do this websockets bootstrap - so we shouldn't coalesce.
easy workarounds:
* don't have the hosts overlap any DNS
* don't have www use a wildcard cert.. you can get free non-wild card certs on demand from lets encrypt
I'll update the bug when I confirm the issue
| Assignee | ||
Updated•9 years ago
|
Assignee: nobody → mcmanus
Summary: Secure websocket unreliable → Secure websocket using h2 coalescing map?
| Reporter | ||
Comment 10•9 years ago
|
||
Removed the overlapping servers form the DNS records, and changed the certificate of hoterkep.idokep.hu, but the problem still exists.
It seems that the only workaround is to totally disable http2 support on the site.
| Reporter | ||
Comment 11•9 years ago
|
||
(In reply to andrew from comment #10)
> Removed the overlapping servers form the DNS records, and changed the
> certificate of hoterkep.idokep.hu, but the problem still exists.
> It seems that the only workaround is to totally disable http2 support on the
> site.
Actually disabling http2 and removing the overlapping IPs do the trick. I will do a more detailed test to find out the proper workaround.
| Assignee | ||
Comment 12•9 years ago
|
||
changing the cert in that way probly didn't help because www still had a cert valid for hoterkep (i.e. a wildcard).. I'm guessing without seeing a log
removing the overlapping ip should also work.. make sure that neither v4 nor v6 have any overlap.
another option is to have the server that is getting the non desired upgrade to respond to that with 421 instead of 404. I think that would work too - but there might be something special about websockets that would get in the way.. but it might be the easiest thing to try.
| Assignee | ||
Comment 13•9 years ago
|
||
the DISALLOW_SPDY caps flag in the http channel (used to bootstrap websockets) is set implicitly based on the websockets upgrade callback being present.. that's fine, and it works for the nsHttpTransaction but it happens after the connectionInfo is established. the CI hash is what prevents the coalescing from happening.
The other changes in this patch are just drive by improvements.
Attachment #8729067 -
Flags: review?(michal.novotny)
| Assignee | ||
Comment 14•9 years ago
|
||
could you download this trial build and see if it resolves your problem?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=dd14dd06c604
Flags: needinfo?(andrew)
| Reporter | ||
Comment 15•9 years ago
|
||
The connection problem remains with the nightly 48.0a1 (2016-03-11) build.
Test page: https://www.idokep.hu/hoterkep2
Flags: needinfo?(andrew)
Updated•9 years ago
|
Attachment #8729067 -
Flags: review?(michal.novotny) → review+
| Assignee | ||
Comment 16•9 years ago
|
||
(In reply to andrew from comment #15)
> The connection problem remains with the nightly 48.0a1 (2016-03-11) build.
> Test page: https://www.idokep.hu/hoterkep2
thanks for testing that.. I was on the road when I did it and somehow screwed up the try build when using unfamiliar machines - you'll see that the patch you tested was empty instead of what michal reviewed:
https://hg.mozilla.org/try/rev/e03d4348a766
sorry! I will make a new one.
| Assignee | ||
Comment 17•9 years ago
|
||
| Assignee | ||
Comment 18•9 years ago
|
||
(In reply to Patrick McManus [:mcmanus] from comment #17)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=b28dec1c9a1e
can you verify this build? (I've checked it has the right code :)) Thanks!
Flags: needinfo?(andrew)
| Reporter | ||
Comment 19•9 years ago
|
||
It seems to be ok, I can't reproduce the problem with this nightly build (48.0a1 (2016-03-16))
Thak you very much for the fix!
Flags: needinfo?(andrew)
| Assignee | ||
Comment 20•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/1789a471b2d50ce505326b86a6931b8b2b3c0b08
Bug 1253582 - h2 coalescing impacts wss:// r=michal
Comment 21•9 years ago
|
||
| bugherder | ||
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
status-firefox48:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
| Assignee | ||
Comment 22•9 years ago
|
||
Comment on attachment 8729067 [details] [diff] [review]
0001-Bug-1253582-h2-coalescing-impacts-wss-r-michal.patch
this small patch fixes a websocket interop problem
Approval Request Comment
[Feature/regressing bug #]: long time ago.. at least 2 esrs
[User impact if declined]: some websockets configurations will have intermittent failures
[Describe test coverage new/current, TreeHerder]: manual verification of fix
[Risks and why]: small change that just moves an initialization earlier
[String/UUID change made/needed]: none
Attachment #8729067 -
Flags: approval-mozilla-aurora?
Comment on attachment 8729067 [details] [diff] [review]
0001-Bug-1253582-h2-coalescing-impacts-wss-r-michal.patch
This has been in Nightly for a few days, Aurora47+
Attachment #8729067 -
Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
status-firefox47:
--- → affected
Comment 24•9 years ago
|
||
| bugherder uplift | ||
You need to log in
before you can comment on or make changes to this bug.
Description
•