Websocket unable to connect through Cloudflare

VERIFIED FIXED in Firefox 65

Status

()

defect
P1
normal
VERIFIED FIXED
4 months ago
3 months ago

People

(Reporter: harald.schilly, Assigned: dragana)

Tracking

({regression})

65 Branch
mozilla66
Points:
---

Firefox Tracking Flags

(firefox-esr60 unaffected, firefox64 unaffected, firefox65+ verified, firefox66+ verified)

Details

(Whiteboard: [necko-triaged])

Attachments

(4 attachments)

(Reporter)

Description

4 months ago
User Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3578.80 Safari/537.36

Steps to reproduce:

* download Firefox 65 beta4 on Linux and started it
* Opened this page https://cocalc.com/app?skip_preflight



Actual results:

* No websocket connection (top right has a connection indicator)
* JS console (sorry, german, it says that it can't establish a wss:// connection)

> Firefox kann keine Verbindung zu dem Server unter wss://cocalc.com/hub/?_primuscb=MUxYhFa aufbauen.
> Die Verbindung zu wss://cocalc.com/hub/?_primuscb=MUxYhFa wurde unterbrochen, während die Seite geladen wurde.


Expected results:

Just like with Firefox 64 or Chrome, the connection should be established.

Additionally, trying to connect via the same way directly to the secret IP address works.

Also, this reminds me a lot of https://bugzilla.mozilla.org/show_bug.cgi?id=1453204
Component: Untriaged → Networking: WebSockets
Product: Firefox → Core
I can confirm that Firefox64 works and Firefox65b4 and Firefox66 are generating the error message
Status: UNCONFIRMED → NEW
Ever confirmed: true
(Assignee)

Updated

4 months ago
Assignee: nobody → dd.mozilla
Status: NEW → ASSIGNED
(Assignee)

Updated

4 months ago
Priority: -- → P1
Whiteboard: [necko-triaged]
(Assignee)

Comment 2

4 months ago
I cannot reproduce this. can you make http log for me:

https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Flags: needinfo?(bugzilla)
(Reporter)

Comment 3

4 months ago
network logfile
(Reporter)

Comment 4

4 months ago
second network log file
(Reporter)

Comment 5

4 months ago
Hmm, maybe you have to wait a few seconds? the top-right connection indicator is orange while it tries to connect. When it connects that orange goes away. Anyway, I've attached the logfiles that were generated.
I used mozregression-gui which gave me a wrong result in the past but this seems to be correct:

>Bug 1434137 - Implement websockets over http/2 - RFC 8441 r=michal,dragana
>
>https://tools.ietf.org/html/rfc8441
>
>This uses our existing http/2 CONNECT infrastructure (modified) to
>enable the new extended CONNECT form defined by 8441, and pretend for
>the websocket's sake that an http/2 stream is actually a socket. From
>the websocket's point of view, this is relatively non-invasive - a few
>things have changed (http response code, absence of some headers) versus
>http/1.1 websockets, but for the most part, the websocket code doesn't
>care.
Blocks: 1434137
Flags: needinfo?(bugzilla)
Keywords: regression
(Assignee)

Comment 7

4 months ago
There are 2 parts of the patch:
1) we should not try to use websockets over h2 if we already know that they are not supported.
2) make sure that we clean up websockets waiting for the settings frame when we close a h2 connection. (the part 1) will fix the issue, this part is only to be 100% that we some how do not retrigger the issue)
(Assignee)

Comment 8

4 months ago
What was happening:

We try to establish a websocket connection, we find a h2 connection that do not support websockets so we open a new one. The new connection have a tls ticket and can use early-data, so we immediately dispatch the websocket transaction to the new connection and the transaction waits in a queue for the settings frame. At some point we decide to close that h2 connection (because we have another one) and the transactions in the queue are just left hanging...

I should also disable 0-rtt for websocket because we will not know if we websockets are supported before connection is established. I will update the patch.
(Assignee)

Comment 9

4 months ago
(In reply to Dragana Damjanovic [:dragana] from comment #8)
> What was happening:
> 
> We try to establish a websocket connection, we find a h2 connection that do
> not support websockets so we open a new one. The new connection have a tls
> ticket and can use early-data, so we immediately dispatch the websocket
> transaction to the new connection and the transaction waits in a queue for
> the settings frame. At some point we decide to close that h2 connection
> (because we have another one) and the transactions in the queue are just
> left hanging...
> 
> I should also disable 0-rtt for websocket because we will not know if we
> websockets are supported before connection is established. I will update the
> patch.

I forgot that we dispatch all transaction to a h2 connection in ealy-data phase even if they cannot do early-data. We queue them if inside h2 session and websocket transactions will be queued inside h2 session so I do not need to do anything.
Hi Michal, can you please prioritize this review as we're running low on time for uplifts to 65 this cycle? Thanks!
Flags: needinfo?(michal.novotny)
Flags: needinfo?(michal.novotny)
See Also: → 1517782
(Assignee)

Updated

4 months ago
Keywords: checkin-needed
(Assignee)

Comment 11

4 months ago

This will need uplift, but I will wait that it lands on m.-c. first. I will need-info myself not to forget about it.

Flags: needinfo?(dd.mozilla)

Comment 12

4 months ago

Pushed by btara@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/62e3131970c0
In case we already have a h2 connection and it does not support websockets, we should not try again to use websockets over h2. r=michal

Keywords: checkin-needed

Comment 13

4 months ago
bugherder
Status: ASSIGNED → RESOLVED
Last Resolved: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
Duplicate of this bug: 1517782

Hi Dragana, now that this has landed in Nightly66, could you please nominate for uplift to Beta65? Thanks!

(Assignee)

Comment 16

4 months ago

Comment on attachment 9033177 [details]
In case we already have a h2 connection and it does not support websockets, we should not try again to use websockets over h2. r=michal

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1434137

User impact if declined: Websocket do not work if tls early-data are used.

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: This is a page that shows the problem: https://cocalc.com/app?skip_preflight .

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): One part of the patch is very simple, therefore low risk. The second part is a copy of a code that is already used and therefore tested and low risk.

String changes made/needed:

Flags: needinfo?(dd.mozilla)
Attachment #9033177 - Flags: approval-mozilla-beta?
Flags: qe-verify+

Comment on attachment 9033177 [details]
In case we already have a h2 connection and it does not support websockets, we should not try again to use websockets over h2. r=michal

[Triage Comment]
Fixes issues with websockets not working if tls early-data are used. Given the lack of automated test coverage here, I do think we should have QA verify the fix on Beta also. Approved for 65.0b11.

Attachment #9033177 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

I have managed to reproduce this issue using Firefox 66.0a1 (BuildId:20181217220100).

This issue is verified fixed using Firefox 66.0a1 (BuildId:20190116093310) and Firefox 65.0b11 (BuildId:20190114172331) on Windows 10 64bit, macOS 10.13.6 and Ubuntu 16.04 64bit. The connection is successfully established.

Status: RESOLVED → VERIFIED

Ups, forgot to remove this sneaky flag. Sorry for the noise.

Flags: qe-verify+
You need to log in before you can comment on or make changes to this bug.