Closed Bug 1453204 Opened 6 years ago Closed 6 years ago

Cannot connect to websocket: 425 error code

Categories

(Core :: Networking: WebSockets, defect, P3)

59 Branch
defect

Tracking

()

VERIFIED FIXED
mozilla63
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 --- fixed
firefox59 --- wontfix
firefox60 --- wontfix
firefox61 --- wontfix
firefox62 --- verified
firefox63 --- verified

People

(Reporter: raunakkathuria, Assigned: dragana)

Details

(Keywords: regression, Whiteboard: [necko-triaged])

Attachments

(3 files)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.13; rv:59.0) Gecko/20100101 Firefox/59.0
Build ID: 20180323154952

Steps to reproduce:

Connect to https://developers.binary.com/api/#active_symbols and check network tab for websocket. 

Or try this script (connects randomly)

```
var ws = new WebSocket('wss://ws.binaryws.com/websockets/v3?l=en&app_id=1089');
var result;
ws.onmessage = function (msg) {
  result = JSON.parse(msg.data);
  console.log(result);
};

ws.onclose = function (e) {
    console.log(e);
}

ws.onopen = function (e) {
    console.log("i am open send request", e);
}

function send(msg) {
    ws.send(JSON.stringify(msg));
}
```


Actual results:

It gives 425 error code and says url is https://frontend.binaryws.com/websockets/v3?l=en&app_id=1089 but if you check console it says Firefox can’t establish a connection to the server at wss://frontend.binaryws.com/websockets/v3?l=en&app_id=1089

Have attached file as well. (frontend is one of alias to wss




Expected results:

It should connect to websocket with 101 Switching Protocols as status code.

Please note that we have enabled TLS 1.3 but firefox support this after version 52 as per blog.
Please note that we are using CloudFlare in front of a nginx server
I am also having this exact issue.
First load of the affected site works as expected. Subsequent refresh presents this error. Waiting long periods (1hr+) or restarting Firefox "resets" the issue (1st connection is ok, etc)

Issue is not present in other browsers.


Clicking the right-side link in console provides this information:

Source map error: TypeError: NetworkError when attempting to fetch resource.
Resource URL: moz-extension://e602f1ee-863f-4c76-a298-40d77f8e15bb/contentscript.js
Source Map URL: ../sourcemaps/contentscript.js.map
Hello

I have tested on the following platforms and versions and had the following results:

 Windows 10 
- FF v59.0.2 - not reproducible
- FF v60.0b13 - not reproducible
- FF v61.0a1 - not reproducible

 Mac OS 10.12.6 
- FF v59.0.2 - not reproducible
- FF v60.0b13 - REPRODUCIBLE
- FF v61.0a1 - not reproducible

 Mac OS 10.13.2 
- FF v59.0.2 - not reproducible
- FF v60.0b13 - REPRODUCIBLE
- FF v61.0a1 - not reproducible

raunakkathuria would you please retest and confirm the versions on which this issue is reproducible for you. We are interested if you can reproduce this issue on Release 59, Beta 60 and Nightly 61. It would be advisable to try the comment 0 STR using a new profile, in order to exclude external factors.

how to create and use a new profile - https://support.mozilla.org/en-US/kb/profile-manager-create-and-remove-firefox-profiles?redirectlocale=en-US&redirectslug=Managing-profiles#w_starting-the-profile-manager

download pre-release versions here: https://www.mozilla.org/en-US/firefox/channel/desktop/
Status: UNCONFIRMED → NEW
Component: Untriaged → Networking: WebSockets
Ever confirmed: true
Flags: needinfo?(raunakkathuria)
Product: Firefox → Core
Issue is not present with new profile, but still exists on old profile.
FF 59.0.2 Windows 8.1

I would like to remedy this issue without losing my profile, is that possible?
Hi,

We are suspecting that there's an extension causing the problem. 
We will need a list of the extensions used on your profile. You can see them on the about:support page. 
(Write "about support" in the address bar and check the Extensions section of the loaded page.)

Due to the above, the next step would be to disable all your active extensions, check if you can reproduce the bug and if not, proceed enabling the extensions one by one, while checking if the bug is reproducible.
Flags: needinfo?(bugfreetech)
With all extensions disabled, issue is still present.

my list anyway:

Build ID 	20180323154952

Adblock Plus	3.0.2
ColorZilla	3.3	
Facebook Video Downloader
Greasemonkey	4.3	
Measure-it
MetaMask
uBlock Origin	
Valence	0.3.8
Video DownloadHelper
*Legacy*
Base64 ⇒ Encoder
colorPicker	3.0.1-signed.1-signed
Cookie Controller	6.1
FlashGot	1.5.6.14
User Agent Switcher	0.7.3.1-signed.1-signed
ZenMate Security, Privacy & Unblock VPN
Flags: needinfo?(bugfreetech)
Hi, we're experiencing the very same issue. A maybe interesting datapoint is that we also use Cloudflare! I don't know if they're to blame or not, but what happens is that -- due to whatever their service does -- the websocket connects to an IPv6 ssl secured connection (my machine here also has an IPv6 address). So, in that IPv6 to IPv6 situation (via cloudflare) the issue is exactly has explained above.

Now the twist: connecting to the hidden IPv4 address works just fine and I can't reproduce the problem.
Can you make a http log for me:
https://developer.mozilla.org/en-US/docs/Mozilla/Debugging/HTTP_logging
Flags: needinfo?(daniel.bodea)
Attached file network logfile
This is a network log of what happens when I reconnect via websockets. The website where this was done is https://cocalc.com/app

First connection seems to work, but upon reconnect it is no longer ok.
Using STR from comment 0, I have retested and I have the following results:

- Windows 10 64-bit:
   - Release v60.0.1 - REPRODUCED (attached HTTP log with recognizable name)
   - Beta v61.0b5 - NOT REPRODUCIBLE
   - Nightly v62.0a1 - NOT REPRODUCIBLE

- Mac OS X 10.12.6 64-bit:
   - Release v60.0.1 - REPRODUCED (attached HTTP log with recognizable name)
   - Beta v61.0b5 - NOT REPRODUCIBLE
   - Nightly v62.0a1 - NOT REPRODUCIBLE

@dragana, I have assumed that you only need HTTP logs for the OS/Browser version combinations that reproduce the issue.
I will attach 2 text filed with the requested HTTP logs.
Flags: needinfo?(daniel.bodea)
Apparently, Bugzilla does not let me attach these logs (probably because of their big size). 
I have uploaded to G Drive: https://drive.google.com/drive/folders/1KBH7NXd-cynbQN02yHZTBt99hwhFTbrS?usp=sharing
info in comment 9,11
Flags: needinfo?(raunakkathuria)
Disabling TLS1.3 seems to help. This may be related to https://bugzilla.mozilla.org/show_bug.cgi?id=1429859
Dragana, please decide what to do with this bug. Bug 1429859 landed on 58 and this is reproducible on 60, so it's not a dupe. But according to comment #10 it looks like this might be fixed in 61 and later...
Assignee: nobody → dd.mozilla
Priority: -- → P3
Whiteboard: [necko-triaged]
Bad news.

I've downloaded the nightly version 62.0a1 (2018-05-29) (64-bit) on my Ubuntu Linux and I'm able to reproduce the issue  exactly as stated above in my comment 7. (filename: "firefox-62.0a1.en-US.linux-x86_64.tar.bz2")
Should I record the network traffic again?

Here are the exact steps I did to reproduce:

1. open https://cocalc.com/app (clean session, even tried in private browsing)
2. don't make an account, just wait a couple of seconds. In the top right is a "wifi" symbol for the websocket connection.
3. click on that connection indicator and you should see a connection info box. Click on "reconnect"

While doing the above, you can see in the js console that
* first, there is a websocket connection opening up. 
* after reconnect, red error lines appear. Network status tab shows the 425 errors.
* open a new tab, close the defunct session, and open https://cocalc.com/app again: the connection fails immediately.

... only restarting the browser allows to re-establish the connection.
Another datapoint: disabling TLS 1.3 mitigates the issue.

I changed in "about:config": security.tls.version.max: 3 (instead of 4)

Then, reconnecting as explained above works fine.
My FF version is now 60.0.1 and it is still with issue (see comment 4)

Testing:
As mentioned by raunakkathuria and harald.schilly -- disabling TLS 1.3 has solved the issue for me on my "broken" profile.
Also note that a clean new profile (created as per comment 3) is also issue-free.

Verifying my result:
Checking in about:config and comparing a search for "tls" between broken and good profile provided the following differences...
• broken profile has extra config called    extensions.systemAddonSet   with the value of 
{"schema":1,"directory":"{e8c7d8de-db3e-431c-ac2f-8e8a3dcaea8d}","addons":{"bug1462099@mozilla.org">tls13-version-fallback-rollout-bug1462099@mozilla.org":{"version":"3.0"}}}       this looks like an attempted patch by mozilla but it did not work for me.
• broken profile security.tls.version.fallback-limit is default at 4.
• clean  profile security.tls.version.fallback-limit is default at 3.

Extra notes:
After messing about with security.tls.version.max (changed from 4 to 3) the issue was resolved, but I wanted to verify that it was still present on default configs - however after changing security.tls.version.max back to the default 4 the issue is still resolved. Removing the 'mozilla patch' to default has the same result - issue is resolved & I cannot get it back.
Finally, security.tls.version.fallback-limit has changed it's default of 4 to a default of 3, but only after I messed about with the value and then 'reset' it.
I just checked with the newly released v61: still the same issue, exactly as described above
Dragana, can you please take a look at this?
I have a patch to fix this problem.
The devtools will need update as well. I will open another bug for that.
Flags: needinfo?(dd.mozilla)
on 425 we need to restart connection. websocket are not restartable once they have started (they are sticky to a connection) but in case of 425 it is ok to restart them.
Attachment #8988769 - Flags: review?(michal.novotny)
Attachment #8988769 - Flags: review?(michal.novotny) → review+
Keywords: checkin-needed
Pushed by dluca@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0349f4afb8d3
Fix 425 return code for websocket. r=michal
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/0349f4afb8d3
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
This issue reproduces constantly in the latest Nightly 63.0a1 (2018-07-01) on Windows 10 using the STR from comment 15, even if it cannot be reproduced using the STR from comment 0.

@Dragana Damjanovic, the issue is not fixed with this change.
Flags: needinfo?(dd.mozilla)
(In reply to Bodea Daniel [:danibodea] from comment #24)
> This issue reproduces constantly in the latest Nightly 63.0a1 (2018-07-01)
> on Windows 10 using the STR from comment 15, even if it cannot be reproduced
> using the STR from comment 0.
> 
> @Dragana Damjanovic, the issue is not fixed with this change.

comment 15 describes 2 ways to see that the code is broken:
1) devtools showing status code 425
2) "red error notification" shows up.

The second should be gone. The first is another bug in devtool (I have filed bug 1472240 for that).
Flags: needinfo?(dd.mozilla)
Considering the comment above, I have retested the issue using STR from comment 0, on the 3 main version of Firefox (Nightly v63.0a1, Beta v62.0b4, Release v61.0 and the 3 main OSes (Windows 10, Ubuntu 16.04 and Mac OS X 10.12.6).

The issue does not occur in either of the combinations. Issue verified!
Dragana, is this something you might want to uplift to 62 beta? 
Or, should we leave it to ride the trains with 63?
Flags: needinfo?(dd.mozilla)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #27)
> Dragana, is this something you might want to uplift to 62 beta? 
> Or, should we leave it to ride the trains with 63?

we can uplift this one. it is very small change, it is verified and well understood. I fill comfortable uplifting this.
Flags: needinfo?(dd.mozilla)
Comment on attachment 8988769 [details] [diff] [review]
bug_1453204.patch

Approval Request Comment
[Feature/Bug causing the regression]: bug 1406908 - implementation of '425 Too Early' http error code
[User impact if declined]: websocket can not connect if tls1.3 early data are used and server returns 425.
[Is this code covered by automated tests?]:no
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: The steps are described in comment 15 and comment 25
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: no
[Why is the change risky/not risky?]: The code is well understood and tested.
[String changes made/needed]:no
Attachment #8988769 - Flags: approval-mozilla-beta?
Comment on attachment 8988769 [details] [diff] [review]
bug_1453204.patch

Sounds like this shows up if TLS 1.3 is used, since 58. Fix verified in nightly, let's uplift for 62 beta 9.
Attachment #8988769 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Please nominate this for ESR60 also.
Flags: needinfo?(dd.mozilla)
Comment on attachment 8988769 [details] [diff] [review]
bug_1453204.patch

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration: This bug impacts connections to a websocket when TLS 1.3 with Early-Data is used and a server returns "425 Too Early". TLS1.3 and Early-Data are on by default on ers60.
User impact if declined:  websocket cannot connect if tls1.3 early data are used and server returns 425.
Fix Landed on Version: 63 and also uplifted to beta 62
Risk to taking this patch (and alternatives if risky): low, The code is well understood and tested.
String or UUID changes made by this patch: none

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.
Flags: needinfo?(dd.mozilla)
Attachment #8988769 - Flags: approval-mozilla-esr60?
Hello, 

I have rechecked the reproduction of this issue in Beta 62.0b10 with Windows 10 and Mac OS X 10.12.6. The issue does not occur after the fix. I have to mention that according to comment 26, this issue wasn't occurring before the corresponding push. I have also rechecked the other main versions of Firefox (Release v61.0.1 and Nightly v63.0a1 and the same OSes) and the issue is not reproducible.
Comment on attachment 8988769 [details] [diff] [review]
bug_1453204.patch

Fixes a TLS 1.3 bug. Verified on Nightly and Beta. Approved for ESR 60.2.
Attachment #8988769 - Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: