Closed
Bug 1492181
Opened 7 years ago
Closed 5 years ago
Repeatable WebSocket Secure wss timeout
Categories
(Core :: Networking: WebSockets, defect, P2)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: mondain, Unassigned)
Details
(Whiteboard: [necko-triaged])
Attachments
(1 file)
|
1.71 MB,
text/plain
|
Details |
User Agent: Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:62.0) Gecko/20100101 Firefox/62.0
Build ID: 20180913170346
Steps to reproduce:
Scenario 1
1. Navigate to https://ffws.red5.org/chat/index.html
2. WebSocket connection will fail about every other attempt
Alternate scenario
1. Navigate to https://ffws.red5.org/chat/index2.html
2. WebSocket connection fails less often than in scenario #1 above
The difference between scenarios is that index2.html does not attach onopen nor onclose to the WebSocket.
Actual results:
Upon timeout, the following error is displayed in the console:
Firefox can’t establish a connection to the server at wss://ffws.red5.org:8083/chat
Expected results:
No timeout should occur unless network conditions are the cause.
Comment 1•7 years ago
|
||
Hi,
I didn't manage to reproduce the issue mentioned above on the latest Release 62.0 Build Id: 20180913170346 on Ubuntu 18.04 and on the latest Nightly 64.0a1 Build ID:20180921100113 on Win10 x64. Leaving the Browser Console open in both cases no timeout was received, no error was shown, the only time the "WebSocket is closed" message was shown was when the page was refreshed.
Could you please retest using safe mode (https://goo.gl/AR5o9d), maybe even a new clean Firefox profile (https://goo.gl/AWo6h8), to eliminate add-ons or custom settings as a possible cause?
Flags: needinfo?(mondain)
(In reply to laszlo.bialis from comment #1)
> Hi,
>
> I didn't manage to reproduce the issue mentioned above on the latest Release
> 62.0 Build Id: 20180913170346 on Ubuntu 18.04 and on the latest Nightly
> 64.0a1 Build ID:20180921100113 on Win10 x64. Leaving the Browser Console
> open in both cases no timeout was received, no error was shown, the only
> time the "WebSocket is closed" message was shown was when the page was
> refreshed.
>
> Could you please retest using safe mode (https://goo.gl/AR5o9d), maybe even
> a new clean Firefox profile (https://goo.gl/AWo6h8), to eliminate add-ons or
> custom settings as a possible cause?
If you have macOS, give that a shot as well, we've seen the fail most often there. I will try in safemode and with a clean profile.
Flags: needinfo?(mondain)
I cannot reproduce it in safe-mode and I have attached the log showing both pages being hit.
Link to safe mode log since its over the upload limit of 10mb https://www.dropbox.com/s/x3nw06yy01rm78e/macos-ff-safemode.log?dl=0
I didn't create a profile for this macOS Firefox using the default theme and it has no extra add-ons, so I'm really failing to see why a standard firefox fails, but when its in safe-mode it works?? While the clean profile suggestion seems like a hassle, I'll try it and report back.
Brand new, clean profile works as well on both pages; so how do we proceed with real world users that don't run in safe-mode nor create clean profiles every so often, but have no additional themes nor add-ons, ie. regular users? The log attached shows such a scenario, is there enough in there to go forward with to locate the issue or not?
We have added better test for this issue, please test with macOS and Firefox 62 or newer https://ffws.red5.org/live/socket-test.html
Comment 8•7 years ago
|
||
Hi,
It seems that none of your links from https://ffws.red5.org are working anymore so I cannot retest the issue.
Please give us some new reference links on which we can reproduce the issue, or try to download Firefox Nightly from here: https://nightly.mozilla.org/ and retest the problem.
Flags: needinfo?(mondain)
(In reply to laszlo.bialis from comment #8)
> Hi,
>
> It seems that none of your links from https://ffws.red5.org are working
> anymore so I cannot retest the issue.
>
> Please give us some new reference links on which we can reproduce the issue,
> or try to download Firefox Nightly from here: https://nightly.mozilla.org/
> and retest the problem.
Hello,
Yes, we brought the server down after having it up for a month for this bug report. We've found a work-around and made some changes to our server-side WebSocket library. An issue remains however in Firefox that I don't have a test for, but we're about 90% its the underlying problem. When Firefox makes the WebSocket connection and is in the process of waiting for a handshake response, if the server / backend sends anything back, such as a ping, Firefox will send a pong and cause the handshake flow to deadlock. From the server-side the handshake response content seems to have been sent, but FireFox doesn't change its status to 101 and just hangs.
Flags: needinfo?(mondain)
Comment 10•7 years ago
|
||
Hi,
I am assigning a component to this issue in order to involve the development team and get an opinion on this.
Component: Untriaged → Networking: WebSockets
Product: Firefox → Core
Updated•7 years ago
|
Priority: -- → P2
Whiteboard: [necko-triaged]
Comment 12•7 years ago
|
||
(In reply to mondain from comment #9)
> test for, but we're about 90% its the underlying problem. When Firefox makes
> the WebSocket connection and is in the process of waiting for a handshake
> response, if the server / backend sends anything back, such as a ping,
> Firefox will send a pong and cause the handshake flow to deadlock.
mSocketIn->AsyncWait() is called in WebSocketChannel::StartWebsocketData() which is called after we finish the handshake, i.e. we don't process input before the handshake is finished.
Flags: needinfo?(michal.novotny)
| Reporter | ||
Comment 13•7 years ago
|
||
(In reply to Michal Novotny [:michal] from comment #12)
> (In reply to mondain from comment #9)
> > test for, but we're about 90% its the underlying problem. When Firefox makes
> > the WebSocket connection and is in the process of waiting for a handshake
> > response, if the server / backend sends anything back, such as a ping,
> > Firefox will send a pong and cause the handshake flow to deadlock.
>
> mSocketIn->AsyncWait() is called in WebSocketChannel::StartWebsocketData()
> which is called after we finish the handshake, i.e. we don't process input
> before the handshake is finished.
Hi Michal,
May I suggest that you test the flow I described to ensure FF behaves as you surmise. When we modified our library not to send any data until after the handshake, our problem was resolved.
Comment 14•7 years ago
|
||
I created a simple testcase with nodejs websocket server, but it probably cannot send ping early enough. Can you please provide us your testcase that I could run locally?
Flags: needinfo?(mondain)
| Reporter | ||
Comment 15•7 years ago
|
||
(In reply to Michal Novotny [:michal] from comment #14)
> I created a simple testcase with nodejs websocket server, but it probably
> cannot send ping early enough. Can you please provide us your testcase that
> I could run locally?
I'll get something together this weekend if time permits. Thanks for looking into this.
Flags: needinfo?(mondain)
Comment 16•7 years ago
|
||
(In reply to mondain from comment #0)
> Created attachment 9009976 [details]
> Log from hitting index.html 3 times showing 2 failures
When looking at the logs I can say it's not a websockets issue. nsHttpChannel which should open the connection for WebSocketChannel sends the request to the server but it never reads the response. After 20 seconds the websocket connection is timed out.
2018-09-18 15:13:12.401942 UTC - [16504:Socket Thread]: D/nsSocketTransport nsSocketTransport::OnSocketReady [this=0x7f881afe9c00 outFlags=1]
2018-09-18 15:13:12.401945 UTC - [16504:Socket Thread]: D/nsSocketTransport nsSocketInputStream::OnSocketReady [this=0x7f881afe9e40 cond=0]
2018-09-18 15:13:12.401948 UTC - [16504:Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable [this=0x7f8819cb4800]
2018-09-18 15:13:12.401950 UTC - [16504:Socket Thread]: V/nsHttp nsHttpTransaction::WriteSegments 0x7f881afe6c00
2018-09-18 15:13:12.401953 UTC - [16504:Socket Thread]: V/nsHttp nsHttpConnectionMgr::ShouldThrottle trans=0x7f881afe6c00
2018-09-18 15:13:12.401968 UTC - [16504:Socket Thread]: D/nsSocketTransport nsSocketInputStream::Read [this=0x7f881afe9e40 count=32768]
2018-09-18 15:13:12.401971 UTC - [16504:Socket Thread]: D/nsSocketTransport calling PR_Read [count=32768]
2018-09-18 15:13:12.402109 UTC - [16504:Socket Thread]: D/nsSocketTransport PR_Read returned [n=-1]
2018-09-18 15:13:12.402115 UTC - [16504:Socket Thread]: V/nsHttp nsHttpConnection::OnSocketReadable 0x7f8819cb4800 trans->ws rv=0 n=0 socketin=80470007
2018-09-18 15:13:12.402117 UTC - [16504:Socket Thread]: V/nsHttp nsHttpConnection::ResumeRecv [this=0x7f8819cb4800]
The socket is readable, but PR_Read returns PR_WOULD_BLOCK_ERROR. Dragana, do you have an idea what is happening here?
Flags: needinfo?(dd.mozilla)
Comment 17•7 years ago
|
||
(In reply to Michal Novotny [:michal] from comment #16)
> (In reply to mondain from comment #0)
> > Created attachment 9009976 [details]
> > Log from hitting index.html 3 times showing 2 failures
>
> When looking at the logs I can say it's not a websockets issue.
> nsHttpChannel which should open the connection for WebSocketChannel sends
> the request to the server but it never reads the response. After 20 seconds
> the websocket connection is timed out.
>
> 2018-09-18 15:13:12.401942 UTC - [16504:Socket Thread]: D/nsSocketTransport
> nsSocketTransport::OnSocketReady [this=0x7f881afe9c00 outFlags=1]
> 2018-09-18 15:13:12.401945 UTC - [16504:Socket Thread]: D/nsSocketTransport
> nsSocketInputStream::OnSocketReady [this=0x7f881afe9e40 cond=0]
> 2018-09-18 15:13:12.401948 UTC - [16504:Socket Thread]: V/nsHttp
> nsHttpConnection::OnSocketReadable [this=0x7f8819cb4800]
> 2018-09-18 15:13:12.401950 UTC - [16504:Socket Thread]: V/nsHttp
> nsHttpTransaction::WriteSegments 0x7f881afe6c00
> 2018-09-18 15:13:12.401953 UTC - [16504:Socket Thread]: V/nsHttp
> nsHttpConnectionMgr::ShouldThrottle trans=0x7f881afe6c00
> 2018-09-18 15:13:12.401968 UTC - [16504:Socket Thread]: D/nsSocketTransport
> nsSocketInputStream::Read [this=0x7f881afe9e40 count=32768]
> 2018-09-18 15:13:12.401971 UTC - [16504:Socket Thread]: D/nsSocketTransport
> calling PR_Read [count=32768]
> 2018-09-18 15:13:12.402109 UTC - [16504:Socket Thread]: D/nsSocketTransport
> PR_Read returned [n=-1]
> 2018-09-18 15:13:12.402115 UTC - [16504:Socket Thread]: V/nsHttp
> nsHttpConnection::OnSocketReadable 0x7f8819cb4800 trans->ws rv=0 n=0
> socketin=80470007
> 2018-09-18 15:13:12.402117 UTC - [16504:Socket Thread]: V/nsHttp
> nsHttpConnection::ResumeRecv [this=0x7f8819cb4800]
>
> The socket is readable, but PR_Read returns PR_WOULD_BLOCK_ERROR. Dragana,
> do you have an idea what is happening here?
this is not using tls?
| Reporter | ||
Comment 18•7 years ago
|
||
> this is not using tls?
Was that a question for me? Yes, we use TLS for the page (https) and for the websocket (wss).
Comment 19•7 years ago
|
||
(In reply to mondain from comment #18)
> > this is not using tls?
>
> Was that a question for me? Yes, we use TLS for the page (https) and for the
> websocket (wss).
I Michal told me that on a meeting.
We could use a wireshark pcap together with a http log to diagnose the problem.
Flags: needinfo?(dd.mozilla)
| Reporter | ||
Comment 21•5 years ago
|
||
You can close this, I'm not aware of it being an issue with the latest releases.
Flags: needinfo?(mondain)
Updated•5 years ago
|
Status: UNCONFIRMED → RESOLVED
Closed: 5 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•