Closed
Bug 936979
Opened 11 years ago
Closed 10 years ago
websocket will never connected after a lot of failure
Categories
(Core :: Networking: WebSockets, defect)
Tracking
()
RESOLVED
FIXED
mozilla29
People
(Reporter: fatmck, Assigned: jduell.mcbugs)
Details
Attachments
(3 files)
1.31 KB,
application/x-compressed-tar
|
Details | |
1.58 KB,
patch
|
Details | Diff | Splinter Review | |
1.56 KB,
patch
|
mcmanus
:
review+
|
Details | Diff | Splinter Review |
First, sorry for my poor English. My code is using setInterval to check connection state of websocket(the delay is 3 seconds). In setInterval callback function, if i found websocket is not connected, i would close it, and create a new websocket. To see this bug, you should close the server, so the connecting attempt will be always failed. After a lot connection failure(about over 8 times on my machine, you can just wait for one minute), startup the server, but no connection happens. I use tcpdump to print the tcp packages: after about over 8 times failure, there is no TCP SYNC sent from firefox. (TCP SYNC means a client is trying connect to a server) After you see this bug, refresh the page, connecting attempt will still be failure. you must wait a long time, then the connection will be success. Env: ubuntu12.04 64bit + firefox25 (also buggy in ubuntu13.04 32bit) The attachment contains following files: 1. test.html : the html file runing the websocket client (trying to conent 127.0.0.1 port 1026) 2. tcpdump.txt : the output of tcpdump in which you will see 8 SYNC packages, and also 8 RESET packages following each SYNC package. Lines marked by [S] is a TCP SYNC package sent by client side which is the firefox websocket. Lines marked by [R.] is a TCP RESET package sent by the server machine, which means no server side application is listening port 1026. When you confirming this bug, you even don't need a server, just run tcpdump using the following command: sudo tcpdump -ilo tcp and port 1026 This will print any tcp packages happend on 127.0.0.1:1026. Then open test.html in firefox, you can only see SYNC packages and RESET packages in the first few seconds(on my machine it is 8 SYNC packages in 24 seconds), and then nothing! That means: firefox can not make connecting attemp after a lot of failure. Refresh the web page, still, no connecting attemp happened! Same code running perfect on google chromium 30.
Component: General → Networking: WebSockets
Product: Firefox → Core
Comment 1•11 years ago
|
||
echo, this is an interesting case and indeed a bug. Basically firefox has some logic to backoff our connection rate when there are some failed connects - rfc 6455 7.2.3 encourages that. After some time goes by we reduce the backoff. Your test essentially closes the websocket and restarts a new one every 3 seconds. The bug comes into play when you tests closes the socket from javascript during that backoff timeout - we interpret that as further failure and backoff even more. The process repeats every 3 seconds and the result is that we never end up with a backoff value of less than 3 seconds, so your test always cancels it. deadlock. The fix appears simple - when we fail to connect during the self-imposed backoff period (probably because js closed the websocket), don't use that as input into extending/increasing the backoff period.
Comment 2•11 years ago
|
||
bug 936979 - websocket will never connected after a lot of failure r?jduell
Attachment #832318 -
Flags: review?(jduell.mcbugs)
Comment 3•11 years ago
|
||
https://tbpl.mozilla.org/?tree=Try&rev=56e8775b207b
wow, i am so happy to see this patch when i getup in the morning, thank you very much. So this patch will go with firefox26 probablly? Currently i am using Chromium for development due to this bug.
Updated•11 years ago
|
status-firefox25:
--- → affected
status-firefox26:
--- → affected
status-firefox27:
--- → affected
status-firefox28:
--- → affected
Assignee | ||
Comment 5•10 years ago
|
||
I think this patch does a more complete fix. The problem with filtering just on CONNECTING_DELAYED is that we can hit the same JS close() call when we're in CONNECTING_QUEUED (if a 1st websocket is trying to connect, and a second is launched with the same "close after 3 seconds" logic), or in CONNECTING_IN_PROGRESS if the timing is right (we're starting to connect but the timeout/close happens before we're done). It can even happen in NOT_CONNECTING (AsyncOpen does a DNS lookup: if the timer/close happens before DNS calls OnLookupComplete, we're still in NOT_CONNECTING state). We can be fairly certain that rv == NS_ERROR_NOT_CONNECTED means JS has called close while mTransport == null (we don't call StopSession with that error code anywhere else), and that captures all of these cases: http://mxr.mozilla.org/mozilla-central/source/netwerk/protocol/websocket/WebSocketChannel.cpp#2802 Patrick, let me know if you agree.
Attachment #8358660 -
Flags: review?(mcmanus)
Comment 6•10 years ago
|
||
Comment on attachment 8358660 [details] [diff] [review] 936979.closedelay.v2 Review of attachment 8358660 [details] [diff] [review]: ----------------------------------------------------------------- yes; better.
Attachment #8358660 -
Flags: review?(mcmanus) → review+
Updated•10 years ago
|
Attachment #832318 -
Flags: review?(jduell.mcbugs)
Assignee | ||
Comment 7•10 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/9ba11d59bf3f
Comment 8•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/9ba11d59bf3f
Assignee: nobody → jduell.mcbugs
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Comment 9•10 years ago
|
||
echo, can you please verify that this bug is fixed for you in Firefox 29?
Flags: needinfo?(fatmck)
You need to log in
before you can comment on or make changes to this bug.
Description
•