Closed Bug 782704 Opened 12 years ago Closed 12 years ago

Multiple initiated Websocket connections have an increased delay

Categories

(Core :: Networking: WebSockets, defect)

15 Branch
x86
All
defect
Not set
normal

Tracking

()

RESOLVED INVALID

People

(Reporter: ricky, Unassigned)

References

Details

User Agent: Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.1 (KHTML, like Gecko) Chrome/21.0.1180.57 Safari/537.1

Steps to reproduce:

Start 5 or more tabs with website that uses websockets to communicate.


Actual results:

Tabs 2-5 or more have a 5-10 second delay in loading.


Expected results:

All versions of FF up to 14.0.1 open website in < 1 sec.
OS: Linux → All
This is an internal webapp I've developed. The following code is what I run as soon as the page is opened.

    websock.pWebsocket = new WebSocket("ws://"+engine.hostName+":"+io.pCommPort+"/uncompressed");
    websock.pWebsocket.onopen = websock.onOpen;
    websock.pWebsocket.onmessage = ui.render;
    websock.pWebsocket.onclose = websock.onClose;
    websock.pWebsocket.onerror = websock.onError;

websock.onOpen is taking 5-10 seconds to fire. Previous version would fire it in less than a second. The client and server is on the same local lan.
Component: Untriaged → Networking: WebSockets
Product: Firefox → Core
Hmm, my first thought was that this might be some buggy side-effect of bug 711793, but that code isn't in FF 14.

Ricky: So does this only happen if you use 5 different tabs (as opposed to opening 5 different websockets in the same tab)?

I'm the logic person to take this (unless we wind up finding it's some sort of HTTP bootstrap issue, in which case it might be for mcmanus) but I won't have much time to look at it before the end of the month.  If anyone can take it before then, feel free to assign it to yourself.
I've finally got some time to debug. The "delay" is coming from the fact that on my websocket server, after the initial accept() I call recv(), recv is returning 0 (graceful shutdown). The server loops back to accept() and the next recv() works as expected. This does NOT happen in FF 14. I Started seeing this delay in FF 15+.
ricky: thanks for taking the time to look into this.

If I'm understanding correctly, you're seeing that an initial TCP connection is getting opened but shut down immediately (by the client), and then a 2nd connection is succeeding? 

Does your
meh--hit "submit" by accident :)

Does your server handle multiple simultaneous connections?  If not, you're probably running into the HTTP speculative connections feature we added in FF 15, where we open 2 TCP connections for an HTTP (or websockets) connection.  We generally use the first connection that gets established, but it's certainly possible that the first connection your server sees could wind up getting closed.  For any server that doesn't block waiting for recv() that shouldn't be an issue, but perhaps yours blocks?
Yes, mine blocks. Is this a normal practice (HTTP speculative connections)? Should websocket servers never block initial recv()?
I have confirmed that it is the HTTP speculative connections feature that is causing the delay.

Does this behavior make sense in new WebSocket()? 

I read: "That way any latency added by the cache can overlap with the connection establishment". 

Would this even apply to websocket connections?
here's the deal - a single threaded blocking I/O server can be dos'd via a single telnet client so it realistically has to be fixed. imo You can't reasonably expect parallelism from 5 tabs and then not deal with parallel TCP sessions running at different rates.

as for websockets doing the speculation around the cache access, yeah that's probably the result of the abstraction involved in the implementation. It could be changed but I don't see it as especially harmful.
IIRC issuing a speculative, 2nd SYN also gets us much better performance if the first SYN gets lost (TCP waits a long time before reissuing those).

I'm marking this INVALID for now (not intended to be harsh--it basically means "you fix it, not us").   If we see evidence that many websockets servers out there have issues with this, we could turn this into evangelism or possibly disable preconnects only for websockets.  But that latter seems unlikely.
Status: UNCONFIRMED → RESOLVED
Closed: 12 years ago
Resolution: --- → INVALID
I think it would be fair to say that websocket use is still fairly young. Just so it can be recorded I will FYI my implementation. The server side is actually a very complex multi-threaded web app written in c. I have added websockets to enhance certain areas of the program. So the websocket code could be called a 1 to 1 single threaded server. So it expects just one connection from the client. It was able to handle the bogus connection, but it did have to handle the shutdown which introduced a delay in the connection. IMO I don't see any advantage to the new code in a websocket context, nor do I see a downside to skipping this behavior in such context.
(In reply to Jason Duell (:jduell) from comment #9)
> IIRC issuing a speculative, 2nd SYN also gets us much better performance if
> the first SYN gets lost (TCP waits a long time before reissuing those).
> 
> I'm marking this INVALID for now (not intended to be harsh--it basically
> means "you fix it, not us").   If we see evidence that many websockets
> servers out there have issues with this, we could turn this into evangelism
> or possibly disable preconnects only for websockets.  But that latter seems
> unlikely.

Can you point to some example code (or psuedo code) of the way FF expects the server to handle the speculative connection? I can't find anything on Google.
Your server will need to look more like this now. http://www.lowtek.com/sockets/select.html
> Your server will need to look more like this now

At least for the next month or so:  we've decided to revert the 2nd connection for websockets, as it's causing too much web bustage.  See bug 789018 comment 1.  Of course if you want your server to avoid being DOSed you probably want to make the comment 15 fix anyway...
You need to log in before you can comment on or make changes to this bug.