This is based on just a code read-through--I might be wrong. nsWebSocket::Init calls EstablishConnection(), but ignores the return value with a comment about WS constructor only throwing for bad URI/port. But AFAICT if there's any other error, we need to fire a close() event asynchronously, and by ignoring EstablishConnection, we don't, for a very large set of errors (anything leading up to and including the WS channel's AsyncOpen, including errors from SetLoadGroup, hitting max # of WS connections, etc--it's a long list). Fix is simple in theory? If EstablishConnection fails, post an async event that calls OnStop. But I'm not sure if the existing OnStop logic would be happy being called with a state of never having had an open cxn, or if any of the logic in StopSession is also needed.
I've verified that this is indeed a bug. I just went in and changed nsWebSocketHandler::AsyncOpen to immediately return NS_ERROR_UNEXPECTED. With that, content/base/test/test_websocket_hello.html hangs (never gets a close).
The e10s version of the code also has this bug, and needs to be fixed in WebSocketChannelChild::AsyncOpenFailed(). Note that as near as I can tell, we might also achieve this by scrapping the entire AsyncOpenFailed message, and just having the parent's CancelEarly() call SendClose. I haven't looked carefully enough to know if that would cover everything that needs to happen. We should rename CancelEarly, BTW, as we did in HTTP--it has nothing to do with cancellation (this was more confusing with HTTP, which has a Cancel operation, but still). FailedAsyncOpen is what we went with there.
Sorry, to be more precise: the EstablishConnection fix will be in nsWebsocket, so will be in both parent/child. But under e10s we still get the same symptoms if the parent's AsyncOpen fails for any reason, because we're not firing off a close() event in AsyncOpenFailed. So, 2 bugs with same symptom. might as well fix both here.
Assignee: nobody → mcmanus
Status: NEW → ASSIGNED
Attachment #544816 - Flags: review?(jduell.mcbugs)
Comment on attachment 544816 [details] [diff] [review] patch v1 Review of attachment 544816 [details] [diff] [review]: ----------------------------------------------------------------- ::: content/base/src/nsWebSocket.cpp @@ +774,1 @@ > Seems logical to call FailConnection() instead of close() (which calls FailConnection if CONNECTING, which should always be true here, right? But I see Close also does a kungFuDeathGrip, claiming that FailConnection can release nsWebSocket (but I don't see how, and we call it in other places w/o the deathgrip). Confusing. I guess it can stay this way for now. Grumble... ::: netwerk/protocol/websocket/WebSocketChannel.cpp @@ +1830,5 @@ > + // use AbortSession() as a replacement for the normal synchronous > + // error handling of AsyncOpen() > + mListener = aListener; > + mContext = aContext; > + Alas, even this isn't early enough if you call mChannel->Close before calling AsyncOpen. @@ +1988,5 @@ > + LOG(("WebSocketChannel::Close() without transport - aborting.")); > + AbortSession(NS_ERROR_NOT_CONNECTED); > + return NS_ERROR_NOT_CONNECTED; > + } > + I'm going to leave this in my version of the patch--doesn't seem to do any harm? I'm not clear on whether it's reachable. Let me know if it should come out. ::: netwerk/protocol/websocket/WebSocketChannelParent.cpp @@ +83,5 @@ > do_CreateInstance("@mozilla.org/network/protocol;1?name=ws", &rv); > } > if (NS_FAILED(rv)) > return CancelEarly(); > recvCancelEarly in its current incarnation doesn't wind up sending a close event to the websocket, so problem still exists. @@ +99,1 @@ > These mChannel->Close calls (before AsyncOpen) don't work either. mChannel->mListener hasn't been set yet, so StopSession() doesn't dispatch OnStop() to the parent (and thus the child doesn't get it either).
Attachment #544816 - Flags: review?(jduell.mcbugs) → review-
Comment on attachment 544816 [details] [diff] [review] patch v1 Review of attachment 544816 [details] [diff] [review]: ----------------------------------------------------------------- I'm working on a patch that will fix this all correctly I think, so hold off on making any changes, Patrick. ::: content/base/src/nsWebSocket.cpp @@ +774,1 @@ > Note that the Close call actually does work in this case (parent process websockets): onclose gets sent. It's the e10s parts of the patch that don't. I tested using test_websockets_basic.html, sticking in bogus unconditional errors returns.
Asking Patrick and jdm to both look over this: jdm just for the e10s bits, which Patrick is less familiar with IIRC.
> right? But I see Close also does a kungFuDeathGrip, claiming that > FailConnection can release nsWebSocket (but I don't see how, and we call it in > other places w/o the deathgrip). Confusing. I guess it can stay this way for > now. Grumble... Until the v.76 there were locks because the code had more than one thread running. So maybe it needs to be checked. I will fix the bug 664894, and I think that my work that will address that thing too.
Comment on attachment 549481 [details] [diff] [review] patch v2 check it in!
Attachment #549481 - Flags: review?(mcmanus) → review+
Comment on attachment 549481 [details] [diff] [review] patch v2 After some pondering, I approve.
Attachment #549481 - Flags: feedback?(josh) → feedback+
Target Milestone: --- → mozilla8
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.