Closed Bug 675784 Opened 8 years ago Closed 8 years ago

WebSockets - problem cancelling pre-bootstrapped connection

Categories

(Core :: Networking: WebSockets, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla8

People

(Reporter: mcmanus, Assigned: mcmanus)

Details

Attachments

(2 files)

A canceled request can hang if it is canceled prior to the websockets bootstrap handshake being issued.
Attached patch patch 1Splinter Review
if you isolate test 2 of test_websocket.html and click reload a couple times while the test is outstanding (the test involves a multiple second connection timeout - reload during that) you can see the issue.
Assignee: nobody → mcmanus
Status: NEW → ASSIGNED
Attachment #550143 - Flags: review?(jduell.mcbugs)
Comment on attachment 550143 [details] [diff] [review]
patch 1

Review of attachment 550143 [details] [diff] [review]:
-----------------------------------------------------------------

Took me a while to grok it all, but I think it's good.

::: netwerk/protocol/websocket/WebSocketChannel.cpp
@@ +379,5 @@
> +                          "transaction already running");
> +
> +        (mData[index])->mChannel->mOpenBlocked = 0;
> +        (mData[index])->mChannel->mOpenRunning = 1;
> +        (mData[index])->mChannel->BeginOpen();

So if the page is in the middle of being Canceled by the LoadGroup, and the LG cancels this cxn but there's another WS in same page that's mOpenBlocked, it's going to get found by this logic and have BeginOpen->AsyncOpen called, just to then get Cancelled by the LG.  Right?  I guess that's fine--a little messy, but correct.

There's a big block of shared logic here with Complete--move to a new function called "ConnectNext" or whatever (and please declare a channel * variable so you don't need to repeat mData[index]>mChannel 5 times).

::: netwerk/protocol/websocket/WebSocketChannel.h
@@ +240,5 @@
>    PRUint32                        mAutoFollowRedirects       : 1;
>    PRUint32                        mReleaseOnTransmit         : 1;
>    PRUint32                        mTCPClosed                 : 1;
> +  PRUint32                        mOpenBlocked               : 1;
> +  PRUint32                        mOpenStarted               : 1;

Unlike mOpenBlocked and mOpenRunning, this one isn't temporary--it stays set once the http chan's asyncOpen has been called.  For parity with nsHttpChannel, let's call it mChannelWasOpened, unless you object.
Attachment #550143 - Flags: review?(jduell.mcbugs) → review+
Patrick, let's include these as well unless you object.
> So if the page is in the middle of being Canceled by the LoadGroup, and the
> LG cancels this cxn but there's another WS in same page that's mOpenBlocked,
> it's going to get found by this logic and have BeginOpen->AsyncOpen called,
> just to then get Cancelled by the LG.  Right?  I guess that's fine--a little
> messy, but correct.
> 

right.. the basic issue is that the websocket protocol defines a period of time (a potentially long async period of time) between calling websocketchannel->asyncopen() and when the http connection it uses is opened. The spec requires a dns lookup and potentially a queued wait before the http stuff can begin. If a cancel comes in during that time, we don't see it.

I went with this approach. The other approach is to implement nsirequest just to get the cancel, but that's a whole slew of code that is needless/dup'd-with-http to solve a very small problem even if it is a little cleaner in terms of the model. (i.e. it fits the model well, but it is annoying in practice).
Attachment #551227 - Flags: review+
Whiteboard: [inbound]
Target Milestone: --- → mozilla8
> The other approach is to implement nsirequest just to get the cancel, 
> but that's a whole slew of code that is needless/dup'd-with-http

Right.  But with the cleanup to eliminate WebsocketEstablishedConnection, it probably will make sense to more the nsIRequest code it implements into WebsocketChannel rather than nsWebSocket.  (I haven't looked at Wellington's patch to see if he's doing that already or not).
Whiteboard: [inbound]
You need to log in before you can comment on or make changes to this bug.