Closed Bug 562681 Opened 10 years ago Closed 10 years ago

Update the WebSockets implementation to support the latest version of the protocol

Categories

(Core :: General, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla2.0b1

People

(Reporter: wfernandom2004, Assigned: wfernandom2004)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

Attachments

(3 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; pt-BR; rv:1.9.1.9) Gecko/20100401 Ubuntu/9.10 (karmic) Firefox/3.5.9
Build Identifier: 

The current mozilla's implementation is available at:
https://bugzilla.mozilla.org/show_bug.cgi?id=472529

The current implemented version is the v.75 one. Available at:
http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-75

The latest version is available at:
http://www.whatwg.org/specs/web-socket-protocol/

Also, the API interface must be changed according to:
http://dev.w3.org/html5/websockets/


Reproducible: Always
Depends on: websocket
(See bug 472529 comment 165 and below for previous discussion on this.)
Status: UNCONFIRMED → NEW
Ever confirmed: true
Since our v75 implementation is over I would like to start this bug to much. However, there is a big discussion in the hybi list, and maybe the the protocol can change a lot. For instance, they are talking about using TLS frames as base for everything.

Hixie, I am adding you here. You can provide some feedback to us about that, can't you? Thanks.
Personally I don't think using TLS exclusively is a particularly good idea, but I don't have a good handle on what the working group is going to do.

My understanding is that Chrome implemented the new (current) handshake. If you agree that the current proposed handshake is a better idea than _requiring_ server-side developers to use TLS libraries to implement Web Socket, then you should implement it, since that would increase the number of implementations of the handshake and makes it easier to sell the idea of not replacing the handshake wholesale. On the other hand, if you think it would be better to always use TLS, then your best bet is to experiment with an implementation that does that, and send a report of your implementation experience to the list. That would make it easier to sell the idea that forcing TLS is a good idea.
Attached patch closing handshake (obsolete) — Splinter Review
This patch implements the closing handshake of the v76 and some minor changes of the protocol and the API.

In the following days I will implement what is missing in this patch, i.e implement the new connection handshake.

Olli, can you review this patch when you have some free time, please?

Thanks.
Attachment #449566 - Flags: review?(Olli.Pettay)
Comment on attachment 449566 [details] [diff] [review]
closing handshake


> NS_IMETHODIMP
>-nsWebSocket::Initialize(nsISupports* aOwner,
>+nsWebSocket::Initialize(nsISupports* owner,
>                         JSContext* cx,
>                         JSObject* obj,
>                         PRUint32 argc,
>                         jsval* argv)
Why this change? In fact, all the parameters should be in form
aParameter

>+class nsDOMCloseEvent : public nsDOMEvent,
>+                        public nsIDOMCloseEvent
>+{
>+public:
>+  nsDOMCloseEvent(nsPresContext* aPresContext, nsEvent* aEvent)
>+    : nsDOMEvent(aPresContext, aEvent)
>+  {
>+  }
Please initialize mWasClean in the constructor.

So far looking good. I'll lok at the code more once you finalize the patch.
Attachment #449566 - Flags: review?(Olli.Pettay) → review+
Attached patch version v76 (obsolete) — Splinter Review
Olli, this one is finished. I didn't tested in pywebsocket yet, but Jonathan Griffin is going to.
Attachment #449566 - Attachment is obsolete: true
Attachment #449913 - Flags: review?(Olli.Pettay)
I've just tested the pywebsocket's echo test. It worked :)

Olli, the code is again raising a assertion when the document is closed (for instance, when the page is reloaded). I don't know if the code has changed, but checking if the document is still there when dispatching the event isn't enough anymore... 

This assertion only occurs when running on debug mode and it neither closes the browser nor makes it to stop working. But the assertion is there... I tried to capture the "dom-window-destroyed" (nsObserverService) that nsGlobalWindows throws, but that didn't work either. I'm attaching the stack trace. Do you have any idea to solve it?
So would CheckInnerWindowCorrectness help?
I'll review the patch tomorrow.
Ah, there is the CheckInnerWindowCorrectness check.
The assertion may happen for example with XMLHttpRequest too.
I wouldn't worry about it too much atm.
Comment on attachment 449913 [details] [diff] [review]
version v76


>+  enum eRequestHeader { upgradeHeader = 0, connectionHeader, hostHeader,
>+                        originHeader, secWebSocketProtocolHeader,
>+                        authorizationHeaders, cookieHeaders,
>+                        secWebSocketKey1Header, secWebSocketKey2Header,
>+                        numberRequestHeaders };
>+  nsTArray<PRUint32> headersToSend(numberRequestHeaders);
You could use nsAutoTArray here. But doesn't matter much.

>+nsresult
>+nsWebSocketEstablishedConnection::GenerateRequestKeys(nsCString& aKey1,
>+                                                      nsCString& aKey2,
>+                                                      nsCString& aKey3)
>+{
>+  nsresult rv;
>+  PRUint32 i;
>+
>+  PRUint32 spaces_1 = rand() % 12 + 1;
>+  PRUint32 spaces_2 = rand() % 12 + 1;
>+
>+  PRUint32 max_1 = 4294967295u / spaces_1;
>+  PRUint32 max_2 = 4294967295u / spaces_2;
You could use PR_UINT32_MAX which is 4294967295


>+
>+  // XXX: hmm, RAND_MAX can be less than 4294967295, can't be?
"This value is library dependent, but is granted to be at least 32767."
Perhaps you should use random() and not rand()

>+  PRUint32 number_1 = rand() % max_1;
>+  PRUint32 number_2 = rand() % max_2;
>+
>+  PRUint32 product_1 = number_1 * spaces_1;
>+  PRUint32 product_2 = number_2 * spaces_2;
>+
>+  nsCAutoString key_1;
>+  nsCAutoString key_2;
>+
>+  key_1.AppendInt(product_1);
>+  key_2.AppendInt(product_2);
>+
>+  PRUint32 numberOfCharsToInsert_1 = rand() % 12 + 1;
>+  for (i = 0; i < numberOfCharsToInsert_1; ++i) {
>+    PRUint32 posToInsert = rand() % key_1.Length();
>+    PRUnichar charToInsert =
>+      rand() % 2 == 0 ?
>+        static_cast<PRUnichar>(0x0021 + (rand() % (0x002F - 0x0021 + 1))) :
>+        static_cast<PRUnichar>(0x003A + (rand() % (0x007E - 0x003A + 1)));
>+
>+    key_1.Insert(charToInsert, posToInsert);
>+  }
>+
>+  PRUint32 numberOfCharsToInsert_2 = rand() % 12 + 1;
>+  for (i = 0; i < numberOfCharsToInsert_2; ++i) {
>+    PRUint32 posToInsert = rand() % key_2.Length();
>+    PRUnichar charToInsert =
>+      rand() % 2 == 0 ?
>+        static_cast<PRUnichar>(0x0021 + (rand() % (0x002F - 0x0021 + 1))) :
>+        static_cast<PRUnichar>(0x003A + (rand() % (0x007E - 0x003A + 1)));
>+
>+    key_2.Insert(charToInsert, posToInsert);
>+  }
>+
>+  for (i = 0; i < spaces_1; ++i) {
>+    PRUint32 posToInsert = rand() % (key_1.Length() - 1) + 1;
>+    key_1.Insert(static_cast<PRUnichar>(0x0020), posToInsert);
>+  }
>+
>+  for (i = 0; i < spaces_2; ++i) {
>+    PRUint32 posToInsert = rand() % (key_2.Length() - 1) + 1;
>+    key_2.Insert(static_cast<PRUnichar>(0x0020), posToInsert);
>+  }

Could you actually make a helper method for creating key1 and key2.
Now you have duplicated the same code for both.


>+
>+  nsCAutoString key_3;
>+  for (i = 0; i < 8; ++i) {
>+    key_3 += static_cast<PRUnichar>(rand() % 0x0100);
>+  }
Why % 0x0100?
And should we prevent '\0'? Or does the string handling work
ok even if the string has \0 at random places?
(The protocol doesn't require stripping \0, but I wonder whether
 we need to limit that in our implementation.)



>+
>+  // since we have the keys, we calculate the server md5 challenge response
>+
>+  nsCOMPtr<nsICryptoHash> md5CryptoHash =
>+    do_CreateInstance(NS_CRYPTO_HASH_CONTRACTID, &rv);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  rv = md5CryptoHash->Init(nsICryptoHash::MD5);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  PRUint8 data[16];
>+  for (i = 1; i <= 4; ++i) {
>+    data[i - 1] = static_cast<PRUint8>(number_1 >> (32 - i * 8));
>+  }
>+  for (i = 1; i <= 4; ++i) {
>+    data[i + 3] = static_cast<PRUint8>(number_2 >> (32 - i * 8));
>+  }
>+  for (i = 0; i < 8; ++i) {
>+    data[i + 8] = static_cast<PRUint8>(key_3[i]);
>+  }
>+
>+  rv = md5CryptoHash->Update(data, 16);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  rv = md5CryptoHash->Finish(PR_FALSE, mExpectedMD5Challenge);
>+  NS_ENSURE_SUCCESS(rv, rv);
>+
>+  aKey1 = key_1;
>+  aKey2 = key_2;
>+  aKey3 = key_3;
>+
>+  return NS_OK;
>+}
>+
> PRBool
>+class nsDOMCloseEvent : public nsDOMEvent,
>+                        public nsIDOMCloseEvent
>+{
>+public:
>+  nsDOMCloseEvent(nsPresContext* aPresContext, nsEvent* aEvent)
>+    : nsDOMEvent(aPresContext, aEvent),
>+      mWasClean(PR_FALSE)

mWasClean should fit in to the preview line


>diff --git a/modules/libpref/src/init/all.js b/modules/libpref/src/init/all.js
>--- a/modules/libpref/src/init/all.js
>+++ b/modules/libpref/src/init/all.js
>@@ -713,17 +713,17 @@ pref("network.gopher.qos", 0);
> // Section 4.8 "High-Throughput Data Service Class", and 80 (0x50, or AF22)
> // per Section 4.7 "Low-Latency Data Service Class".
> pref("network.ftp.data.qos", 0);
> pref("network.ftp.control.qos", 0);
> 
> // </http>
> 
> // <ws>: WebSocket
>-pref("network.websocket.enabled", false);
>+pref("network.websocket.enabled", true);
I think we could enable websockets when tests lands.
Though, in practice everything probably lands at the
same time.

If you could update the patch and I could re-review it then.
Attachment #449913 - Flags: review?(Olli.Pettay) → review+
> The assertion may happen for example with XMLHttpRequest too.
> I wouldn't worry about it too much atm.
Hmm. Well, perhaps we could change CheckInnerWindowCorrectness to make the same test that DEBUG_CheckForComponentsInScope(in xpcwrappednativescope.cpp) does. What do you think about?

> Why % 0x0100?
Because it should be a number between 0 and 255, inclusive.

> And should we prevent '\0'?
Well, the key3 is supposed to be just 8 bytes, not a string. I don't know if it could break websocket servers, but it is supposed to not break them. Since the key3 is a sequence of random numbers I think we can strip \0 bytes without problems. Do you want to strip them?

> mWasClean should fit in to the preview line
Preview line? Did you want to write "previous line", didn't you?
Depends on: 571013
Attached patch patch v76Splinter Review
Olli, I tested the \0 byte in the key3 and worked with pywebsocket. But I've removed, as we talked about
Attachment #449913 - Attachment is obsolete: true
Attachment #450228 - Flags: review?(Olli.Pettay)
Comment on attachment 450228 [details] [diff] [review]
patch v76


>+void
>+nsWebSocketEstablishedConnection::GenerateSecKey(nsCString& aKey,
>+                                                 PRUint32 *aNumber)
>+{
Couldn't this method just return PRUint32. No need for the aNumber parameter.
Attachment #450228 - Flags: review?(Olli.Pettay) → review+
Olli, I've finished to write my tests. However I found an issue during executing them that I think it should be corrected before landing the patch. The problem is related to the garbage collector. For instance, suppose there are two js functions:

function testUsingLocalWS()
{
  var ws = new WebSocket("ws://a.valid.ws.server/a.valid.resource");
  ws.onopen = function() {
    alert(1);
  }
}

function testUsingGlobalWS()
{
  window.ws = new WebSocket("ws://a.valid.ws.server/a.valid.resource");
  window.ws.onopen = function() {
    alert(2);
  }
}

Also, suppose that there is a small delay (5 seconds for instance) before the server can provide the response handshake. In my tests I found out that just the alert of testUsingGlobalWS will eventually be called, because the local ws object in the first function will be deleted from the CC before it can receive the onopen event.

Well, I thought that the event listeners held strong references to the object...
Actually, I think that the code is missing something, because it shouldn't happen (as it doesn't happen with xhr)... Well, I will take a look at this issue tomorrow.
We could perhaps fix that issue in a followup bug.
...since there will be followup bugs anyway. That happens always.

Now I need to fix the Windows building problem which tryserver shows...
The windows problem is just that it doesn't seem to find random().

I'll change that and push to tryserver again.
I'll fix the windows problem by changing random back to rand.

And about the event listener problem... nothing actually keeps those
listeners alive.
So, in this case it is probably the websocket connection which needs to keep
itself alive. So nsWebSocketEstablishedConnection shouldn't probably have weak mOwner.
I'll file a followup bug to fix that case.
Actually, per the spec the behavior in #16 is right.
Though there is the bug that connection isn't kept alive
if there are 'message' events listeners.
Attachment #452273 - Flags: review?(Olli.Pettay)
Comment on attachment 452273 [details] [diff] [review]
tests and small fixes

Good, you're testing very different things that what I have in my patch queue.
Attachment #452273 - Flags: review?(Olli.Pettay) → review+
Assignee: nobody → wfernandom2004
Status: NEW → ASSIGNED
I run the tests locally, and I got
"failed | the ws connection in test 15 shouldn't be closed cleanly"
I'll disable that test before landing and I'll file a followup bug to
investigate what is wrong with the test.
Hmm, test 15 only succeeds if the server doesn't send the close frame. It is done by the msgutil.close_connection() call in pywebsocket. That call is done in a "finally" block in the functions that calls the test handler. So I tried sys.exit(0) and that worked to me, but I am not really really sure if it is enough to prevent the finally block.
Seems like tests 9, 13, 14 fail randomly.
I'm going to try to increate finishWSTest timeout time and if that doesn't work
I'll disable the tests.
It is unfortunate that there really isn't any other way to test this all than
using timeouts.
Increasing timeout didn't help. So tests 9, 13, 14, 15 are disabled for now.
Note, the failures seem to be Linux only.
There was another random problem on linux which was about wasClean.
Depends on: 573227
Marking this fixed. Enabling those tests is tracked in Bug 573227.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.1b1
Target Milestone: mozilla1.9.1b1 → mozilla2.0b1
Blocks: 611127
Depends on: 621347
You need to log in before you can comment on or make changes to this bug.