Closed Bug 664344 Opened 13 years ago Closed 13 years ago

fuzz testing against websockets client implementation

Categories

(Core :: Networking: WebSockets, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: imelven, Unassigned)

References

Details

the threat model for websockets (http://wiki.mozilla.org/Security/Reviews/Firefox6/ReviewNotes/WebSockets) has two threats related to input validation, revolving around correctly handling fragmented frames and protocol state.

we would like to create a fuzzing server that will test frames that are unexpected in the current state and unexpected cases involving fragmented frames and generally perform validation against the memory allocation etc. performed by the client websockets code. 

a specific case called out in the threat model review is getting a non-fragmented data frame while expecting more fragment continuation frames (but not control frames as these are explicitly stated in the spec to be allowed while a fragmented frame is 'open').
I have started a project on https://github.com/oberstet/Autobahn which includes a fuzzing server for WebSockets. The fuzzing server can be controlled by sending JSON commands to it, for example "sendframe"/"testcase" commands upon which it will happily send frames illegal by itself or frames illegal in the current protocol state.

My main interest is getting my own stuff work correctly, but I would also like to help to make FF implement WebSockets "to the spec" and securely.

Before I put more work into documenting the issues I find dubious, is there interest by Mozilla developers?

To give examples of what I am after:

There are a couple of "minor" behaviours in Firefox (I'm using FF7 Aurora), which I think are "issues" (contradicts the draft specs for WebSockets protocol).

In that "minor" category are things like for example: FF (correctly) closes a connection upon receiving reserved control frame opcodes, but not reserved data frame opcodes. The latter will just be silently ignored.

Then there subtle issues with fragmented messages. I'll give an example.

Below are 3 wire dumps (after the initial handshake) where the client send a fuzzing server command, then 3 frames are sent from server to client, all same, but differently chopped up on the wire.

The 3 frames sent from server to client are:

1=frag. text message start
2=ping control frame
3=text message end.

Expected is a single PONG.

Only the first case seems to be handled by FF.

a)
All 3 frames from server to client in one chop (line 7C-8C):

00000210  81 8b 5e d8 ef d6 05 fa  8c b7 2d bd df e6 6b fa ..^..... ..-...k.
00000220  b2                                               .
    0000007C  01 0a 66 72 61 67 6d 65  6e 74 31 2d 89 04 70 69 ..fragme nt1-..pi
    0000008C  6e 67 80 09 66 72 61 67  6d 65 6e 74 32          ng..frag ment2
00000221  8a 84 ae 3d b5 f7 de 54  db 90                   ...=...T ..

b)
each of 3 frames in one chop (lines 7C, 88 and 8E)

00000210  81 8b 83 e4 90 6a d8 c6  f3 0b f0 81 a0 5a b5 c6 .....j.. .....Z..
00000220  cd                                               .
    0000007C  01 0a 66 72 61 67 6d 65  6e 74 31 2d             ..fragme nt1-
    00000088  89 04 70 69 6e 67                                ..ping
00000221  8a 84 73 09 e4 00 03 60  8a 67                   ..s....` .g
    0000008E  80 09 66 72 61 67 6d 65  6e 74 32                ..fragme nt2
0000022B  8a 84 13 53 98 72 63 3a  18 7b                   ...S.rc: .{

c)
the 3 frames chopped up on wire into individual octects (the extreme case)

00000210  81 8b 8b 2e 81 29 d0 0c  e2 48 f8 4b b1 19 b2 0c .....).. .H.K....
00000220  dc                                               .
    0000007C  01                                               .
    0000007D  0a                                               .
    0000007E  66                                               f
    0000007F  72                                               r
    00000080  61                                               a
    00000081  67                                               g
    00000082  6d                                               m
    00000083  65                                               e
    00000084  6e                                               n
    00000085  74                                               t
    00000086  31                                               1
    00000087  2d                                               -
    00000088  89                                               .
    00000089  04                                               .
    0000008A  70                                               p
    0000008B  69                                               i
    0000008C  6e                                               n
    0000008D  67                                               g
00000221  8a 84 18 f9 cc 02 68 90  a2 65                   ......h. .e
    0000008E  80                                               .
    0000008F  09                                               .
    00000090  66                                               f
    00000091  72                                               r
0000022B  8a 84 d2 b2 9e 74 a2 db  1e 7d                   .....t.. .}
    00000092  61                                               a
    00000093  67                                               g
    00000094  6d                                               m
    00000095  65                                               e
    00000096  6e                                               n
    00000097  74                                               t
    00000098  32                                               2

==

The data sent from server to client is exactly the same in all three cases. Only in a) FF answer correctly (one PONG). With b) and c), there will be a correct PONG, but a bogus 2nd PONG.

Also note, that after that, FF seems to be in a confused state. Subsequent communication will often fail with b+c.
> Before I put more work into documenting the issues I find dubious, is there
> interest by Mozilla developers?

absolutely - we will need to take them on a case by case basis, but what you have here is really helpful.

I'll look into the two things you mentioned and report back.

btw - its worth testing against nightly instead of aurora, which is where all the current dev work goes on.
Ok, I've created an easy to use test documenting/reproducing the issues in above Github hosted repo.

Here is the short version (first two as above):

1) FF accepts (silently ignores) frames with reserved data frame opcodes. It should immediately fail the connection.

2) FF gets confused on (legal) sequences of fragmented data message with intermittant control frames, when the octets on wire are chopped up. It should correctly process data.

3) FF accepts (silently ingores) continuation frames with FIN = 1 when outside of fragmented message. It should immediately fail the connection.

4) FF accepts/processes both data fragments of payload length 0, and complete (reassembled) messages of length 0, and will generate onMessage() events for those (with an empty string). I don't know if that is "spec conform" .. could not find anything in the spec. However, I'm not sure if it is helpful for the JS anyway to get empty messages ..

==

Then, I had concerns with onClose() not providing code/reason, but that seems to be already filed: https://bugzilla.mozilla.org/show_bug.cgi?id=674716

Also, I had concerns with subprotocols not handled correctly during handshake, but that is also recently filed: https://bugzilla.mozilla.org/show_bug.cgi?id=674527

==

Currently, I have one additional issue:

5) When a WebSocket connection is opened from JS, FF opens 1 TCP connection to server, performs handshake, but immediately after that opens a 2nd TCP connection to server. On that 2nd TCP, FF does not send anything .. and closes the connection after approx. 5s by itself.

I'm not sure why it does that. Also, it might lead to problems, for example when a server has protection mechanisms like blocking an IP from where TCP connection had been accepted, but no handshake performed.

A related issue might be: when I have JS like this:

      <script type="text/javascript">
         function start()
         {
            webSocket = new MozWebSocket("ws://localhost:9000");
            webSocket.onclose =
            function(e)
            {
               start();
            }
         }
         window.onload =
            function()
            {
               start();
            }
      </script>

(naive automatic reconnect), and kill the server, wait some time, upon restarting the server, the server is confronted with dozens of new TCP connections. All those will be closed by FF, only 1 does WebSockets handshaking. This might also interfere with server protection mechanisms.

Somehow, FF seems to be "buffering" the WebSocket creates from JS ..and when server comes back, execute all the buffered connects. Or something else. It's just weird.

==

Regarding Aurora: ok, I'll switch to nightly for testing.

==

So in total, there might be 5 issues. Personally, I find 2) and 5) more important than 1) and 3). Then, 4) might not be an issue anyway.

Pls let me know if I should file bugs or if I can further help in some way.
update:

- the behavior 4) is no issue at all, and conforms to the spec (as you explained to me on the hybi list).

- I've ran the tests against nightly (firefox-8.0a1.en-US.win64-x86_64 / 8.0a1 (2011-07-31)): results are the same
Depends on: 675919
Depends on: 675961
Depends on: 675983
(In reply to comment #3)
> Ok, I've created an easy to use test documenting/reproducing the issues in
> above Github hosted repo.
> 
> Here is the short version (first two as above):
> 
> 1) FF accepts (silently ignores) frames with reserved data frame opcodes. It
> should immediately fail the connection.
> 
> 2) FF gets confused on (legal) sequences of fragmented data message with
> intermittant control frames, when the octets on wire are chopped up. It
> should correctly process data.
> 
> 3) FF accepts (silently ingores) continuation frames with FIN = 1 when
> outside of fragmented message. It should immediately fail the connection.
> 
> 4) FF accepts/processes both data fragments of payload length 0, and
> complete (reassembled) messages of length 0, and will generate onMessage()
> events for those (with an empty string). I don't know if that is "spec
> conform" .. could not find anything in the spec. However, I'm not sure if it
> is helpful for the JS anyway to get empty messages ..
> 
> ==
> 
> Then, I had concerns with onClose() not providing code/reason, but that
> seems to be already filed:
> https://bugzilla.mozilla.org/show_bug.cgi?id=674716
> 
> Also, I had concerns with subprotocols not handled correctly during
> handshake, but that is also recently filed:
> https://bugzilla.mozilla.org/show_bug.cgi?id=674527
> 
> ==
> 
> Currently, I have one additional issue:
> 
> 5) When a WebSocket connection is opened from JS, FF opens 1 TCP connection
> to server, performs handshake, but immediately after that opens a 2nd TCP
> connection to server. On that 2nd TCP, FF does not send anything .. and
> closes the connection after approx. 5s by itself.
> 

so this is works for me. If I load fuzzing.html and click (as an example) 2.1, I see 1 TCP handshake, 1 Websockets Handshake, and the expected test 2.1 websockets exchange. That's all good.

Because websockets is bootstrapped with HTTP, if the TCP handshake exceeds 250ms you will see a second TCP session initiated. The extra one (which could be either one depending on the actual order of connection) might never see any TCP data and would close after a few seconds.

Is that what is happening for you? I would expect a localhost connection to connect much faster than that. Can you provide a packet trace. The best format is the binary pcap (just attach it).







> I'm not sure why it does that. Also, it might lead to problems, for example
> when a server has protection mechanisms like blocking an IP from where TCP
> connection had been accepted, but no handshake performed.
> 
> A related issue might be: when I have JS like this:
> 
>       <script type="text/javascript">
>          function start()
>          {
>             webSocket = new MozWebSocket("ws://localhost:9000");
>             webSocket.onclose =
>             function(e)
>             {
>                start();
>             }
>          }
>          window.onload =
>             function()
>             {
>                start();
>             }
>       </script>
> 
> (naive automatic reconnect), and kill the server, wait some time, upon
> restarting the server, the server is confronted with dozens of new TCP
> connections. All those will be closed by FF, only 1 does WebSockets
> handshaking. This might also interfere with server protection mechanisms.
> 
> Somehow, FF seems to be "buffering" the WebSocket creates from JS ..and when
> server comes back, execute all the buffered connects. Or something else.
> It's just weird.
> 
> ==
> 
> Regarding Aurora: ok, I'll switch to nightly for testing.
> 
> ==
> 
> So in total, there might be 5 issues. Personally, I find 2) and 5) more
> important than 1) and 3). Then, 4) might not be an issue anyway.
> 
> Pls let me know if I should file bugs or if I can further help in some way.
(In reply to comment #3)

> 
> 1) FF accepts (silently ignores) frames with reserved data frame opcodes. It
> should immediately fail the connection.

675919

> 
> 2) FF gets confused on (legal) sequences of fragmented data message with
> intermittant control frames, when the octets on wire are chopped up. It
> should correctly process data.
> 

675961

> 3) FF accepts (silently ingores) continuation frames with FIN = 1 when
> outside of fragmented message. It should immediately fail the connection.
> 

675983

> 4) FF accepts/processes both data fragments of payload length 0, and

invalid

> 
> 5) When a WebSocket connection is opened from JS, FF opens 1 TCP connection
> to server, performs handshake, but immediately after that opens a 2nd TCP
> connection to server. On that 2nd TCP, FF does not send anything .. and
> closes the connection after approx. 5s by itself.

wfm

we'll leave this bug open because it is the anchor for some other parallel fuzz testing.
I try to gather more data when I'm back to office tomorrow.
On Windows, it's not possible to sniff loopback device (since its not really a device). I'll check in a Linux VM. Or with fuzzer remotely in LAN.
I have now extended the test suite for fully automated operation and nearly 70 test cases.

The results are here (the code is on GitHub - tagged v0.2):

http://www.tavendo.de/autobahn/testsuite/report/

Please have a look there first, I have invested considerable time to make it informative .. it even produces wire logs, so you can easily inspect what happens on wire.

Also: Included is the build you provided to me ("PM/Firefox/8.0a(20110802)").

The good news: I could verify that

675919

is fixed in your build (it also seems to have already landed on nightly).

=> Cases 4.2.1 and 4.2.2

Regarding 675961 / 675983: on your build, almost everything green - only 5.15 is left.

This latter one could be related to the others (also new) that remain red: 3.2-3.4 and 4.1.3-4.1.5 and 4.2.3-4.2.5.

The point with these new, refined cases: they do something valid, then invalid. It's verified that the peer fails the connections, but only after having processed the valid stuff before. I think this is the right thing to do when "failing the connection immediately". Process the stuff up until invalid, instead of discarding everything not yet processed, but accumulated. One could though argue the spec doesn't speak about this. What do you think?

Cases 1.2.x are for binary messages .. ignore.

Apart from 5.15., in every case at least one of FF (in some version) or Chrome is green. Therefor, I'm quite confident, that all the cases are valid and the test suite is operating correctly.

==

The test suite still lacks cases .. areas I want to improve are: UTF-8, Performance/Limits, Close Handling and Handshaking.

Puhh, however I have to admit it all took much more work than I thought .. which is also the reason I did not yet have time for the "dubious double connection" stuff .. I come back when I have done the homework of tcpdumping.
(In reply to Tobias Oberstein from comment #8)
>
> The point with these new, refined cases: they do something valid, then
> invalid. It's verified that the peer fails the connections, but only after
> having processed the valid stuff before. I think this is the right thing to
> do when "failing the connection immediately". Process the stuff up until
> invalid, instead of discarding everything not yet processed, but
> accumulated. One could though argue the spec doesn't speak about this. What
> do you think?

That's nice, but sorry, I disagree that there is a requirement to do that. In the case of a protocol error we may proceed to shutdown the connection immediately. This is going to unpredictably impact processing of the various asychronous events already queued. Applications intolerant of that really need to speak a compliant protocol which has an orderly shutdown built into it ;) (i.e. don't try and build a reliable system on top of broken implementations - that way lies madness).

Reflecting that bit of information, can you revise the chart (or just plug it into this bug) where you think the inappropriate behavior still is? Your tool really is impressive.
Ok, somehow I felt you would argue so;) It's right, I agree, the spec does not require thus behavior. Currently, no cases are left (for your build) when that is considered conformant.

That said: I prefer as deterministic behavior as possible. That means, fail exactly on the point of misbehaving data, process everything before. Otherwise it would vary on the amount of buffered, still unprocessed data at the time when misbehaving bit is detected by implementation. Chrome seems to handle that in such a way, as do I. On a symmetric case, I will want to fail for example on a fragmented text message on the first frame where non-UTF-8 data is seen, not only after the whole message. I guess you would argue the spec does not require that also. Conformant also.

Anyway. I would like to leave such test cases in the suite. But in the light of above, it's inappropriate to mark them as fails. I would suggest that I designate them as "non-strict" in yellow. Would that be ok for you? Any better suggestion? I am asking, because I don't want to act like distributing FUD w.r.t. FF. Not that anyone besides you and some Chrome devs took notice up till now anyway, but who knows;)
I have updated the report for reflect above ..
Just one more remark on the cases. For the binary message cases (1.2.x), I only now noticed that the cases fail for FF, not because FF closes the connection, but because I timeout on receiving an echo. Chrome does fail the connection, though not on the first fragment, but only after the last (case 1.2.8). You can see the differences in the wirelogs: TCP CLOSED BY CLIENT vs TCP CLOSED BY SERVER. The FF behavior is for all builds ..
Just for record: I've relooked into the "multiple TCP connections issue" I was talking above.

It's invalid, there is no issue.

What got me confused is that FF closes previous connections to the same WS server asynchronously - that is, it will open new connection .. and after a while only will close the previous. This is different from Chrome, which first closes a connection before opening a new one to the same WS server.
resolved or split off
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.