Open Bug 1555641 Opened 5 years ago Updated 1 month ago

Binary payload viewer

Categories

(DevTools :: Netmonitor, enhancement, P3)

enhancement

Tracking

(Not tracked)

People

(Reporter: Honza, Unassigned, NeedInfo)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Whiteboard: gsoc-2020)

Attachments

(5 files)

It should be possible to inspect binary WS frame payload. Such component could also be nicely reused for binary HTTP responses.

Honza

STR:

  1. Go to this test site and press Connect.
  2. Open Netmonitor > Click on request > Click on Messages panel.
  3. Send a non-ASCII message e.g. "б" or "他“.
  4. The message shown in DevTools is unreadable (refer to picture)

In nsIWebSocketListener.idl, the frame payload is defined to be of type ACString.

It would be nice if the payload returned is of a binary type.

Alternatively, it would be nice if the payload string returned is exactly the same string as what was sent over the wire. This way DevTools can parse it on the frontend by encoding the string into UTF-8 and then to a hex string to be shown to users.

:baku Do you know if this possible and easy to do? Thanks in advance :D

Flags: needinfo?(amarchesini)

I also found the following issue:

  1. Launch Firefox browser
  2. Navigate to: http://demos.kaazing.com/echo/index.html
  3. Press F12 to open Devtools and go to the Network tab
  4. Press "Connect" button from the page opened at Step 2
  5. Click on "demos.kaazing.com" websocket request
  6. Click on "Messages" panel from the right side
  7. Mouseover the empty message

Actual result:
A tooltip with "Sent/Received" binary data is displayed. See screenshot

Expected result:
The message should not be empty.

Sorry for the delay!

Alternatively, it would be nice if the payload string returned is exactly the same string as what was sent over the wire. This way DevTools can parse it on the frontend by encoding the string into UTF-8 and then to a hex string to be shown to users.

nsIWebSocketListener is what dom/websocket uses. Based on the 'BinaryType' you can show the nsCString as a Blob or as an ArrayBuffer.
In nsIWebSocketEventListener, each webSocketMessageAvailable() as its type.
Is it enough to display the content correctly?

Flags: needinfo?(amarchesini) → needinfo?(odvarko)

@Heng Yeow: does the comment from Andrea helps?
Honza

Flags: needinfo?(odvarko) → needinfo?(E0032242)

(In reply to Andrea Marchesini [:baku] from comment #5)

nsIWebSocketListener is what dom/websocket uses. Based on the 'BinaryType' you can show the nsCString as a Blob or as an ArrayBuffer.
In nsIWebSocketEventListener, each webSocketMessageAvailable() as its type.

I was testing this and I can confirm that webSocketMessageAvailable provides the type as an argument passed to this callback:

Here are the possible values:
https://searchfox.org/mozilla-central/rev/c80fa7258c935223fe319c5345b58eae85d4c6ae/netwerk/protocol/websocket/nsIWebSocketEventService.idl#52-54

But, we need to have the type accessible in frameSent and frameReceived since webSocketMessageAvailable doesn't provide access to the frame and we don't know whether it's sent or received frame.

Andrea: it looks like we should be able to use frame.opCode instead of type, is that correct? Are there any advantages/advantages over in using opCode over type?

Also, I tried to get the raw data from the payload just simply doing the following:

  var data = [];
  for (var i=0, i<frame.payload.length; i++) {
      data.push(frame.payload.charCodeAt(i));
  }
  console.log(data);

It seems to provide the right data but, is this what we should do? It doesn't feel ok.

Here is the place where I tried to get the raw data
https://searchfox.org/mozilla-central/rev/4d9cd186767978a99dafe77eb536a9525980e118/devtools/server/actors/network-monitor/websocket-actor.js#126

Honza

Flags: needinfo?(E0032242) → needinfo?(amarchesini)

I prefer to have somebody from the necko team to answer this question.

Flags: needinfo?(amarchesini) → needinfo?(valentin.gosu)
Flags: needinfo?(valentin.gosu) → needinfo?(michal.novotny)

(In reply to Jan Honza Odvarko [:Honza] (always need-info? me) from comment #7)

Andrea: it looks like we should be able to use frame.opCode instead of type, is that correct? Are there any advantages/advantages over in using opCode over type?

You should use opcode because type isn't applicable to control frames. Opcode can be any of these codes https://searchfox.org/mozilla-central/rev/fa2df28a49883612bd7af4dacd80cdfedcccd2f6/netwerk/protocol/websocket/nsIWebSocketEventService.idl#29-36. See also https://tools.ietf.org/html/rfc6455#section-5.2.

Flags: needinfo?(michal.novotny)

Thanks Michal, also how we should process/handle binary data passed to the WS service listener?
I added a code snippet into the comment #7, but it doesn't feel like the right way to deal with binary data...

Honza

Flags: needinfo?(michal.novotny)

Sorry, I really don't know what's the best way to get raw data from string.

Flags: needinfo?(michal.novotny)

Honza, I was checking the protocols we support that use binary data:

WAMP already converts the string to a typed array: https://searchfox.org/mozilla-central/rev/fa2df28a49883612bd7af4dacd80cdfedcccd2f6/devtools/client/netmonitor/src/components/websockets/parsers/wamp/serializers.js#20

Thanks Harald so, it isn't far from what I tried.

Now, how can we get the original string from the frame.payload?

Examples:

  • sending generates this payload (in dec): [ 228, 187, 150 ]
  • sending б generates this payload (in dec): [ 208, 177 ]

It's correct from the binary perspective and corresponds to what I can see in Chrome.
But, how can we convert it back to string when showing it to the user?

Or any tips who to ask?

Honza

Flags: needinfo?(hkirschner)

(In reply to Jan Honza Odvarko [:Honza] (always need-info? me) from comment #13)

But, how can we convert it back to string when showing it to the user?

I don't think you should try to interpret the binary data. See end of the section 1.2 in rfc 6455 (https://tools.ietf.org/html/rfc6455#section-1.2):

Broadly speaking, there are types for textual data (which is interpreted as UTF-8 [RFC3629] text), binary data (whose interpretation is left up to the application), and control frames ...

(In reply to Michal Novotny [:michal] from comment #14)

I don't think you should try to interpret the binary data.

But, in my case the opCode is 1 i.e. I just want to get the proper text data so, it should be safe no?

Honza

If the frame is a text message, then simply use the payload as UTF-8 string.

@jlast pointed me to this TextDecoder MDN Page:
https://developer.mozilla.org/en-US/docs/Web/API/TextDecoder

And it solves the problem I described in comment #13

let utf8decoder = new TextDecoder(); // default 'utf-8' or 'utf8'
let u8arr = new Uint8Array([ 228, 187, 150 ]);
console.log(utf8decoder.decode(u8arr));

Honza

AFAICS, TextDecoder() doesn't take string as an input, so you have to convert string to byte array and then back to string. You should probably do something like this https://searchfox.org/mozilla-central/rev/7fba7adfcd695343236de0c12e8d384c9b7cd237/third_party/webkit/PerformanceTests/SunSpider/sunspider-0.9.1/crypto-aes.js#359

Afaik, and please challenge me on that, the steps to Binary Viewer output in its many variations are the following:

  1. Take the utf-8 string from the message and convert it back to a binary representation, Uint8Array (see WAMP link)
  2. Take string or binary data to generate various different formats
    1. Hex: Iterate over binary to show binary[i].toString(16) in a 8 column hex table, aligned with utf-8 counterparts
    2. Base64: Ignore binary data, show btoa(string)
    3. UTF-8: Ignore binary data, just show the original utf-8 string
Flags: needinfo?(hkirschner)

Hi honza i want to work on this one, some points

i. I think we should add an extra property to the payload which will be utf8 representation of the payload. this property will be added in websocketFront(https://searchfox.org/mozilla-central/source/devtools/client/fronts/websocket.js#96) this will allow us to display strings correctly for both binary and text payload. The function for calculating utfString will be by utilizing TextDecorator, (we should not convert the payload to utf8 string because for binary payload it is much easier to work with the payload we have)

ii.We should make a new component for showing binary data which go sidebyside with RawData.js and will be rendered if data is binary we could check that by opcode, the component will display Hex,Base64 and UTF-8 representation of data , this will be quite similar to what google chorme have and we could use the harald's idea to implement (comment #19),

Flags: needinfo?(odvarko)

Pranav, thanks for the interest! But, this bug is supposed to be part of GSoC 2020 so, it needs to wait yet (it'd be opened for contribution in case it isn't done during the internship).

Flags: needinfo?(odvarko)
Whiteboard: gsoc-2020
Attached image image.png

Just attaching a screenshot of standard binary viewer.

Honza

Attachment #9158549 - Attachment filename: image.png → Binary Viewer
Severity: normal → S3
Attached image websocket-hex-view.png

Hi, I recently ran into situations where this could've been handy, and hacked up a hex viewer similar to Chrome's that I could try to land. Is this open to contribution now, and if so, would something that looks like the attached image be OK?

Flags: needinfo?(odvarko)

This looks awesome!

I am going to add [devtools-triage] keyword to this, so it goes through regular team triage where we discuss next steps.
If you have a patch we could try please attach it to this bug.

The screenshot looks great. I am curious about the performance. How it works for bigger binary responses?
Is your solution virtualized (i.e. only rendering what's visible)?

Thank you for the help!

Flags: needinfo?(odvarko)
Whiteboard: gsoc-2020 → gsoc-2020, [devtools-triage]
Assignee: nobody → krishna.ravi732
Status: NEW → ASSIGNED

If you have a patch we could try please attach it to this bug.

I have attached a phabricator request, apologies if you meant literally attach a diff file. The viewer I'd hacked up was basically just a simple/crude text area displaying a formatted string that looks like a hex display.

Is your solution virtualized (i.e. only rendering what's visible)?

No, even if text area only renders what's visible, it's not fully virtualized. As, what I'd hacked up ends up generating a hex display formatted string for the entire payload regardless of what's currently visible. Feel free to let me know if I should look into virtualizing the view for this to land.

I am curious about the performance. How it works for bigger binary responses?

Depending on what you mean by bigger, it could be fine or bad :D
I tested a binary frame against the default devtools.netmonitor.msg.messageDataLimit (100 KB iirc) and nothing was noticeably different/slow/bad.

After raising the pref limit to avoid truncation, I also tested a ~111 MB binary frame. All of Firefox crashed when trying to use the hex viewer. However, I also tested it on the raw payload viewer on bookmarks/central and all of Firefox was also crashing there. So this could be an existing issue or my configuration/setup is bust. If someone has any pointers on things I can be looking at (eg. a log file somewhere?) to confirm that this is in fact a bug, I can try them after work and file a separate bug if necessary.

Out of curiosity, I also checked the same frame in Chrome and it was crashing there too (however, only the DevTools window was crashing).

Flags: needinfo?(odvarko)

Discussed this during triage, given the 100kB limitation by default, plus the fact that the current implementation only renders one textarea element, going for a first version without virtualization sounds good.

Maybe we can restrict this to websocket, this way packets are expected to be relatively small.

In case the user overrides the preference, we could still disable the viewer if the size of the packet is too big.

Flags: needinfo?(odvarko)

Discussed this during triage, given the 100kB limitation by default, plus the fact that the current implementation only renders one textarea element, going for a first version without virtualization sounds good.

Awesome, the patch I've posted is ready for review and continues with the one textarea approach for now. It depends on a patch that adds base64/hex copy options for binary websocket frames, since that patch has some test changes and constants that are being used in this patch. Feel free to let me know if it's easier to review without the dependency and I'll split it out.

Maybe we can restrict this to websocket, this way packets are expected to be relatively small.

I've restricted the viewer to only be shown for not parsed binary websocket frames.

In case the user overrides the preference, we could still disable the viewer if the size of the packet is too big.

I was unsure of what limit to use, so if you have any suggestions I can add it in to the patch.

Separately, I also looked into virtualizing it for fun and it shouldn't be too hard to do given that there is a VirtualizedTree which basically worked out of the box. Though it might be better if VirtualizedTree is first changed or another similar/simpler component made to leverage knowing the total number of elements ahead of time. As VirtualizedTree currently is, a new array of size bytes.length / BYTES_PER_ROW would need to be constructed in getRoots which wastes time and space. It would be easier if we could just pass a length and the prop functions instead operated on indices up to the length instead of actual items.

(In reply to Krishna Ravishankar from comment #29)

Discussed this during triage, given the 100kB limitation by default, plus the fact that the current implementation only renders one textarea element, going for a first version without virtualization sounds good.

Awesome, the patch I've posted is ready for review and continues with the one textarea approach for now. It depends on a patch that adds base64/hex copy options for binary websocket frames, since that patch has some test changes and constants that are being used in this patch. Feel free to let me know if it's easier to review without the dependency and I'll split it out.

Maybe we can restrict this to websocket, this way packets are expected to be relatively small.

I've restricted the viewer to only be shown for not parsed binary websocket frames.

In case the user overrides the preference, we could still disable the viewer if the size of the packet is too big.

I was unsure of what limit to use, so if you have any suggestions I can add it in to the patch.

Separately, I also looked into virtualizing it for fun and it shouldn't be too hard to do given that there is a VirtualizedTree which basically worked out of the box. Though it might be better if VirtualizedTree is first changed or another similar/simpler component made to leverage knowing the total number of elements ahead of time. As VirtualizedTree currently is, a new array of size bytes.length / BYTES_PER_ROW would need to be constructed in getRoots which wastes time and space. It would be easier if we could just pass a length and the prop functions instead operated on indices up to the length instead of actual items.

Thanks for the investigation!

I was unsure of what limit to use, so if you have any suggestions I can add it in to the patch.

It's something we can handle in a follow up bug, no need to implement that here.

Separately, I also looked into virtualizing it for fun

It would be amazing to try this out and see if it improves performance compared to the textarea approach.

Flags: needinfo?(krishna.ravi732)
Whiteboard: gsoc-2020, [devtools-triage] → gsoc-2020

It would be amazing to try this out and see if it improves performance compared to the textarea approach.

I've updated the attached phabricator patch to virtualize instead.

Depending on which performance metrics are considered, it is definitely an improvement over the textarea approach. The initial draw is significantly faster with virtualization for bigger payloads and I can load payload such as the earlier mentioned 110MB frame on the hex viewer. For sufficiently sized payloads, scrolling also seems to be fine, if it has gotten slower I wasn't able to notice it. I haven't profiled memory, but having an additional array linearly proportional to the size of the payload with nothing but indices does feel wasteful. Though, maybe this is one of those things that can be fixed in a follow up if it ever becomes a problem.

Another problem I noticed is that the scrollbar seems to break on big payloads (eg. ~110 MB payload). When the height of the invisible presentation div of the VirtualizedTree is sufficiently large (eg. 124457000px) , the maxpos of the vertical scrollbar element ends up pretty low (eg. 31; as if some calculation result overflowed or such?). I am not sure if I am misusing VirtualizedTree, or if I wrote some horrible CSS/code, or there's some browser bug.

Let me know what you think about virtualizing the view this way and which approach would be better to continue with to land a patch.

Flags: needinfo?(krishna.ravi732) → needinfo?(jdescottes)

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: krishna.ravi732 → nobody
Status: ASSIGNED → NEW
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: