Unicode characters in URLs are not correctly displayed

NEW
Unassigned

Status

()

Firefox
Developer Tools: Netmonitor
P2
normal
2 years ago
a year ago

People

(Reporter: sebo, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment)

(Reporter)

Description

2 years ago
+++ This bug was initially created as a clone of Bug #987069 +++

If URLs contain non-ASCII characters, they are not correctly Unicode encoded within the Network panel.

STR:
1. Open a new tab
2. Open the Network panel (Ctrl+Shift+Q)
3. Go to http://шомбург.рф/

=> The Network panel displays http://H><1C@3.@D/ as URL for the requests instead of http://шомбург.рф/.

It looks like it is not a Necko bug, because Firebug displays Punycode.

Sebastian
This string seems to come from this code in netmonitor-vew.js:

  NetworkHelper.convertToUnicode(unescape(aData.url))

which is calling a function in network-helper.js.

This function takes the given string and hands it off to nsIScriptableUnicodeConverter, with a charset of UTF-8 in this case.  What this last will do is treat each char of the JSString as a single byte (dropping its high byte) and then treat those bytes as a UTF-8 string.

Now the actual string that comes along in aData.url in this case is "http://шомбург.рф/" which is already Unicode and in particular is decidedly NOT byte-inflated UTF-8.  So treating it as byte-inflated UTF-8 will cause obvious problems.

Victor, you seem to have blame for this code.  Where is that string coming from?
Flags: needinfo?(vporof)
I don't particularly remember the code paths in there, but I believe the url is handed over by the webconsole backend actor, during a "networkEvent" event emitted by the client. If it's always the case that it is already unicode, then that conversion is not necessary. I don't see why it would be unicode in some cases and not in others.
Flags: needinfo?(vporof)
I tried to find where the networkEvent events are actually created, but I can't find it...  nor can I find any documentation for the expected format of the packets going back and forth between server and client for the networkEvent case.  :(

Is there someone who knows the devtools better than I do who can take this on?
(In reply to Boris Zbarsky [:bz] from comment #3)
> I tried to find where the networkEvent events are actually created, but I
> can't find it...  nor can I find any documentation for the expected format
> of the packets going back and forth between server and client for the
> networkEvent case.  :(

I believe the url property is taken from the HTTP channel[1].  Does that mean it should be be safe to display directly?  (Seems like we could just try it an see...)

[1]: https://dxr.mozilla.org/mozilla-central/rev/4ea7408b3eef059aa248f4b00328f8fdb4475112/devtools/shared/webconsole/network-monitor.js#863
Created attachment 8726282 [details] [diff] [review]
bug1250397-unicode.patch

(In reply to J. Ryan Stinnett [:jryans] (use ni?) from comment #4)
> I believe the url property is taken from the HTTP channel[1].  Does that
> mean it should be be safe to display directly?  (Seems like we could just
> try it an see...)

Yes, exactly. I looked at the packet using the RDPi [1] and its data looks just fine.

Here is a quick patch, not sure if that's enough, but the primary issue is fixed and I am curious what the try says.

Try
https://treeherder.mozilla.org/#/jobs?repo=try&revision=678b57aa5ce6

Honza
> Does that mean it should be be safe to display directly?

It... depends.  The .spec of a URI, in its ACString form, may contain unescaped UTF-8 characters (and clearly documents that).  Or it may contain escaped UTF-8.  It's actually totally up to whoever created the URI.

Now when you do .spec in JS, we will do UTF-8 to UTF-16 conversion, so if the URI had unescaped UTF-8 you will now have nontrivial (as in, high bytes not 0) UTF-16.

If you want to make sure things really get unescaped consistently here (e.g. if someone happens to use escaped UTF-8, not raw UTF-8, in a file path), you probably want nsITextToSubURI.unEscapeURIForUI.
Priority: -- → P2
You need to log in before you can comment on or make changes to this bug.