Closed
Bug 569154
Opened 15 years ago
Closed 15 years ago
escape encodes non-breaking space to 0xA0 for UTF-8 strings
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
INVALID
People
(Reporter: ehsan.akhgari, Unassigned)
References
()
Details
I'm not 100% sure how this is supposed to work, but the result of escape in the test case in the URL field is 0xA0, which is an invalid UTF-8 code point.
Comment 1•15 years ago
|
||
Seems like all entities that match an ISO-8859-1 character will be encoded that way.
© is encoded to %A9, while entities that are not included in ISO-8859-1 are encoded as unicode.
€ becomes %u20AC for example. € would be %A4 in ISO-8859-15, so I believe this bug is only present with 8859-1 chars.
![]() |
||
Comment 2•15 years ago
|
||
> I'm not 100% sure how this is supposed to work
ECMA-262 section B.2.1 (so not part of the standard, but a "suggestion for how to do it interoperably) says:
6. Get the character (represented as a 16-bit unsigned integer) at position k
within Result(1).
7. If Result(6) is one of the 69 nonblank characters
“ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789@*_+-./”
then go to step 13.
8. If Result(6), is less than 256, go to step 11.
...
11. Let S be a String containing three characters xy where xy are two
hexadecimal digits encoding the value of Result(6).
So the behavior is correct as far as that goes. Note that nowhere in that is the document encoding used. We _used_ to consider the document encoding, then stopped doing that; see bug 44272 (and especially bug 44272 comment 37).
So the point is that unescaping the results of escape() actually gives UTF-16 codepoints, not bytes. If you want something that will unescape to UTF-8 bytes, you probably want encodeURIComponent, right?
Comment 3•15 years ago
|
||
(In reply to comment #2)
> So the point is that unescaping the results of escape() actually gives UTF-16
> codepoints, not bytes. If you want something that will unescape to UTF-8
> bytes, you probably want encodeURIComponent, right?
Thanks A LOT for that suggestion. I usually use that in my own scripts, it just didn't occur to me in this particular case.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → INVALID
You need to log in
before you can comment on or make changes to this bug.
Description
•