Remove confusing escape() extra-argument extension behavior




8 years ago
8 years ago


(Reporter: vgeshel, Assigned: paul.biggar)


Firefox Tracking Flags

(Not tracked)


(Whiteboard: [fixed-in-tracemonkey])


(2 attachments)



8 years ago
User-Agent:       Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_7) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.24 Safari/535.1
Build Identifier: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0) Gecko/20100101 Firefox/5.0

["a"].map(escape)[0] === ["a"].map(function(s) {return escape(s);})[0]

=> false in FF 4 and 5, true in Chrome and Safari


["a", "b"].map(escape)

=> ["%61", "b"]

["a", "b"].map(function (s) {return escape(s); })

=> ["a", "b"]

Reproducible: Always

Steps to Reproduce:
The behavior above can be reproduced in a script on a page as well as in Web Console or Firebug.
Pretty sure this is your problem, transferred to escape from parseInt:
Last Resolved: 8 years ago
Resolution: --- → INVALID
This might be due to a Mozilla-specific extension to escape() semantics.  Perhaps we should remove those, but I wouldn't lay it at this bug's feet.
Yep, ECMA-262 escape() takes only one argument, while ours has a second optional argument: the escape mask.
Let's reopen and morph this bug into one about removing that ancient and non-standard Netscape-2-era extra parameter. It was a reflection of the underlying "netlib" utility function. ES3 and 5 advise against extending built-ins but we have a "pre-existing extension". Does anything need it?

Ever confirmed: true
Resolution: INVALID → ---
*shrug* Okay, then.  Paul, you want to take this maybe?
Assignee: general → pbiggar
Summary: and escape together act strange → Remove confusing escape() extra-argument extension behavior
I remember investigating this extension behavior at one time, I think because Gavin asked about its future-looking reliability.  I think there were no hits in Mozilla code for using this, and I think I dissuaded him from using it because it seemed of dubious value to merit preservation if it caused any issues (like here).  Could be wrong, of course, but I'm pretty sure it should be relatively easy to deal with any issues that might pop up in one way or another.

Comment 7

8 years ago
Remove the mask parameter, and everything that relies on it.

Doesn't look like any mozilla code uses it, but checking here:
Attachment #541839 - Flags: review?(jwalden+bmo)
Comment on attachment 541839 [details] [diff] [review]
Remove 2nd param

Review of attachment 541839 [details] [diff] [review]:

There's nothing wrong with this patch, to be sure, but it seems like a few changes would make it a bit nicer still.  I'm going to r- just because there are enough of them, but this is basically pretty good.

::: js/src/jsstr.cpp
@@ +171,5 @@
>   *                             converts spaces to plus and plus to %20
>   *      Bit 2 ...       path            -- as xalphas but doesn't escape '/'
>   */
>      /*   0 1 2 3 4 5 6 7 8 9 A B C D E F */
>      {    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,       /* 0x */

The comment above appears mostly obsolete.  All we need care about is whether the character's one of the 69 listed in B.2.1.  The extra bits aren't needed any more and now serve only to confuse.

The spec specifically calls out that there are 69 special characters with special handling.  Add some debug-only code that asserts that there are exactly 69 1s in this table, with a comment noting that step 7 specifically identifies there being that many.

Tables like this are always a mess to read (too bad we don't have C99 named initializers, or whatever |.'A' = 1| is called), but these changes should help.

@@ +175,5 @@
>      {    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,       /* 0x */
>           0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,       /* 1x */
> +         0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,1,       /* 2x   !"#$%&'()*+,-./  */
> +         1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,       /* 3x  0123456789:;<=>?  */
> +         1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,       /* 1x  @ABCDEFGHIJKLMNO  */

I think these 1x,2x,3x, etc. bits were supposed to indicate hexadecimal character codes.  It's a bit awkward how it was done and not entirely intuitively readable to me.  I'd suggest using "0x0_", "0x1_", etc. for these to be clearer.  Definitely you don't want to be changing these hints to be wrong, tho.

@@ +183,5 @@
>           0, };
>  /* This matches the ECMA escape set when mask is 7 (default.) */
> +#define IS_OK(C) ((bool)(urlCharType[((uint8) (C))]))

The comment above this is now unnecessary.  The macro's not really needed, either.  I'd just name the table shouldPassThrough or something and then use the element value directly as the check.  And for best clarity, I'd move the table into js_str_escape to make it absolutely clear that its relevance is limited to that one method.

@@ +185,5 @@
>  /* This matches the ECMA escape set when mask is 7 (default.) */
> +#define IS_OK(C) ((bool)(urlCharType[((uint8) (C))]))
> +
> +/* See ECMA-262 Edition 5 B.2.1 */

Minor nit, but /* ES5 B.2.1 */ is probably the prevalent style (by analogy to non-appendix references, at least).

@@ +203,5 @@
>      /* Take a first pass and see how big the result string will need to be. */
>      size_t newlength = length;
>      for (size_t i = 0; i < length; i++) {
>          jschar ch;
> +        if ((ch = chars[i]) < 128 && IS_OK(ch))

While you're here, mind moving the = chars[i] up a line, into the declaration?

@@ +210,1 @@
>              newlength += 2; /* The character will be encoded as %XX */

This if, now that it's not complicated, is probably better expressed as:

/* The character will be encoded as %XX or %uXXXX. */
newlength += (ch < 256) ? 2 : 5;

@@ +232,5 @@
>          return JS_FALSE;
>      size_t i, ni;
>      for (i = 0, ni = 0; i < length; i++) {
>          jschar ch;
> +        if ((ch = chars[i]) < 128 && IS_OK(ch)) {

Again move assignment up to declaration.
Attachment #541839 - Flags: review?(jwalden+bmo) → review-

Comment 9

8 years ago
I stripped a lot of the comments, since it was mostly history or verbosity, and neither seems to be useful anymore.


8 years ago
Attachment #544004 - Flags: review?(jwalden+bmo)
Comment on attachment 544004 [details] [diff] [review]
Handle Waldo's comments

Review of attachment 544004 [details] [diff] [review]:

If you want to rename js_str_escape to just str_escape, and make it not an extern exposed in jsstr.h, feel free.  There's no reason any more for it to be exposed (dunno what the reason was originally, to be honest).  I imagine someone will rename it at some point if it doesn't happen here.

A part of me wonders if it might not be simpler to use a StringBuffer to build up the string, rather than do the manual memory allocation thing and all that, but 1) that's more change than is really necessary, 2) it might be slower (or might not), and 3) who really cares.  :-)

::: js/src/jsstr.cpp
@@ +152,3 @@
>   */
> +/* See ES5 B.2.1 */

Just /* ES5 B.2.1 */ (no "See", that is), please.

@@ +251,5 @@
>  {
>      return js_str_escape(cx, argc, vp, vp);
>  }
>  /* See ECMA-262 Edition 3 B.2.2 */

Might as well change this to /* ES5 B.2.2 */ while you're making changes here.

@@ +252,5 @@
>      return js_str_escape(cx, argc, vp, vp);
>  }
>  /* See ECMA-262 Edition 3 B.2.2 */
> +/* Incomplete escapes are interpreted as literal characters by unescape. */

This comment should go somewhere inside the method, not at its top level.  Although, I think issues like the one in this recent es5-discuss thread argue for just not having it at all -- the algorithm defines the semantics:
Attachment #544004 - Flags: review?(jwalden+bmo) → review+
cdleary-bot mozilla-central merge info:
Note: not marking as fixed because fixed-in-tracemonkey is not present on the whiteboard.

Comment 12

8 years ago
(In reply to comment #11)
> Note: not marking as fixed because fixed-in-tracemonkey is not present on
> the whiteboard.

Last Resolved: 8 years ago8 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-tracemonkey]
You need to log in before you can comment on or make changes to this bug.