Last Comment Bug 666448 - Remove confusing escape() extra-argument extension behavior
: Remove confusing escape() extra-argument extension behavior
Status: RESOLVED FIXED
[fixed-in-tracemonkey]
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: ---
Assigned To: Paul Biggar
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-06-22 17:12 PDT by Vadim Geshel
Modified: 2011-07-05 18:17 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Remove 2nd param (8.08 KB, patch)
2011-06-24 15:46 PDT, Paul Biggar
jwalden+bmo: review-
Details | Diff | Splinter Review
Handle Waldo's comments (10.34 KB, patch)
2011-07-05 11:26 PDT, Paul Biggar
jwalden+bmo: review+
Details | Diff | Splinter Review

Description Vadim Geshel 2011-06-22 17:12:54 PDT
User-Agent:       Mozilla/5.0 (Macintosh; Intel Mac OS X 10_6_7) AppleWebKit/535.1 (KHTML, like Gecko) Chrome/13.0.782.24 Safari/535.1
Build Identifier: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:5.0) Gecko/20100101 Firefox/5.0

["a"].map(escape)[0] === ["a"].map(function(s) {return escape(s);})[0]

=> false in FF 4 and 5, true in Chrome and Safari

Also,

["a", "b"].map(escape)

=> ["%61", "b"]

["a", "b"].map(function (s) {return escape(s); })

=> ["a", "b"]

Reproducible: Always

Steps to Reproduce:
The behavior above can be reproduced in a script on a page as well as in Web Console or Firebug.
Comment 1 Jeff Walden [:Waldo] (remove +bmo to email) 2011-06-22 17:40:47 PDT
Pretty sure this is your problem, transferred to escape from parseInt:

http://www.wirfs-brock.com/allen/posts/166
Comment 2 Jeff Walden [:Waldo] (remove +bmo to email) 2011-06-22 17:41:52 PDT
This might be due to a Mozilla-specific extension to escape() semantics.  Perhaps we should remove those, but I wouldn't lay it at this bug's feet.
Comment 3 Boris Zbarsky [:bz] 2011-06-22 20:41:44 PDT
Yep, ECMA-262 escape() takes only one argument, while ours has a second optional argument: the escape mask.
Comment 4 Brendan Eich [:brendan] 2011-06-23 00:28:25 PDT
Let's reopen and morph this bug into one about removing that ancient and non-standard Netscape-2-era extra parameter. It was a reflection of the underlying "netlib" utility function. ES3 and 5 advise against extending built-ins but we have a "pre-existing extension". Does anything need it?

/be
Comment 5 Jeff Walden [:Waldo] (remove +bmo to email) 2011-06-23 17:35:38 PDT
*shrug* Okay, then.  Paul, you want to take this maybe?
Comment 6 Jeff Walden [:Waldo] (remove +bmo to email) 2011-06-23 17:38:26 PDT
I remember investigating this extension behavior at one time, I think because Gavin asked about its future-looking reliability.  I think there were no hits in Mozilla code for using this, and I think I dissuaded him from using it because it seemed of dubious value to merit preservation if it caused any issues (like here).  Could be wrong, of course, but I'm pretty sure it should be relatively easy to deal with any issues that might pop up in one way or another.
Comment 7 Paul Biggar 2011-06-24 15:46:21 PDT
Created attachment 541839 [details] [diff] [review]
Remove 2nd param

Remove the mask parameter, and everything that relies on it.

Doesn't look like any mozilla code uses it, but checking here: http://tbpl.mozilla.org/?tree=Try&rev=8b08da141286
Comment 8 Jeff Walden [:Waldo] (remove +bmo to email) 2011-06-27 12:29:35 PDT
Comment on attachment 541839 [details] [diff] [review]
Remove 2nd param

Review of attachment 541839 [details] [diff] [review]:
-----------------------------------------------------------------

There's nothing wrong with this patch, to be sure, but it seems like a few changes would make it a bit nicer still.  I'm going to r- just because there are enough of them, but this is basically pretty good.

::: js/src/jsstr.cpp
@@ +171,5 @@
>   *                             converts spaces to plus and plus to %20
>   *      Bit 2 ...       path            -- as xalphas but doesn't escape '/'
>   */
>      /*   0 1 2 3 4 5 6 7 8 9 A B C D E F */
>      {    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,       /* 0x */

The comment above appears mostly obsolete.  All we need care about is whether the character's one of the 69 listed in B.2.1.  The extra bits aren't needed any more and now serve only to confuse.

The spec specifically calls out that there are 69 special characters with special handling.  Add some debug-only code that asserts that there are exactly 69 1s in this table, with a comment noting that step 7 specifically identifies there being that many.

Tables like this are always a mess to read (too bad we don't have C99 named initializers, or whatever |.'A' = 1| is called), but these changes should help.

@@ +175,5 @@
>      {    0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,       /* 0x */
>           0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,       /* 1x */
> +         0,0,0,0,0,0,0,0,0,0,1,1,0,1,1,1,       /* 2x   !"#$%&'()*+,-./  */
> +         1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,       /* 3x  0123456789:;<=>?  */
> +         1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,       /* 1x  @ABCDEFGHIJKLMNO  */

I think these 1x,2x,3x, etc. bits were supposed to indicate hexadecimal character codes.  It's a bit awkward how it was done and not entirely intuitively readable to me.  I'd suggest using "0x0_", "0x1_", etc. for these to be clearer.  Definitely you don't want to be changing these hints to be wrong, tho.

@@ +183,5 @@
>           0, };
>  
>  /* This matches the ECMA escape set when mask is 7 (default.) */
>  
> +#define IS_OK(C) ((bool)(urlCharType[((uint8) (C))]))

The comment above this is now unnecessary.  The macro's not really needed, either.  I'd just name the table shouldPassThrough or something and then use the element value directly as the check.  And for best clarity, I'd move the table into js_str_escape to make it absolutely clear that its relevance is limited to that one method.

@@ +185,5 @@
>  /* This matches the ECMA escape set when mask is 7 (default.) */
>  
> +#define IS_OK(C) ((bool)(urlCharType[((uint8) (C))]))
> +
> +/* See ECMA-262 Edition 5 B.2.1 */

Minor nit, but /* ES5 B.2.1 */ is probably the prevalent style (by analogy to non-appendix references, at least).

@@ +203,5 @@
>      /* Take a first pass and see how big the result string will need to be. */
>      size_t newlength = length;
>      for (size_t i = 0; i < length; i++) {
>          jschar ch;
> +        if ((ch = chars[i]) < 128 && IS_OK(ch))

While you're here, mind moving the = chars[i] up a line, into the declaration?

@@ +210,1 @@
>              newlength += 2; /* The character will be encoded as %XX */

This if, now that it's not complicated, is probably better expressed as:

/* The character will be encoded as %XX or %uXXXX. */
newlength += (ch < 256) ? 2 : 5;

@@ +232,5 @@
>          return JS_FALSE;
>      size_t i, ni;
>      for (i = 0, ni = 0; i < length; i++) {
>          jschar ch;
> +        if ((ch = chars[i]) < 128 && IS_OK(ch)) {

Again move assignment up to declaration.
Comment 9 Paul Biggar 2011-07-05 11:26:52 PDT
Created attachment 544004 [details] [diff] [review]
Handle Waldo's comments

I stripped a lot of the comments, since it was mostly history or verbosity, and neither seems to be useful anymore.
Comment 10 Jeff Walden [:Waldo] (remove +bmo to email) 2011-07-05 14:10:02 PDT
Comment on attachment 544004 [details] [diff] [review]
Handle Waldo's comments

Review of attachment 544004 [details] [diff] [review]:
-----------------------------------------------------------------

If you want to rename js_str_escape to just str_escape, and make it not an extern exposed in jsstr.h, feel free.  There's no reason any more for it to be exposed (dunno what the reason was originally, to be honest).  I imagine someone will rename it at some point if it doesn't happen here.

A part of me wonders if it might not be simpler to use a StringBuffer to build up the string, rather than do the manual memory allocation thing and all that, but 1) that's more change than is really necessary, 2) it might be slower (or might not), and 3) who really cares.  :-)

::: js/src/jsstr.cpp
@@ +152,3 @@
>   */
>  
> +/* See ES5 B.2.1 */

Just /* ES5 B.2.1 */ (no "See", that is), please.

@@ +251,5 @@
>  {
>      return js_str_escape(cx, argc, vp, vp);
>  }
>  
>  /* See ECMA-262 Edition 3 B.2.2 */

Might as well change this to /* ES5 B.2.2 */ while you're making changes here.

@@ +252,5 @@
>      return js_str_escape(cx, argc, vp, vp);
>  }
>  
>  /* See ECMA-262 Edition 3 B.2.2 */
> +/* Incomplete escapes are interpreted as literal characters by unescape. */

This comment should go somewhere inside the method, not at its top level.  Although, I think issues like the one in this recent es5-discuss thread argue for just not having it at all -- the algorithm defines the semantics:

https://mail.mozilla.org/pipermail/es5-discuss/2011-June/004001.html
Comment 11 Chris Leary [:cdleary] (not checking bugmail) 2011-07-05 18:06:22 PDT
cdleary-bot mozilla-central merge info:
http://hg.mozilla.org/mozilla-central/rev/cba007ad1747
Note: not marking as fixed because fixed-in-tracemonkey is not present on the whiteboard.
Comment 12 Paul Biggar 2011-07-05 18:17:01 PDT
(In reply to comment #11)
> Note: not marking as fixed because fixed-in-tracemonkey is not present on
> the whiteboard.

Whoops!

http://hg.mozilla.org/tracemonkey/rev/cba007ad1747

Note You need to log in before you can comment on or make changes to this bug.