Closed Bug 352044 Opened 18 years ago Closed 16 years ago

issues with Unicode escape sequences in JavaScript source code

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla1.9beta5

People

(Reporter: martin.honnen, Assigned: Waldo)

References

Details

Attachments

(2 files)

In <http://groups.google.com/group/mozilla.dev.tech.js-engine/browse_frm/thread/18df5163131f4b6b/d9cdf39351df3939?hl=en#d9cdf39351df3939>
recently some issues with Unicode escape sequences in JavaScript source code (outside of string literals) were discussed.
Based on that I file this bug to point out the problems found with Spidermonkey.

Spidermonkey (tested in shell built/pulled from Mozilla trunk CVS) parses the following statement
  var v\u0020 = 1;
(where \u0020 escapes the space ' ' character) without giving a syntax error and creates a variable with name 'v'.

I think a syntax error should be raised as the ECMAScript edition 3 specification allows Unicode escape sequences only in identifiers, string literals and regular expression literals but for an identifier such an escape sequence is not allowed to add a character disallowed by the grammar for identifiers (like the space character). So unless Spidermonkey intentionally drops the escaped space \u0020 to correct what it finds in the source code and such error correction is allowed I think giving a syntax error is what should happen.

Complete test case is <https://bugzilla.mozilla.org/attachment.cgi?id=237612>, created on bug <https://bugzilla.mozilla.org/show_bug.cgi?id=352042> for Rhino's different but also incorrect handling of the same code.


Second issue found in the newsgroup thread on Spidermonkey's parsing of Unicode escape sequences is that this program (where \u002b escapes the '+' character outside of any identifier, string literal or regular expression literal)

var i = 1;
i \u002b= 1; 
print(i);

is parsed without errors. Rhino gives a syntax error which I think is correct as that Unicode escape sequence is not allowed outside of an identifier, string literal or regular expression literal and += is a punctuator and not an identifier.

For the time being I file both these oddities in Spidermonkey's handling of Unicode escape sequences as one bug, assuming they are related. If they are not and you need different bugs for each issue let me know.
Assignee: general → jwalden+bmo
Blocks: acid3
Target Milestone: --- → mozilla1.9beta5
Attached patch PatchSplinter Review
GetUnicodeEscape in the if clobbered |c|, causing \u002b to be interpreted as +.  Same thing again for |x\u0020 = 3|, except inside the loop.  

The exact character indexes at which errors aren't always accurate, but it's hard to get right and not really worth caring about.  Something like ts->ungetpos+=5 is more scary than I want to have to think about, and at a glance it appears to not work if we failed to peek ahead 5 characters for the uXXXX.

js> 1 \u002b 1             // good index
typein:1: SyntaxError: illegal character:
typein:1: 1 \u002b 1             // good index
typein:1: ..^
js> var x\u0020 = 3        // should be at \, but hard to do
typein:2: SyntaxError: illegal character:
typein:2: var x\u0020 = 3        // should be at \, but hard to do
typein:2: ..........^
js> var x = 2;
js> x\u002b 3              // should be at \, but hard to do
typein:4: SyntaxError: illegal character:
typein:4: x\u002b 3              // should be at \, but hard to do
typein:4: ......^
js> x\u002b= 3             // should be at \, but hard to do
typein:5: SyntaxError: illegal character:
typein:5: x\u002b= 3             // should be at \, but hard to do
typein:5: ......^

I had no idea these were valid, but spec-reading seems to say they're IdentifierName but not ReservedWord due to the escape.  WebKit and Opera agree, so even if this were a spec bug it should remain as it is:

js> var x = {def\u0061ult: 17}; 
js> x.default
17
js> var def\u0061ult = 32
js> this.default
32
Attachment #307900 - Flags: review?(mrbkap)
Attachment #307900 - Flags: review?(mrbkap) → review+
Comment on attachment 307900 [details] [diff] [review]
Patch

Simple change, picks up an easy acid3 point, a syntax error in other browsers (not to mention complete idiocy too) so no compatibility concerns to do it...
Attachment #307900 - Flags: approval1.9?
Comment on attachment 307900 [details] [diff] [review]
Patch

a1.9+=damons
Attachment #307900 - Flags: approval1.9? → approval1.9+
Status: NEW → RESOLVED
Closed: 16 years ago
OS: Windows XP → All
Hardware: PC → All
Resolution: --- → FIXED
/cvsroot/mozilla/js/tests/ecma_3/Unicode/regress-352044-01.js,v  <--  regress-352044-01.js
initial revision: 1.1

/cvsroot/mozilla/js/tests/ecma_3/Unicode/regress-352044-02-n.js,v  <--  regress-352044-02-n.js
initial revision: 1.1

/cvsroot/mozilla/js/tests/public-failures.txt,v  <--  public-failures.txt
new revision: 1.67; previous revision: 1.66
Flags: in-testsuite+
Flags: in-litmus-
v 1.9.0
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: