toSource() method returns double byte function names incorrectly

VERIFIED FIXED in Future

Status

()

P3
normal
VERIFIED FIXED
18 years ago
14 years ago

People

(Reporter: amohr, Assigned: rogerl)

Tracking

({js1.5})

Trunk
Future
js1.5
Points:
---
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 4 obsolete attachments)

(Reporter)

Description

18 years ago
I created a function with a double byte name.

When evaluating the name of the double byte function, the source code of the 
function is returned with a mangled function name.   However, if the function 
contains a double byte body text, it is returned correctly.  

When looking at the string in the engine it seems as if the first byte of each 
double byte character is changed to zero.

This may be occuring in other platforms as well.

Comment 1

18 years ago
Alexander, would you be able to provide an example of the bug? 
Or attach a reduced testcase that illustrates the problem? 

Thanks -

Comment 2

18 years ago
Created attachment 18620 [details]
HTML attachment showing the problem

Comment 3

18 years ago
Created attachment 18623 [details]
Testcase for standalone JS Engine -

Comment 4

18 years ago
I see no error when I run the above testcase in the standalone JS shell.
Here is the output:


Setting obj = function ABC
obj.toSource()  =  function ABC(x, y, +) {print("nothing");}

Setting obj = function A¦
obj.toSource()  =  function A¦(x, y, +) {print("nothing");}



Notice the function name "function A¦" is displayed perfectly when 
obj.toSource() is applied. The correct non-ASCII character appears. 


However, when I run the HTML testcase in Mozilla, the function name
in obj.toSource() is not the same as the defined function name.
Using Mozilla trunk and branch binaries 20001031xx on WinNT and Linux.


Since the DOM escape function supersedes the JS Engine escape in the
browser, reassigning to DOM Level 0 for further triage - 


amohr@adobe.com, were you running your test in the standalone JS shell
or in the N6/Mozilla browser? If possible, could you attach your testcase
to this bug? Thank you - the more we can pin this down, the faster we'll
be able to resolve it. 
Assignee: rogerl → jst
Status: UNCONFIRMED → NEW
Component: Javascript Engine → DOM Level 0
Ever confirmed: true
OS: Windows 2000 → All
QA Contact: pschwartau → desale
Summary: JS Engine returns double byte function names incorrectly in toSource method → toSource() method returns double byte function names incorrectly
(Reporter)

Comment 5

18 years ago
Since the single byte execution seems to be working correctly (per Phil's test 
cases) this must be a unicode execution problem only.  In fact, in our product 
we only execute unicode scripts using the JS_EvaluateUCScript call.

Comment 6

18 years ago
Here is some data from Alex showing the problem:



Executing this:
 
033D9508  66 75 6E 63 74 69  functi
033D950E  6F 6E 20 82 D0 82  on $B$R!&(B
033D9514  E7 82 AA 82 C8 28  $Bmb%'$J(B(
033D951A  29 20 7B 20 76 61  ) { va
033D9520  72 20 61 20 3D 20  r a = 
033D9526  22 82 D0 82 E7 82  "$B$R$i!&(B
033D952C  AA 82 C8 22 3B 20  $B%'$J(B"; 
033D9532  7D 0D 0A 0D 0A 82  }....$B!&(B
033D9538  D0 82 E7 82 AA 82  $B%_$i$,!&(B
033D953E  C8 00
 

which gets transformed to Unicode:

 
0497E4F8  66 00 75 00 6E 00  f.u.n.
0497E4FE  63 00 74 00 69 00  c.t.i.
0497E504  6F 00 6E 00 20 00  o.n. .
0497E50A  72 30 89 30 4C 30  r0$B!&(BL0
0497E510  6A 30 28 00 29 00  j0(.).
0497E516  20 00 7B 00 20 00   .{. .
0497E51C  76 00 61 00 72 00  v.a.r.
0497E522  20 00 61 00 20 00   .a. .
0497E528  3D 00 20 00 22 00  =. .".
0497E52E  72 30 89 30 4C 30  r0$B!&(BL0
0497E534  6A 30 22 00 3B 00  j0".;.
0497E53A  20 00 7D 00 0D 00   .}...
0497E540  0A 00 0D 00 0A 00  ......
0497E546  72 30 89 30 4C 30  r0$B!&(BL0
0497E54C  6A 30 00 00
 

through:

 
JS_EvaluateUCScript(JSContext *cx, JSObject *obj,
      const jschar *chars, uintN length,
      const char *filename, uintN lineno,
      jsval *rval);

 
where:

 
- cx 0x03808a90
+ links {...}
 interpLevel 0
 version JSVERSION_DEFAULT
 jsop_eq 18 ''
 jsop_ne 19 ''
+ runtime 0x038063d8
+ stackPool {...}
+ fp 0x00000000
+ codePool {...}
+ notePool {...}
+ tempPool {...}
+ globalObject 0x0380a008
+ newborn 0x03808b20
+ regExpStatics {...}
+ sharpObjectMap {...}
 argumentFormatMap 0x00000000
+ lastMessage 0x00000000 ""
 branchCallback 0x00000000
 errorReporter 0x03c75070 iJSBranchCallBack(JSContext *, JSScript *)
 data 0x03c74fb1 iJSErrorReporter(JSContext *, const char *, JSErrorReport *)
+ dormantFrameChain 0x00000000
 throwing 0 ''
 exception 0
 options 58764952


chars
0497E4F8  66 00 75 00 6E 00  f.u.n.
0497E4FE  63 00 74 00 69 00  c.t.i.
0497E504  6F 00 6E 00 20 00  o.n. .
0497E50A  72 30 89 30 4C 30  r0$B!&(BL0
0497E510  6A 30 28 00 29 00  j0(.).
0497E516  20 00 7B 00 20 00   .{. .
0497E51C  76 00 61 00 72 00  v.a.r.
0497E522  20 00 61 00 20 00   .a. .
0497E528  3D 00 20 00 22 00  =. .".
0497E52E  72 30 89 30 4C 30  r0$B!&(BL0
0497E534  6A 30 22 00 3B 00  j0".;.
0497E53A  20 00 7D 00 0D 00   .}...
0497E540  0A 00 0D 00 0A 00  ......
0497E546  72 30 89 30 4C 30  r0$B!&(BL0
0497E54C  6A 30 00 00
 
length 43
 
+ filename "Console Exec"

lineno 1
then calling JS_ValueToString(rval) we get:

 
04962A38  0A 00 66 00 75 00  ..f.u.
04962A3E  6E 00 63 00 74 00  n.c.t.
04962A44  69 00 6F 00 6E 00  i.o.n.
04962A4A  20 00 72 00 89 00   .r.$B!&(B
04962A50  4C 00 6A 00 28 00  L.j.(.
04962A56  29 00 20 00 7B 00  ). .{.
04962A5C  0A 00 20 00 20 00  .. . .
04962A62  20 00 20 00 76 00   . .v.
04962A68  61 00 72 00 20 00  a.r. .
04962A6E  61 00 20 00 3D 00  a. .=.
04962A74  20 00 22 00 5C 00   .".\.
04962A7A  75 00 33 00 30 00  u.3.0.
04962A80  37 00 32 00 5C 00  7.2.\.
04962A86  75 00 33 00 30 00  u.3.0.
04962A8C  38 00 39 00 5C 00  8.9.\.
04962A92  75 00 33 00 30 00  u.3.0.
04962A98  34 00 43 00 5C 00  4.C.\.
04962A9E  75 00 33 00 30 00  u.3.0.
04962AA4  36 00 41 00 22 00  6.A.".
04962AAA  3B 00 0A 00 7D 00  ;...}.
04962AB0  0A 00 00
 

As you can see, the top bytes of the function name have been changed to "00"'s, 
while the var a values have been changed to \u representation

Comment 7

18 years ago
Reassigning back to JS Engine. Roger has pointed out to me that my approach
in the standalone JS shell is flawed, in that I am relying on the visual
representation of characters in DOS to decide whether two characters are equal.


However, DOS is not going to accurately distinguish characters with high-byte 
information; so you can't test accurately relying on visual cues alone - 
Assignee: jst → rogerl
Component: DOM Level 0 → Javascript Engine
QA Contact: desale → pschwartau
(Assignee)

Comment 8

18 years ago
Trying :

s = "function f\u02B2() { }"
eval(s)
e = f\u02B2.toSource()
print(e[10] == s[10]);

in the shell, gives false.
Status: NEW → ASSIGNED

Comment 9

18 years ago
Marking future.
Target Milestone: --- → Future
(Assignee)

Comment 10

17 years ago
Created attachment 91009 [details] [diff] [review]
Use QuoteString to encode function name

Is it ok to use QuoteString this way? (by supplying \0 as quote).
(Assignee)

Comment 11

17 years ago
cc'ing reviewers
Comment on attachment 91009 [details] [diff] [review]
Use QuoteString to encode function name

sr=brendan@mozilla.org

Thanks, this is fine.  So long as you know the string doesn't contain embedded
NULs, supplying '\0' as the quote char is ok.  Watch out for the tab hiding in
the return JS_FALSE line, tho:

+	    return JS_FALSE;

Expand that and sr=brendan@mozilla.org.

/be
Attachment #91009 - Flags: superreview+
Let's get this in soon, branch if possible, trunk for sure -- need an r=khanson
or shaver.

/be
Blocks: 149801
Keywords: js1.5, mozilla1.0.1, mozilla1.1
(Assignee)

Comment 14

17 years ago
Created attachment 91121 [details] [diff] [review]
Lots more of the same

Yikes. There's more to this than just the function names, I tried :

function f\u02b2(p\u02b2, b) { try { label\u02b2 : var v\u02b2 = p\u02b2; }
catch (e\u02b2) { for (var v\u02b2 in e\u02b2); } }
f\u02b2.toSource()

This patch catches all the ATOM_BYTES invocations and changes them. Passes the
test suite ok, so I don't think I did any harm. But on the other hand I'm not
sure if this is sufficient or necessary. I think we need a more thorough test
case.

(p.s. the tab snuck in from a copy-paste - they seem to be all over the file,
is it worth going through and removing them all?)
Attachment #91009 - Attachment is obsolete: true
Comment on attachment 91121 [details] [diff] [review]
Lots more of the same

Oops, don't you need to suppress '\0' insertion in QuoteString by predicating
the Sprint(sp, "%c", (char)quote) calls on if (quote)?	Then (anyway, even
without this fix), I'd pass just 0 and not (jschar)'\0' in all the calls you're
adding.

The rule for tab elimination is: fix what you touch or add, and anything nearby
that you can stand to take naive blame for (cvs annotate -r helps see past
whitespace only changes).

/be
Attachment #91121 - Flags: needs-work+
Comment on attachment 91121 [details] [diff] [review]
Lots more of the same

Oh, I see (did I see this last time?  No): Sprint given "%c" and '\0' will
append an empty string to the Sprinter.  No worries there. 
sr=brendan@mozilla.org with the 0 for (jschar)'\0' change, I say (did you need
the cast to avoid a warning?   I suppose signed char to unsigned jschar might
cause worries, but the compiler should notice that 0 isn't negative).

/be
Attachment #91121 - Flags: needs-work+ → superreview+
(Assignee)

Comment 17

17 years ago
Created attachment 91185 [details] [diff] [review]
Removed (jschar) casting

No, I wasn't getting warnings, just 'Monkey see, Monkey Do'.
Yes, the '0' through to Sprint seemed harmless - other than the waste of time.
It'd be a simple enough fix to QuoteString.
Attachment #91121 - Attachment is obsolete: true
Comment on attachment 91185 [details] [diff] [review]
Removed (jschar) casting

Thanks -- I guess the if (quote) predication helps save a few cycles, doesn't
add too much code to QuoteString, and seems worth doing to me. 
sr=brendan@mozilla.org if you do that.

/be
Attachment #91185 - Flags: superreview+
(Assignee)

Comment 19

17 years ago
Created attachment 91231 [details] [diff] [review]
Added 'if (quote)' predicate.

Just looking for an r= now...
Attachment #91185 - Attachment is obsolete: true
(Assignee)

Comment 20

17 years ago
Comment on attachment 91231 [details] [diff] [review]
Added 'if (quote)' predicate.

Presumptuous sr= per comment #18
Attachment #91231 - Flags: superreview+
Comment on attachment 91231 [details] [diff] [review]
Added 'if (quote)' predicate.

Nit: instead of if-if (which invites dangling else without bracing, in general
-- not really a risk here, but still...), how about if (quote && ...) ?

/be
(Assignee)

Comment 22

17 years ago
Created attachment 91282 [details] [diff] [review]
nit picked
Attachment #91231 - Attachment is obsolete: true
Comment on attachment 91282 [details] [diff] [review]
nit picked

Looks good -- now I wonder why quote is jschar instead of char (I'll try to
cleanup later, if it seems worth it).  sr=brendan@mozilla.org again.

/be
Attachment #91282 - Flags: superreview+

Updated

17 years ago
Attachment #91282 - Flags: review+

Comment 24

17 years ago
Comment on attachment 91282 [details] [diff] [review]
nit picked

r=khanson.  Works for me.

Comment 25

17 years ago
Two testcases added to JS testsuite:

      mozilla/js/tests/ecma_3/Function/regress-58274.js
      mozilla/js/tests/ecma_3/Unicode/uc-005.js


The former is based on the function-name test in Comment #8.
I use toString() instead of toSource() so it can run in Rhino.

The latter is based on the more general identifier test in 
Comment #14; again, using toString() instead of toSource().

Both tests are currently passing in Rhino, and failing in SpiderMonkey.
I will try out the above patch when I can. Right now I'm having trouble
applying it for some reason; will investigate -
(Assignee)

Comment 26

17 years ago
Fix checked in to trunk.
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED

Comment 27

17 years ago
Verified Fixed -

Both testcases in Comment #25 now pass not only in Rhino, but in the
SpiderMonkey debug and optimized shells as well. To test this in the
browser, wait for tomorrow's trunk build, and try:

javascript: function f\u02B2\u0AAA () {}; document.open(); document.write(
f\u02B2\u0AAA.toString().toSource()); document.close(); 


OUTPUT BEFORE THE FIX:
(new String("\nfunction f\xB2\xAA() {\n}\n"))
Notice how the high-byte information has been dropped: (the 02 and the 0A).

OUTPUT AFTER THE FIX: (from current JS shell):
(new String("\nfunction f\\u02B2\\u0AAA() {\n}\n"))


This will be available in trunk builds of the browser shortly -
Status: RESOLVED → VERIFIED

Updated

14 years ago
Flags: testcase+
You need to log in before you can comment on or make changes to this bug.