Closed Bug 232182 Opened 21 years ago Closed 18 years ago

Can't display non-ascii characters in JS exceptions

Categories

(Core :: JavaScript Engine, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9alpha1

People

(Reporter: bugzilla, Assigned: daumling)

References

(Blocks 1 open bug)

Details

(Keywords: intl, js1.5)

Attachments

(9 files, 17 obsolete files)

5.92 KB, image/png
Details
3.36 KB, text/plain
Details
3.68 KB, text/plain
Details
49.63 KB, patch
shaver
: review+
Details | Diff | Splinter Review
4.52 KB, text/plain
Details
2.14 KB, patch
mrbkap
: review+
Details | Diff | Splinter Review
35.83 KB, image/png
Details
787 bytes, patch
brendan
: review+
Pike
: review+
mtschrep
: approval1.8.1+
Details | Diff | Splinter Review
787 bytes, patch
Details | Diff | Splinter Review
User-Agent:       
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; cs-CZ; rv:1.6) Gecko/20040113

JS console displayes non-ascii characters in a wrong way. This means problem,
because some Mozilla JS-messages are localizable.

Reproducible: Always
Steps to Reproduce:
1. Open JSconsole
2. In the textbox write some expresion which returns some output with accented
characters e.g. 'ל¹טר¾‎בםיתש'
3. Press Evaluate

Actual Results:  
The characters are not displayed correctly in the JS console (see screenshot in
attachement)


I have tried \uXXXX version of characters too:

'\u011B\u0161\u010D\u0159\u017E\u00FD\u00E1\u00ED\u00E9\u00FA\u016F'

but with the same result.

This is the problem, because some Mozilla messages displayed in the JS console
are localizable  - e.g. some from
http://lxr.mozilla.org/mozilla/source/caps/src/caps.properties so these
localized error messages are not displayed correctly
Here you can see output from the 'ל¹טר¾‎בםיתש' string.
The output for the \uXXXX version is the same.
(In reply to comment #0)
> This is the problem, because some Mozilla messages displayed in the JS console
> are localizable  - e.g. some from
> http://lxr.mozilla.org/mozilla/source/caps/src/caps.properties so these
> localized error messages are not displayed correctly

Is this just your suspicion, or have you verified that strings from
caps.properties display wrongly? those messages use a rather different codepath
than strings you enter in the js console.
(In reply to comment #2)
> Is this just your suspicion, or have you verified

Yes, I have verified this. We have this problem in the Czech Mozilla version.

See screenshot in our bugzilla
http://bugzilla.czilla.cz/attachment.cgi?id=30&action=view - instead of some
non-ascii character is diasplayed "Y" (in similar way as in the attachment in
the comment 1).

Problem with caps.properties is our main problem, the example in the comment 0
was created only as simple testcase for those who cannot test czech Mozilla
localization.
I'm not sure that the two issues are related - in Mozilla 1.5,
javascript:'%E2%82%AC' does not display the Euro sign that I am expecting.
Does adding alerts to consolebindings show the right message?  That is, is the
bug in the JSConsole display or in the data the error object has?
Evaluated in the JS console,
top.gConsole.mCService.logStringMessage('%E2%82%AC') does correctly display the
Euro symbol, but throw('%E2%82%AC') displays Error: uncaught exception: ¬
Zbarsky>Good idea. Well I have test it. And in alert() I see the same string as
in the JSconsole. So problem is not in JSconsole itself.
Neil>I have tried mCService.logStringMessage() too. And this is working
correctly for non-ascii character (I have tried different encoding utf8,
iso8859-2, win1250), so probably problen is not even in the nsIConsoleService.

(In reply to comment #2)
> those messages use a rather different codepath
> than strings you enter in the js console.

Well, you have true. I have loook in the source now.
a) When you enter expression into the JSconsole, it is paste in similar way as
in the Neil's comment 4 (by javascript:expression ).
b) But strings from the caps.properties are entered by the
nsScriptSecurityManager::ReportError() in the
http://lxr.mozilla.org/mozilla/source/caps/src/nsScriptSecurityManager.cpp#1294

Although a) and b) have same results, the prohlem may be different.
Ok, so the reason that throw('%E2%82%AC') doesn't display the correct exception
on the JS console is that the string gets truncated to 8 bits via
js_DeflateString and js_InflateString. I also happend to notice that line 521 of
jscntxt.c asserts the wrong test, it should be JS_ASSERT(d < argCount);
The relevant caps code is:

1326         JS_SetPendingException(cx,
1327             STRING_TO_JSVAL(JS_NewUCStringCopyZ(cx,
1328                 NS_REINTERPRET_CAST(const jschar*, message.get()))));

where "message" is an nsXPIDLString

brendan, it sounds like the JS engine is dropping the ball here...
Basically, all JS exception messages get truncated to 8 bits :-(
Assignee: js-console → general
Status: UNCONFIRMED → NEW
Component: JavaScript Console → JavaScript Engine
Ever confirmed: true
QA Contact: jrgmorrison → PhilSchwartau
Summary: JS console displayes non-ascii characters in a wrong way → Can't display non-ascii characters in JS exceptions
Keywords: intl
OS: Windows 2000 → All
Hardware: PC → All
Argh, messy jsexn.c and jscntxt.c code from the errors-as-exceptions days.

/be
Assignee: general → brendan
Keywords: js1.5
Priority: -- → P2
Target Milestone: --- → mozilla1.8alpha
Target Milestone: mozilla1.8alpha1 → mozilla1.9alpha
*** Bug 295112 has been marked as a duplicate of this bug. ***
Proposed fix:

in js_ReportUncaughtException(), assign the Unicode equivalent of the error
message to report.ucmessage.

If this is OK, assign the bug to me and I'll provide a fix for the Mozilla 1.8
branch.
Another one Michael has offered to fix -- thanks, Michael!

/be
Assignee: brendan → daumling
During the implementation of the fix, I found that using packed data types like
JSUint8 or JSInt16 in a public struct leads to alignment problems, so I propose
that we revert to intN members for JSErrorMessage. To save space, one could
remove the argCount element, because it is easy to determine the number of
arguments by scanning the message string for {n} and determine the highest number.
Attached patch The proposed fix. (obsolete) — Splinter Review
Please check this patch and let me know what you think. See also other
232182.txt for details.
Attachment #196722 - Flags: review?(brendan)
Attached file Detailed description of patch. (obsolete) —
Attachment #196723 - Flags: review?(brendan)
Attached file Regress test. (obsolete) —
Attachment #196724 - Flags: review?(bob)
The fix passed the test suites in the ExtendScript test environment (which is
slightly different from the Perl suites) with and without the #define
JS_HAS_UTF8_STRINGS enabled.
Status: NEW → ASSIGNED
Comment on attachment 196724 [details]
Regress test.


>if (!isUTF8Enabled())
>	quit();
>

quit isn't defined in the browser. I would prefer to actually run the test even
if the build doesn't have the fix so as to show failures in previous builds.

>var summary = 'Can't display non-ascii characters in JS 
		   ^ terminates string.

>status = summary + ': Acess undefined Unicode symbol';

nit: Acess

I'll attach a new version. Let me know what you think.
Attachment #196724 - Attachment is obsolete: true
Attachment #196724 - Flags: review?(bob) → review-
Attached file regression test 2 (obsolete) —
Attachment #196731 - Flags: review?(daumling)
Comment on attachment 196731 [details]
regression test 2

Much better. Test will fail if SM is not compiled for UTF-8.
Attachment #196731 - Flags: review?(daumling) → review+
RCS file: /cvsroot/mozilla/js/tests/js1_5/Exceptions/regress-232182.js,v
done
Checking in regress-232182.js;
/cvsroot/mozilla/js/tests/js1_5/Exceptions/regress-232182.js,v  <-- 
regress-232182.js
initial revision: 1.1
done
Flags: testcase+
Forgive me for asking a newbie question: How long does it take before a bug fix
is reviewed? The fix has been submitted on 9/19, which is almost three weeks
ago. I suspect that the diff file is almost worthless now after such a long time
has passed. Please provide some insight. I also want to fix 215173, but I want
to wait until the (long) list of changes in this fix has made it to the code base.

My intention was also to have the fix appear in the 1.8 branch. Is there still a
chance?
I had thought shaver was going to do the review, since I was abroad for a
conference.  Yes, I have mail to that effect -- but it looks as though the
request was not redirected in this bug to shaver.  Michael, can you do that? 
I'm back, but shaver should get dibs.  Another reviewer who is responsive: mrbkap.

/be
Attachment #196722 - Flags: review?(brendan) → review?(shaver)
Comment on attachment 196722 [details] [diff] [review]
The proposed fix.

Review redirected to shaver
Comment on attachment 196723 [details]
Detailed description of patch.

Review redirected to shaver
Attachment #196723 - Flags: review?(brendan) → review?(shaver)
Comment on attachment 196722 [details] [diff] [review]
The proposed fix.

>     {"pc2line",         PCToLine,       0},
>+	{"isUTF8Enabled",	IsUTF8Enabled,	0},
> #ifdef DEBUG
>     {"dis",             Disassemble,    1},

Looks like a tab character there, which are not welcome
in the JS sources.

>+    str = js_NewString(cx, chars, charsLen, 0);

I think that we generally want to use the "length" name-part
for these variables, so "charLength" is better.

>+#ifdef JS_HAS_UTF8_STRINGS

I _think_ this is a project-config-file thing, since it's
not tied to the JS version selected, which is what jsconfig.h
is for.

I think I'd also prefer the name JS_STRINGS_ARE_UTF8, since we
don't just have them in addition to ASCII strings, we have them
instead.

>+JS_PUBLIC_API(JSBool)
>+JS_IsUTF8Enabled()
>+{
>+    return JS_TRUE;
>+}

JS_StringsAreUTF8?  JS_UTF8IsEnabled?

>     str->chars = chars;
>-    str->length = length;
>+    str->length = len;

Having both "length" and "len" in the same function is asking
for trouble, IMO.  Can we use "charsLength" instead, or perhaps "inflatedLen"?

>+                        size_t len = strlen(charArg);
>                         reportp->messageArgs[i]
>-                            = js_InflateString(cx, charArg, strlen(charArg));
>+                            = js_InflateString(cx, charArg, &len);

"length" is better than "len", where you have to choose today.

>+                size_t buflen = MAX_KEYWORD_LENGTH;
>+                if (!js_DeflateStringToBuffer(cx, TOKENBUF_BASE(), TOKENBUF_LENGTH(),
>+                                                  buf, &buflen))
>+                    goto error;

Can we initialize buflen with sizeof(buf) - 1, to avoid duplication of
the constant, and keep it more clearly coupled?

>+jschar *
>+js_InflateString(JSContext *cx, const char *bytes, size_t *length)
>+{
>+    jschar *chars = NULL;
>+    size_t dstlen = 0;
>+
>+    if (js_InflateStringToBuffer(cx, bytes, *length, NULL, &dstlen)) {
>+        chars = (jschar *) JS_malloc(cx, (dstlen + 1) * sizeof (jschar));
>+        if (chars) {
>+            js_InflateStringToBuffer(cx, bytes, *length, chars, &dstlen);
>+            chars [dstlen] = 0;
>+            *length = dstlen;
>+        }
>+    }
>+    return chars;
>+}

House style would write that with some explicit |return NULL| after error
checks, rather than nesting-for-success.  (And similar in js_DeflateString,
where you already do that for the malloc-failure case.)

>+    jschar C, C2;

lowercase c, pls.

>+    uint32 V;

and v.

>+            utf8Len = js_OneUcs4ToUtf8Char(utf8buf, V);

Do we lose a lot here to the function-call overhead?  That function
might be too long to inline, hrm.

>+    size_t j, n, dstlen = *dstlenP, origDstlen = dstlen;

We use lowercase p for pointer-suffix, when we use it.

>+ * The buffer is NOT null-terminated.

We'll need to audit the various callers to make sure that
they can cope with this change, if you haven't already.

>+ * cx may be NULL, which means no errors are thrown.
>+ * The destination length needs to be initialized with the buffer size, takes the number of chars moved.

I'm not sure about the optional-cx-and-error-reporting angle here,
though I do like the consolidation of error reporting.

>+MSG_DEF(JSMSG_BAD_CHAR_CONVERSION,    201, 0, JSEXN_TYPEERR, "character conversion error")

Can we be more informative here?  (Giving the offending value, or something?)

I need to go over some of the *flation logic again more  closely, but I wanted
to get these mostly-cosmetic comments in to avoid serializing any more than
necessary.  Marking-, though with a smile and not a frown.
Attachment #196722 - Flags: review?(shaver) → review-
Attachment #196722 - Attachment is obsolete: true
Attachment #199538 - Flags: review?(shaver)
Attachment #196723 - Attachment is obsolete: true
Attachment #199539 - Flags: review?(shaver)
Thanks - the review was exactly what I needed to learn the SM lingo!

I have applied the proposed changes, and changed the wording as follows:

JS_HAS_UTF8_STRINGS => JS_STRINGS_ARE_UTF8 (much better!)
JS_IsUTF8Enabled() => JS_StringsAreUTF8()
JS_EncodeUTF8() => JS_EncodeCharacters()
JS_DecodeUTF8() => JS_DecodeBytes()

The latter two are now also present if UTF-8 is not enabled. If UTF-8 is not
enabled, a simple byte-jschar expansion is performed as it was before.

I was not sure whether it is appropriate to add many error codes. Now, there are
4 new errors. Please check js.msg for the wording, type and format of these errors.

In sh.c, isUTF8Enabled() has been renamed to stringsAreUtf8(), and a new
function testUtf8() tests the new error codes. See the updated docs for details.

JS_STRINGS_ARE_UTF8 is a project #define, not part of jsconfig.h. How should a
#define like this be documented?
Attached file Regression test 3 (obsolete) —
Added testUtf8() tests - please check tests if they are OK.
Attachment #196731 - Attachment is obsolete: true
Attachment #199541 - Flags: review?(bob)
...kudos to Jesse Ruderman!
Attachment #199538 - Attachment is obsolete: true
Attachment #199547 - Flags: review?(shaver)
Comment on attachment 199541 [details]
Regression test 3

These tests will run in the browser as well as the shell. You need to detect if
the functions stringsAreUtf8 and testUtf8 are available and skip the tests
depending on them if they are not.
Attachment #199541 - Flags: review?(bob) → review-
Bob, hwat is the standard procedure to find out if the test is run in the
browser environment?
(In reply to comment #35)

Use object detections such as |if (document) {...}| or test for the particular
functions which you wish to use.
Attached file Regression test 4
Of course...stupid question of mine...
Attachment #199541 - Attachment is obsolete: true
Attachment #199742 - Flags: review?(bob)
Comment on attachment 199742 [details]
Regression test 4

Ok with me. I'll check it in.
Attachment #199742 - Flags: review?(bob) → review+
/cvsroot/mozilla/js/tests/js1_5/Exceptions/regress-232182.js,v  <-- 
regress-232182.js
new revision: 1.2; previous revision: 1.1
done
(In reply to comment #36)
> (In reply to comment #35)
> 
> Use object detections such as |if (document) {...}|

if (document.all), e.g. -- just testing an unqualified name such as document,
which may not be found, needs to be done in a try/catch statement to keep the
ReferenceError from stopping the script.

/be
Comment on attachment 199547 [details] [diff] [review]
Updated patch with some backslashes removed

>+char *
>+js_DeflateString(JSContext *cx, const jschar *chars, size_t length)
>+{
>+    size_t size = 0;
>+    char *bytes = NULL;
>+    if (!js_DeflateStringToBuffer (cx, chars, length, NULL, &size))
>+        return NULL;
>+    bytes = (char *) (cx ? JS_malloc(cx, size+1) : malloc(size+1));
>+    if (!bytes)
>+        return NULL;
>+    js_DeflateStringToBuffer (cx, chars, length, bytes, &size);
>+    bytes [size] = 0;
>+    return bytes;
>+}

If cx is null here, then we could have a silent failure here, but
I guess that has to be handled by callers that have a cx on which
to report.

>+static void 
>+ReportConversionError (const jschar* src, size_t offset)
>+{
>+}

This doesn't look used, and does look static -- remove?

(Though a helper to unify the error reporting logic and reduce
code duplication would not be amiss, if that's the direction in which
you were headed.)

I think this is OK to land on trunk.  Unifying the error reporting
can wait for a follow-on bug, to keep this patch from aging further.
Attachment #199547 - Flags: review?(shaver) → review+
Yes, please remove ReportConversionError() - a leftover that I forgot to remove.

js_DeflateString() with cx == NULL is being used internally, so I had to pay
attention to this condition.

Newbie question: Is there anything that I need to do now? When will the fix be
integrated?
(In reply to comment #42)

This patch fails to apply to the trunk. Please attach a new patch synced to the
trunk. We can revisit whether this gets checked in then.
Attached patch Updated patch per 10/26/05 (obsolete) — Splinter Review
Hope this patch works...
Attachment #200860 - Flags: review?(shaver)
Attachment #199547 - Attachment is obsolete: true
+static void 
+ReportConversionError (const jschar* src, size_t offset)
+{
+}

This is still there.. 
D'oh. Got me...
Attachment #200860 - Attachment is obsolete: true
Attachment #200861 - Flags: review?(shaver)
Attachment #200860 - Flags: review?(shaver)
shaver, if you are too busy to review/integrate, can someone else take on this job? I am afraid that I run into a submit patch - patch too old - resubmit patch cycle.
Comment on attachment 200861 [details] [diff] [review]
Next attempt, with ReportConversionError() removed

r=shaver (I was away Wed and Thu at a workshop, sorry)
Attachment #200861 - Flags: review?(shaver) → review+
Michael, I applied the patch to the trunk and tested in the shell and browser on winxp with js1_5/Exceptions/regress-232182.js but still get a failure. 
The patch does not fix the bug, but rather introduces the ability to work with UTF-8 encoded strings, which would fix the bug.

I would like to ask the community: Should JS 1.6 work with UTF-8 encoded strings in general? This would eliminate the need for the JS_STRINGS_ARE_UTF8 project define, and it would fix the dreadfully simple conversion from bytes to UTF-16 by simple expanding bytes to words. 

It would break existing implementations that use charsets like ANSI or Mac Roman, like German, because of their usage of characters 128-255. But since JS 1.6 is a new release, this may OK. Also, performace may be slightly lower because of the string conversion overhead.

Your opinions, please...
We don't break API compatibility in SpiderMonkey, even just a "little".  Sorry.  That's why I suggested or agreed to (I forget, sorry if it was your suggestion) a compile-time option.

/be
It does not hurt to ask :) First time asked == last time asked. No more requests to break compatibility.

Still leaves the question of where to document compile-time options. 1.6 readme?

Bob, how would you create a test case that runs only if UTF-8 was enabled as a compile timne option? The error is reproducable as:

var utf8Enabled = false;
try { 
	throw new Error ("\u0441\u0462");
} catch (e) { 
 	utf8Enabled = (e.message != "AB");
}

Due to the byte/word expansion, 0x0441 and 0x0442 are returned as 0x0041 and 0x0042 ("AB"). The above code can thus be used to detect if UTF-8 has been enabled for this compilation, both in the browser and in the shell. I have attached an updated test case for Bob to review.
Attachment #201411 - Flags: review?(bob)
Attachment #201411 - Attachment description: Proposed patch with universal detection of UTF-8 presence → Test case with universal detection of UTF-8 presence
Fix checked into trunk.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
What's JS_STRINGS_ARE_UTF8?
Do we have reason the string is not UTF-8?
I think we don't need this ifdef or we should define this.
So we should always use UTF-8 string for l10n builds.
This problem is not solved now.

-> REOPEN
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
See comments #50 and 51. JS_STRINGS_ARE_UTF8 is a projrect file #define. Assuming all strings to be UTF-8 would break compatibility with implementations using the full 8-bit character set.
Please, we have UTF-16 strings in JS already, there is no i18n problem.  This bug is about supporting UTF-8 strings stored using char[] and passed in the API via the char * arguments to JS_NewString, etc. -- which is an API-incompatible change that requires an #ifdef.

/be
Status: REOPENED → RESOLVED
Closed: 19 years ago19 years ago
Resolution: --- → FIXED
(In reply to comment #56)
> Please, we have UTF-16 strings in JS already, there is no i18n problem.  This
> bug is about supporting UTF-8 strings stored using char[] and passed in the API
> via the char * arguments to JS_NewString, etc. -- which is an API-incompatible
> change that requires an #ifdef.

Brendan:

We have a l10n problem. If |caps.properties| has some characters that is lager than U+00ff in its error messages(e.g., the value of GetPropertyDenied), we cannot display the characters in JS Console. Becuase over U+00ff characters are broken by js_DeflateString.
Of course, if we can use UTF-8 string for Console related codes, we may fix this.
But I cannot understand JS engine codes and Console service codes.

Do you think for this?
And...

Isn't there any schedule which defines it on Firefox/Thunderbird/SeaMonkey?
(In reply to comment #58)
> And...
> 
> Isn't there any schedule which defines it on Firefox/Thunderbird/SeaMonkey?

My apologies, you are right that this needs to be done for these products, or really, for all Gecko-based products from now onward.

If we just define the JS_STRINGS_ARE_UTF8 macro, will those caps errors appear correctly in the JS console?

If so, I think we want a build/config bug blocking this one.  I'll reopen and leave it to someone else to file that build/config bug.

/be
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
An Japanese tester(his name is Masatoshi Kimura (emk), he is hacker of Mozilla, see bug 310174) said, if he defines JS_STRINGS_ARE_UTF8, it works fine.
Screenshot
http://bugzilla.mozilla.gr.jp/attachment.cgi?id=2947&action=view
Depends on: 315288
Excellent! Then this bug can be closed when bug 315288 is fixed.

/be
Hm... it looks to me like caps just calls JS_SetPendingException with a JSString jsval. shouldn't that work, independent of whether JS strings are UTF-8?
(In reply to comment #62)
> Hm... it looks to me like caps just calls JS_SetPendingException with a
> JSString jsval. shouldn't that work, independent of whether JS strings are
> UTF-8?
Unfortunately, JSString will be broken by js_ReportUncaughtException AFTER JS_SetPendingException unless JS strings are UTF-8.
sounds like that is the real cause of this bug.
It is a manifestation of the bug. The problem is the error message, which is a char*, and, unless you use utf-8, compression and expansion to and from char* is lossy. Since the way error reporting works cannot be changed without breaking backwards compatibiliy, allowing char*'s to be utf-8 is the only locigal solution (this also fixes the problem of localizing error messages into Unicode), and allows for the use of all char* APIs in a safe way).
Attached patch Bug fix for the bug fix :) (obsolete) — Splinter Review
The current patch causes a double-free if Unicode arguments are supplied to any error reporter. I forgot to create copies of the arguments - how embarrassing...

This patch fixes the issue. It should be applied ASAP.
Attachment #202143 - Flags: review?(shaver)
Comment on attachment 202143 [details] [diff] [review]
Bug fix for the bug fix :)

>@@ -844,18 +844,24 @@ js_ExpandErrorArguments(JSContext *cx, J
>                     if (charArgs) {
>                         char *charArg = va_arg(ap, char *);
>                         size_t charArgLength = strlen(charArg);
>                         reportp->messageArgs[i]
>                             = js_InflateString(cx, charArg, &charArgLength);
>                         if (!reportp->messageArgs[i])
>                             goto error;
>                     }
>-                    else
>-                        reportp->messageArgs[i] = va_arg(ap, jschar *);
>+                    else {

Pre-existing style deviation, I know, but could you fix that else to cuddle with the } on the previous line, per JS house style?  Thanks.

>+                        const jschar* jscharArg = va_arg(ap, jschar *);

Style nit: C-style declarator mode, that is, put the * before the declarator name, not cuddled with the typename.

>+                        reportp->messageArgs[i] = (jschar*)
>+                            JS_malloc (cx, (js_strlen (jscharArg) + 1) * sizeof (jschar*));

Another style nit: no space before ( in function calls of sizeof(type_expression).

Bug: you want sizeof(jschar) here, not sizeof(jschar*).

>+                        if (!reportp->messageArgs[i])
>+                            goto error;
>+                        memcpy (reportp->messageArgs[i], jscharArg, (js_strlen (jscharArg) + 1) * sizeof (jschar*));

Style nit: preserve the sacrosanct 80th column, fill only up to column 79 (numbering from 1), in the name of beloved Emacs.

/be
Attachment #202143 - Flags: review?(shaver) → review-
(In reply to comment #67)

> Another style nit: no space before ( in function calls of

s/of/or/

> sizeof(type_expression).

/be
Attached patch Bug fix for the bug's bug fix (obsolete) — Splinter Review
Not my day today...let me blame it on a slight fever...
Attachment #202143 - Attachment is obsolete: true
Attachment #202152 - Flags: review?(brendan)
Attachment #202152 - Attachment is obsolete: true
Attachment #202153 - Flags: review?(brendan)
Attachment #202152 - Flags: review?(brendan)
Comment on attachment 202153 [details] [diff] [review]
Removed unnecessary const declaration

>@@ -843,19 +843,26 @@ js_ExpandErrorArguments(JSContext *cx, J
>                 for (i = 0; i < argCount; i++) {
>                     if (charArgs) {
>                         char *charArg = va_arg(ap, char *);
>                         size_t charArgLength = strlen(charArg);
>                         reportp->messageArgs[i]
>                             = js_InflateString(cx, charArg, &charArgLength);
>                         if (!reportp->messageArgs[i])
>                             goto error;
>+                   } else {
>+                        jschar *jscharArg = va_arg(ap, jschar *);
>+                        reportp->messageArgs[i] = (jschar*)
>+                            JS_malloc(cx, (js_strlen(jscharArg) + 1) * sizeof(jschar));
>+                        if (!reportp->messageArgs[i])
>+                            goto error;
>+                        memcpy(reportp->messageArgs[i], 
>+                               jscharArg, 
>+                              (js_strlen(jscharArg) + 1) * sizeof(jschar));

Final nit: third line (second overflow line) is unindented one space, the ( should start in the same column as the j in the previous line.

r=me with that picked.

/be
Attachment #202153 - Flags: review?(brendan) → review+
(In reply to comment #65)
> It is a manifestation of the bug. The problem is the error message, which is a
> char*, and, unless you use utf-8, compression and expansion to and from char*
> is lossy.

But it looks to me like the error reporter gets a UTF-16 (UCS-2?) error message too, in report->ucmessage. Why can't that contain the correct message, even if strings are not UTF-8?
Because ucmessage only is non-NULL if the error is generated from within the engine. If the error is JS-generated, ucmessage is NULL, and the char* message is being used, which is the first argument to new Error(). JS_ReportError() and JS_ReportWarning work this way as well.

One could fix this, but this would be a more major undertaking. Conversions from jschar* to char* are all over the place in the engine, and it would take a lot of work to fix all places.
We need another blocker bug to adjust JS API users in XPConnect and possibly elsewhere to deal with this change.  It's hugely incompatible to now give UTF-8 in char[] storage to API clients who for six or seven years have expected to get ISO-Latin-1 or just the low byte of UTF-16 in each char.

/be
Depends on: 315974
Comment on attachment 201411 [details]
Test case with universal detection of UTF-8 presence

r+ with minor 80 col nits fixed.
Attachment #201411 - Flags: review?(bob) → review+
With minor nits fixed.

Checking in regress-232182.js;
/cvsroot/mozilla/js/tests/js1_5/Exceptions/regress-232182.js,v  <--  regress-232182.js
new revision: 1.3; previous revision: 1.2

Results with a recent trunk build:

STATUS: Display non-ascii characters in JS exceptions
STATUS: UTF8 is enabled
Failure messages were:
FAILED!: Display non-ascii characters in JS exceptions: Access undefined Unicode symbol
FAILED!: Expected value '@A is not defined', Actual value '\u0440\u0441 is not defined'
The test fails due to a printf() bug - see bug 315974.
I had to create a new patch. This patch saves two stack variables and a few js_strlen() calls and uses argLengths[i] directly.
Attachment #202153 - Attachment is obsolete: true
Attachment #212297 - Flags: review?(brendan)
315974 has been checked in. Testcase regress-232182 works now.
Attachment #212297 - Attachment is obsolete: true
Attachment #212298 - Flags: review?(brendan)
Attachment #212297 - Flags: review?(brendan)
Comment on attachment 212298 [details] [diff] [review]
Forgot a (void*) cast in memcpy - VC8 issued a warning

>                 } else {
>-                    reportp->messageArgs[i] = va_arg(ap, jschar *);
>+                    jschar *jscharArg = va_arg(ap, jschar *);
>+                    argLengths[i] = js_strlen(jscharArg);
>+                    reportp->messageArgs[i] = (jschar*)
>+                        JS_malloc(cx, (argLengths[i] + 1) * sizeof(jschar));
>+                    if (!reportp->messageArgs[i])
>+                        goto error;
>+                    memcpy((void*) reportp->messageArgs[i], 
>+                           jscharArg, 
>+                          (argLengths[i] + 1) * sizeof(jschar));
>                 }

This (void *) cast (note spacing style nit) is casting away the const in the type of messageArgs.  Might be nicer to use a jschar *jscopyArg or some such local and store that in report->messageArgs[i] only after the memcpy.

r=me with nits fussed over.

/be
Attachment #212298 - Flags: review?(brendan) → review+
Attachment #212298 - Attachment is obsolete: true
Attachment #212392 - Flags: review?(brendan)
Comment on attachment 212392 [details] [diff] [review]
De-nitted patch with extra stack variable

Same patch as previous (bugzilla's interdiff at top of diff view shows this).

/be
Attached patch Corrected patch (obsolete) — Splinter Review
Carefully hand-selected from a variety of exquisite directories.
Attachment #212424 - Flags: review?(brendan)
Attachment #212392 - Attachment is obsolete: true
Attachment #212392 - Flags: review?(brendan)
Comment on attachment 212424 [details] [diff] [review]
Corrected patch

>+                    memcpy(copiedArg, 
>+                           jscharArg, 
>+                          (argLengths[i] + 1) * sizeof(jschar));

Nit: The third line is unindented by one column.

/be
Attachment #212424 - Flags: review?(brendan) → review+
Space character aded to 3rd line and patch checked in.
Status: REOPENED → RESOLVED
Closed: 19 years ago18 years ago
Resolution: --- → FIXED
The fix was totally unnecessary!

In jscntxt.c, this comment states it:
/*
 * js_ExpandErrorArguments owns its messageArgs only if it had to
 * inflate the arguments (from regular |char *|s).
 */

Can someone, please, back out my last "fix"? It causes memory leaks.

Additional question: js_ExtandEerrorArguments() is also called by ReportCompileErrorNumber() in jsscan.c. This function, however, never frees any message arguments. Does this function never receive any arguments? Sounds unlikely. 

if (charArgs) {
    int i = 0;
    while (report->messageArgs[i])
        JS_free(cx, (void *)report->messageArgs[i++]);
}
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
mrbkap sorted out the ownership rules here last time, perhaps he can comment.  I'm going to fire myself and promote him into the job of reviewing here. :-/

You can back out your own changes by going to http://tinderbox.mozilla.org?showbuilds.cgi?tree=SeaMonkey and finding your checkin, then click on the

"Show commands which could be used to back out these changes"

link.  The cvs update -j (for join) command(s) given can be used in your up-to-date tree to revert your change.  You'll then commit again after testing and diffing and re-reviewing.

/be
(In reply to comment #87)
> Additional question: js_ExtandEerrorArguments() is also called by
> ReportCompileErrorNumber() in jsscan.c. This function, however, never frees any
> message arguments. Does this function never receive any arguments? Sounds
> unlikely. 

ReportCompileErrorNumber is only called by js_ReportCompileErrorNumber and js_ReportCompileErrorNumberUC, both of which do the necessary freeing. It isn't the most obvious thing in the world, but it made sense at the time! It also removes the need for the if (charArgs) test when we do need to free the arguments.
Attached patch Reverted patchSplinter Review
I looked up the latest version og jscntxt.c in lxr (3.75), then I did

cvs -j 3.75 -j 3.74 jscntxt.c

The attached patch is the result - it backs out the commit. I hope this was correct.
Attachment #212424 - Attachment is obsolete: true
Attachment #213281 - Flags: review?(mrbkap)
Comment on attachment 213281 [details] [diff] [review]
Reverted patch

Yeah.
Attachment #213281 - Flags: review?(mrbkap) → review+
Backed out the fix.
Status: REOPENED → RESOLVED
Closed: 18 years ago18 years ago
Resolution: --- → FIXED
This got backed out, right?  Shouldn't this bug be reopened?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I think that according to comment #87, the fix was backed out because it was unnecessary.
The fix was not only unnecessary, it created a memory leak. The fix assumed that error arguments in Unicode mode has to be allocated on the heap, which is incorrect. Unicode error arguments are just pointers and are never freed.

The bug does not need to be reopened.
Status: REOPENED → ASSIGNED
Status: ASSIGNED → RESOLVED
Closed: 18 years ago18 years ago
Resolution: --- → FIXED
removing js16 blocking. shouldn't this really be invalid instead of fixed?
No longer blocks: js1.6rc1
sorry for the spam. Only a small change was backed out and this does block js16
Blocks: js1.6rc1
js16 fixes should be in ff2 me thinks.
Flags: blocking1.8.1?
*** Bug 235811 has been marked as a duplicate of this bug. ***
Michael came up with a fix to properly detect utf8

Checking in regress-232182.js;
/cvsroot/mozilla/js/tests/js1_5/Exceptions/regress-232182.js,v  <--  regress-232182.js
new revision: 1.4; previous revision: 1.3
done

Did we end up landing a patch for this?  If so do we need to land for 1.8?
Flags: blocking1.8.1? → blocking1.8.1+
If this is going to be handled as part of a larger js landing by brendan, this bug does not need to be on the blocker list. minusing. 
Flags: blocking1.8.1+ → blocking1.8.1-
This is how it looks in 2.0 beta 2 - the bug is still reproducible.

In the Polish localization I could remove diacritics (for example, make "nie otrzymal uprawnien" out of "nie otrzymał uprawnień") and live with it (though it doesn't look professional), but the non-Latin languages can't do it easily...
Unghost suggested on IRC, that if this is not fixed for 2.0, a l10n note should be placed in caps.properties saying that the *Denied properties must be in plain ASCII.

Also a post to mozilla.dev.l10n is a must, since many locales do use non-ASCII characters for the *Denied properties.

(In reply to comment #101)
> Did we end up landing a patch for this?  If so do we need to land for 1.8?

Nothing more to do until bug 315288 (first dependency above) is fixed, which won't happen for Firefox 2 / Mozilla 1.8.1.

/be
Attached patch Add l10n note to caps.properties (obsolete) — Splinter Review
As mentioned in comment 104, a l10n note should be added in 1.8.1's caps.properties file if this is not going to be fixed for Fx 2.
Attachment #237145 - Flags: review?(brendan)
Attachment #237145 - Flags: approval1.8.1?
Comment on attachment 237145 [details] [diff] [review]
Add l10n note to caps.properties

+# LOCALIZATION NOTE: Due to bug 232182, do NOT use non-ASCII characters in: 

This is actually not really true. A better patch coming up.
Attachment #237145 - Attachment is obsolete: true
Attachment #237145 - Flags: review?(brendan)
Attachment #237145 - Flags: approval1.8.1?
Made the l10n note say Western European characters are OK (and not only plain ASCII).
Attachment #237148 - Flags: review?(brendan)
Attachment #237148 - Flags: approval1.8.1?
Comment on attachment 237148 [details] [diff] [review]
Add l10n note to caps.properties

Should cite bug 315288 to avoid more confusion, r=me with that.  I'd appreciate it if Axel would r+ too.

/be
Attachment #237148 - Flags: review?(brendan)
Attachment #237148 - Flags: review?(axel)
Attachment #237148 - Flags: review+
Comment on attachment 237148 [details] [diff] [review]
Add l10n note to caps.properties

r=the other me, with the nit that brendan had.
Attachment #237148 - Flags: review?(axel) → review+
This is the previous patch with the bug number changed to 315288.
Acutally, the *Error strings suffer from the same problem, as reported here:
http://bugs.aviary.pl/show_bug.cgi?id=930#c0

Updating the l10n note to add the *Error strings as well.
Attachment #237157 - Attachment is obsolete: true
I found out that "CheckLoadURIError" can be safely translated without any restrictions, there are steps to show this message:
1. Open http://www.greymagic.com/security/advisories/gm001-ns/
2. Click Sniff button
3. Go to Tools - Error Console.
Localized "CheckLoadURIError" message shows up without any problems in Russian build of Firefox ( Mozilla/5.0 (Windows; U; Windows NT 5.1; ru; rv:1.8.1b2) Gecko/20060904 BonEcho/2.0b2 ID:2006090404 )

I haven't found any testcase for "CheckSameOriginError", is there anyone? 
Comment on attachment 237157 [details] [diff] [review]
Add l10n note to caps.properties - for check-in

Actually, this patch was OK.

We did some testing in #l10n and #aviarypl and it's really the *Denied strings only.
Attachment #237157 - Attachment is obsolete: false
Attachment #237174 - Attachment is obsolete: true
Comment on attachment 237148 [details] [diff] [review]
Add l10n note to caps.properties

a=schrep for drivers.
Attachment #237148 - Flags: approval1.8.1? → approval1.8.1+
This bug morphed, it should not be assigned to daumling still.  But better would be to file a separate bug for what is patched by the "Add l10n note ..." patches and unmorph this bug, leaving it fixed but not its dependency bug 315288.

/be
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: