Closed Bug 687679 Opened 13 years ago Closed 12 years ago

JS_CompileFile* should handle UTF-8 (xpconnect doesn't convert non-ascii wchars correctly)

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla11

People

(Reporter: bholley, Assigned: johns)

References

Details

Attachments

(1 file, 7 obsolete files)

I have the following method in IDL:

wchar testWchar(in wchar a, inout wchar b);

The implementation is roughly the following:
foo(a, b) {
  rv = b.value;
  b.value = a;
  return rv;
}

This produces the expected results: foo("z", "q");
This returns a question mark: foo("z", "ア");

This works for wstring, but not wchar. I'm making a note in my testing framework (bug 684327) to investigate this.
Assignee: nobody → bobbyholley+bmo
To reproduce the problem, just change the relevant characters here: http://mxr.mozilla.org/mozilla-central/source/js/xpconnect/tests/unit/test_params.js#145
[ Adding jwalden since this messes with jsapi ]

Proposed fix attached - 
The JS_CompileFile* functions simply read and inflate to UTF16, nuking any encoding except ASCII. This attempts to run the script through JS_DecodeUTF8 first. Ideally we would have a encoding detection of some kind, but this at least lets UTF8 files (which many of these tests already were assuming xpcshell was using) load in properly.

The reason the wstring test was succeeding is because the utf8 was being loaded as ascii - so the resultant string did match the input, equally scrambled. (I found no actual error in the WCHAR/WCHAR_STR handling in XPCConvert)
Attachment #575517 - Flags: review?(jwalden+bmo)
Since this changes functionality, please also rename the API entry points for the benefit of users that currently expect Latin-1 to work:

  JS_CompileFile*  -->  JS_CompileUTF8File*

UTF8 is so much better. Thanks.
Assignee: bobbyholley+bmo → jschoenick
Component: XPConnect → JavaScript Engine
OS: Mac OS X → All
QA Contact: xpconnect → general
Hardware: x86 → All
Summary: xpconnect doesn't convert non-ascii wchars correctly → JS_CompileFile* should handle UTF-8 (xpconnect doesn't convert non-ascii wchars correctly)
Version: unspecified → Trunk
Updated patch. This is the same as before, but renames JS_CompileFile* entry points to JS_CompileUTF8File* to reflect the change. It also fixes some tests that were not assuming UTF8.

Most of these tests were the uconv tests, which had raw bytes in their strings, representing characters in the encoding they were testing. These were changed by replacing the strings with escape sequences.
Attachment #575517 - Attachment is obsolete: true
Attachment #575517 - Flags: review?(jwalden+bmo)
Attachment #576297 - Flags: review?(jorendorff)
Comment on attachment 576297 [details] [diff] [review]
Change JS_CompileFile* to JS_CompileUTF8File* and fix non-UTF8 tests.

Doesn't free decodebuf, fixed version inc
Attachment #576297 - Attachment is obsolete: true
Attachment #576297 - Flags: review?(jorendorff)
Comment on attachment 576311 [details] [diff] [review]
Change JS_CompileFile* to JS_CompileUTF8File* and fix non-UTF8 tests. (fixed)

Review of attachment 576311 [details] [diff] [review]:
-----------------------------------------------------------------

All the changes in js/src are fine. The test changes look fine. (I guess test_LightweightThemeManager.js just doesn’t do what it looks like at all without this change? good grief)

Clearing the review flag rather than setting r+ because of the change in mozJSComponentLoader.cpp. Ask an XPConnect peer about that.

::: ipc/testshell/XPCShellEnvironment.cpp
@@ +338,2 @@
>                                                     Environment(cx)->GetPrincipal());
>          fclose(file);

In all the places like this, also fix up the indentation on the following line.

::: js/src/jsapi.cpp
@@ +4828,1 @@
>              }

Please file a followup bug to replace this with a call to fread.

::: js/xpconnect/loader/mozJSComponentLoader.cpp
@@ +937,5 @@
>  #else  /* HAVE_PR_MEMMAP */
>  
>              /**
>               * No memmap implementation, so fall back to using
> +             * JS_CompileUTF8FileHandleForPrincipals().

With this change, we treat scripts as Latin-1 if HAVE_PR_MEMMAP and UTF-8 otherwise. It should be one or the other on all platforms.

But more importantly we should not switch to UTF-8 here without feedback from people who know what would break. I suspect it’s too big a breaking change.

I thought the idea was that the change would only affect xpcshell and the js shell.
Attachment #576311 - Flags: review?(jorendorff)
If I had a dime for every inexplicable bad thing Splinter ever did to me I would be all set for lunch, but not as all set for lunch as I’d be if I had a dime for every time I’ve complained about Splinter doing something inexplicably bad to me and got a response like “it always includes the preceding five lines, just plan accordingly”.

Not bitter!
per jst, just change this fallback code (which only runs on OS/2 w/o mmap) to read in the file and use CompileScript
Attachment #576311 - Attachment is obsolete: true
Attachment #577430 - Flags: review?(jst)
Comment on attachment 577430 [details] [diff] [review]
Change JS_CompileFile* to JS_CompileUTF8File* and fix non-UTF8 tests.

- In XPCShellEnvironment.cpp

         JSScript* script =
-            JS_CompileFileHandleForPrincipals(cx, obj, filename, file,
+            JS_CompileUTF8FileHandleForPrincipals(cx, obj, filename, file,
                                               env->GetPrincipal());

Fix second line argument indentation to line up with the arguments on the first line. Same thing in xpcshell.cpp.

- In mozJSComponentLoader::GlobalForLocation():

+            struct stat st;
+            int ok = fstat(fileno(fileHandle), &st);
+            if (ok != 0) {
+                NS_WARNING("Failed to stat file");
+                return NS_ERROR_FAILURE;
+            }
+
+            size_t len = st.st_size;

You can get the size of the file from aComponentFile->GetFileSize(), and thus avoid the explicit stat here, and the need for the extra #include at the top of this file.

+            size_t rlen = fread(buf, 1, len, fileHandle);
+            if (!rlen) {

We should probably check rlen != len here to catch cases where the read ends short of reading what it was asked to (due to errors of various kinds etc).

r=jst with that for the tests and xpconnect changes, jorendorff should look at the JS engine changes in this patch.
Attachment #577430 - Flags: review?(jst) → review+
Changes from jst, also fixed issue where JS_SetOptions(oldopts) wasn't being called on all return paths
Attachment #577430 - Attachment is obsolete: true
Comment on attachment 577735 [details] [diff] [review]
Change JS_CompileFile* to JS_CompileUTF8File* and fix non-UTF8 tests

Jason -
jst okay'd the component loader changes here, r?ing you to make sure you're happy with the final JS stuff
Attachment #577735 - Flags: review?(jorendorff)
Comment on attachment 577735 [details] [diff] [review]
Change JS_CompileFile* to JS_CompileUTF8File* and fix non-UTF8 tests

Great. Thanks for finishing this!
Attachment #577735 - Flags: review?(jorendorff) → review+
Can this be checked in? Jason?
Rebased against newest moz central

This is ready for checkin
Attachment #577735 - Attachment is obsolete: true
Keywords: checkin-needed
John, the indentation goes wrong in a number of cases; could you fix that?
Keywords: checkin-needed
Fixed indent in several spots

The funny thing is I specifically went through looking for indentation issues once. I'm not sure how I missed approximately all of them.
Attachment #582330 - Attachment is obsolete: true
Thanks, I pushed to inbound:

https://hg.mozilla.org/integration/mozilla-inbound/rev/ba447ace2594
Status: NEW → ASSIGNED
Flags: in-testsuite+
Target Milestone: --- → mozilla11
And backed because frontend::CompileScript grew an extra argument.

https://hg.mozilla.org/integration/mozilla-inbound/rev/a658b1e5d56b

I've fixed that locally, and I'm sending it to try with my own patches tonight. I'll land over the weekend if everything goes well.
Ack, the only change in the rebase was a stdint type in that block, so I missed the sneaky NULL arg.

I ran a full build/tests again with the arg fix (which I should've done in the first place). This has been through try once so hopefully it wont cause any issues, but it couldn't hurt to run it again, especially with the recent JS changes. Let me know if anything melts.
Attachment #582348 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/d89f3d80d3ec
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Depends on: 711980
You need to log in before you can comment on or make changes to this bug.