Closed Bug 1516697 Opened 6 years ago Closed 6 years ago

Implement asm.js support in UTF-8 source text

Categories

(Core :: JavaScript Engine, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
Tracking Status
firefox66 --- affected

People

(Reporter: Waldo, Assigned: Waldo)

References

Details

Attachments

(7 files)

At present (as an expediency thought initially to maybe even be a permanent choice) asm.js parsing is not supported in UTF-8 source text -- only in UTF-16. This needs to change if we are to pass UTF-8 source directly from the browser to the JS engine, because Luke thinks we're not quite at the point where asm.js users can be expected to have moved to wasm from asm.js. (I'm not sure I agree, but I sure as heck don't *disagree*.) To some degree this just requires adding templates out the wazoo to AsmJS.cpp (and changing source layout where possible to reduce the need for templates everywhere, because lots of AsmJS.cpp functionality requires no knowledge of source unit type). That's all easy and mostly straightforward. But eventually we run into the problem that struct AsmJSCacheOps { OpenAsmJSCacheEntryForReadOp openEntryForRead = nullptr; CloseAsmJSCacheEntryForReadOp closeEntryForRead = nullptr; OpenAsmJSCacheEntryForWriteOp openEntryForWrite = nullptr; CloseAsmJSCacheEntryForWriteOp closeEntryForWrite = nullptr; }; contains function pointers that look like using OpenAsmJSCacheEntryForReadOp = bool (*)(HandleObject global, const char16_t* begin, const char16_t* limit, size_t* size, const uint8_t** memory, intptr_t* handle); using OpenAsmJSCacheEntryForWriteOp = AsmJSCacheResult (*)( HandleObject global, const char16_t* begin, const char16_t* end, size_t size, uint8_t** memory, intptr_t* handle); that encode source unit type in their signatures. The super-easiest fix is to keep duplicating -- splitting these function pointers into UTF-8 and UTF-16 variants. I'll wait on figuring that out til the new year, I guess, seeing as Luke seems to be Out right now. In the meantime, tho, I can post all the patches leading up to the point where that wrinkle has to be sorted out. Commence primary ignition...
Assignee: nobody → jwalden
Status: NEW → ASSIGNED
Unfortunately ubsan doesn't pick this up because the UB occurs in the standard library, not in compiler-compiled code. :-(
Attachment #9033587 - Flags: review?(arai.unmht)
The number of things that have to take FunctionValidator& is super-unfortunate because each then requires templating, but I'm not sure there's anything better to do.
Attachment #9033588 - Flags: review?(arai.unmht)
Just stylistic improvements, made in passing.
Attachment #9033589 - Flags: review?(arai.unmht)
At this point, pretty much all of AsmJS.cpp is ready for UTF-8 except CompileAsmJS and the open/read/write-cache functions it directly calls, basically. And all the <Unit>-templated functions/classes are only instantiated for char16_t still. Doing anything for UTF-8 will require answers to the questions posed in comment 0.
Attachment #9033590 - Flags: review?(arai.unmht)
Attachment #9033583 - Flags: review?(arai.unmht) → review+
Attachment #9033587 - Flags: review?(arai.unmht) → review+
Luke, thoughts on the question in comment 0 -- should the read/write asm.js cache ops just be duplicated (i.e. overloads that call a single templated implementation) for UTF-8 and UTF-16? Or should some other thing be attempted?
Flags: needinfo?(luke)
Attachment #9033588 - Flags: review?(arai.unmht) → review+
Attachment #9033589 - Flags: review?(arai.unmht) → review+
Attachment #9033590 - Flags: review?(arai.unmht) → review+
Keywords: leave-open
Pushed by jwalden@mit.edu: https://hg.mozilla.org/integration/mozilla-inbound/rev/0e39815d1bce Make wasm/AsmJS.h minimally include only what it needs. r=arai https://hg.mozilla.org/integration/mozilla-inbound/rev/e58454c68f0f Don't use std::abs on a value that could possibly be -2147483648 (whose absolute value can't be computed when |int| is 32-bit-sized). r=arai https://hg.mozilla.org/integration/mozilla-inbound/rev/c7577c232591 Split {Module,Function}Validator classes into {Module,Function}Validator{,Shared} to segregate source-unit-agnostic parts from source-unit-aware parts, use the correct types in all signatures, and use *ValidatorShared::* instead of *Validator::* for nested classes that are common to both source types. r=arai https://hg.mozilla.org/integration/mozilla-inbound/rev/ea3ab4b83152 Use |using| to define various typedefs in AsmJS.cpp. r=arai https://hg.mozilla.org/integration/mozilla-inbound/rev/4a8940073f8c Make AsmJSParser a template typedef and {Module,Function}Validator template classes. r=arai
Pushed by jwalden@mit.edu: https://hg.mozilla.org/integration/mozilla-inbound/rev/df1b7a886a9d Split {Module,Function}Validator classes into {Module,Function}Validator{,Shared} to segregate source-unit-agnostic parts from source-unit-aware parts, use the correct types in all signatures, and use *ValidatorShared::* instead of *Validator::* for nested classes that are common to both source types. r=arai https://hg.mozilla.org/integration/mozilla-inbound/rev/d1267711aef0 Make AsmJSParser a template typedef and {Module,Function}Validator template classes. r=arai
Thanks for all the work here to keep asm.js working! Ugh, I forgot about the caching hooks. Ultimately, the browser-side implementation of these callbacks is just using the char16_t*'s to: (1) hash the first fixed number of chars to do a first-level lookup, (2) using the length found by the first-level lookup, do a memcmp() for the complete size. In theory, both of these could be done in a unit-agnostic manner, caring only about byte lengths, and thus passing void*'s. Could you perhaps make this change?
Flags: needinfo?(luke)
Pushed by jwalden@mit.edu: https://hg.mozilla.org/integration/mozilla-inbound/rev/dc3c29af8881 Make wasm/AsmJS.h minimally include only what it needs. r=arai https://hg.mozilla.org/integration/mozilla-inbound/rev/5f69b689d7c2 Don't use std::abs on a value that could possibly be -2147483648 (whose absolute value can't be computed when |int| is 32-bit-sized). r=arai https://hg.mozilla.org/integration/mozilla-inbound/rev/6bdb62915f68 Use |using| to define various typedefs in AsmJS.cpp. r=arai https://hg.mozilla.org/integration/mozilla-inbound/rev/cf7153f42f1c Split {Module,Function}Validator classes into {Module,Function}Validator{,Shared} to segregate source-unit-agnostic parts from source-unit-aware parts, use the correct types in all signatures, and use *ValidatorShared::* instead of *Validator::* for nested classes that are common to both source types. r=arai https://hg.mozilla.org/integration/mozilla-inbound/rev/11d9ca021f12 Make AsmJSParser a template typedef and {Module,Function}Validator template classes. r=arai

(In reply to Luke Wagner [:luke] from comment #9)
Alternatively, I suppose it'd be fine to simply kill dom/asmjscache. It's been slated for removal, ahead of full asm.js, for a while. I had been hoping to get bug 1487113 finished first, but given the close proximity, it seems wasteful to spend more than a few hours keeping it working right before removal.

I had also been concerned about keeping the serialization mechanism tested, but now we have bug 1330661 which is much better.

Mm. Killing dom/asmjscache looks like a bit of change, possibly more than I want to do with the concomitant risk. I spent a few hours today trying to do the desired adaptations to asm.js caching for UTF-8 support, and it appears not to have been too horrific or tricky to get right, so I will probably go with that for now. I have the patches for this locally, but I'm not yet 100% confident to the point of putting them up for review. Maybe tomorrow.

...okay, now I've spent too long (more than a more than a few hours, now) trying to figure out why my attempts at such are failing (test_cachingMulti.html starts indicating cache non-hits), probably should just do now what I should have done once comment 13 gave me permission. :-|

sgtm. Feel free just to do the minimal cut in one patch and I can do a follow-up to do the full expungement.

Okay, this mostly removes the calls to the caching API goo, and it removes *some* of the related code clearly only invoked for caching. Probably more to remove beyond this. The existing tests that query for cache-hitness had to be altered for all of this. I did what seemed like the most minimal fix: remove all functions that expose cache status, fix up callers, and -- because isCachingEnabled/setCachingEnabled are only relevant for this purpose -- removing them too. (And generally replacing them with a thing that is a subset of what the is-cached helper does, namely do the type-check that function did only to throw if the check missed.) That made the change sprawl some, but I'm hard-pressed to say what else could be done, really. This makes it dead simple to actually enable UTF-8 asm.js, but it doesn't actually do it -- Parser still bails if it encounters asm.js in UTF-8 with this patch alone in place.
Attachment #9037378 - Flags: review?(luke)
Very minimal testing indicates this does the expected thing -- namely, running the shell with -u -w and seeing that asm.js compilation was successful on a function. jstests also seem to pass with it, but I'm not entirely sure that I invoked them properly to test the UTF-8 path, so I hesitate to make the jstests claim confidently.
Attachment #9037379 - Flags: review?(arai.unmht)
Comment on attachment 9037378 [details] [diff] [review] Remove the implementation of the asm.js caching mechanism (APIs to invoke it still remain in place; they just don't do anything) Review of attachment 9037378 [details] [diff] [review]: ----------------------------------------------------------------- Good by asm.js caching; we had some good times. And few crashes. But mostly good times.
Attachment #9037378 - Flags: review?(luke) → review+
Blocks: 1520931
Attachment #9037379 - Flags: review?(arai.unmht) → review+
Priority: -- → P2
Pushed by jwalden@mit.edu: https://hg.mozilla.org/integration/mozilla-inbound/rev/dc3f5f51778c Remove the implementation of the asm.js caching mechanism. (APIs to invoke it still remain in place; they just don't do anything.) r=luke https://hg.mozilla.org/integration/mozilla-inbound/rev/0f49f3cc157d Make asm.js work in UTF-8 source text. r=arai

Turns out, per tryservering, the patch was missing a corresponding update to the binast tests to remove the latin1/asm.js-alikes. Landed with that additional tweak.

Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: