Implement asm.js support in UTF-8 source text

RESOLVED FIXED

Status

()

enhancement
P2
normal
RESOLVED FIXED
5 months ago
4 months ago

People

(Reporter: Waldo, Assigned: Waldo)

Tracking

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox66 affected)

Details

Attachments

(7 attachments)

Assignee

Description

5 months ago
At present (as an expediency thought initially to maybe even be a permanent choice) asm.js parsing is not supported in UTF-8 source text -- only in UTF-16.  This needs to change if we are to pass UTF-8 source directly from the browser to the JS engine, because Luke thinks we're not quite at the point where asm.js users can be expected to have moved to wasm from asm.js.  (I'm not sure I agree, but I sure as heck don't *disagree*.)

To some degree this just requires adding templates out the wazoo to AsmJS.cpp (and changing source layout where possible to reduce the need for templates everywhere, because lots of AsmJS.cpp functionality requires no knowledge of source unit type).  That's all easy and mostly straightforward.

But eventually we run into the problem that

struct AsmJSCacheOps {
  OpenAsmJSCacheEntryForReadOp openEntryForRead = nullptr;
  CloseAsmJSCacheEntryForReadOp closeEntryForRead = nullptr;
  OpenAsmJSCacheEntryForWriteOp openEntryForWrite = nullptr;
  CloseAsmJSCacheEntryForWriteOp closeEntryForWrite = nullptr;
};

contains function pointers that look like

using OpenAsmJSCacheEntryForReadOp =
    bool (*)(HandleObject global, const char16_t* begin, const char16_t* limit,
             size_t* size, const uint8_t** memory, intptr_t* handle);
using OpenAsmJSCacheEntryForWriteOp = AsmJSCacheResult (*)(
    HandleObject global, const char16_t* begin, const char16_t* end,
    size_t size, uint8_t** memory, intptr_t* handle);

that encode source unit type in their signatures.  The super-easiest fix is to keep duplicating -- splitting these function pointers into UTF-8 and UTF-16 variants.  I'll wait on figuring that out til the new year, I guess, seeing as Luke seems to be Out right now.

In the meantime, tho, I can post all the patches leading up to the point where that wrinkle has to be sorted out.  Commence primary ignition...
Assignee

Updated

5 months ago
Assignee: nobody → jwalden
Status: NEW → ASSIGNED
Assignee

Comment 2

5 months ago
Unfortunately ubsan doesn't pick this up because the UB occurs in the standard library, not in compiler-compiled code.  :-(
Attachment #9033587 - Flags: review?(arai.unmht)
Assignee

Comment 3

5 months ago
The number of things that have to take FunctionValidator& is super-unfortunate because each then requires templating, but I'm not sure there's anything better to do.
Attachment #9033588 - Flags: review?(arai.unmht)
Assignee

Comment 4

5 months ago
Just stylistic improvements, made in passing.
Attachment #9033589 - Flags: review?(arai.unmht)
Assignee

Comment 5

5 months ago
At this point, pretty much all of AsmJS.cpp is ready for UTF-8 except CompileAsmJS and the open/read/write-cache functions it directly calls, basically.  And all the <Unit>-templated functions/classes are only instantiated for char16_t still.  Doing anything for UTF-8 will require answers to the questions posed in comment 0.
Attachment #9033590 - Flags: review?(arai.unmht)
Attachment #9033583 - Flags: review?(arai.unmht) → review+
Attachment #9033587 - Flags: review?(arai.unmht) → review+
Assignee

Comment 6

5 months ago
Luke, thoughts on the question in comment 0 -- should the read/write asm.js cache ops just be duplicated (i.e. overloads that call a single templated implementation) for UTF-8 and UTF-16?  Or should some other thing be attempted?
Flags: needinfo?(luke)
Attachment #9033588 - Flags: review?(arai.unmht) → review+
Attachment #9033589 - Flags: review?(arai.unmht) → review+
Attachment #9033590 - Flags: review?(arai.unmht) → review+
Assignee

Updated

5 months ago
Keywords: leave-open

Comment 7

5 months ago
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/0e39815d1bce
Make wasm/AsmJS.h minimally include only what it needs.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/e58454c68f0f
Don't use std::abs on a value that could possibly be -2147483648 (whose absolute value can't be computed when |int| is 32-bit-sized).  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/c7577c232591
Split {Module,Function}Validator classes into {Module,Function}Validator{,Shared} to segregate source-unit-agnostic parts from source-unit-aware parts, use the correct types in all signatures, and use *ValidatorShared::* instead of *Validator::* for nested classes that are common to both source types.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/ea3ab4b83152
Use |using| to define various typedefs in AsmJS.cpp.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/4a8940073f8c
Make AsmJSParser a template typedef and {Module,Function}Validator template classes.  r=arai

Comment 8

5 months ago
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/df1b7a886a9d
Split {Module,Function}Validator classes into {Module,Function}Validator{,Shared} to segregate source-unit-agnostic parts from source-unit-aware parts, use the correct types in all signatures, and use *ValidatorShared::* instead of *Validator::* for nested classes that are common to both source types.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/d1267711aef0
Make AsmJSParser a template typedef and {Module,Function}Validator template classes.  r=arai
Thanks for all the work here to keep asm.js working!  Ugh, I forgot about the caching hooks.

Ultimately, the browser-side implementation of these callbacks is just using the char16_t*'s to: (1) hash the first fixed number of chars to do a first-level lookup, (2) using the length found by the first-level lookup, do a memcmp() for the complete size.  In theory, both of these could be done in a unit-agnostic manner, caring only about byte lengths, and thus passing void*'s.  Could you perhaps make this change?
Flags: needinfo?(luke)

Comment 10

5 months ago
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/dc3c29af8881
Make wasm/AsmJS.h minimally include only what it needs.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/5f69b689d7c2
Don't use std::abs on a value that could possibly be -2147483648 (whose absolute value can't be computed when |int| is 32-bit-sized).  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/6bdb62915f68
Use |using| to define various typedefs in AsmJS.cpp.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/cf7153f42f1c
Split {Module,Function}Validator classes into {Module,Function}Validator{,Shared} to segregate source-unit-agnostic parts from source-unit-aware parts, use the correct types in all signatures, and use *ValidatorShared::* instead of *Validator::* for nested classes that are common to both source types.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/11d9ca021f12
Make AsmJSParser a template typedef and {Module,Function}Validator template classes.  r=arai

Comment 12

4 months ago

(In reply to Luke Wagner [:luke] from comment #9)
Alternatively, I suppose it'd be fine to simply kill dom/asmjscache. It's been slated for removal, ahead of full asm.js, for a while. I had been hoping to get bug 1487113 finished first, but given the close proximity, it seems wasteful to spend more than a few hours keeping it working right before removal.

I had also been concerned about keeping the serialization mechanism tested, but now we have bug 1330661 which is much better.

Assignee

Comment 13

4 months ago

Mm. Killing dom/asmjscache looks like a bit of change, possibly more than I want to do with the concomitant risk. I spent a few hours today trying to do the desired adaptations to asm.js caching for UTF-8 support, and it appears not to have been too horrific or tricky to get right, so I will probably go with that for now. I have the patches for this locally, but I'm not yet 100% confident to the point of putting them up for review. Maybe tomorrow.

Assignee

Comment 14

4 months ago

...okay, now I've spent too long (more than a more than a few hours, now) trying to figure out why my attempts at such are failing (test_cachingMulti.html starts indicating cache non-hits), probably should just do now what I should have done once comment 13 gave me permission. :-|

sgtm. Feel free just to do the minimal cut in one patch and I can do a follow-up to do the full expungement.

Assignee

Comment 16

4 months ago
Okay, this mostly removes the calls to the caching API goo, and it removes *some* of the related code clearly only invoked for caching.  Probably more to remove beyond this.

The existing tests that query for cache-hitness had to be altered for all of this.  I did what seemed like the most minimal fix: remove all functions that expose cache status, fix up callers, and -- because isCachingEnabled/setCachingEnabled are only relevant for this purpose -- removing them too.  (And generally replacing them with a thing that is a subset of what the is-cached helper does, namely do the type-check that function did only to throw if the check missed.)  That made the change sprawl some, but I'm hard-pressed to say what else could be done, really.

This makes it dead simple to actually enable UTF-8 asm.js, but it doesn't actually do it -- Parser still bails if it encounters asm.js in UTF-8 with this patch alone in place.
Attachment #9037378 - Flags: review?(luke)
Assignee

Comment 17

4 months ago
Very minimal testing indicates this does the expected thing -- namely, running the shell with -u -w and seeing that asm.js compilation was successful on a function.  jstests also seem to pass with it, but I'm not entirely sure that I invoked them properly to test the UTF-8 path, so I hesitate to make the jstests claim confidently.
Attachment #9037379 - Flags: review?(arai.unmht)
Comment on attachment 9037378 [details] [diff] [review]
Remove the implementation of the asm.js caching mechanism (APIs to invoke it still remain in place; they just don't do anything)

Review of attachment 9037378 [details] [diff] [review]:
-----------------------------------------------------------------

Good by asm.js caching; we had some good times.  And few crashes.  But mostly good times.
Attachment #9037378 - Flags: review?(luke) → review+

Updated

4 months ago
Blocks: 1520931
Attachment #9037379 - Flags: review?(arai.unmht) → review+
Priority: -- → P2

Comment 19

4 months ago
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/dc3f5f51778c
Remove the implementation of the asm.js caching mechanism.  (APIs to invoke it still remain in place; they just don't do anything.)  r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/0f49f3cc157d
Make asm.js work in UTF-8 source text.  r=arai
Assignee

Comment 20

4 months ago

Turns out, per tryservering, the patch was missing a corresponding update to the binast tests to remove the latin1/asm.js-alikes. Landed with that additional tweak.

Assignee

Updated

4 months ago
Status: ASSIGNED → RESOLVED
Last Resolved: 4 months ago
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.