Closed Bug 1292724 Opened 8 years ago Closed 8 years ago

Baldr: remove call_import, add imports to function index space

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla51
Tracking Status
firefox51 --- fixed

People

(Reporter: luke, Assigned: luke)

References

Details

Attachments

(4 files, 4 obsolete files)

A recent change to the spec puts imports into the function index space (like arguments):
  https://github.com/WebAssembly/design/blob/master/Modules.md#function-index-space
so that they can be called by `call`. `call_import` is remoted.

Having imports in the index also means:
 * imports can be rexported; function identity should be preserved
 * imports can be used in elem segments meaning that even tables that are not imported/exported can be "external" (fortunately this can still be detected statically)
Attached patch late-throwSplinter Review
This patch is unrelated, but I noticed the discrepancy while working in this area for the later patch: as described in JS.md, errors are only thrown when calling an import/export that takes/returns an i64.  Fortunately, we already allow i64 in all these places in wasmTestMode, so it's quite easy to turn some asserts into dynamic checks.
Assignee: nobody → luke
Status: NEW → ASSIGNED
Attachment #8783798 - Flags: review?(sunfish)
Attached patch tidy-elem-segSplinter Review
Currently, ElemSegments store only the raw offsets for their entries and there is this kinda dirty hack where ElemSegments.elems starts as a vector of function indices and is then converted in-place into a vector of offsets.  For the later patch, it'd be a lot nicer if we instead (1) kept the original function indices around (b/c they can index imports), (2) stored CodeRange indices instead of offsets (since a CodeRange is more convenient).
Attachment #8783800 - Flags: review?(sunfish)
Attached patch fix-init-expr-offsets (obsolete) — Splinter Review
Technically unrelated, but I also noticed that memory data segments' initializer expressions are currently hardcoded to constants.  This patch generalizes them to InitExprs.
Attachment #8783801 - Flags: review?(sunfish)
Oh, one more thing about the last patch: when data/elem segments were given init-exprs, the "monotonically increasing and non-overlapping" constraints were dropped (b/c now it's not static) so the patch drops this constraint as well.
Attached patch call-import (obsolete) — Splinter Review
Finally, here's the real patch.  This patch keeps things the same for 0xb but removes Expr::CallImport and changes the index space for new-format (which is compiled by the new Module() ctor).  (Once we merge 0xc, we can remove all the !new-format paths.)

To write the patch I started by renaming *everything* from funcIndex to funcDefIndex.  A "funcDef" means "function definition" and is the opposite of a function import.  Then I audited each funcDefIndex and decided which ones should really be a funcIndex (and thus required generalization to deal with the import vs. def case).  There are four cases:
 - start functions
 - function exports
 - elem section elements
 - Expr::Call callees
The rest really are indexing [0, numFuncDefs).

Expr::Call case is an interesting case when it comes to asm.js: even once we've eliminated 0xb, asm.js still really wants Expr::Call to have a funcDefIndex.  The reason is that the set of function imports is not known until after seeing all the function bodies.  Thus, for both asm.js and 0xb, Expr::Call takes an index in the range [0, numFuncDefs), but new-format wasm takes a real function index.  This is a sad little extra bit of complexity which we'll have to wait for asm.js-removal to remove.
Attachment #8783803 - Flags: review?(sunfish)
Comment on attachment 8783798 [details] [diff] [review]
late-throw

Review of attachment 8783798 [details] [diff] [review]:
-----------------------------------------------------------------

The code looks good. One question: It looks like this change should make it possible for a wasm instance to call an exported function from another wasm instance with i64 args and returns. Is that true? If so, can we add a test for that?
Attachment #8783798 - Flags: review?(sunfish) → review+
(In reply to Dan Gohman [:sunfish] from comment #6)
Ah yes, of course.  Done (and it works!)
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8d266498a981
Baldr: don't fail to validate i64 imports and exports; throw runtime errors (r=sunfish)
Whiteboard: [leave open]
Comment on attachment 8783800 [details] [diff] [review]
tidy-elem-seg

Review of attachment 8783800 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/asmjs/WasmModule.h
@@ +174,1 @@
>      {}

I'd find it clarifying to MOZ_ASSERT(elemCodeRangeIndices.length() == elemFuncIndices.length()), to emphasize that they're using the same index space.
Attachment #8783800 - Flags: review?(sunfish) → review+
Comment on attachment 8783801 [details] [diff] [review]
fix-init-expr-offsets

Review of attachment 8783801 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/asmjs/WasmBinaryToExperimentalText.cpp
@@ +1728,5 @@
>          if (!PrintIndent(c))
>              return false;
>          if (!c.buffer.append("segment "))
>             return false;
> +        if (!PrintInt32(c, segment->offset()->as<AstConst>().val().i32()))

What if it's not an AstConst? Or is this just work-in-progress code? (ditto for WasmBinaryToText.cpp).

::: js/src/asmjs/WasmCompile.cpp
@@ -1422,5 @@
> -        if (!d.readVarU32(&seg.memoryOffset))
> -            return Fail(d, "expected segment destination offset");
> -
> -        if (seg.memoryOffset < prevEnd)
> -            return Fail(d, "data segments must be disjoint and ordered");

WasmBinaryToAST.cpp still has a version of this check too.
(In reply to Dan Gohman [:sunfish] from comment #11)
> What if it's not an AstConst? Or is this just work-in-progress code? (ditto
> for WasmBinaryToText.cpp).

WasmBinaryToAST.cpp does not yet know about newFormat and these can only be AstConst nodes.  Before making newFormat the only format, I'll need to go down and, in each place there is a newFormat branch in WasmTextToBinary.cpp make the symmetric generalization to WasmBinaryTo*.cpp and then update all the totext tests.

> WasmBinaryToAST.cpp still has a version of this check too.

Right, that can be removed.
Comment on attachment 8783803 [details] [diff] [review]
call-import

Clearing while we sort out https://github.com/WebAssembly/design/pull/779 and also all-or-nothing initialization.
Attachment #8783803 - Attachment is obsolete: true
Attachment #8783803 - Flags: review?(sunfish)
Comment on attachment 8783801 [details] [diff] [review]
fix-init-expr-offsets

Also dependent on https://github.com/WebAssembly/design/pull/780.
Attachment #8783801 - Flags: review?(sunfish)
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7486f26780ca
Baldr: simplify elem segments (r=sunfish)
Blocks: 1287220
Attached patch fix-init-expr-offsets (obsolete) — Splinter Review
This patch adds initialization expressions to memory segments.  The patch also removes the monotonic/non-overlapping checks and ensures no partial initialization which matches https://github.com/WebAssembly/design/pull/780.
Attachment #8783801 - Attachment is obsolete: true
Attachment #8787726 - Flags: review?(bbouvier)
Attached patch call-import (obsolete) — Splinter Review
See comment 5 for explanation.  This patch additional moves the element section to right before the code section based on https://github.com/WebAssembly/design/pull/779.
Attachment #8787729 - Flags: review?(bbouvier)
Oops, I left a "TODO: write a test" :)
Attachment #8787726 - Attachment is obsolete: true
Attachment #8787726 - Flags: review?(bbouvier)
Attachment #8787749 - Flags: review?(bbouvier)
Attached patch call-importSplinter Review
Rebased
Attachment #8787729 - Attachment is obsolete: true
Attachment #8787729 - Flags: review?(bbouvier)
Attachment #8787750 - Flags: review?(bbouvier)
(Both PRs now merged in the design repo.)
Comment on attachment 8787749 [details] [diff] [review]
fix-init-expr-offsets

Review of attachment 8787749 [details] [diff] [review]:
-----------------------------------------------------------------

Nice! (don't forget to change the changeset message)

::: js/src/asmjs/WasmModule.cpp
@@ +454,5 @@
> +
> +            if (offset > memoryLength || memoryLength - offset < seg.length) {
> +                JS_ReportErrorNumber(cx, GetErrorMessage, nullptr, JSMSG_WASM_BAD_FIT, "data", "memory");
> +                return false;
> +            }

Should we also do the same checks (or MOZ_ASSERT, at the very least) for seg.bytecodeOffset / bytecode_.length()?

Thinking about it, the checks could be done in WasmCompile (they don't seem to be done, as far as I can tell), and we could just MOZ_ASSERT here.

@@ +460,5 @@
> +    } else {
> +        MOZ_ASSERT(dataSegments_.empty());
> +    }
> +
> +    // The instance is now fully operational and initialization and can now be

nit: this part sounds weird: "and initialization"? Not sure what is meant here.

@@ +820,5 @@
>          return false;
>  
> +    // Now that the instance is fully live and initialized, the start function.
> +    // Note that failure may cause instantiation to throw, but the instance may
> +    // still be live via edges created by initSegments or the start function.

So if a function is referenced in a table, but the start function throws, you can still access the function via the function table? ... it sounds strange in terms of semantics (so not tied to this implementation in this bug): it sounds error prone. If the start() function is used to do some initialization of the module, and a wasm writer (via Emscripten e.g.) is not aware their start function may fail under certain conditions, then the functions could still be used while the module is not initialized, leading to hard to debug, confusing behavior, maybe?

::: js/src/asmjs/WasmModule.h
@@ +214,4 @@
>                            SharedTableVector* tables) const;
> +    bool initSegments(JSContext* cx,
> +                      HandleWasmInstanceObject instanceObj,
> +                      HandleWasmMemoryObject memoryObj,

pre-existing: could rename to instance and memory, here (for symmetry with the above)

::: js/src/asmjs/WasmTextToBinary.cpp
@@ +2456,5 @@
>  
>      while (c.ts.getIf(WasmToken::OpenParen)) {
>          if (!c.ts.match(WasmToken::Segment, c.error))
>              return false;
> +        AstDataSegment* segment = ParseDataSegment(c, false);

Can you add /* newFormat = */?

@@ +4206,5 @@
>      if (!EncodeCodeSection(e, module))
>          return false;
>  
> +    if (!EncodeElemSection(e, newFormat, module))
> +        return false;

Nice catch! And now we even have a test for that!

::: js/src/jit-test/tests/wasm/tables.js
@@ -22,5 @@
>  
>  assertErrorMessage(() => new Module(textToBinary(`(module (table (resizable 10)) (elem (i32.const 10) $f0) ${callee(0)})`)), TypeError, /element segment does not fit/);
>  assertErrorMessage(() => new Module(textToBinary(`(module (table (resizable 10)) (elem (i32.const 8) $f0 $f0 $f0) ${callee(0)})`)), TypeError, /element segment does not fit/);
> -assertErrorMessage(() => new Module(textToBinary(`(module (table (resizable 10)) (elem (i32.const 1) $f0 $f0) (elem (i32.const 0) $f0) ${callee(0)})`)), TypeError, /must be.*ordered/);
> -assertErrorMessage(() => new Module(textToBinary(`(module (table (resizable 10)) (elem (i32.const 1) $f0 $f0) (elem (i32.const 2) $f0) ${callee(0)})`)), TypeError, /must be.*disjoint/);

Can you replace these tests by some which check that they can be *not* ordered and *not* disjoint?
Attachment #8787749 - Flags: review?(bbouvier) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #22)
> Should we also do the same checks (or MOZ_ASSERT, at the very least) for
> seg.bytecodeOffset / bytecode_.length()?
> 
> Thinking about it, the checks could be done in WasmCompile (they don't seem
> to be done, as far as I can tell), and we could just MOZ_ASSERT here.

Unless I've missed something, these values are valid by construction: seg.bytecodeOffset assigned d.currentOffset() (thus a valid offset) and then we readBytes(seg.length) and later finishSection() which ensures seg.length is valid.  I'll definitely add the MOZ_ASSERT() though.

> So if a function is referenced in a table, but the start function throws,
> you can still access the function via the function table? ... it sounds
> strange in terms of semantics (so not tied to this implementation in this
> bug): it sounds error prone. If the start() function is used to do some
> initialization of the module, and a wasm writer (via Emscripten e.g.) is not
> aware their start function may fail under certain conditions, then the
> functions could still be used while the module is not initialized, leading
> to hard to debug, confusing behavior, maybe?

Definitely a good question.  Since the start function code expects to execute in the same environment as the rest of the code, I think we don't have an option to only write the segments after start succeeds.  If the start function fails, I don't think there is a universally-valid way to do rollback.  So I think that forces us to do what we have now.

Realistically, I don't think it's too bad and I'd expect one of these cases:
 - start is infallible or fatal or assumed to never happen
 - on failure, the space allocated by the dynamic loader to write the elem/data segments is reclaimed and either never used (b/c the `dlopen` failed so noone gets a pointer into those data/elems) or clobbered by the next library that is loaded

In fact, the dynamic loader could do its own rollback, nulling out the elems sections on failure (so the GC could collect the now-dead instance).

This reminds me a little of what happens if an exception is uncaught in a normal webapp: event handlers are left registered and can be called in the future.
Comment on attachment 8787750 [details] [diff] [review]
call-import

Review of attachment 8787750 [details] [diff] [review]:
-----------------------------------------------------------------

Looks great! It makes so much sense with the naming. I've suggested a few API improvements that are not mandatory and could be included in this patch or done as a follow-up (or just not done). Nice to see call_import ~go!

::: js/src/asmjs/WasmBaselineCompile.cpp
@@ +5349,5 @@
>  }
>  
>  bool
> +BaseCompiler::emitCallImport(uint32_t callOffset)
> +{

Can you MOZ_ASSERT(!mg_.firstFuncDefIndex) here too?

::: js/src/asmjs/WasmCompile.cpp
@@ +72,5 @@
>              return iter().fail("can't touch memory without memory");
>          return true;
>      }
> +
> +    bool checkHasOldFormat() {

Light suggestion: check*Is*OldFormat?

@@ +1224,5 @@
>      if (!mg.startFuncDef(d.currentOffset(), &fg))
>          return false;
>  
>      ValTypeVector locals;
> +    const Sig& sig = mg.funcDefSig(funcDefIndex);

Random idea (it should be in the general comment, but I had it while reading here). If we'd use two structs FuncDefIndex / FuncIndex (both simply wrapping an uint32_t), we could then use these types in mg calls:

const Sig& sig
ModuleGenerator::funcDefSig(FuncDefIndex index);

const Sig& sig
ModuleGenerator::funcSig(FuncIndex index);

This would prevent misuse of the funcSig / funcDefSig (and other functions specialized for funcDef or func); not sure if the utility over work ratio is high enough.

::: js/src/asmjs/WasmGenerator.cpp
@@ +390,5 @@
>      return true;
>  }
>  
>  bool
>  ModuleGenerator::finishFuncExports()

How about renaming this to finishFuncDefExports? Imports can be re-exported too, but this doesn't happen here.

@@ +408,2 @@
>  
>      MOZ_ASSERT(metadata_->funcExports.empty());

And renaming metadata_->funcExports to funcDefExports?

@@ +822,5 @@
> +    MOZ_ASSERT(!startedFuncDefs_);
> +    MOZ_ASSERT(shared_->tables.length() == 1);
> +
> +    for (size_t i = 0; i < elemFuncIndices.length(); i++) {
> +        uint32_t funcIndex = elemFuncIndices[i];

We could use a ranged for loop here.

@@ +825,5 @@
> +    for (size_t i = 0; i < elemFuncIndices.length(); i++) {
> +        uint32_t funcIndex = elemFuncIndices[i];
> +        if (!funcIndexIsDef(funcIndex)) {
> +            shared_->tables[0].external = true;
> +            continue;

should be a break, right?

@@ +827,5 @@
> +        if (!funcIndexIsDef(funcIndex)) {
> +            shared_->tables[0].external = true;
> +            continue;
> +        }
> +

nit: blank line

::: js/src/asmjs/WasmGenerator.h
@@ +119,5 @@
>      IonCompileTaskVector            tasks_;
>      IonCompileTaskPtrVector         freeTasks_;
>  
>      // Assertions
>      DebugOnly<FunctionGenerator*>   activeFunc_;

At the risk of making it overly obvious, can you rename it activeFuncDef_?

::: js/src/asmjs/WasmIonCompile.cpp
@@ +1946,5 @@
>  }
>  
>  static bool
> +EmitCallImport(FunctionCompiler& f, uint32_t callOffset)
> +{

Can you MOZ_ASSERT(!f.mg().firstFuncDefIndex) here?

::: js/src/asmjs/WasmJS.cpp
@@ +641,5 @@
>      if (!fun)
>          return false;
>  
>      fun->setExtendedSlot(FunctionExtended::WASM_INSTANCE_SLOT, ObjectValue(*instanceObj));
> +    fun->setExtendedSlot(FunctionExtended::WASM_FUNC_INDEX_SLOT, Int32Value(funcDefIndex));

Should we rename WASM_FUNC_INDEX_SLOT to WASM_FUNC_DEF_INDEX_SLOT too?

@@ +667,5 @@
>      return fun->maybeNative() == WasmCall;
>  }
>  
>  bool
> +wasm::IsExportedNonAsmJSFunction(JSFunction* fun)

Or simply wasm::IsExportedWasmFunction?

::: js/src/asmjs/WasmModule.cpp
@@ +484,5 @@
>          for (uint32_t i = 0; i < seg.elemCodeRangeIndices.length(); i++) {
> +            uint32_t elemFuncIndex = seg.elemFuncIndices[i];
> +            if (elemFuncIndex < funcImports.length()) {
> +                MOZ_ASSERT(!metadata().isAsmJS());
> +                MOZ_ASSERT(!table.isTypedFunction());

Can you MOZ_ASSERT(seg.elemCodeRangeIndices[i] == UINT32_MAX) here...

@@ +496,5 @@
> +                WasmInstanceObject* exportInstanceObj = ExportedFunctionToInstanceObject(f);
> +                const CodeRange& cr = exportInstanceObj->getExportedFunctionCodeRange(f);
> +                Instance& exportInstance = exportInstanceObj->instance();
> +                table.set(offset + i, exportInstance.codeBase() + cr.funcTableEntry(), exportInstance);
> +            } else {

... and can you MOZ_ASSERT(seg.elemCodeRangeIndices[i] != UINT32_MAX) here?

@@ +859,5 @@
>      // still be live via edges created by initSegments or the start function.
>  
>      if (metadata_->hasStartFunction()) {
>          FixedInvokeArgs<0> args(cx);
> +        if (metadata_->startFuncIndex() < funcImports.length()) {

metadata_->startFuncIndex() is referenced three times => local?
Attachment #8787750 - Flags: review?(bbouvier) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #24)
Thanks!

> Random idea (it should be in the general comment, but I had it while reading
> here). If we'd use two structs FuncDefIndex / FuncIndex (both simply
> wrapping an uint32_t), we could then use these types in mg calls:

That's a great idea and I like the general strategy of putting semantic meaning into the types.  To expedite some dependent stuff, how about I do that in a later patch, though.

Applied all other suggestions, though.

> > +            shared_->tables[0].external = true;
> > +            continue;
> 
> should be a break, right?

Hah, yes.  With two nits and this thinko, this was my weakest loop in the patch by far ;)
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/dd39ceedb7f0
Baldr: fix segment offsets (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/94befb88aee2
Baldr: remove call_import, add imports to function index space (r=bbouvier)
Whiteboard: [leave open]
https://hg.mozilla.org/mozilla-central/rev/dd39ceedb7f0
https://hg.mozilla.org/mozilla-central/rev/94befb88aee2
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla51
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: