Wasm structure types - initial prototype implementation

RESOLVED FIXED in Firefox 64

Status

()

enhancement
P3
normal
RESOLVED FIXED
Last year
8 months ago

People

(Reporter: lth, Assigned: lth)

Tracking

unspecified
mozilla64
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox64 fixed)

Details

Attachments

(9 attachments, 16 obsolete attachments)

28.39 KB, patch
luke
: review+
Details | Diff | Splinter Review
57.59 KB, patch
luke
: review+
Details | Diff | Splinter Review
101.01 KB, patch
luke
: review+
Details | Diff | Splinter Review
14.71 KB, patch
luke
: review+
Details | Diff | Splinter Review
12.78 KB, patch
luke
: review+
Details | Diff | Splinter Review
61.43 KB, patch
luke
: review+
Details | Diff | Splinter Review
35.63 KB, patch
luke
: review+
Details | Diff | Splinter Review
44.66 KB, patch
luke
: review+
Details | Diff | Splinter Review
25.87 KB, application/octet-stream
Details
Basically:

- design a simple struct type system (a subset of what we will eventually have)
- design an encoding
- design instructions to operate on it
- text->binary, binary->text
- validation
- code generation
- runtime support

Design happens here:
https://docs.google.com/document/d/1-oqmyS3sA9jiEvfSUVPl-VfnaOrigfZEp480xPfOQjw
Work in progress: parse, print, and validate struct types.
Work in progress: Implement the newstruct operator.
This changes ValType from being an enum to being a class with a couple of small integer fields, one for the type code and one for an index into a type table (not part of this patch).

In the process, ValType has become a little more disciplined: it has a notion of an invalid value that is opaque to clients, not some hack with TypeCode::Limit; and it can be constructed from integers only through explicit action, not via a cast.  There are checks that bits that are used to construct it it are meaningful.

This is fairly clean, so I'm looking for feedback to make sure we're more or less in agreement before cleaning it up.  Ideally I'd just land this so that we can move forward.

In three or so places I had to add constructors to unions because ValType no longer is an integral value.  I also had to explicitly initialize Val objects in a couple of places.  Furthermore, AddContainerToHash() can't be used, in two places.

Apart from that, and apart from the definition of ValType itself, the changes are all calls that explicitly extract the code from the ValType to switch on.
Attachment #8980347 - Flags: feedback?(luke)
Attachment #8980347 - Flags: feedback?(bbouvier)
Comment on attachment 8980347 [details] [diff] [review]
bug1459900-adapt-valtype.patch

Review of attachment 8980347 [details] [diff] [review]:
-----------------------------------------------------------------

Looks like a plausible path to me, so I think we could move forward with this.

::: js/src/wasm/WasmAST.h
@@ +161,5 @@
>  
>      typedef const AstSig& Lookup;
>      static HashNumber hash(Lookup sig) {
> +        HashNumber hn = HashNumber(sig.ret());
> +        const auto& args = sig.args();

nit: no auto when the implied typed isn't explicit somewhere else in the statement. Using a range based loop below could kill two birds with one stone...

::: js/src/wasm/WasmBinaryConstants.h
@@ +160,5 @@
>  
> +  private:
> +    ValType(uint8_t code, uint32_t refTypeIndex)
> +      : code_(code),
> +        refTypeIndex_(refTypeIndex)

MOZ_ASSERT(code_ == code); // although that one is supplanted by validate()
MOZ_ASSERT(refTypeIndex != InvalidIndex)
MOZ_ASSERT(refTypeIndex_ == refTypeIndex);

@@ +165,5 @@
> +    {
> +        validate();
> +    }
> +
> +    void validate() const {

nit: how about renaming to assertValid? Validate does suggest a fallible operation.
Attachment #8980347 - Flags: feedback?(bbouvier) → feedback+
Comment on attachment 8980347 [details] [diff] [review]
bug1459900-adapt-valtype.patch

Review of attachment 8980347 [details] [diff] [review]:
-----------------------------------------------------------------

Yes, excellent, thanks!

::: js/src/wasm/WasmBaselineCompile.cpp
@@ +8815,4 @@
>  
>      startCallArgs(stackArgAreaSize(sig), &baselineCall);
>      for (uint32_t i = 1; i < sig.length(); i++) {
> +        ValType t = ValType::I32;

Could this and other cases use the default ctor to get Invalid and better error checking?

::: js/src/wasm/WasmBinaryConstants.h
@@ +75,5 @@
>  };
>  
> +enum class ExprType;
> +
> +class ValType

nit: could you move this class def to WasmTypes.h, keeping this header to pure constants?

@@ +151,5 @@
> +    bool operator !=(const ValType& that) const {
> +        return !(*this == that);
> +    }
> +    bool operator ==(ValType::Code that) const {
> +        return code_ == uint32_t(that) && refTypeIndex_ == 0;

The condition on refTypeIndex_ confuses me and has questionable behavior for a Ref type that happens to have type-index 0.  Perhaps instead MOZ_ASSERT(that != Ref) (except there is no Ref yet, so in a future patch)?

::: js/src/wasm/WasmJS.cpp
@@ +241,4 @@
>              break;
>  
>            case DefinitionKind::Global:
> +            Val val(uint32_t(0));

Could Val have a default ctor producing an Invalid type and avoiding these scattered (uint32_t(0)) intializers?

::: js/src/wasm/WasmOpIter.h
@@ +62,4 @@
>  static inline StackType
>  ToStackType(ValType type)
>  {
> +    return StackType(type.bitsUnsafe());

I guess StackType would eventually be changed to the same class pattern as ValType and then StackType would have a MOZ_IMPLICIT ValType ctor?
Attachment #8980347 - Flags: feedback?(luke) → feedback+
(In reply to Benjamin Bouvier [:bbouvier](PTO until June 11th) from comment #4)
> Comment on attachment 8980347 [details] [diff] [review]
> bug1459900-adapt-valtype.patch
> 
> Review of attachment 8980347 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/wasm/WasmBinaryConstants.h
> @@ +160,5 @@
> >  
> > +  private:
> > +    ValType(uint8_t code, uint32_t refTypeIndex)
> > +      : code_(code),
> > +        refTypeIndex_(refTypeIndex)
> 
> MOZ_ASSERT(code_ == code); // although that one is supplanted by validate()
> MOZ_ASSERT(refTypeIndex != InvalidIndex)
> MOZ_ASSERT(refTypeIndex_ == refTypeIndex);

I don't understand the first and last of these.  Are you trying to assert that we don't lose information?  If so the first is a no-op because both types are uint8_t.

I've added some additional asserts in the spirit of validating better, and made the code parameter uint32_t to aid that.
(In reply to Luke Wagner [:luke] from comment #5)
> Comment on attachment 8980347 [details] [diff] [review]
> bug1459900-adapt-valtype.patch

> @@ +151,5 @@
> > +    bool operator !=(const ValType& that) const {
> > +        return !(*this == that);
> > +    }
> > +    bool operator ==(ValType::Code that) const {
> > +        return code_ == uint32_t(that) && refTypeIndex_ == 0;
> 
> The condition on refTypeIndex_ confuses me and has questionable behavior for
> a Ref type that happens to have type-index 0.  Perhaps instead
> MOZ_ASSERT(that != Ref) (except there is no Ref yet, so in a future patch)?

What we really want to say here is that you do not get to compare to ValType::Ref since a Ref always carries a payload.  As you observe, there's an implicit (future) assert that `that` is not ValType::Ref (which is not defined yet) and then we can check that refTypeIndex_ == NoIndex (what was previously known as InvalidIndex).

> ::: js/src/wasm/WasmOpIter.h
> @@ +62,4 @@
> >  static inline StackType
> >  ToStackType(ValType type)
> >  {
> > +    return StackType(type.bitsUnsafe());
> 
> I guess StackType would eventually be changed to the same class pattern as
> ValType and then StackType would have a MOZ_IMPLICIT ValType ctor?

As for ToExprType(), this change just makes explicit the relationship that already exists between ValType and StackType - they share representation, mostly.  So it's natural to imagine that StackType follows ValType here, especially once we've gotten rid of ExprType.  But whether it will be exactly what you suggest I don't know yet.

(Other issues addressed.  The bits with the default constructors were just missing edits.)
All issues addressed from previous round, I think, and the only major change relative to that patch was to move ValType to WasmTypes.h.

Right now a couple of things in ValType look weird because there is no Ref type; I'll add that next.  That will also cause the removal of the uses of bitsUnsafe() in writeValType in WasmValidate.h; bitsUnsafe() will then return the actual bits of the whole type, not just the bits of the code_ field as now.  The impact of that change for ExprType and StackType is slightly hazy to me right now.
Attachment #8980347 - Attachment is obsolete: true
Attachment #8981158 - Flags: review?(luke)
Comment on attachment 8981158 [details] [diff] [review]
bug1459900-adapt-valtype-v2.patch

Review of attachment 8981158 [details] [diff] [review]:
-----------------------------------------------------------------

Beautiful, thanks!

::: js/src/wasm/WasmJS.cpp
@@ +241,4 @@
>              break;
>  
>            case DefinitionKind::Global:
> +            Val val(uint32_t(0));

nit: one more

::: js/src/wasm/WasmTypes.h
@@ +177,5 @@
> +    static const uint32_t NoIndex = 0xFFFFFF;
> +
> +  public:
> +    enum Code {
> +        I32                              = uint8_t(TypeCode::I32),

nit: I think this column alignment was supposed to match that of other enums in WasmBinaryConstants.h.  Now it's in WasmTypes.h, could you align it to be two spaces past the end of "AnyRef"?
Attachment #8981158 - Flags: review?(luke) → review+
Memo to self: readRefNull() in this patch needs to change very slightly because fromTypeCode() can assert, but this is wrong for validation.  What we need are simple static methods on ValType that perform fallible validation in this case (validateCode, validateBitsUnsafe - though "Unsafe" is an unfortunate tag in this connection, so maybe just validateBits or validateRawBits).
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fa0f453c5be9
Adapt ValType to accomodate reference types.  r=luke
Component: JavaScript Engine: JIT → Javascript: Web Assembly
The minimum support for struct types (with anyref only), see commit msg for much information.
Attachment #8976114 - Attachment is obsolete: true
Attachment #8982199 - Flags: review?(luke)
Comment on attachment 8982199 [details] [diff] [review]
bug1459900-understand-struct-types-v2.patch

Review of attachment 8982199 [details] [diff] [review]:
-----------------------------------------------------------------

Withdrawing this, I have some minor bug fixes I would like to include.
Attachment #8982199 - Flags: review?(luke)
Parse and understand struct types.  (Additional features like name resolution come in a subsequent patch, for full Ref types.)
Attachment #8982199 - Attachment is obsolete: true
Attachment #8983721 - Flags: review?(luke)
Posted patch bug1459900-ref-types.patch (obsolete) — Splinter Review
Implement (ref T) with a simple type calculus.  This is large, but if you start with WasmTypes.h and WasmAST.h the rest is probably unsurprising.
Attachment #8984136 - Flags: review?(luke)
Implementation of the newstruct operation, rebased / adapted to the new reality with ref types.  Requires the TypedObject changes on the other bug.

This is pretty clean actually but there are two things holding it back:

- no int64 support in the TypedObject system
- no type constraints on pointer stores into TO instances from the JS side

The first one can be finessed (for now) with two int32 fields; wasm doesn't care, this is only for JS.

The second one needs to have a solution of some kind.  We can land without a fix just to keep our momentum, but it'll be brittle - it means JS can subvert wasm type discipline.  One solution is that pointer fields become r/o to JS, for now.

Anyway pointer fields are r/o to wasm because we don't have a set_field operation, and even if we did it will not do anything to pointer fields until we have write barriers in place.
Attachment #8976115 - Attachment is obsolete: true
Depends on: 1467632
Now with standard naming.
Attachment #8984187 - Attachment is obsolete: true
The struct.get and struct.set operations.  These can't land because they expose unchecked operations on pointers whose static type we think know, but that type can be subverted on the call-in interface from JS to wasm because the stubs have not been updated to perform type checks.  Also see earlier comment about struct.new and how JS can be used to store random pointers in TO pointer fields.
WIP, automatic upcast to wider struct type.
Posted patch bug1459900-downcast.patch (obsolete) — Splinter Review
WIP, downcast to narrower struct type.  This is hacky and slow and not what we really want but it works for now.
Comment on attachment 8983721 [details] [diff] [review]
bug1459900-understand-struct-types-v3.patch

Review of attachment 8983721 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, makes sense!

::: js/src/wasm/WasmGenerator.cpp
@@ +243,4 @@
>      }
>  
>      if (!isAsmJS()) {
> +        for (TypeDef& ty : env_->types) {

nit: how about 'td' or 'typeDef' for the canonical name of a TypeDef that we use here and everywhere else, instead of 'ty'?

::: js/src/wasm/WasmOpIter.h
@@ +1745,5 @@
>      if (!popWithType(ValType::I32, callee))
>          return false;
>  
> +    if (!env_.types[*sigIndex].isSig())
> +        return false;

I think this needs 'return fail()' to give a validation error.

@@ +1796,4 @@
>          return fail("signature index out of range");
>  
> +    if (!env_.types[*sigIndex].isSig())
> +        return false;

ditto

::: js/src/wasm/WasmTypes.h
@@ +756,5 @@
> +// The Module owns a dense array of Struct values that represent the structure
> +// types that the module knows about.  It is created from the sparse array of
> +// types in the ModuleEnvironment when the Module is created.
> +
> +class Struct

nit: I suppose we'll later need an Array which makes me wonder if there will be conflicts with such fundamental names claimed.  What about (Struct|Array|...)Type (which also avoids "strukt" if args/methods are named "structType()"))?

For that matter (and probably in a separate rs=me patch if you agree), Sig could be renamed to FuncType, allowing it to more directly correspond to the spec (https://webassembly.github.io/spec/core/syntax/types.html#function-types).

@@ +1136,4 @@
>  typedef Vector<SigWithId, 0, SystemAllocPolicy> SigWithIdVector;
>  typedef Vector<const SigWithId*, 0, SystemAllocPolicy> SigWithIdPtrVector;
>  
> +// A tagged container for the various types that can be present in a Wasm

nit: in comments we try to use "wasm"

@@ +1200,5 @@
> +
> +    // p has to point to the sig_ embedded within a TypeDef for this to be
> +    // valid.
> +    static const TypeDef* fromSigWithIdPtr(const SigWithId* p) {
> +        const TypeDef* q = (const TypeDef*)((char*)p - offsetof(TypeDef, sig_));

Bug 1447591 should remove ModuleEnvironment::funcIndexToSigIndex() which iiuc removes the only use of this method.

@@ +1217,5 @@
> +    }
> +
> +    // p has to point to the struct_ embedded within a TypeDef for this to be
> +    // valid.
> +    static const TypeDef* fromStructPtr(const Struct* p) {

Is this method dead?

::: js/src/wasm/WasmValidate.cpp
@@ +1089,2 @@
>  
> +        if (form == uint8_t(TypeCode::Func)) {

nit: since more are coming and for symmetry with other similar decoding functions in WasmValidate.cpp, can this be a switch where the cases call factored-out DecodeXType() functions?
Attachment #8983721 - Flags: review?(luke) → review+
Comment on attachment 8984136 [details] [diff] [review]
bug1459900-ref-types.patch

Review of attachment 8984136 [details] [diff] [review]:
-----------------------------------------------------------------

(Not yet into the meat of the patch; one cursory comment since I'm off for the day:)

::: js/src/wasm/WasmTypes.h
@@ +294,5 @@
> +// ValType.)
> +
> +class ValType;
> +
> +class ExprType : public TypeCodeType<ExprType, true, true>

In bug 1401675, my plan was to change ExprType from being "a ValType with an extra case" to, as the comment suggests, "a list of ValType" (not represented of course as a Vector<ValType> but rather some packed word that is either "Empty", a ValType, or an index into the type section.

Separately, for StackType, we're going to have to deal with some annoying unification problems with "let"/"pick" which make "Any" insufficient.  Thus, StackType will need to be some logical Variant<ValType, PolymorphicType> where PolymorphicType sometimes holds an index (we can still probably pack it all into one word).

All this is to say that, rather than trying to reuse code between ValType/ExprType/StackType, I think it makes sense for us to, for the moment, duplicate code, and then let these three types go their own ways in the future.  Plus, it doesn't look like a ton of code.
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/82874b00bab0
Struct types: read, write, validate.  r=luke
(In reply to Luke Wagner [:luke] from comment #22)
> Comment on attachment 8983721 [details] [diff] [review]
> bug1459900-understand-struct-types-v3.patch
> 
> Review of attachment 8983721 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> 
> For that matter (and probably in a separate rs=me patch if you agree), Sig
> could be renamed to FuncType, allowing it to more directly correspond to the
> spec
> (https://webassembly.github.io/spec/core/syntax/types.html#function-types).

Will do.

> @@ +1200,5 @@
> > +
> > +    // p has to point to the sig_ embedded within a TypeDef for this to be
> > +    // valid.
> > +    static const TypeDef* fromSigWithIdPtr(const SigWithId* p) {
> > +        const TypeDef* q = (const TypeDef*)((char*)p - offsetof(TypeDef, sig_));
> 
> Bug 1447591 should remove ModuleEnvironment::funcIndexToSigIndex() which
> iiuc removes the only use of this method.

OK, we can remove later.

> @@ +1217,5 @@
> > +    }
> > +
> > +    // p has to point to the struct_ embedded within a TypeDef for this to be
> > +    // valid.
> > +    static const TypeDef* fromStructPtr(const Struct* p) {
> 
> Is this method dead?

Only mostly dead - I need this for a subsequent patch, though it may go away if we don't need it for binary->text.
Flags: needinfo?(lhansen)
(In reply to Lars T Hansen [:lth] from comment #25)

> > For that matter (and probably in a separate rs=me patch if you agree), Sig
> > could be renamed to FuncType, allowing it to more directly correspond to the
> > spec
> > (https://webassembly.github.io/spec/core/syntax/types.html#function-types).
> 
> Will do.

Actually, let's agree on scope:

  Sig -> FuncType
  SigWithId -> FuncTypeWithId
  SigWithIdVector -> FuncTypeWithIdVector
  SigWithIdPtrVector -> FuncTypeWithIdPtrVector
  'sig' locals -> 'funcType' or 'ft' ditto, I probably prefer the latter but we should pick one
  'sig_' members -> 'funcType_'
  'sig()' members -> 'funcType()'

  SigIdDesc - unchanged because this is really about signature?

  newSig() etc in AsmJS.cpp - unchanged because a lot of churn + asm.js?
Flags: needinfo?(luke)
(In reply to Lars T Hansen [:lth] from comment #27)
Great question!  Agreed on all suggestions with comments:

>   'sig' locals -> 'funcType' or 'ft' ditto, I probably prefer the latter but
> we should pick one

Agreed on 'ft' for locals and some permutation of "funcType" everywhere else.

>   SigIdDesc - unchanged because this is really about signature?

For me this raised the question of whether "id" is good name.  From a pure wasm spec pov, what we're doing is implementing structural type equality of func types by unifying/hash-consing Sigs at runtime with an additional inline-word optimization.  I tried to imagine not-terribly-long variations of naming with "Unified" or "Hashed", but the added complication is that SigIdDesc/SigWithId aren't the unified/hashed Sigs *themselves*, but rather only a description of how to materialize a unified/hashed Sig at runtime.  So unless "HashedFuncTypeDesc"/"UnifiedFuncTypeDesc" are attractive to you (they seem a bit long to me), FuncTypeIdDesc seems best to me.

>   newSig() etc in AsmJS.cpp - unchanged because a lot of churn + asm.js?

Agreed; mentally, I'm mostly in "keep it working and not too awful" mode for this asm.js.
Flags: needinfo?(luke)
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ca90f5e47f46
Struct types: read, write, validate.  r=luke
Comment on attachment 8984136 [details] [diff] [review]
bug1459900-ref-types.patch

Review of attachment 8984136 [details] [diff] [review]:
-----------------------------------------------------------------

Generally looks great and agreed with the approach!  Two other suggested changes, happy to discuss further:

::: js/src/wasm/WasmValidate.cpp
@@ +402,5 @@
> +        if (gcTypesEnabled == HasGcTypes::False)
> +            break;
> +        if (uncheckedRefTypeIndex >= types->length())
> +            return d.fail("ref index out of range");
> +        TypeDef& ty = (*types)[uncheckedRefTypeIndex];

transitive nit: 'td' in this patch

@@ +404,5 @@
> +        if (uncheckedRefTypeIndex >= types->length())
> +            return d.fail("ref index out of range");
> +        TypeDef& ty = (*types)[uncheckedRefTypeIndex];
> +        if (ty.isNone()) {
> +            ty.setForwardStruct();

IIUC, the "forward struct" case won't ever survive past decoding the types section.  I wonder if we can remove this case from the general TypeDef struct and instead encapsulate this whole transient state in the DecodeTypeSection() algorithm.  That way, after you have a ModuleEnvironment (anywhere else), you can simply know you have a non-None/ForwardStruct ref.  This probably means a bit of logic duplication for the DecodeTypeSection() but not much and anyway maybe the ValType-decoding one does in the type section is different than for locals/func-signatures.  With the RefTypePolicy removed (as suggested in the other comment), then I think we can avoid having two DecodeValType()s.

::: js/src/wasm/WasmValidate.h
@@ +736,5 @@
> +
> +enum class RefTypePolicy {
> +    KnownGood,                   // Ref type indices have been checked before
> +    Check                        // For internal use
> +};

Since Decoder::fail() asserts error_ and since the only use of KnownGood which benefits from its assertiveness uses a Decoder with null error_, maybe the enum isn't necessary to have and decoding can always check?
Attachment #8984136 - Flags: review?(luke)
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ca13bbbbb8c5
Rename Sig as FuncType, transitively.  rs=luke
(In reply to Luke Wagner [:luke] from comment #23)
> Comment on attachment 8984136 [details] [diff] [review]
> bug1459900-ref-types.patch
> 
> Review of attachment 8984136 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> (Not yet into the meat of the patch; one cursory comment since I'm off for
> the day:)
> 
> ::: js/src/wasm/WasmTypes.h
> @@ +294,5 @@
> > +// ValType.)
> > +
> > +class ValType;
> > +
> > +class ExprType : public TypeCodeType<ExprType, true, true>
> 
> In bug 1401675, my plan was to change ExprType from being "a ValType with an
> extra case" to, as the comment suggests, "a list of ValType" (not
> represented of course as a Vector<ValType> but rather some packed word that
> is either "Empty", a ValType, or an index into the type section.
> 
> Separately, for StackType, we're going to have to deal with some annoying
> unification problems with "let"/"pick" which make "Any" insufficient.  Thus,
> StackType will need to be some logical Variant<ValType, PolymorphicType>
> where PolymorphicType sometimes holds an index (we can still probably pack
> it all into one word).
> 
> All this is to say that, rather than trying to reuse code between
> ValType/ExprType/StackType, I think it makes sense for us to, for the
> moment, duplicate code, and then let these three types go their own ways in
> the future.  Plus, it doesn't look like a ton of code.

It's not really about reuse, it's more about ensuring that these types use entirely compatible encodings so that bit representations are compatible.  But as you say, it's not a great hardship to expand TypeCodeType into its users if ExprType and StackType will diverge from ValType anyway.
Ah, I see, because now we can't just simply rely on the TypeCode enum for that.  How about duplicating most of the logic (since it will be diverging), but factoring out the packing/unpacking logic via:
  enum class PackedTypeCode : uint32_t {};        // i.e., a strong typedef
  PackedValType PackTypeCode(TypeCode, uint32_t); // asserts a ref type
  PackedValType PackTypeCode(TypeCode);           // asserts not a ref type
  TypeCode UnpackTypeCode(PackedValType);
  uint32_t UnpackTypeCodeIndex(PackedValType);
and then ExprType/StackType are built up from this by composition (i.e., containing a PackedTypeCode and using the non-ValType TypeCodes (BlockVoid, Any) to express their special cases.
(In reply to Luke Wagner [:luke] from comment #32)
> Comment on attachment 8984136 [details] [diff] [review]
> bug1459900-ref-types.patch
> 
> Review of attachment 8984136 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +404,5 @@
> > +        if (uncheckedRefTypeIndex >= types->length())
> > +            return d.fail("ref index out of range");
> > +        TypeDef& ty = (*types)[uncheckedRefTypeIndex];
> > +        if (ty.isNone()) {
> > +            ty.setForwardStruct();
> 
> IIUC, the "forward struct" case won't ever survive past decoding the types
> section.  I wonder if we can remove this case from the general TypeDef
> struct and instead encapsulate this whole transient state in the
> DecodeTypeSection() algorithm.

Yeah, I like that.  Will give it a try.
 
> ::: js/src/wasm/WasmValidate.h
> @@ +736,5 @@
> > +
> > +enum class RefTypePolicy {
> > +    KnownGood,                   // Ref type indices have been checked before
> > +    Check                        // For internal use
> > +};
> 
> Since Decoder::fail() asserts error_ and since the only use of KnownGood
> which benefits from its assertiveness uses a Decoder with null error_, maybe
> the enum isn't necessary to have and decoding can always check?

I don't see that.  The purpose of this hacky flag is to avoid looking at the types array at all, since we can't set it up properly when calling into the decoding code from the debugger - we don't know its proper length or its contents, we just have to take it on faith that a reftype points to a valid entry.

Maybe the problem here is that debugGetLocalTypes reuses the DecodeLocalEntries machinery, which really is not suited for this.  Maybe it is DecodeLocalEntries that needs a proper specialization for use from the debugger and the hacky flag is a symptom of that.
I think this addresses all concerns, as follows:

- Following your suggestion there is now a PackedTypeCode that is shared as
  the representation for ValType, ExprType, and StackType, by composition.
  I would have loved to make this a struct with bitfields but that's non-POD,
  and I would like to use this type as a POD to address Waldo's concerns
  on bug 1467632.  But it's a very local fix to make it a struct instead;
  I tested, and there are no casts of values of this type left in the code.

- The forward-structure handling is indeed now isolated in a table created in
  DecodeTypeSection and passed around as needed.  This is called the 'typeType'
  table.  In connection with this I factored the checking of the typeType
  table entries out of DecodeValType into its own ValidateRefType predicate, 
  this makes it simpler to use DecodeValType overall.

- I specialized DecodeLocalEntries so that the variant used from the debugger
  does not need to pass a lot of unnecessary data; this simplifies the
  implementation to the point where it does not need to / should not call
  DecodeValType, but instead it just reads a ValType and asserts that it looks
  OK.  And it gets rid of the decoding-policy enum too.

Finally, there are two "standard" methods in StackType that are unused in this patch but I'll need them later, so here they are.
Attachment #8984136 - Attachment is obsolete: true
Comment on attachment 8987875 [details] [diff] [review]
bug1459900-ref-types-v2.patch

... aaaand we remember to ask for review; see previous comment for a rundown of changes.
Attachment #8987875 - Flags: review?(luke)
Track the mutability of fields in a structure.
Attachment #8988153 - Flags: review?(luke)
Automatic upcast to prefixes so long as fields have the same type and mutability.
Attachment #8986399 - Attachment is obsolete: true
Attachment #8988155 - Flags: review?(luke)
Attachment #8984419 - Attachment is obsolete: true
Re the last three patches:

struct.new, struct.get, and struct.set are reasonably clean but can create unsound behavior in that (a) we do not check, on entry to exported wasm functions, whether the values we receive are of the correct type, and (b) we expose TypedObject instances to JS but these do not prevent assignments to immutable fields and do not implement proper type checks on assignments to pointer fields.

struct.narrow is strictly WIP, it performs a very slow structural downcast and is not what we want at all, just something to hack around with.
As for unsound behavior, we also have not dealt with wrappers.
Comment on attachment 8987875 [details] [diff] [review]
bug1459900-ref-types-v2.patch

Review of attachment 8987875 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for all the fixes; great patch!

::: js/src/wasm/WasmBaselineCompile.cpp
@@ +8897,4 @@
>  bool
>  BaseCompiler::emitRefNull()
>  {
> +    ValType ty;

nit for the patch: s/ty/vt/

::: js/src/wasm/WasmOpIter.h
@@ +1015,5 @@
>  
> +    if (uncheckedCode == uint8_t(ExprType::Void))
> +        *type = ExprType::Void;
> +    else
> +        *type = ExprType(ExprType::Code(uncheckedCode), uncheckedRefTypeIndex);

Why is this `if` necessary; would the `else` case not handle the Void case as well?

::: js/src/wasm/WasmTypes.h
@@ +175,5 @@
> +const uint32_t NoTypeCode     = 0xFF;      // Only use these
> +const uint32_t NoRefTypeIndex = 0xFFFFFF;  //   with PackedTypeCode
> +
> +static inline PackedTypeCode
> +PackTypeCode() {

nit: here and below, '{' on new line

Also, while I appreciate the symmetry seeing these overloads next to each other, perhaps InvalidPackedTypeCode() so that, when a call is read in isolation, there's not a question of what is being packed.

@@ +197,5 @@
> +
> +static inline PackedTypeCode
> +PackTypeCodeBits(uint32_t bits) {
> +    return PackTypeCode(TypeCode(bits & 255), bits >> 8);
> +}

nit: How about renaming this and its sibling to PackedTypeCodeToBits()/PackedTypeCodeFromBits() since these are more of coercions than packing/unpacking operations?

@@ +230,5 @@
> +// ValType.)
> +
> +class ValType;
> +
> +class ExprType

nit: because it's more fundamental, could ValType go before ExprType (moving whatever inline functions out-of-line that depend on ValType; and moving the ExprType(ValType&) ctor back inline).

@@ +325,5 @@
> +    bool isRef() const {
> +        return UnpackTypeCodeType(tc_) == TypeCode::Ref;
> +    }
> +
> +    bool isRefOrAnyRef() const {

Since I think we're going to get other reference types in the future (optref, i31ref, rttiref, eqref), I think we need a name here (and in ValType/StackType) that doesn't enumerate the cases.  isAnyRefSubtype() would be super-clear about its meaning, so I like it, but maybe it's a bit long so maybe isRefType()?

::: js/src/wasm/WasmValidate.cpp
@@ +1108,4 @@
>      return true;
>  }
>  
> +enum class TypeType

nit: while this name delights me, it seems more like a "state" (which changes over time according to a state transition diagram that it would be good to briefly comment here) than "type" (invariant).  TypeState?

@@ +1249,5 @@
> +    if (!typeTypes.resize(numTypes))
> +        return false;
> +
> +    for (uint32_t typeIndex = 0; typeIndex < numTypes; typeIndex++)
> +        typeTypes[typeIndex];

I think this line was meant to initialize?  Instead, I think you could fuse the resize+init with an appendN(numTypes, TypeState::None).

::: js/src/wasm/WasmValidate.h
@@ +736,2 @@
>  MOZ_MUST_USE bool
> +DecodeLocalEntries(Decoder& d, ValTypeVector* locals);

nit: could you rename to DecodeValidatedLocalEntries()?
Attachment #8987875 - Flags: review?(luke) → review+
Attachment #8988153 - Flags: review?(luke) → review+
Comment on attachment 8988155 [details] [diff] [review]
bug1459900-automatic-upcast-v2.patch

Review of attachment 8988155 [details] [diff] [review]:
-----------------------------------------------------------------

Nice.
Attachment #8988155 - Flags: review?(luke) → review+
(In reply to Luke Wagner [:luke] from comment #47)
> Comment on attachment 8987875 [details] [diff] [review]
> bug1459900-ref-types-v2.patch
> 
> Review of attachment 8987875 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> @@ +230,5 @@
> > +// ValType.)
> > +
> > +class ValType;
> > +
> > +class ExprType
> 
> nit: because it's more fundamental, could ValType go before ExprType (moving
> whatever inline functions out-of-line that depend on ValType; and moving the
> ExprType(ValType&) ctor back inline).

I think I'll leave this alone for now, as ExprType is going to change anyway.  The types both reference the other, and whether one is more fundamental than the other ... at the bottom they're both TypeCode :)

> @@ +325,5 @@
> > +    bool isRef() const {
> > +        return UnpackTypeCodeType(tc_) == TypeCode::Ref;
> > +    }
> > +
> > +    bool isRefOrAnyRef() const {
> 
> Since I think we're going to get other reference types in the future
> (optref, i31ref, rttiref, eqref), I think we need a name here (and in
> ValType/StackType) that doesn't enumerate the cases.  isAnyRefSubtype()
> would be super-clear about its meaning, so I like it, but maybe it's a bit
> long so maybe isRefType()?

If we're going to change it at all I think isAnyRefSubtype() is the better choice.  isRefType() is confusing relative to isRef(), and isRef() is needed because the == operator can't be used with ValType::Ref.

But I will actually leave this alone.  I don't think we know what "refs" will really look like, though for sure they will evolve.  But then we should rename this when we've decided that.  This predicate currently asks, is this a pointer type.  We've discussed making "anyref" more an "any" type, which would break that.  For now I value the clarity of enumerating the cases.
Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9e3fb74a11ee
Ref types with a simple type calculus.  r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/3ed7ed0f031b
Track structure field mutability. r=luke
https://hg.mozilla.org/integration/mozilla-inbound/rev/732ee5601056
Automatic upcasts through prefix supertyping. r=luke
Posted file reftypes.bundle (obsolete) —
Backup of rebased / fixed versions of last three patches.
No longer blocks: 1472634
Depends on: 1472634
No longer blocks: 1472633
Depends on: 1472633
Posted file wasm-structs.bundle (obsolete) —
Remaining patches, about ready for review.
Attachment #8988711 - Attachment is obsolete: true
Summary: Wasm structure types internal MVP → Wasm structure types - initial prototype implementation
Posted file wasm-structs.bundle (obsolete) —
Current patch queue for this and related bugs.
Attachment #9004886 - Attachment is obsolete: true
Depends on: 1489994
Attachment #8988159 - Attachment is obsolete: true
Attachment #9007286 - Attachment is obsolete: true
Attachment #9008009 - Flags: review?(luke)
Attachment #8988161 - Attachment is obsolete: true
Attachment #9008011 - Flags: review?(luke)
BTW there are some comments in these patches that flag "optimization opportunities".  Not all of those are of long-lived importance; I'll file follow-up bugs for those that are.
Posted file wasm-structs.bundle
/me embraces more change.  (Patch queue rebased and updated to the new brace style, best-effort.  Otherwise no changes.)
Comment on attachment 9008009 [details] [diff] [review]
bug1459900-implement-struct.new-v4.patch

Review of attachment 9008009 [details] [diff] [review]:
-----------------------------------------------------------------

Great work, nice and clean!

(In case you haven't already, patch needs to embrace the change.)

::: js/src/jit-test/tests/wasm/binary.js
@@ +278,5 @@
> +// Illegal Misc opcodes
> +
> +var reservedMisc =
> +    { // Saturating conversions (standardized)
> +      0x00: true, 0x01: true, 0x02: true, 0x03: true, 0x04: true, 0x05: true, 0x06: true, 0x07: true,

TIL that an object literal field name "0x0f" is not treated as the string "0x0f" but as the string "15".

::: js/src/wasm/WasmBaselineCompile.cpp
@@ +9512,5 @@
>  #endif
>  
> +#ifdef ENABLE_WASM_GC
> +bool
> +BaseCompiler::emitStructNew()

In some cases struct.new is #ifdef ENABLE_WASM_GC, in some cases (e.g., in WasmInstance.h) it's not.  FWIW, I think it'd be fine to only use the #ifdef for controlling the basic enablement and having everything else unconditionally built.

::: js/src/wasm/WasmModule.cpp
@@ +982,5 @@
> +
> +
> +bool
> +Module::makeStructTypeDescrs(JSContext* cx,
> +                             MutableHandle<StructTypeDescrVector> structTypeDescrs) const

Just to check my assumption: long-term, the impl strategy we could use here instead of having each Module store all its type descriptors and then each instantiation copying all those into typed object descriptors could (eventually, if there's not a better idea) be:
 - process-global hash-consed type graph that modules lookup-or-create into during async parallel compilation
 - Modules' struct table stores a RefPtr for each element, pointing into the type graph
 - Module::instantiate() does need to make new typed objects (ctor+shape+prototype), but each of these are small, constant-sized wrappers that point into the hash-consed type for its substance

@@ +994,5 @@
> +    RootedNativeObject toModule(cx, &typedObjectModule->as<NativeObject>());
> +    RootedObject prototype(cx, &toModule->getReservedSlot(
> +                                   TypedObjectModuleObject::StructTypePrototype).toObject());
> +
> +    for ( const StructType& structType : structTypes_ ) {

nit: no spaces after ( or before ); here and below

@@ +1048,5 @@
> +        // for every TypedObject.  If they contain fields of type Ref T then we
> +        // prevent JS from constructing instances of them.
> +
> +        Rooted<StructTypeDescr*>
> +            structTypeDescr(cx, StructMetaTypeDescr::createFromArrays(cx,

missing OOM check

@@ +1194,5 @@
>          return false;
>  
> +    // Create type descriptors for any struct types that the module has.
> +
> +    Rooted<StructTypeDescrVector> structTypeDescrs(cx);

TIL that a single GCVector<HeapPtr<T*>> can be both stack-rooted and heap-traced (at different points in its lifetime).  Nice!

::: js/src/wasm/WasmOpIter.h
@@ +2001,5 @@
> +
> +    if (!argValues->resize(str.fields_.length()))
> +        return false;
> +
> +    for (int32_t i = str.fields_.length() - 1; i >= 0; i--) {

nit: can you static_assert(MaxStructFields < INT32_MAX)?

::: js/src/wasm/WasmTypes.h
@@ +828,5 @@
>  {
>    public:
> +    StructFieldVector fields_;   // Field type, offset, and mutability
> +    uint32_t      moduleIndex_;  // Index in a dense array of structs in the module
> +    bool          isInline_;     // True if this is an InlineTypedObject and we

nit: maybe no column alignment if 1 of 3 fields is unaligned

@@ +832,5 @@
> +    bool          isInline_;     // True if this is an InlineTypedObject and we
> +                                 //   interpret the offsets from the object pointer;
> +                                 //   if false this is an OutlineTypedObject and we
> +                                 //   interpret everything relative to the pointer to
> +                                 //   the attached storage.

For my own edification: is this imposed by the current design of typed objects or is just a simplification?  (The alternative being to stick the first N bytes worth of fields inline and only the overflow goes into outline slots.)

::: js/src/wasm/WasmValidate.h
@@ +165,4 @@
>      MemoryUsage               memoryUsage;
>      uint32_t                  minMemoryLength;
>      Maybe<uint32_t>           maxMemoryLength;
> +    uint32_t                  structTypes;

nit: how about 'numStructTypes'
Attachment #9008009 - Flags: review?(luke) → review+
Comment on attachment 9008010 [details] [diff] [review]
bug1459900-implement-struct.get_and_set-v3.patch

Review of attachment 9008010 [details] [diff] [review]:
-----------------------------------------------------------------

Beautiful!  (Ditto on bracing for impact.)
Attachment #9008010 - Flags: review?(luke) → review+
Comment on attachment 9008011 [details] [diff] [review]
bug1459900-implement-struct.narrow-v3.patch

Review of attachment 9008011 [details] [diff] [review]:
-----------------------------------------------------------------

This is surely going to change radically later, but if you think it makes sense for short-term experimentation purposes, seems fine.  (Sorry for taking so long to review this patchse.)

::: js/src/wasm/WasmInstance.cpp
@@ +733,5 @@
> +    // the prefix check.
> +
> +    uint32_t found = UINT32_MAX;
> +    for (uint32_t i = 0; i < instance->structTypes_.length(); i++) {
> +        if (instance->structTypeDescrs_[i] == typeDescr) {

I'm just going to pretend I never saw this.

::: js/src/wasm/WasmModule.cpp
@@ +1203,5 @@
> +    StructTypeVector structTypes;
> +    if (!structTypes.resize(structTypes_.length()))
> +        return false;
> +    for (uint32_t i = 0; i < structTypes_.length(); i++) {
> +        if (!structTypes[i].copyFrom(structTypes_[i]))

Instead of all this copying, could you move StructTypeVector to wasm::Code, where it will be naturally shared by Module and all its Instances?

::: js/src/wasm/WasmValidate.h
@@ +692,4 @@
>          }
>          return true;
>      }
> +    MOZ_MUST_USE bool readReferenceType(ValType* ty, uint32_t numTypes, const char* const context) {

I think this method might be better placed in OpIter where it wouldn't need to take numTypes and it could also validate that the refTypeIndex pointed to a validly-referencable type (not just that it was in-range).  This could remove the isStruct() checks in readStructNarrow() and also provide some tighter validation for ref.null.

@@ +699,5 @@
> +        if (!readValType(&code, &refTypeIndex))
> +            return failf("invalid reference type for %s", context);
> +
> +        if (code == uint8_t(TypeCode::Ref)) {
> +            if (refTypeIndex > MaxTypes || refTypeIndex >= numTypes)

Isn't numTypes < MaxTypes?
Attachment #9008011 - Flags: review?(luke) → review+
(In reply to Luke Wagner [:luke] from comment #62)
> Comment on attachment 9008009 [details] [diff] [review]
> bug1459900-implement-struct.new-v4.patch
> 
> Review of attachment 9008009 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Great work, nice and clean!
> 
> (In case you haven't already, patch needs to embrace the change.)
> 
> ::: js/src/jit-test/tests/wasm/binary.js
> @@ +278,5 @@
> > +// Illegal Misc opcodes
> > +
> > +var reservedMisc =
> > +    { // Saturating conversions (standardized)
> > +      0x00: true, 0x01: true, 0x02: true, 0x03: true, 0x04: true, 0x05: true, 0x06: true, 0x07: true,
> 
> TIL that an object literal field name "0x0f" is not treated as the string
> "0x0f" but as the string "15".

You suddenly made me doubt myself, but I thoroughly internalized 3rd Edition way back when and this still works :)

> ::: js/src/wasm/WasmBaselineCompile.cpp
> @@ +9512,5 @@
> >  #endif
> >  
> > +#ifdef ENABLE_WASM_GC
> > +bool
> > +BaseCompiler::emitStructNew()
> 
> In some cases struct.new is #ifdef ENABLE_WASM_GC, in some cases (e.g., in
> WasmInstance.h) it's not.  FWIW, I think it'd be fine to only use the #ifdef
> for controlling the basic enablement and having everything else
> unconditionally built.

Yeah, this is a minor mess frankly, and this is not the only feature suffering from the mess.  The ifdefs are useful sign posts for things that might change / might be removed, but they are indeed incoherently applied.

I think I will file a followup bug to fix this asap, it should not take a tremendous effort.

> ::: js/src/wasm/WasmModule.cpp
> @@ +982,5 @@
> > +
> > +
> > +bool
> > +Module::makeStructTypeDescrs(JSContext* cx,
> > +                             MutableHandle<StructTypeDescrVector> structTypeDescrs) const
> 
> Just to check my assumption: long-term, the impl strategy we could use here
> instead of having each Module store all its type descriptors and then each
> instantiation copying all those into typed object descriptors could
> (eventually, if there's not a better idea) be:
>  - process-global hash-consed type graph that modules lookup-or-create into
> during async parallel compilation
>  - Modules' struct table stores a RefPtr for each element, pointing into the
> type graph
>  - Module::instantiate() does need to make new typed objects
> (ctor+shape+prototype), but each of these are small, constant-sized wrappers
> that point into the hash-consed type for its substance

So this is an interesting question.  It depends a little bit on how we test type compatibility and so on.  If a module defines a type that is private and then the module is instantiated twice, then the objects produced by one instance are not of the same nominal type as the objects produced by the other instance, and since the objects can flow from one to the other as anyref we have to be sure that there's no confusion if there's a nominal downcast of a foreign object to the local type.

Generally we want to move in the direction of more sharing and what you outline seems fairly reasonable to me right now, since the last step probably performs the necessary disambiguation of types.

> @@ +832,5 @@
> > +    bool          isInline_;     // True if this is an InlineTypedObject and we
> > +                                 //   interpret the offsets from the object pointer;
> > +                                 //   if false this is an OutlineTypedObject and we
> > +                                 //   interpret everything relative to the pointer to
> > +                                 //   the attached storage.
> 
> For my own edification: is this imposed by the current design of typed
> objects or is just a simplification?  (The alternative being to stick the
> first N bytes worth of fields inline and only the overflow goes into outline
> slots.)

Well, "imposed"... that is the current design of typed objects, so far as I can see.  Whether it's beneficial to have a fat object with an overflow chunk or to have a slim object with the full outline chunk probably depends on what the memory manager likes the best, and then secondarily, what is most convenient for JS.  For Wasm we can easily cope with either design so long as we known at compile time where to look.  Clearly if we store more fields inline we avoid the extra pointer load for the first n fields, which would be nice for us.
(In reply to Luke Wagner [:luke] from comment #64)
> Comment on attachment 9008011 [details] [diff] [review]
> bug1459900-implement-struct.narrow-v3.patch
> 
> Review of attachment 9008011 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This is surely going to change radically later, but if you think it makes
> sense for short-term experimentation purposes, seems fine.  (Sorry for
> taking so long to review this patchse.)

This is only a stopgap solution and this will absolutely change radically.

> ::: js/src/wasm/WasmModule.cpp
> @@ +1203,5 @@
> > +    StructTypeVector structTypes;
> > +    if (!structTypes.resize(structTypes_.length()))
> > +        return false;
> > +    for (uint32_t i = 0; i < structTypes_.length(); i++) {
> > +        if (!structTypes[i].copyFrom(structTypes_[i]))
> 
> Instead of all this copying, could you move StructTypeVector to wasm::Code,
> where it will be naturally shared by Module and all its Instances?

I will investigate.  I have it in my head that this copying is (currently!) necessary to ensure per-instance type identity, but suddenly I'm not sure about that.

This really is gross, and there are better ways to do it.  I'll expedite a design that enables more sharing, so that each instance at least can hold something smaller than the full type.
(In reply to Lars T Hansen [:lth] from comment #66)
> (In reply to Luke Wagner [:luke] from comment #64)
> > Comment on attachment 9008011 [details] [diff] [review]
> > bug1459900-implement-struct.narrow-v3.patch
> 
> > ::: js/src/wasm/WasmModule.cpp
> > @@ +1203,5 @@
> > > +    StructTypeVector structTypes;
> > > +    if (!structTypes.resize(structTypes_.length()))
> > > +        return false;
> > > +    for (uint32_t i = 0; i < structTypes_.length(); i++) {
> > > +        if (!structTypes[i].copyFrom(structTypes_[i]))
> > 
> > Instead of all this copying, could you move StructTypeVector to wasm::Code,
> > where it will be naturally shared by Module and all its Instances?
> 
> I will investigate.  I have it in my head that this copying is (currently!)
> necessary to ensure per-instance type identity, but suddenly I'm not sure
> about that.

I think you're right - that ought to work.
(In reply to Lars T Hansen [:lth] from comment #65)
> > TIL that an object literal field name "0x0f" is not treated as the string
> > "0x0f" but as the string "15".
> 
> You suddenly made me doubt myself, but I thoroughly internalized 3rd Edition
> way back when and this still works :)

Haha, I doubted it until I tried it in the console ;)

> If a module defines a type that is private
> and then the module is instantiated twice, then the objects produced by one
> instance are not of the same nominal type as the objects produced by the
> other instance, and since the objects can flow from one to the other as
> anyref we have to be sure that there's no confusion if there's a nominal
> downcast of a foreign object to the local type.

Agreed.  So the tuple (typed object ctor, ctor.prototype, Object Group*, Shape*) would be unique per instantiation since these are all nominal things and, you're right, unique per instance.  But then it seems like all of these can point into the (process-wide shared) structural type for all the actual type information.
https://hg.mozilla.org/mozilla-central/rev/29614e10d512
https://hg.mozilla.org/mozilla-central/rev/b4bb1edd5539
https://hg.mozilla.org/mozilla-central/rev/56a2c01222f3
Status: ASSIGNED → RESOLVED
Closed: 10 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla64
Depends on: 1493476
You need to log in before you can comment on or make changes to this bug.