Closed Bug 1276028 Opened 8 years ago Closed 8 years ago

Baldr: add WebAssembly.(Module, Instance)

Categories

(Core :: JavaScript Engine, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox49 --- affected
firefox50 --- fixed

People

(Reporter: luke, Assigned: luke)

References

Details

Attachments

(14 files, 1 obsolete file)

5.46 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
5.10 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
16.18 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
111.81 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
51.79 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
12.31 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
4.18 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
22.85 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
365.74 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
22.46 KB, patch
shu
: review+
Details | Diff | Splinter Review
36.87 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
21.76 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
22.16 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
30.07 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
To implement the "real" WebAssembly JS API (https://github.com/WebAssembly/design/blob/master/JS.md), we need two JS objects: a Module and an Instance.  Logically a Module holds compiled code and the Instance holds code + instantiation parameters (memory, table, code, etc).  Modules are also structured cloneable, so must keep a copy of the original bytecode.  This means that Module holds data that is not necessary after instantiation/caching which suggests a further internal split of wasm::Module into a wasm::Code (which is the actual code and metadata necessary for the entire duration of an Instance) and wasm::Module (stuff that can be released earlier).  Thus, wasm::Instance and wasm::Module would both point to wasm::Code (but not the other way).

(Until we stop doing patching (and otherwise specializing code to instantiation parameters), we'll need to clone 1 wasm::Code for each wasm::Instance.)
Blocks: 1276029
Attached patch pod-vector-macroSplinter Review
First a little clean-up to remove the ghastly IsPod specialization junk.
Attachment #8758532 - Flags: review?(bbouvier)
Attached patch rm-num-funcsSplinter Review
I noticed this field isn't really necessary anymore.
Attachment #8758534 - Flags: review?(bbouvier)
Attached patch move-invoke-defsSplinter Review
This moves callImport to be an internal detail of Module and puts related code closer together.
Attachment #8758535 - Flags: review?(bbouvier)
This patch separates code from the metadata, splitting the vageuly-named ModuleData into CodeSegment (just the code) and Metadata (describes the code).  What's cool is that, except for the code, which currently needs to be cloned, the Metadata is all read-only and this means that we don't need to clone it, we can simply ref-count it.  The next patch will build on this, but with this patch, cloning an asm.js module becomes a ton cheaper and we can remove all the clone() code.  AsmJSModuleData is renamed to AsmJSMetadata for symmetry.
Attachment #8758539 - Flags: review?(bbouvier)
Comment on attachment 8758532 [details] [diff] [review]
pod-vector-macro

Review of attachment 8758532 [details] [diff] [review]:
-----------------------------------------------------------------

The embedding of closing / opening the namespaces is definitely not the most pretty, but as long as it stays under control...
Attachment #8758532 - Flags: review?(bbouvier) → review+
Comment on attachment 8758534 [details] [diff] [review]
rm-num-funcs

Review of attachment 8758534 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!
Attachment #8758534 - Flags: review?(bbouvier) → review+
Comment on attachment 8758535 [details] [diff] [review]
move-invoke-defs

Review of attachment 8758535 [details] [diff] [review]:
-----------------------------------------------------------------

Nice.

::: js/src/asmjs/WasmModule.cpp
@@ +1283,5 @@
>      exit.baselineScript = nullptr;
>  }
>  
> +static bool
> +ReadI64Object(JSContext* cx, HandleValue v, int64_t* i64)

Sweet! I like that it can be made static.

::: js/src/asmjs/WasmModule.h
@@ +512,5 @@
>      ImportExit& importToExit(const Import& import);
>  
> +    bool callImport(JSContext* cx, uint32_t importIndex, unsigned argc, const uint64_t* argv,
> +                    MutableHandleValue rval);
> +    static int32_t callImport_void(int32_t importIndex, int32_t argc, uint64_t* argv);

It might be nice to keep the comment about returning int32_t and not bool (because of the optimizing compiler bug), since it is weird and any weirdness will raise questions to the reader.

::: js/src/asmjs/WasmStubs.cpp
@@ +408,4 @@
>      static const MIRType typeArray[] = { MIRType::Pointer,   // ImportExit
>                                           MIRType::Int32,     // argc
>                                           MIRType::Pointer }; // argv
>      MIRTypeVector invokeArgTypes;

Should we also rename invokeArgTypes, as it was probably referring to InvokeImport_* too? (and in GenerateInterpExit too)
Attachment #8758535 - Flags: review?(bbouvier) → review+
Comment on attachment 8758539 [details] [diff] [review]
split-code-and-metadata

Review of attachment 8758539 [details] [diff] [review]:
-----------------------------------------------------------------

The naming scheme is so much better. I would always get confused by "module" in AsmJS.cpp, before this patch. Thank you!

::: js/src/asmjs/AsmJS.cpp
@@ +306,5 @@
>  };
>  
> +// Holds the immutable guts of an AsmJSModule.
> +//
> +// AsmJSModule is built incrementally by ModuleValidator and then shared

nit: AsmJSModule => AsmJSMetadata

@@ +395,2 @@
>          *data += mallocSizeOf(exportMap_.get()) + exportMap_->sizeOfExcludingThis(mallocSizeOf);
> +        *data += mallocSizeOf(asmJSMetadata_.get()) + asmJSMetadata_->sizeOfExcludingThis(mallocSizeOf);

Might need to take cacheResult_ into account too?

@@ +2112,5 @@
>          // the index will refer into the range of AsmJSExports.
> +        MOZ_ASSERT(exportIndex <= asmJSMetadata_->exports.length());
> +        return exportIndex < asmJSMetadata_->exports.length() ||
> +               asmJSMetadata_->exports.emplaceBack(func.srcBegin() - asmJSMetadata_->srcStart,
> +                                            func.srcEnd() - asmJSMetadata_->srcStart);

nit: alignment of the second line's arguments is wrong

@@ +2347,5 @@
>          moduleObj.set(WasmModuleObject::create(cx_));
>          if (!moduleObj)
>              return false;
>  
> +        auto module = js_new<AsmJSModule>(Move(code), *metadata, *staticLinkData, *exportMap,

nit: auto*

@@ +2348,5 @@
>          if (!moduleObj)
>              return false;
>  
> +        auto module = js_new<AsmJSModule>(Move(code), *metadata, *staticLinkData, *exportMap,
> +                                          *asmJSMetadata_, CacheResult::Miss);

Should it be asmJSMetadata_->forget() here? I think finish() is the last thing we call before the ModuleValidator gets finalized and so asmJSMetadata_'s dtor will update the ref count anyways, but the intent could be made more clear here.

@@ +8098,5 @@
> +    asmJSMetadata->srcBodyStart = parser.tokenStream.currentToken().pos.end;
> +    asmJSMetadata->strict = parser.pc->sc->strict() && !parser.pc->sc->hasExplicitUseStrict();
> +    asmJSMetadata->scriptSource.reset(parser.ss);
> +
> +    auto module = js_new<AsmJSModule>(Move(code), *metadata, *staticLinkData, *exportMap,

nit: auto*

@@ +8121,5 @@
> +    UniqueCodeSegment code = CodeSegment::clone(cx, codeSegment());
> +    if (!code)
> +        return false;
> +
> +    auto module = js_new<AsmJSModule>(Move(code), metadata(), *staticLinkData_, *exportMap_,

nit: auto* to indicate it's a pointer type

@@ +8122,5 @@
> +    if (!code)
> +        return false;
> +
> +    auto module = js_new<AsmJSModule>(Move(code), metadata(), *staticLinkData_, *exportMap_,
> +                                      *asmJSMetadata_, cacheResult_);

Although a module cloning is not a cache hit per se, it has the same nice property to not need full recompilation. Should we use CacheHit here independently of cacheResult_? Or have a third enum value that means cloned?

::: js/src/asmjs/WasmGenerator.cpp
@@ +910,3 @@
>  
>      // Allocate the code (guarded by a UniquePtr until it is given to the Module).
> +    UniqueCodeSegment code = CodeSegment::allocate(cx_, codeLength, globalDataLength_);

Naming this local `code` makes the below statement `code->code()` look weird. Can we rename code to codeSegment or codeSeg here?

::: js/src/asmjs/WasmModule.cpp
@@ +78,5 @@
>      unsigned permissions =
>          ExecutableAllocator::initialProtectionFlags(ExecutableAllocator::Writable);
>  
> +    void* p = AllocateExecutableMemory(nullptr, totalLength, permissions,
> +                                       "asm-js-code", gc::SystemPageSize());

Can we update the description string to "wasm-code-segment"?

::: js/src/asmjs/WasmModule.h
@@ +52,5 @@
> +
> +class CodeSegment;
> +typedef UniquePtr<CodeSegment> UniqueCodeSegment;
> +
> +class CodeSegment final

Correct me if I'm wrong, but I think adding `final` here is (harmless but) unnecessary, as CodeSegment is not involved in any inheritance hierarchy here?

@@ +419,2 @@
>  {
> +    uint32_t              functionLength;

Pre-existing, but this name was a bit unclear to me, as it's the length of *all* the functions, not a single function. Could we rename it to functionsLength? (or funcsLength or funcBodiesLength or funcDefsLength)
Attachment #8758539 - Flags: review?(bbouvier) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #8)
> Might need to take cacheResult_ into account too?

It's included in the mallocSizeOf(this) in Module::addSizeOfMisc.

> Should it be asmJSMetadata_->forget() here?

That's a good question and one I thought about.  The forget() will leave an outstanding ref-count (it returns already_AddRefed<T>), so we'd have to take an already_AddRefed<T> and then there's the corner case to worry about where there is a failure before the already_AddRefed is stored to a RefPtr.  Altogether, it felt simpler to just pass around &'s.

> Although a module cloning is not a cache hit per se, it has the same nice
> property to not need full recompilation. Should we use CacheHit here
> independently of cacheResult_? Or have a third enum value that means cloned?

This is mostly preserving existing behavior.  Also, the "cloning" is about to change significantly in the latter patches (it won't be "cloning", just "multiply instantiating").

> Pre-existing, but this name was a bit unclear to me, as it's the length of
> *all* the functions, not a single function. Could we rename it to
> functionsLength? (or funcsLength or funcBodiesLength or funcDefsLength)

Agreed, and fixed in a forthcoming patch :)
Whiteboard: [leave open]
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e85e9883c8fc
Baldr: tidy up IsPod specialization (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/ac03826cd5a8
Baldr: remove numFuncs (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/525966b4be23
Baldr: move and privatize callImport (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/a35b9743df32
Baldr: split out CodeSegment and Metadata from Module (r=bbouvier)
This patch removes the need to trace AsmJSModule (which simplifies later patches) by replacing PropertyName*s with CacheableChars.
Attachment #8760970 - Flags: review?(bbouvier)
Attached patch tweak-export-mapSplinter Review
This small patch moves the function-index from being a separate ExportMap vector (exportFuncIndices) to simply being a member of the Export.
Attachment #8760972 - Flags: review?(bbouvier)
This patch moves x64 global patching out of the MacroAssembler since it does not depend on any MacroAssembler state; this is necessary for a later patch.
Attachment #8760973 - Flags: review?(bbouvier)
Attached patch rm-slow-funcsSplinter Review
This patch kills SlowFunctionVector: wasm won't have a console message and won't use this info.  Also, it's easy enough to recover by manual instrumentation (and in more detail, which is usually what you want).  This simplifies the interface slightly.
Attachment #8760974 - Flags: review?(bbouvier)
Depends on: 1278754
Depends on: 1277896
This is the main patch which splits out wasm::Instance from wasm::Module.  The JS object wrappers are moved into WasmJS.h/cpp which will be extended in subsequent patches to implement the JS WebAssembly.Module/Instance APIs.

With this split, an asm.js module function is simply a wasm::Module and each call the the module function produces a wasm::Instance.  This patch removes the remainder of the complicated cloning path: now a wasm::Module *always* stores the unlinked/unspecialized code so it is always able to simply clone and link/specialize each time instantiate() is called (w/ no difference between first and subsequent calls).  Much simpler, but this does mean more memory.  Fortunately, wasm::Module can be collected much earlier than wasm::Instance, releasing this unlinked code memory.  This means that this patch can actually *reduce* asm.js memory usage (after a GC which releases the asm.js module function) and I've observed this to be the case in asm.js AngryBots.

Since Metadata and (possibly) Bytecode can be shared between Modules/Instances, to avoid double-counting in about:memory, HashSets are used.
Attachment #8761458 - Flags: review?(bbouvier)
Comment on attachment 8760970 [details] [diff] [review]
remove-asmjsmodule-trace

Review of attachment 8760970 [details] [diff] [review]:
-----------------------------------------------------------------

Nice. I like the unification of all the *name() functions into field(). Thank you for the patch.

::: js/src/asmjs/AsmJS.cpp
@@ +1849,5 @@
>      void initModuleFunctionName(PropertyName* name) {
>          MOZ_ASSERT(!moduleFunctionName_);
>          moduleFunctionName_ = name;
>      }
> +    bool initGlobalArgumentName(PropertyName* n) {

here and below, MOZ_MUST_USE

@@ +7338,5 @@
>  
>  static bool
> +GetDataProperty(JSContext* cx, HandleValue objVal, const char* fieldChars, MutableHandleValue v)
> +{
> +    RootedAtom field(cx, AtomizeUTF8Chars(cx, fieldChars, strlen(fieldChars)));

I guess this doesn't regress memory usage too much that we do the PropertyName -> CacheableChars (~chars) -> Atoms round trip, because 1) we have the shared atom stuff now and 2) there aren't too many data properties?

@@ +7428,1 @@
>          RootedValue v(cx);

Won't removing the { break compilation, as there's a RootedValue local in this case block?

@@ +7919,5 @@
>      return module.dynamicallyLink(cx, moduleObj, buffer, imports, exportObj);
>  }
>  
>  static bool
> +MaybeAppendUTF8Name(JSContext* cx, const char* utf8Chars, MutableHandle<PropertyNameVector> names)

Let's MOZ_MOST_USE this.

@@ +8852,5 @@
> +    UniqueTwoByteChars twoByteChars(UTF8CharsToNewTwoByteCharsZ(cx, utf8, &length).get());
> +    if (!twoByteChars)
> +        return false;
> +
> +    return sb->append(sep, strlen(sep)) && sb->append(twoByteChars.get(), length);

style nit: could you put the second statement after the && on a new line, please?

::: js/src/asmjs/WasmCode.cpp
@@ +287,2 @@
>  {
> +    return chars ? strlen(chars) + 1 : 0;

nice catch, and this makes deserialize's code simpler (and not waste 1 byte for null strings).
Attachment #8760970 - Flags: review?(bbouvier) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #17)
Thanks, and good catch with the init*ArgumentName!

> I guess this doesn't regress memory usage too much that we do the
> PropertyName -> CacheableChars (~chars) -> Atoms round trip, because 1) we
> have the shared atom stuff now and 2) there aren't too many data properties?

Correct, the number of exports is minute.

> Won't removing the { break compilation, as there's a RootedValue local in
> this case block?

Hah, it would but the underlying error is jumping *past* a local so the last case can have locals :).  That's why it compiles but indeed this was unintentional: I think I started by removing the local and the curlies, then re-adding the local and forgetting to re-add the curlies.  Anyhow, will re-add.

> nice catch, and this makes deserialize's code simpler (and not waste 1 byte
> for null strings).

Yeah, and it also allows (a later patch) to keep 0 vs. "" distinct.
Comment on attachment 8760972 [details] [diff] [review]
tweak-export-map

Review of attachment 8760972 [details] [diff] [review]:
-----------------------------------------------------------------

Sweet!
Attachment #8760972 - Flags: review?(bbouvier) → review+
Attachment #8760973 - Flags: review?(bbouvier) → review+
Comment on attachment 8760974 [details] [diff] [review]
rm-slow-funcs

Review of attachment 8760974 [details] [diff] [review]:
-----------------------------------------------------------------

Sounds good. It is still quite easy to compute the compilation time for a single function and prints that to stderr in the JS shell.

Maybe a follow-up: for the total compilation time, could we use mozilla::TimeStamp? It's supposedly faster than PRMJ_Now();
Attachment #8760974 - Flags: review?(bbouvier) → review+
Comment on attachment 8761458 [details] [diff] [review]
split-module-instance

Review of attachment 8761458 [details] [diff] [review]:
-----------------------------------------------------------------

It's been a few hours and I haven't yet finished looking at AsmJS.cpp, Wasm{,Code,Generator,Instance,JS,Module}.{h,cpp}. First round of comments, I'll finish on Monday. As discussed on irc, land this if you want with rs=me now.

Please, pretty please, -- and I'm not being passive-aggressive hopefully, just want to help your next reviewers, which might include me -- next time:
- put drive-by renaming changes in their own separate patches
- put code-motion changes in their own separate patches
- ask review on reviewboard for patches that include a lot of code motion (reviewboard helps identifying blocks of code that just moved across files)
- if the patch is going to be hard to rebase over a few days, then your reviewers are going to have a bad time
</rant>

That being said, the design is indeed clean and easy to think about, and it's nice that we end up removing all the module() getters and replace them by direct accesses to metadata/instance, making it easier to understand what we touch.

::: js/src/asmjs/AsmJS.cpp
@@ +2279,5 @@
> +        // AsmJSGlobal.
> +        ImportNameVector importNames;
> +
> +        // asm.js does not have any wasm bytecode to save; view-source is
> +        // provied through the ScriptSource.

nit: provided

::: js/src/asmjs/Wasm.cpp
@@ +1103,5 @@
> +static UniqueModule
> +DecodeModule(JSContext* cx, UniqueChars file, const ShareableBytes& bytecode,
> +             MutableHandleArrayBufferObject heap)
> +{
> +    UniqueModuleGeneratorData init = js::MakeUnique<ModuleGeneratorData>(cx);

pre-existing nit: there's a using namespace js:: at the top of this file, making the prefixing spurious here.

::: js/src/asmjs/WasmBinaryToExperimentalText.h
@@ +47,5 @@
>  // representation.
>  
>  MOZ_MUST_USE bool
>  BinaryToExperimentalText(JSContext* cx, const uint8_t* bytes, size_t length, StringBuffer& buffer,
> +                         const ExperimentalTextFormatting& formatting = ExperimentalTextFormatting());

I see the only single usage of this is with the default argument. The name stating that we want to the experimental text formatting, could we just remove this parameter and make formatting a local variable of this function instead?

::: js/src/asmjs/WasmCode.cpp
@@ +172,5 @@
> +        UniqueChars owner;
> +        const char* name = metadata.getFuncName(cx, codeRange.funcIndex(), &owner);
> +        if (!name) {
> +            return false;
> +        }

nit: no braces

::: js/src/asmjs/WasmCode.h
@@ +33,2 @@
>  // A wasm CodeSegment owns the allocated executable code for a wasm module.
>  // CodeSegment passed to the Module constructor must be allocated via allocate.

This comment is now outdated (allocate).

@@ +55,5 @@
> +
> +    // The profiling mode may be changed dynamically.
> +    bool profilingEnabled_;
> +
> +    CodeSegment() : bytes_(nullptr) {}

For static analysis purposes, can we PODZero this?

@@ +112,5 @@
> +        typename SeenSet::AddPtr p = seen->lookupForAdd(self);
> +        if (p)
> +            return 0;
> +        bool ok = seen->add(p, self);
> +        (void)ok;  // oh well

We probably need to better handle this OOM: either an unrecoverable OOM crash, or just a default value 0? It is likely that some place else will either trigger a GC or crash, making the end result nonsensical anyway.

@@ +424,5 @@
> +        return kind == ModuleKind::AsmJS;
> +    }
> +    const AsmJSMetadata& asAsmJS() const {
> +        MOZ_ASSERT(isAsmJS());
> +        return *(const AsmJSMetadata*)this;

We could use a static_cast here?

::: js/src/asmjs/WasmFrameIterator.cpp
@@ +326,5 @@
>  // any pc. In non-profiling mode, the only way to observe WasmActivation::fp is
>  // to call out to C++ so, as an optimization, we don't update fp. To avoid
>  // recompilation when the profiling mode is toggled, we generate both prologues
>  // a priori and switch between prologues when the profiling mode is toggled.
> +// Specifically, ToggleProfiling patches all callsites to either call the profiling or

nit: more than 80 chars

::: js/src/asmjs/WasmModule.cpp
@@ +332,4 @@
>  
> +    // Only save the source if the target compartment is a debuggee (which is
> +    // true when, e.g., the console is open, not just when debugging is active).
> +    const ShareableBytes* maybeBytecode = cx->compartment()->isDebuggee() 

nit: trailing whitespace

::: js/src/asmjs/WasmTypes.cpp
@@ +40,5 @@
> +{
> +    switch (type_) {
> +      case ValType::I32:
> +      case ValType::F32:
> +        *(uint32_t*)dst = u.i32_;

Is it fine to do so (even on ARM) because the global's data is SimdAligned? Otherwise, using memcpy everywhere in this function sounds fine to me too.

::: js/src/asmjs/WasmTypes.h
@@ +335,5 @@
> +        MOZ_ASSERT(type_ == ValType::F32x4);
> +        return u.f32x4_;
> +    }
> +
> +    void writePayload(uint8_t* dst);

Val is a POD, could we keep it that way and have a static function in the unique file where it's used, instead?

::: js/src/builtin/AtomicsObject.cpp
@@ +531,5 @@
>  static void
>  GetCurrentAsmJSHeap(SharedMem<void*>* heap, size_t* length)
>  {
>      JSRuntime* rt = js::TlsPerThreadData.get()->runtimeFromMainThread();
> +    wasm::Instance& module = rt->wasmActivationStack()->instance();

Can you rename the local too?

::: js/src/jsobj.cpp
@@ -3789,5 @@
>      } else if (is<SharedArrayBufferObject>()) {
>          SharedArrayBufferObject::addSizeOfExcludingThis(this, mallocSizeOf, info);
> -    } else if (is<WasmModuleObject>()) {
> -        as<WasmModuleObject>().addSizeOfMisc(mallocSizeOf, &info->objectsNonHeapCodeAsmJS,
> -                                             &info->objectsMallocHeapMisc);

Following the chain of callers in searchfox, I think that the heap allocation inspection tool in the devtools won't collect metadata/bytecode's sizes anymore. This might be a problem for people profiling their code. Could we address this?

::: js/src/vm/Debugger-inl.h
@@ -71,4 @@
>  {
> -    // Insert the wasm::Module into a compartment-wide list for discovery
> -    // later without a heap walk.
> -    cx->compartment()->wasmModuleWeakList.insertBack(&wasmModule->module());

I think this is unintentional. We still walk over the list of wasmModuleWeakList in Debugger.cpp, but this is always empty if we remove this line.

::: js/src/vm/Debugger.cpp
@@ +4227,5 @@
>  
>      /*
>       * Like above, but for wasm modules.
>       */
> +    Rooted<WasmInstanceObjectVector> instanceObjectVector;

Can you keep "wasm" in the name, please?
(In reply to Benjamin Bouvier [:bbouvier] from comment #21)
Thanks!  Lots of good points.  Happy to continue to follow-up as you post more comments.

> - put code-motion changes in their own separate patches

Yes, very sorry about this.  I realize now that what I perhaps could've done is to have moved the code from WasmModule.cpp to WasmInstance.cpp before introducing wasm::Instance (so it was *pure* code motion) and renamed things from 'module' to 'instance' (so *pure* renaming) and then then next patch could've introduced Instance and tweaked everything in-place.  That being said, the actual process I used was to basically delete everything and rebuild these two classes from scratch, putting each piece where it belonged (since it wasn't quite clear at the beginning how these two things would look).  Anyhow, I'll try to do better in the future (although I think now things have finally settled down and we're done with the great wasmification).

> > +    UniqueModuleGeneratorData init = js::MakeUnique<ModuleGeneratorData>(cx);
> 
> pre-existing nit: there's a using namespace js:: at the top of this file,
> making the prefixing spurious here.

I certainly would have, but there's some arcane C++ Argument Dependent Lookup logic going on that finds an ambiguity between js::MakeUnique and mozilla::MakeUnique, so I need to qualify.

> >  MOZ_MUST_USE bool
> >  BinaryToExperimentalText(JSContext* cx, const uint8_t* bytes, size_t length, StringBuffer& buffer,
> > +                         const ExperimentalTextFormatting& formatting = ExperimentalTextFormatting());
> 
> I see the only single usage of this is with the default argument. The name
> stating that we want to the experimental text formatting, could we just
> remove this parameter and make formatting a local variable of this function
> instead?

I assume Yury or Dan have a patch that pulls out errors and passes them in the caller of this function in TestingFunctions.cpp, so I didn't want to break that, just tidy up the non-testing callsite.

> We probably need to better handle this OOM: either an unrecoverable OOM
> crash, or just a default value 0? It is likely that some place else will
> either trigger a GC or crash, making the end result nonsensical anyway.

This is actually the precedent used in MemoryMetrics.cpp for the seenScripts HashSet.  I think it makes sense too: if we're performing a diagnostic about:memory report, it may be b/c we're close to OOM.  In that case I'd rather have some modest over-counting than failure to get a report out at all.  For the most part the about:memory sweep uses hardly any memory; it just accumulates, so I expect the reports won't be too nonsensical anyway.

> > +        return *(const AsmJSMetadata*)this;
> 
> We could use a static_cast here?

Not without the AsmJSMetadata static type visible and I'd really like to keep asm.js entirely encapsulated in AsmJS.cpp.

> > +        *(uint32_t*)dst = u.i32_;
> 
> Is it fine to do so (even on ARM) because the global's data is SimdAligned?
> Otherwise, using memcpy everywhere in this function sounds fine to me too.

Correct, although might as well use the memcpy since then you don't even need to ask this question.

> > +    void writePayload(uint8_t* dst);
> 
> Val is a POD, could we keep it that way and have a static function in the
> unique file where it's used, instead?

You can have member functions and still be a POD; it's just things like virtual functions and destructors that take away PODness.  Also, I expect there are more uses coming in the future as wasm::Val is passed out/in of imports instead of js::Value.

> Following the chain of callers in searchfox, I think that the heap
> allocation inspection tool in the devtools won't collect metadata/bytecode's
> sizes anymore. This might be a problem for people profiling their code.
> Could we address this?

This logic had to move into MemoryMetrics.cpp so the HashSets could be use to avoid double-counting the shared components.  (I also hand-verified about:memory works for asm.js and wasm.)

> I think this is unintentional. We still walk over the list of
> wasmModuleWeakList in Debugger.cpp, but this is always empty if we remove
> this line.

Now it's in the Instance constructor so that it matches the destructor.  I also hand-verified that Debugger wasm pretty-printing works.

> Can you keep "wasm" in the name, please?

Oops, that was a typo.
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5dc26d954924
Odin: remove trace hook from AsmJSModule (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/3f58e166483b
Baldr: tidy up ExportMap (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/e6c1eef4b409
Baldr: move x64 global patching out of MacroAssembler (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/5b796f519ed0
Odin: remove slow-function reporting (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/b40b713ab502
Baldr: split out wasm::Instance from wasm::Module (rs=bbouvier)
Attached patch weak-mapSplinter Review
(Unrelated, but sits atop the last patch) This patch removes the WasmInstanceObject back-pointer and readBarrier, using a WeakCache instead.
Attachment #8762155 - Flags: review?(shu)
Comment on attachment 8762155 [details] [diff] [review]
weak-map

Review of attachment 8762155 [details] [diff] [review]:
-----------------------------------------------------------------

Improved ease of working with GC things makes me happy.

::: js/src/jscompartment.h
@@ +516,5 @@
>  
> +    // All wasm live instances in the compartment.
> +    using WasmInstanceObjectSet = js::GCHashSet<js::HeapPtr<js::WasmInstanceObject*>,
> +                                                js::MovableCellHasher<js::HeapPtr<js::WasmInstanceObject*>>,
> +                                                js::SystemAllocPolicy>;

o wow

::: js/src/vm/Debugger.h
@@ +686,5 @@
>      static bool slowPathOnLeaveFrame(JSContext* cx, AbstractFramePtr frame, jsbytecode* pc, bool ok);
>      static JSTrapStatus slowPathOnDebuggerStatement(JSContext* cx, AbstractFramePtr frame);
>      static JSTrapStatus slowPathOnExceptionUnwind(JSContext* cx, AbstractFramePtr frame);
>      static void slowPathOnNewScript(JSContext* cx, HandleScript script);
> +    static bool slowPathOnNewWasmInstance(JSContext* cx, Handle<WasmInstanceObject*> wasmInstance);

Hm, now that you've changed this I wonder if the |ReportOutOfMemory(cx)| in slowPathOnNewScript is correct. I wonder if any callers of it end up continue executing with a pending exception on the cx.

I'm not sure why slowPathOnNewScript is infallible in the first place, perhaps for OMT parsing?
Attachment #8762155 - Flags: review?(shu) → review+
(In reply to Shu-yu Guo [:shu] from comment #25)
> o wow

Heh, yeah, I was recommending some sort of template alias to Terrence.

> Hm, now that you've changed this I wonder if the |ReportOutOfMemory(cx)| in
> slowPathOnNewScript is correct. I wonder if any callers of it end up
> continue executing with a pending exception on the cx.

All the callers seem to assume immutable so perhaps we should simply remove the ReportOutOfMemory() b/c it's *definitely* wrong to keep running with the pending execution.  I also noticed I forgot to make the caller check the return value...
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3bdb96c5a530
Baldr: replace JSCompartment::weakInstanceWeakList with WeakCache (r=shu)
Thank you with the answer. There's just one thing...

(In reply to Luke Wagner [:luke] from comment #22)
> > Following the chain of callers in searchfox, I think that the heap
> > allocation inspection tool in the devtools won't collect metadata/bytecode's
> > sizes anymore. This might be a problem for people profiling their code.
> > Could we address this?
> 
> This logic had to move into MemoryMetrics.cpp so the HashSets could be use
> to avoid double-counting the shared components.  (I also hand-verified
> about:memory works for asm.js and wasm.)
Yes, although now we have two ways to report used memory: about:memory + the memory inspector [1]. For about:memory, I've understood (thanks to searchfox) that MemoryMetrics was fixing it; but the memory inspector uses raw calls to JSObject::addSizeOfMisc:

- called by http://searchfox.org/mozilla-central/source/js/src/jsobj.cpp#3842
- called by http://searchfox.org/mozilla-central/source/js/public/UbiNode.h#773
- called by http://searchfox.org/mozilla-central/source/devtools/shared/heapsnapshot/HeapSnapshot.cpp#1252

So this is used by the heap snapshot mechanism, and the metadata/instance won't appear anymore in any heap snapshot, giving different results for heap snapshot vs about:memory (and the difference may be quite large). Can we address this? The solution is to find the right place where to put the seen sets in the heap snapshot mechanism...

[1] https://developer.mozilla.org/en-US/docs/Tools/Memory
Flags: needinfo?(luke)
Great catch!  I wasn't aware that that tool was using the about:memory infrastructure; probably because the tool also appears to not be reporting ScriptSources either (which are doing the same seen-set thing for the same sharing reason).  Nick, is this a known bug or is there anything on file?  Likely the fix is just moving these HashSets into some context parameter that is passed to these shared memory-reporting methods.
Flags: needinfo?(luke) → needinfo?(nfitzgerald)
Attached patch dont-dup-names (obsolete) — Splinter Review
This patch kills 3 birds with one stone:
 - stores offsets of names in bytecode instead of storing a copy of names
 - trigger preservation of bytecode in Instance when names section present
 - remove ascii name encoding limitation, matching Web.md#names
Attachment #8762576 - Flags: review?(bbouvier)
Attached patch dont-dup-namesSplinter Review
Oops, forgot to test profiling on ARM simulator.
Attachment #8762576 - Attachment is obsolete: true
Attachment #8762576 - Flags: review?(bbouvier)
Attachment #8762593 - Flags: review?(bbouvier)
Comment on attachment 8761458 [details] [diff] [review]
split-module-instance

Review of attachment 8761458 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you for all the new comments! The architecture is really clean. I haven't looked at all the code (only glanced at the new code), so it is hard to spot any hidden regression, but the architecture makes sense and high level interfaces as described in the spec repo are nicely split. Thank you for the patch!

::: js/src/asmjs/WasmCode.cpp
@@ +222,5 @@
> +    MOZ_ASSERT(code.length() % gc::SystemPageSize() == 0);
> +    MOZ_ASSERT(linkData.globalDataLength % gc::SystemPageSize() == 0);
> +    MOZ_ASSERT(linkData.functionCodeLength < code.length());
> +
> +    auto cs = cx->make_unique<CodeSegment>();

style nit: auto*

::: js/src/asmjs/WasmFrameIterator.cpp
@@ +814,5 @@
>  
> +    uint8_t* profilingEntry     = instance.codeSegment().code() + codeRange.funcProfilingEntry();
> +    uint8_t* tableProfilingJump = instance.codeSegment().code() + codeRange.funcTableProfilingJump();
> +    uint8_t* profilingJump      = instance.codeSegment().code() + codeRange.funcProfilingJump();
> +    uint8_t* profilingEpilogue  = instance.codeSegment().code() + codeRange.funcProfilingEpilogue();

Maybe time to use a temporary for instance.codeSegment().code()?

::: js/src/asmjs/WasmGenerator.cpp
@@ +542,5 @@
>  bool
>  ModuleGenerator::allocateGlobalBytes(uint32_t bytes, uint32_t align, uint32_t* globalDataOffset)
>  {
> +    uint32_t pad = ComputeByteAlignment(linkData_.globalDataLength, align);
> +    if (UINT32_MAX - linkData_.globalDataLength < pad + bytes)

pre-existing: this looks like a strategy to avoid unsigned add overflow (of globalDataLength + pad + bytes), but what if pad + bytes overflows in the first place?

@@ +908,3 @@
>  
> +    // Zero the padding, since we used resizeUninitialized above.
> +    memset(code.begin() + bytesNeeded, 0, padding);

Should we inhibit icache flushing for this memset too by putting this code under the AutoFlushICache'd block above?

::: js/src/asmjs/WasmInstance.cpp
@@ +375,5 @@
> +                                           JSFunction::ASMJS_CTOR);
> +    if (!fun)
> +        return nullptr;
> +
> +    fun->setExtendedSlot(FunctionExtended::WASM_MODULE_SLOT, ObjectValue(*instanceObj));

WASM_MODULE_SLOT should be renamed into WASM_INSTANCE_SLOT.

@@ +383,5 @@
> +
> +static bool
> +CreateExportObject(JSContext* cx,
> +                   Handle<WasmInstanceObject*> instanceObj,
> +                   Handle<ArrayBufferObjectMaybeShared*> heap,

There's now a typedef for that!

@@ +472,5 @@
> +                 Handle<FunctionVector> funcImports,
> +                 const ExportMap& exportMap,
> +                 MutableHandleWasmInstanceObject instanceObj)
> +{
> +    // Ensure that WasmInstance

Looks like a truncated comment here.

@@ +474,5 @@
> +                 MutableHandleWasmInstanceObject instanceObj)
> +{
> +    // Ensure that WasmInstance
> +
> +    instanceObj.set(WasmInstanceObject::create(cx));

So if you create the WasmInstanceObject after the Instance itself, you could pass the Instance as an argument to the WasmInstanceObject, removing the need for the (infallible) WasmInstanceObject::{init,isNewBorn} methods.

@@ +479,5 @@
> +    if (!instanceObj)
> +        return false;
> +
> +    {
> +        auto instance = cx->make_unique<Instance>(Move(codeSegment), metadata, maybeBytecode,

style nit: auto*

@@ +688,5 @@
> +    if (args.isConstructing()) {
> +        // By spec, when a function is called as a constructor and this function
> +        // returns a primary type, which is the case for all wasm exported
> +        // functions, the returned value is discarded and an empty object is
> +        // returned instead.

(Not related to your patch, but I think this might not be true anymore because of SIMD as long as they are represented as objects. Not a problem when SIMD vectors will become primary value types, though, depending on how the value type spec turns out)

::: js/src/asmjs/WasmInstance.h
@@ +145,5 @@
> +
> +    // Instances maintain a GC pointer to their wrapping WasmInstanceObject.
> +    //
> +    // NB: when this or any GC field of an instance is accessed through a weak
> +    // references (viz., via JSCompartment::wasmInstanceWeakList), readBarrier()

nit: reference

::: js/src/asmjs/WasmJS.cpp
@@ +120,5 @@
> +
> +const Class WasmModuleObject::class_ =
> +{
> +    "WasmModuleObject",
> +    JSCLASS_IS_ANONYMOUS | JSCLASS_DELAY_METADATA_BUILDER |

I've tried to find out what IS_ANONYMOUS means, but the only useful comment was that it is "internal only" and won't have a prototype name. Can you explain what does this mean, for history, please?

@@ +135,5 @@
> +/* static */ WasmModuleObject*
> +WasmModuleObject::create(ExclusiveContext* cx, UniqueModule module)
> +{
> +    AutoSetNewObjectMetadata metadata(cx);
> +    auto obj = NewObjectWithGivenProto<WasmModuleObject>(cx, nullptr);

style nit: auto*

::: js/src/asmjs/WasmModule.cpp
@@ +313,5 @@
>  
>  bool
> +Module::instantiate(JSContext* cx,
> +                    Handle<FunctionVector> funcImports,
> +                    Handle<ArrayBufferObjectMaybeShared*> heap,

I think there's now a typedef for this?

@@ +327,3 @@
>      }
>  
> +    auto cs = CodeSegment::create(cx, code_, linkData_, *metadata_, heapBase, heapLength);

style nit: auto*

::: js/src/asmjs/WasmModule.h
@@ +134,5 @@
>  
>      WASM_DECLARE_SERIALIZABLE(ExportMap)
>  };
>  
> +// Module represents a compiled wasm module and primarily provides two 

nit: trailing space

@@ +138,5 @@
> +// Module represents a compiled wasm module and primarily provides two 
> +// operations: instantiation and serialization. A Module can be instantiated any
> +// number of times to produce new Instance objects. A Module can be serialized
> +// any number of times such that the serialized bytes can be deserialized later
> +// to produce and new, equivalent Module.

nit: and -> a

::: js/src/jscompartment.h
@@ +516,5 @@
>  
>      // All unboxed layouts in the compartment.
>      mozilla::LinkedList<js::UnboxedLayout> unboxedLayouts;
>  
>      // All wasm modules in the compartment. Weakly held.

This part of the comment is outdated now: modules -> instances.
Attachment #8761458 - Flags: review?(bbouvier) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #33)
Thanks!  Good comments, will land fixes to all these.  (Some where actually fixed already in the other patch.)

> > +    auto cs = cx->make_unique<CodeSegment>();
> 
> style nit: auto*

Can't, b/c make_unique returns a UniquePtr.

> pre-existing: this looks like a strategy to avoid unsigned add overflow (of
> globalDataLength + pad + bytes), but what if pad + bytes overflows in the
> first place?

Indeed, it's CheckedInt time!

> > +    memset(code.begin() + bytesNeeded, 0, padding);
> 
> Should we inhibit icache flushing for this memset too by putting this code
> under the AutoFlushICache'd block above?

The icache flushing happens implicitly from within various MacroAssembler operations so the AutoFlushICache is only necessary in where masm is being used.

> So if you create the WasmInstanceObject after the Instance itself, you could
> pass the Instance as an argument to the WasmInstanceObject, removing the
> need for the (infallible) WasmInstanceObject::{init,isNewBorn} methods.

The problem is that Instance needs to be traced and allocating WasmInstanceObject can trigger GC.

> I've tried to find out what IS_ANONYMOUS means, but the only useful comment
> was that it is "internal only" and won't have a prototype name. Can you
> explain what does this mean, for history, please?

It seems to affect how standard classes are resolved.  Module/Instance don't have a JSProto, so it's probably superfluous here; I'll try removing.
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c70c3d933ee8
Baldr: address review comments (r=bbouvier)
Backed out for SM(nu) failures (in WasmGenerator.cpp):

https://hg.mozilla.org/integration/mozilla-inbound/rev/afb16c8ddaa46acd8d8b1e0e2204fd0936c7fda4

Push with failure: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=c70c3d933ee89de37cb7c308045482436db4dc89
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=30152657&repo=mozilla-inbound

/home/worker/workspace/build/src/js/src/asmjs/WasmGenerator.cpp:547:5: error: 'CheckedInt' was not declared in this scope
/home/worker/workspace/build/src/js/src/asmjs/WasmGenerator.cpp:547:24: error: expected primary-expression before '>' token
/home/worker/workspace/build/src/js/src/asmjs/WasmGenerator.cpp:547:72: error: 'newGlobalDataLength' was not declared in this scope
Flags: needinfo?(luke)
Comment on attachment 8762593 [details] [diff] [review]
dont-dup-names

Review of attachment 8762593 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me, thank you for the patch!

::: js/public/CharacterEncoding.h
@@ +180,5 @@
> +    const mozilla::Range<const char16_t> tbchars(begin, length);
> +    return JS::LossyTwoByteCharsToNewLatin1CharsZ(cx, tbchars);
> +}
> +
> +

nit: 2 blank lines

::: js/src/asmjs/WasmCode.cpp
@@ +228,5 @@
>          ReportOutOfMemory(cx);
>          return nullptr;
>      }
>  
> +    if (!SendCodeRangesToProfiler(cx, *cs, code, metadata))

cs contains code already, so we could probably get rid of the `code` parameter in SendCodeRangesToProfiler.

@@ +555,5 @@
> +        // return a Vector directly.
> +        size_t twoByteLength;
> +        UniqueTwoByteChars chars(JS::UTF8CharsToNewTwoByteCharsZ(cx, utf8, &twoByteLength).get());
> +        if (!chars) {
> +            cx->clearPendingException();

This is the only place where we clear the exception after a failed call to UTF8CharsToNewTwoByteCharsZ. I am not sure which functions need to clear the pending exception or not, but the inconsistency here seems to indicate there's something wrong.

@@ +556,5 @@
> +        size_t twoByteLength;
> +        UniqueTwoByteChars chars(JS::UTF8CharsToNewTwoByteCharsZ(cx, utf8, &twoByteLength).get());
> +        if (!chars) {
> +            cx->clearPendingException();
> +            goto invalid;

Also, the invalid path also tries to do some allocation, and if it failed here to alloc, it's probably going to fail below too (unless these are in different allocation pools?). Should we just return false?

::: js/src/asmjs/WasmCode.h
@@ +383,5 @@
>  {
>      return bool(heapUsage);
>  }
>  
> +// TODO

To do or not to do, that is the question.

@@ +395,5 @@
> +    NameInBytecode(uint32_t offset, uint32_t length) : offset(offset), length(length) {}
> +};
> +
> +typedef Vector<NameInBytecode, 0, SystemAllocPolicy> NameInBytecodeVector;
> +typedef Vector<char16_t, 32> TwoByteName;

As the max inline capacity of a vector is 1024 bytes, you could actually initialize TwoByteName with 64 elements, if you wanted to.

::: js/src/asmjs/WasmInstance.h
@@ +129,5 @@
>      // string will be returned.
>  
>      JSString* createText(JSContext* cx);
>  
> +    // TODO: should this be returning Vector<char16_t>?  Everyone wants it...

Sounds good to me: it would introduce a nice symmetry with getFuncAtom which returns its string too.
Attachment #8762593 - Flags: review?(bbouvier) → review+
Arg, unified build bootlegging.
Flags: needinfo?(luke)
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a509094fc1f8
Baldr: address review comments (r=bbouvier)
(In reply to Benjamin Bouvier [:bbouvier] from comment #37)
> > +    if (!SendCodeRangesToProfiler(cx, *cs, code, metadata))
> 
> cs contains code already, so we could probably get rid of the `code`
> parameter in SendCodeRangesToProfiler.

Ah, 'code' here should have been named 'bytecode' which is different than cs.code() :)

> Also, the invalid path also tries to do some allocation, and if it failed
> here to alloc, it's probably going to fail below too (unless these are in
> different allocation pools?). Should we just return false?

There's a general distinction in wasm between functions that report their errors (which typically take a 'cx' parameter) and those that don't.  There is a bug here you've spotted which is that when I changed this function from the latter to the former, I didn't ReportOutOfMemory() when calling the (non-reporting) JS_smprintf.  However, UTF8CharsToNewTwoByteCharsZ can also fail due to invalid utf8 bytes, so that's why we retry here.

> > +// TODO
> 
> To do or not to do, that is the question.

lol, good spot, I typically add "TODO"s for all things I mean to comment and fill them in at the end...

> > +    // TODO: should this be returning Vector<char16_t>?  Everyone wants it...
> 
> Sounds good to me: it would introduce a nice symmetry with getFuncAtom which
> returns its string too.

... and I clearly forgot to do one last "TODO" sweep of the patch :)  To wit, this is what I've implemented already, I just forgot to delete the comment.
(In reply to Benjamin Bouvier [:bbouvier] from comment #29)
> Thank you with the answer. There's just one thing...
> 
> (In reply to Luke Wagner [:luke] from comment #22)
> > > Following the chain of callers in searchfox, I think that the heap
> > > allocation inspection tool in the devtools won't collect metadata/bytecode's
> > > sizes anymore. This might be a problem for people profiling their code.
> > > Could we address this?
> > 
> > This logic had to move into MemoryMetrics.cpp so the HashSets could be use
> > to avoid double-counting the shared components.  (I also hand-verified
> > about:memory works for asm.js and wasm.)
> Yes, although now we have two ways to report used memory: about:memory + the
> memory inspector [1]. For about:memory, I've understood (thanks to
> searchfox) that MemoryMetrics was fixing it; but the memory inspector uses
> raw calls to JSObject::addSizeOfMisc:
> 
> - called by http://searchfox.org/mozilla-central/source/js/src/jsobj.cpp#3842
> - called by
> http://searchfox.org/mozilla-central/source/js/public/UbiNode.h#773
> - called by
> http://searchfox.org/mozilla-central/source/devtools/shared/heapsnapshot/
> HeapSnapshot.cpp#1252
> 
> So this is used by the heap snapshot mechanism, and the metadata/instance
> won't appear anymore in any heap snapshot, giving different results for heap
> snapshot vs about:memory (and the difference may be quite large). Can we
> address this? The solution is to find the right place where to put the seen
> sets in the heap snapshot mechanism...
> 
> [1] https://developer.mozilla.org/en-US/docs/Tools/Memory

Recap of non-JS::ubi::Node tooling: DMD instruments malloc/free directly and so it sees everything. about:memory requires manual instrumentation so it won't ever have 100% coverage in practice. Instead, it uses DMD to guide where to add instrumentation to the best effect and expands its coverage out towards where DMD is.

Similarly, we would like to use about:memory in the same way for JS::ubi::Node: the same way that about:memory keeps trying to expand out closer towards what DMD sees, we wish to expand JS::ubi::Node towards what about:memory sees. There are limits on this, as we will never see as much as about:memory, let alone DMD, because we only care about retained/reachable memory, eg we don't care about anything that is allocated but in a free list. Such things just aren't actionable for our users.

So, that's our general philosophy.

The way to expand what we see is to make more JS::ubi::Concrete<T> specializations for the new things we want to measure, and then add custom edges from existing JS::ubi::Concrete<T>::edges() implementations to connect them to the heap graph. A heap snapshot is just a JS::ubi::Node heap graph traversal that makes an offline copy for analyses. If the new thing we want to measure isn't associated with any specific edge from a class of JSObjects or something, then we can probably add an edge from JS::ubi::RootList to dangle it off the roots.

Does that make sense?
Flags: needinfo?(nfitzgerald)
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f1209b27c6a8
Baldr: store names as offsets into bytecode (r=bbouvier)
(In reply to Nick Fitzgerald [:fitzgen] [⏰PDT; UTC-7] from comment #42)
Thanks, makes sense.  Sounds like something we can address in the future for both kinds of objects.
Ugh, the patch uses initWithLength which leaves capacity at precisely the given length (not rounded up to almost a power of 2) which breaks the somewhat overzealous CapacityHasExcessSpace assert.  Fixed by using growBy.
Flags: needinfo?(luke)
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/3ddf6bae09dc
Baldr: store names as offsets into bytecode (r=bbouvier)
This patch stores DataSegments in Module so that the copy of data from bytecode into wasm memory can happen instead in Module::instantiate which is necessary for splitting up wasm::Eval.
Attachment #8763416 - Flags: review?(bbouvier)
Attached patch split-wasm-evalSplinter Review
This is just code movement, splitting compilation from extracting imports.  Ideally, extracting imports would be in Module::instantiate, but asm.js FFIs are significantly different that wasm and asm.js need to do it separately, so this is kept in AsmJS.cpp/WasmJS.cpp.
Attachment #8763417 - Flags: review?(bbouvier)
Comment on attachment 8763416 [details] [diff] [review]
create-heap-later

Review of attachment 8763416 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks. Arguably, the memory could Just Be part of the Module, because I don't see how (as of today) instantiate can give different memory to a module. Although I can foresee how this will happen in the future with the shared memory table for dynamic linking.

::: js/src/asmjs/Wasm.cpp
@@ +1252,5 @@
>      Rooted<FunctionVector> funcImports(cx, FunctionVector(cx));
>      if (!ImportFunctions(cx, importObj, module->importNames(), &funcImports))
>          return false;
>  
> +    return module->instantiate(cx, funcImports, nullptr, instanceObj);

Can you add a comment /* heap = */ before nullptr?

::: js/src/asmjs/WasmGenerator.h
@@ +157,2 @@
>      bool usesHeap() const;
> +    uint32_t initialHeapLength() const { return metadata_->initialHeapLength; }

Can we MOZ_ASSERT(!isAsmJS())?

@@ +157,3 @@
>      bool usesHeap() const;
> +    uint32_t initialHeapLength() const { return metadata_->initialHeapLength; }
> +    MOZ_MUST_USE bool addDataSegment(DataSegment s) { return dataSegments_.append(s); }

DataSegment is *slightly* bigger than a word... Rvalue here?

::: js/src/asmjs/WasmModule.cpp
@@ +335,5 @@
> +    // asm.js module instantiation supplies its own heap, but for wasm, create
> +    // and initialize the heap if one is requested.
> +
> +    Rooted<ArrayBufferObjectMaybeShared*> heap(cx, asmJSHeap);
> +    if (metadata_->usesHeap() && !heap) {

Can we make this a stronger assertion? TryInstantiate in AsmJS.cpp will fail if a heap is used but not passed at runtime. Maybe just before this if:

MOZ_ASSERT(IsAsmJS() == heap);

@@ +346,5 @@
> +    uint8_t* memoryBase = heap ? heap->dataPointerEither().unwrap(/* code patching */) : nullptr;
> +    uint32_t memoryLength = heap ? heap->byteLength() : 0;
> +
> +    const uint8_t* bytecode = bytecode_->begin();
> +    for (const DataSegment& seg : dataSegments_)

Can we also check that if we're in AsmJS mode, dataSegments_ is empty?
Attachment #8763416 - Flags: review?(bbouvier) → review+
Comment on attachment 8763417 [details] [diff] [review]
split-wasm-eval

Review of attachment 8763417 [details] [diff] [review]:
-----------------------------------------------------------------

Thank you! And thank you moreover for doing a simple code motion patch, this is greatly appreciated :-)

::: js/src/asmjs/WasmTypes.h
@@ +822,5 @@
>  typedef int32_t (*ExportFuncPtr)(ExportArg* args, uint8_t* global);
>  
>  // Constants:
>  
> +static const unsigned PageSize = 64 * 1024;

Although the comment for the ifdef below wasn't that useful, I think the one here was explaining the constant 64 and was nice for this.
Attachment #8763417 - Flags: review?(bbouvier) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #51)
> Thanks. Arguably, the memory could Just Be part of the Module, because I
> don't see how (as of today) instantiate can give different memory to a
> module. Although I can foresee how this will happen in the future with the
> shared memory table for dynamic linking.

Well, Module::instantiate will ArrayBufferObject::createForWasm each time it is called so there can be N memories for each Module.

> DataSegment is *slightly* bigger than a word... Rvalue here?

For an out-of-line function, I would, but I'm anticipating inlining here which should totally decompose this struct into scalars.

> Can we also check that if we're in AsmJS mode, dataSegments_ is empty?

We could, b/c it's true, but I'm not sure if that helps us understand the code since, if asm.js were to have a way to express data segments, that'd be fine and this code would do the right thing.
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/29bc719f41f0
Baldr: create heap during instantiation, not compilation (r=bbouvier)
https://hg.mozilla.org/integration/mozilla-inbound/rev/19f38f104a09
Baldr: split wasm::Compile out of wasm::Eval (r=bbouvier)
Attached patch js-apiSplinter Review
Finally, the last patch.  With all the preceding refactoring, this one just does the simple boilerplate work of building the WebAssembly.(Module,Instance) JS wrappers as described in https://github.com/WebAssembly/design/blob/master/JS.md.
Attachment #8766588 - Flags: review?(bbouvier)
Comment on attachment 8766588 [details] [diff] [review]
js-api

Review of attachment 8766588 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks. It's nice to be able to read the spec and have the code corresponding line by line (one day we can even include step numbers, as in JS stdlib algorithms).

::: js/src/asmjs/WasmJS.cpp
@@ +271,5 @@
> +    if (!GetProperty(cx, callee, callee, cx->names().prototype, &proto))
> +        MOZ_CRASH("non-configurable data property of known constructor");
> +
> +    if (!proto.isObject())
> +        MOZ_CRASH("readonly property of known constructor");

I guess there is no way to change the callee while still calling the constructor? (just raising this question sounds silly, but as this is JavaScript, there might be a way using obscure introspection features...)

@@ +391,3 @@
>  {
>      AutoSetNewObjectMetadata metadata(cx);
> +    auto obj = NewObjectWithGivenProto<WasmInstanceObject>(cx, proto);

pre-existing uber-nit: auto*

::: js/src/asmjs/WasmJS.h
@@ +53,5 @@
>  
>  } // namespace wasm
>  
> +// 'Wasm' and its one function 'instantiateModule' are transitional APIs and
> +// will be removed (replaced by 'WebAssembly') before release.

If I understand correctly, we keep them both to ensure that the demo is still runnable?

::: js/src/jit-test/tests/wasm/jsapi.js
@@ +28,5 @@
> +assertEq(Module.name, "Module");
> +assertErrorMessage(() => Module(), TypeError, /constructor without new is forbidden/);
> +assertErrorMessage(() => new Module(1), TypeError, "first argument must be a typed array");
> +assertErrorMessage(() => new Module({}), TypeError, "first argument must be a typed array");
> +assertErrorMessage(() => new Module(new Uint8Array()), /* TODO: WebAssembly.CompileError */ TypeError, /wasm validation error/);

Maybe can do:

var CompileError = WebAssembly.CompileError || TypeError;

somewhere above this code, and this code can use CompileError here, so we don't need to update the tests later?

@@ +100,5 @@
> +
> +// Exports object:
> +// Note: at some point the exports object should become an ES6 module namespace
> +// exotic object. For now, don't probe too hard on the property descriptors or
> +// the exports objec titself.

nit: object itself

@@ +109,5 @@
> +
> +var code = wasmTextToBinary('(module (func) (export "foo" 0))');
> +var e = new Instance(new Module(code)).exports;
> +assertEq(Object.keys(e).join(), "foo");
> +assertEq(e.foo(), undefined);

Can you maybe add a test where the export is unnamed, matching the default export of the module?

::: js/src/js.msg
@@ +344,5 @@
>  MSG_DEF(JSMSG_WASM_FAIL,               1, JSEXN_TYPEERR,     "wasm error: {0}")
>  MSG_DEF(JSMSG_WASM_DECODE_FAIL,        2, JSEXN_TYPEERR,     "wasm validation error at offset {0}: {1}")
>  MSG_DEF(JSMSG_WASM_TEXT_FAIL,          1, JSEXN_SYNTAXERR,   "wasm text error: {0}")
>  MSG_DEF(JSMSG_WASM_BAD_IND_CALL,       0, JSEXN_ERR,         "wasm indirect call signature mismatch")
>  MSG_DEF(JSMSG_WASM_BAD_BUF_ARG,        0, JSEXN_TYPEERR,     "first argument must be a typed array")

pre-existing, but this is misleading: it can be any BufferSource, so Typed array or ArrayBuffer.
Attachment #8766588 - Flags: review?(bbouvier) → review+
(In reply to Benjamin Bouvier [:bbouvier] from comment #58)
Thanks, good catches!

> I guess there is no way to change the callee while still calling the
> constructor? (just raising this question sounds silly, but as this is
> JavaScript, there might be a way using obscure introspection features...)

It's definitely the right question to be asking in JS-land :)  No, if you arrive in a particular js::Native, at least at the entry-point (before you clobber it with the rval), your callee is a JSFunction for that native (or, in this One Weird Cornercase involving CallNonGenericMethod, a wrapper for the native, but that can't happen unless you CallNonGenericMethod which we aren't here).  I'll add an assert, though.

> > +// 'Wasm' and its one function 'instantiateModule' are transitional APIs and
> > +// will be removed (replaced by 'WebAssembly') before release.
> 
> If I understand correctly, we keep them both to ensure that the demo is
> still runnable?

Correct, and until V8 supports the same APIs (so that demos can have something portable to run).

> var CompileError = WebAssembly.CompileError || TypeError;
> 
> somewhere above this code, and this code can use CompileError here, so we
> don't need to update the tests later?

That's a good idea, but I actually want the TODO to stick out since this is on the list of things to add before shipping and this test case may be shared with other engines and they should know this isn't the right behavior.

> Can you maybe add a test where the export is unnamed, matching the default
> export of the module?

So even though we've implemented that (b/c asm.js), this is actually wrong and will need to be fixed since the 'exports' object will always be a Module Namespace object.  Scanning the spec now, it appears that default exports simply don't show up in the Module Namespace object:
  https://tc39.github.io/ecma262/#sec-resolveexport
When we do ES6 module integration, I expect (but I'm not sure) that since "" isn't a valid lexical binding, that we could take "" to be the default export.  That or wasm modules just can't have default exports.
Whiteboard: [leave open]
Pushed by lwagner@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6668e747f0df
Baldr: add WebAssembly.(Module, Instance) (r=bbouvier)
https://hg.mozilla.org/mozilla-central/rev/6668e747f0df
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
Blocks: 1284155
Depends on: 1345453
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: