Closed
Bug 1234985
Opened 10 years ago
Closed 10 years ago
BaldrMonkey: land initial wasm compilation and testing functions
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla47
| Tracking | Status | |
|---|---|---|
| firefox47 | --- | fixed |
People
(Reporter: luke, Assigned: luke)
References
Details
Attachments
(9 files, 6 obsolete files)
|
2.97 KB,
patch
|
sfink
:
review+
|
Details | Diff | Splinter Review |
|
27.83 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
|
9.47 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
|
19.87 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
|
39.12 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
|
1.47 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
|
1.84 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
|
70.56 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
|
2.91 KB,
patch
|
luke
:
review+
|
Details | Diff | Splinter Review |
This bug is for introducing a path to compile wasm, a shell testing function 'wasmEval(bytecode)' (which compiles an ArrayBuffer of wasm bytecode), and a shell 'wasmTextToBinary(text)' (which encodes the given wasm text format into the wasm binary format). Together this will allow writing unit tests in the wasm text format a la `wasmEval(wasmTextToBinary('...'))` while also allowing direct fuzzing of the binary format via wasmEval.
Initially, the binary format will simply be that of the current asm.js bytecode (introduced by optimizations in bug 1157624, bug 1181612), but it will evolve to match the binary format being defined in the wasm spec (see https://github.com/WebAssembly/design/issues/497).
Also, initially the text format will be the temporary s-expression syntax (https://github.com/WebAssembly/spec/tree/master/ml-proto#s-expression-syntax), but it will evolve to match the spec-defined text format (https://github.com/WebAssembly/design/blob/master/TextFormat.md#official-text-format).
This bug will just establish the pipeline (text->binary->code) and iterate in future bugs.
| Assignee | ||
Comment 1•10 years ago
|
||
| Assignee | ||
Comment 2•10 years ago
|
||
Attachment #8701694 -
Attachment is obsolete: true
| Assignee | ||
Comment 3•10 years ago
|
||
Attachment #8708647 -
Flags: review?(sphink)
| Assignee | ||
Comment 4•10 years ago
|
||
This patch changes the interface between MG and clients to make things smoother for the baldr patch.
Attachment #8708649 -
Flags: review?(bbouvier)
| Assignee | ||
Comment 5•10 years ago
|
||
Building on bug 1239177, this patch adds the signature table so we can actually start validating function bodies. But to make more progress on wasm function bodies, we'll need stmt-expr unification.
Attachment #8703225 -
Attachment is obsolete: true
| Assignee | ||
Comment 6•10 years ago
|
||
Attachment #8708650 -
Attachment is obsolete: true
Comment 7•10 years ago
|
||
Comment on attachment 8708651 [details] [diff] [review]
baldr (WIP 3)
Review of attachment 8708651 [details] [diff] [review]:
-----------------------------------------------------------------
Random driveby review comments:
::: js/src/asmjs/WasmBinary.h
@@ +396,5 @@
>
> + template <class T>
> + MOZ_WARN_UNUSED_RESULT
> + bool writeEnum(T v, size_t* offset) {
> + return write(uint32_t(v), offset);
For added safety, this could static_assert mozilla::IsEnum<T>::value and sizeof(T) <= sizeof(uint32_t).
::: js/src/asmjs/WasmText.cpp
@@ +191,5 @@
> + SigVector sigs_;
> + SigMap sigMap_;
> +
> + public:
> + WasmAstModule(LifoAlloc& lifo)
explicit
@@ +657,5 @@
> + return nullptr;
> + }
> +
> + return exp;
> +
Stray whitespace.
| Assignee | ||
Comment 8•10 years ago
|
||
For now, the ops are serialized as a simple uint8. This patch hoists the serialization functions to wasm::Encoder/Decoder so they can be changed later.
Attachment #8709265 -
Flags: review?(bbouvier)
| Assignee | ||
Comment 9•10 years ago
|
||
I realized neither of these methods belonged in the wasm::Module and they can be pushed into the AsmJSModule now that it derives wasm::Module.
Attachment #8709266 -
Flags: review?(bbouvier)
Updated•10 years ago
|
Attachment #8708647 -
Flags: review?(sphink) → review+
Comment 10•10 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #0)
> This bug is for introducing a path to compile wasm, a shell testing function
> 'wasmEval(bytecode)' (which compiles an ArrayBuffer of wasm bytecode), and a
> shell 'wasmTextToBinary(text)' (which encodes the given wasm text format
> into the wasm binary format). Together this will allow writing unit tests
> in the wasm text format a la `wasmEval(wasmTextToBinary('...'))` while also
> allowing direct fuzzing of the binary format via wasmEval.
Any chance I could convince you to name those wasm.eval and wasm.textToBinary? The shell has an insane number of functions these days, making help() mostly useless, and I've started trying to separate things out into module-ish things (see for example shell/OSObject.cpp, which defines os.getenv, os.system, etc. Together with aliases for the old names, but wasm wouldn't need that part.)
I still need to fix up the help stuff a little for this (these not-modules should have help text, and the aliases should inherit help text from the "real" names). But that shouldn't block anything.
| Assignee | ||
Comment 11•10 years ago
|
||
(In reply to Steve Fink [:sfink, :s:] from comment #10)
Actually, you'll get your wish before long: what we'll probably standardize is a new 'wasm' or 'WASM' namespace object. 'wasmEval' is a temporary thing (and note that the tests are mostly not calling it directly so it'll be easy to switch).
Comment 12•10 years ago
|
||
Comment on attachment 8708649 [details] [diff] [review]
tweak-generator
Review of attachment 8708649 [details] [diff] [review]:
-----------------------------------------------------------------
I see where this is going. Thanks for the patch.
::: js/src/asmjs/WasmBinary.h
@@ +377,5 @@
> typedef Vector<uint8_t, 0, SystemAllocPolicy> Bytecode;
> typedef UniquePtr<Bytecode> UniqueBytecode;
>
> // The Encoder class recycles (through its constructor) or creates a new Bytecode (through its
> // init() method). Its Bytecode is released when it's done building the wasm IR in finish().
This comment needs an update.
@@ +392,4 @@
> }
>
> public:
> + Encoder(Bytecode& bytecode) : bytecode_(bytecode) {}
Please MOZ_ASSERT(empty()) in the ctor.
::: js/src/asmjs/WasmGenerator.cpp
@@ +108,2 @@
> {
> +
nit: blank line
@@ +126,2 @@
> shared_ = Move(shared);
> + MOZ_ASSERT(shared_->importSigs.length() == shared_->importGlobalDataOffsets.length());
This hasn't been updated to the latest shared_.imports vector, right?
@@ +313,5 @@
>
> void
> ModuleGenerator::initSig(uint32_t sigIndex, Sig&& sig)
> {
> + MOZ_ASSERT(sigIndex == numSigs_);
Perhaps also MOZ_ASSERT that we're in AsmJS mode?
@@ +318,3 @@
> MOZ_ASSERT(shared_->sigs[sigIndex] == Sig());
> +
> + numSigs_++;
Another way to keep numSigs_ (resp. numFuncSigs_) in sync would be to just take out the arrays' lengths in finishFuncDefs. I actually prefer your way, as we can assert order.
@@ +330,5 @@
>
> bool
> +ModuleGenerator::initFuncSig(uint32_t funcIndex, uint32_t sigIndex)
> +{
> + MOZ_ASSERT(funcIndex == numFuncSigs_);
Also can assert that we're in AsmJS mode.
Attachment #8708649 -
Flags: review?(bbouvier) → review+
Comment 13•10 years ago
|
||
Comment on attachment 8709266 [details] [diff] [review]
rm mutedErrors and displayURL
Review of attachment 8709266 [details] [diff] [review]:
-----------------------------------------------------------------
Very nice.
::: js/src/asmjs/WasmGenerator.cpp
@@ -602,2 @@
> CacheableChars filename,
> - CacheableTwoByteChars displayURL,
Unused parameter is very unused :-)
Attachment #8709266 -
Flags: review?(bbouvier) → review+
Comment 14•10 years ago
|
||
Comment on attachment 8709265 [details] [diff] [review]
hoist Expr encoder/decoder ops
Review of attachment 8709265 [details] [diff] [review]:
-----------------------------------------------------------------
Cheers!
::: js/src/asmjs/AsmJS.cpp
@@ +2787,5 @@
> return encoder().writeU8(uint8_t(Expr::Unreachable), offset);
> }
> MOZ_WARN_UNUSED_RESULT
> bool tempOp(size_t* offset) {
> + return encoder().writeExpr(Expr::Unreachable, offset);
This can be unified with tempU8(); and temp32() can use tempOp() as well.
::: js/src/asmjs/WasmBinary.h
@@ +326,5 @@
> + // Variable-length is somewhat annoying at the moment due to the
> + // pre-order encoding and back-patching; let's see if we switch to
> + // post-order first.
> + static_assert(mozilla::IsEnum<T>::value, "is an enum");
> + MOZ_ASSERT(uint64_t(v) < UINT8_MAX);
static_assert(sizeof(T) == sizeof(uint8_t)); ?
@@ +334,5 @@
> + template <class T>
> + void patchEnum(size_t pc, T v) {
> + // See writeEnum comment.
> + static_assert(mozilla::IsEnum<T>::value, "is an enum");
> + MOZ_ASSERT(uint64_t(v) < UINT8_MAX);
ditto
@@ +457,5 @@
>
> + template <class T>
> + MOZ_WARN_UNUSED_RESULT bool
> + readEnum(T* out) {
> + // See Encoder::writeEnum.
The static_assert would be nice here too.
@@ +476,5 @@
> }
>
> + template <class T>
> + T uncheckedReadEnum() {
> + // See Encoder::writeEnum.
And here as well.
Attachment #8709265 -
Flags: review?(bbouvier) → review+
| Assignee | ||
Comment 15•10 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #14)
> > bool tempOp(size_t* offset) {
> > + return encoder().writeExpr(Expr::Unreachable, offset);
>
> This can be unified with tempU8(); and temp32() can use tempOp() as well.
Well my intention here is that, while *U8/32 will always write exactly those sizes (and, ultimately, be eliminated since I think practically all ints will be VarU32 eventually), Op will become variable length one day.
> static_assert(sizeof(T) == sizeof(uint8_t)); ?
I would've, but that assert won't hold in the Baldr patch where ValType is passed (b/c of that GCC bug preventing a uint8 enum class which I was hoping we could resolve by making configure fail...).
| Assignee | ||
Updated•10 years ago
|
Whiteboard: [leave open]
Comment 16•10 years ago
|
||
Comment 17•10 years ago
|
||
| Assignee | ||
Comment 18•10 years ago
|
||
This patch makes names optional, auto-generating wasm names like "wasm-function[0]" in their absence.
Attachment #8710243 -
Flags: review?(bbouvier)
| Assignee | ||
Comment 19•10 years ago
|
||
This trivial patch switches integer immediates of I32Const to be VarU32 from U32. Later I'd actually like to introduce VarS32 so that small negative integers use a small number of bytes instead of 5 with VarU32, but this is fine for now.
Attachment #8710245 -
Flags: review?(bbouvier)
| Assignee | ||
Comment 20•10 years ago
|
||
Ok, I think I have enough of a "Hello, World" working to land and then start iterating in small patches.
In summary:
- Wasm.cpp implements the binary -> wasm::Module compilation using the same WasmGenerator.(h,cpp) abstraction as asm.js. Most of the work up to this point has just been pulling all the asm.js/JS-specific stuff out of this pipeline.
- Wasm.cpp decodes all the module-level stuff (imports, exports, etc) but hands over the function body bytes to the shared pipeline for decoding in WasmIonCompile.cpp.
- Since WasmIonCompile.cpp assumes well-formed bytes, to avoid a bunch of fuzzer asserts, Wasm.cpp performs a validation pass before OK'ing the bytes to be passed to WasmIonCompile.cpp. This is suboptimal (the bytes are decoded twice) and later this decoding could be moved into baseline-wasm.
- Decoding is also recursive now, but should be made iterative later (when folded into baseline).
- The wasm binary is reminiscent of, but not exactly the same as, the v8 binary format in https://github.com/WebAssembly/design/issues/497. The expectation is that we'll be iterating on the binary format design over the next few months towards convergence, so SM is basically our take on where we should be going.
Attachment #8708651 -
Attachment is obsolete: true
Attachment #8710250 -
Flags: review?(bbouvier)
| Assignee | ||
Comment 21•10 years ago
|
||
Oops, this patch goes before baldr. It avoids the need for a final explicit return opcode, making function bodies expressions for wasm.
Attachment #8710251 -
Flags: review?(bbouvier)
Comment 22•10 years ago
|
||
| bugherder | ||
Comment 23•10 years ago
|
||
Comment on attachment 8710245 [details] [diff] [review]
var-len-lit
Review of attachment 8710245 [details] [diff] [review]:
-----------------------------------------------------------------
Can't wait for writeVarS32
Attachment #8710245 -
Flags: review?(bbouvier) → review+
Comment 24•10 years ago
|
||
Comment on attachment 8710251 [details] [diff] [review]
final-return
Review of attachment 8710251 [details] [diff] [review]:
-----------------------------------------------------------------
Nice. I can already imagine all the tests checking that control flow expressions yield values!
::: js/src/asmjs/AsmJS.cpp
@@ +3424,3 @@
> }
>
> return true;
Control flow can be simplified a bit:
if (!lastNonEmpty... && !IsVoid(...))
return f.fail(...);
return true;
Attachment #8710251 -
Flags: review?(bbouvier) → review+
Comment 25•10 years ago
|
||
Comment on attachment 8710243 [details] [diff] [review]
optional-names
Review of attachment 8710243 [details] [diff] [review]:
-----------------------------------------------------------------
Quite elegant.
::: js/src/asmjs/WasmGenerator.cpp
@@ +123,5 @@
> // already initialized.
> shared_ = Move(shared);
> if (kind == ModuleKind::Wasm) {
> numSigs_ = shared_->sigs.length();
> + module_->numFuncs = shared_->funcSigs.length();
note to self: No need to initialize in asm.js mode because ModuleData is PodZero'd.
::: js/src/asmjs/WasmGenerator.h
@@ +35,5 @@
>
> struct SlowFunction
> {
> + SlowFunction(uint32_t index, unsigned ms, unsigned lineOrBytecode)
> + : index(index), ms(ms)
nit: don't forget to set lineOrBytecode here!
::: js/src/asmjs/WasmModule.cpp
@@ +681,5 @@
> + if (!PerfFuncEnabled())
> + return true;
> +#elif defined(MOZ_VTUNE)
> + if (!IsVTuneProfilingActive())
> + return true;
If Spidermonkey is built with Perf support *and* VTune support, this means we'll have to enable both or none. Maybe
if (false
#ifdef JS_ION_PERF
|| !PerfFuncEnabled()
#endif
#ifdef MOZ_VTUNE
|| !IsVTuneProfilingActive()
#endif
)
This is a bit ugly though...
Or another way is to unify what enables Perf *and* VTune.
Or just get back to what it was?
@@ +710,3 @@
> unsigned method_id = iJIT_GetNewMethodID();
> if (method_id == 0)
> return;
This should return a boolean, not void;
@@ +1424,5 @@
> +
> +const char*
> +Module::getFuncName(JSContext* cx, uint32_t funcIndex, UniqueChars* owner) const
> +{
> + if (!module_->prettyFuncNames.empty())
In webassembly, can anonymous and named functions live in a same module? (I would say so, assuming that you've moved numFuncs to the pod so as to have the upper bound of the number of names). If so, this condition will trigger a OOB if we're trying to get the name of an anonymous function.
@@ +1427,5 @@
> +{
> + if (!module_->prettyFuncNames.empty())
> + return prettyFuncName(funcIndex);
> +
> + char* chars = JS_smprintf("wasm-function[%u]", funcIndex);
I guess annoying know-it-all who'll explicitly name their function "wasm-function[0]" won't be able to distinguish it from unnamed functions. That's a reasonable price to pay for being annoying :-)
::: js/src/asmjs/WasmModule.h
@@ +219,5 @@
>
> CodeRange() = default;
> CodeRange(Kind kind, Offsets offsets);
> CodeRange(Kind kind, ProfilingOffsets offsets);
> + CodeRange(uint32_t nameIndex, uint32_t lineOrBytecode, FuncOffsets offsets);
s/nameIndex/funcIndex
Attachment #8710243 -
Flags: review?(bbouvier) → review+
| Assignee | ||
Comment 26•10 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #25)
Thanks! Good catches.
> In webassembly, can anonymous and named functions live in a same module? (I
> would say so, assuming that you've moved numFuncs to the pod so as to have
> the upper bound of the number of names). If so, this condition will trigger
> a OOB if we're trying to get the name of an anonymous function.
I'm expecting it'll be all-or-nothing, but if it isn't, then yeah, we'd have prettyFuncNames either be empty or numFuncs-sized and have null elements that we'd have to check for in getFuncName().
Comment 27•10 years ago
|
||
| Assignee | ||
Comment 28•10 years ago
|
||
Rebased and fixed so that on platforms where wasm isn't supported (JS_CODEGEN_NONE, non-hardware-fp, non-standard-page-size (we could drop this requirement pretty easily)) the wasmEval/wasmTextToBinary functions simply aren't available so all the wasm tests can start with "if (!wasmEval) quit()".
Attachment #8710250 -
Attachment is obsolete: true
Attachment #8710250 -
Flags: review?(bbouvier)
Attachment #8710818 -
Flags: review?(bbouvier)
Comment 29•10 years ago
|
||
Comment on attachment 8710250 [details] [diff] [review]
baldr
Review of attachment 8710250 [details] [diff] [review]:
-----------------------------------------------------------------
First set of comments that I made yesterday before the rebased patch.
::: js/src/asmjs/Wasm.cpp
@@ +65,5 @@
> + JSContext* cx() const { return cx_; }
> + Decoder& d() const { return d_; }
> + ModuleGenerator& mg() const { return mg_; }
> + FunctionGenerator& fg() const { return fg_; }
> + uint32_t funcIndex() const { return funcIndex_; }
These getters are unused so far (cx(), mg(), fg(), funcIndex()). How about we add them only when we need them?
(FunctionGenerator is entirely unused (yet))
@@ +93,5 @@
> + switch (expr) {
> + case Expr::Nop:
> + return CheckType(f, ExprType::Void, expected);
> + case Expr::I32Const:
> + return f.d().readVarU32() && CheckType(f, ExprType::I32, expected);
If readVarU32() fails, the whole validation will fail with oom. Could you f.fail("unable to read variable-length u32") here instead?
It would actually be even nicer to have helpers for this,
bool DecodeVarU32(FunctionContext& f)
{
if (!f.d().readVarU32())
return f.fail("blablabla");
return true;
}
::: js/src/asmjs/Wasm.h
@@ +27,5 @@
> +// Compile and instantiate the given binary-encoded wasm module and return the
> +// resulting export object, which contains the module's exported functions as
> +// properties.
> +bool
> +Eval(JSContext* cx, UniqueChars filename, Handle<ArrayBufferObject*> code,
If you needed js/Utility.h in WasmText.h for UniqueChars, it's likely you'll need it here as well.
::: js/src/asmjs/WasmBinary.h
@@ +379,5 @@
> MOZ_WARN_UNUSED_RESULT bool
> writeExpr(Expr expr, size_t* offset = nullptr) {
> return writeEnum(expr, offset);
> }
> + MOZ_WARN_UNUSED_RESULT bool writeValType(ValType type, size_t* offset = nullptr) {
Can you please unify the signature style with writeExpr above (in one way or the other)?
@@ +423,5 @@
> + return bytecode_.append(reinterpret_cast<const uint8_t*>(cstr), strlen(cstr) + 1);
> + }
> +
> + MOZ_WARN_UNUSED_RESULT bool startSection(size_t* offset) {
> + if (!writeU32(0))
If you wanted, you could put a sentinel value that gets checked in finishSection afterwards before patching.
@@ +429,5 @@
> + *offset = bytecode_.length();
> + return true;
> + }
> + void finishSection(size_t offset) {
> + uint8_t* patchAt = bytecode_.begin() + offset - sizeof(uint32_t);
MOZ_ASSERT(patchAt < bytecode_.end());
@@ +430,5 @@
> + return true;
> + }
> + void finishSection(size_t offset) {
> + uint8_t* patchAt = bytecode_.begin() + offset - sizeof(uint32_t);
> + uint32_t numBytes = bytecode_.length() - offset;
MOZ_ASSERT(numBytes <= bytecode_.length());
@@ +509,5 @@
> public:
> + Decoder(const uint8_t* begin, const uint8_t* end)
> + : beg_(begin),
> + end_(end),
> + cur_(begin)
MOZ_ASSERT(begin <= end);
@@ +532,2 @@
> void assertCurrentIs(const DebugOnly<size_t> offset) const {
> MOZ_ASSERT(size_t(cur_ - beg_) == offset);
Can you use currentOffset() in that function please?
@@ +619,5 @@
> + MOZ_WARN_UNUSED_RESULT bool startSection(uint32_t* offset) {
> + uint32_t unused;
> + if (!readU32(&unused))
> + return false;
> + *offset = cur_ - beg_;
You can use currentOffset() here
::: js/src/asmjs/WasmText.h
@@ +26,5 @@
> +namespace wasm {
> +
> +// Translate the textual representation of a wasm module (given by a
> +// null-terminated char16_t array) into a Bytecode object. If there is an error
> +// other than out-of-memory an error message string will be stored in 'error'
nit: end with period
::: js/src/jit-test/lib/asserts.js
@@ +74,5 @@
> if (!(e instanceof ctor))
> throw new Error("Assertion failed: expected exception " + ctor.name + ", got " + e);
> if (typeof test == "string") {
> if (test != e.message)
> + throw new Error("Assertion failed: expected " + test + ", got " + e.message);
I hear we even have template strings now...
::: js/src/jsutil.h
@@ +136,5 @@
> }
>
> +template <class Container1, class Container2>
> +static inline bool
> +EqualContainers(const Container1& lhs, const Container2& rhs)
Could you describe in a comment what's expected from the classes Container1/Container2 (namely, they need to implement length() and the const operator[])
@@ +149,5 @@
> +}
> +
> +template <class Container>
> +static inline HashNumber
> +HashContainer(const Container& c, HashNumber hn = 0)
Ditto
Considering the signature, would a better name be AddContainerToHash? Ideally, that'd be called HashContainer if hn = 0 and AddContainerToHash otherwise, but we can't go to that level of detail.
::: js/src/vm/ArrayBufferObject.h
@@ -276,5 @@
> - /*
> - * Ensure data is not stored inline in the object. Used when handing back a
> - * GC-safe pointer.
> - */
> - static bool ensureNonInline(JSContext* cx, Handle<ArrayBufferObject*> buffer);
nice catch (compilers are really permissive, aren't they)
Attachment #8710250 -
Attachment is obsolete: false
Comment 30•10 years ago
|
||
| bugherder | ||
| Assignee | ||
Comment 31•10 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #29)
Great comments, thanks!
> These getters are unused so far (cx(), mg(), fg(), funcIndex()). How about
> we add them only when we need them?
Ah, they were used by DecodeReturn in the previous patch which I was able to take out with 9c9ae4b5cacc. I could remove them but they're going to come back immediately as soon as we implement anything more complicated than nop/i32.const so I'd rather keep them to show the intended structure of the code.
> It would actually be even nicer to have helpers for this,
Yes, that's a good idea, probably as members of the FunctionContext which maybe should be renamed to FunctionDecoder for symmetry with the other Function*s. Perhaps that could be in a future patch that introduces formatting strings though so we could write `f.readVarU32("i32.const")` and get an error that mentioned the operator.
> MOZ_ASSERT(numBytes <= bytecode_.length());
This won't hold for wasm input since the bits are arbitrary.
> Could you describe in a comment what's expected from the classes
> Container1/Container2 (namely, they need to implement length() and the const
> operator[])
I could, but I felt like the name, tiny function body, and compile errors would make that sufficiently clear.
Comment 32•10 years ago
|
||
Comment on attachment 8710818 [details] [diff] [review]
baldr
Review of attachment 8710818 [details] [diff] [review]:
-----------------------------------------------------------------
Still need to give a look at the tests and re-read all of WasmText, but this looks very nice and elegant. A few more comments below.
::: js/src/asmjs/Wasm.cpp
@@ +297,5 @@
> + if (!d.finishSection(sectionStart))
> + return Fail(cx, d, "func section byte size mismatch");
> +
> + int64_t after = PRMJ_Now();
> + unsigned generateTime = (after - before) / PRMJ_USEC_PER_SEC;
We probably want PRMJ_USEC_PER_MSEC here, don't we?
@@ +319,5 @@
> + if (!d.readVarU32(&numFuncs))
> + return Fail(cx, d, "expected number of functions");
> +
> + for (uint32_t i = 0; i < numFuncs; i++) {
> + if (!DecodeFunc(cx, d, mg, funcIndex++))
Probably want to limit the number of functions or to check funcIndex for overflow here?
::: js/src/asmjs/WasmBinary.h
@@ +411,5 @@
> return true;
> }
>
> + MOZ_WARN_UNUSED_RESULT bool writeCString(const char* cstr) {
> + return bytecode_.append(reinterpret_cast<const uint8_t*>(cstr), strlen(cstr) + 1);
MOZ_ASSERT(cstr);
@@ +601,5 @@
> + MOZ_WARN_UNUSED_RESULT bool readCString(const char** cstr = nullptr) {
> + if (cstr)
> + *cstr = reinterpret_cast<const char*>(cur_);
> + for (; cur_ != end_; cur_++) {
> + if (*cur_ == '\0') {
For consistency with the next function, can we use: if (!*cur_) ?
@@ +631,5 @@
> + MOZ_WARN_UNUSED_RESULT bool finishSection(uint32_t offset) {
> + const uint8_t* start = beg_ + offset;
> + uint32_t numBytes;
> + memcpy(&numBytes, start - sizeof(uint32_t), sizeof(uint32_t));
> + return numBytes == uintptr_t(cur_ - start);
So if I understand correctly and look at all the uses of startSection/finishSection, the apis could be simpler:
- startSection could write the number of bytes in an outparam uint32_t
- then finishSection could directly use this number of bytes, instead of recording the offset to where the number of bytes can be read.
Also, skipSection() could just read the value recorded by startSection(), instead of being more aware of the section format and explicitly readU32() the size of bytes, in DecodeUnknownSection.
What do you think?
@@ +636,5 @@
> + }
> + MOZ_WARN_UNUSED_RESULT bool skipSection(uint32_t numBytes) {
> + if (uintptr_t(end_ - cur_) < numBytes)
> + return false;
> + cur_ += numBytes;
Arbitrary bits could overflow addition this while having the first assertion being correct, e.g. if numBytes == UINT32_MAX we can effectively provoke an infinite loop (see an unknown section of size -1, go back byte per byte to the start of a previous unknown section, skip it, see the unknown section of size -1 again).
If this is true, could you add a test, please?
::: js/src/asmjs/WasmText.cpp
@@ +47,5 @@
> +using WasmAstHashMap = HashMap<K, V, HP, LifoAllocPolicy<Fallible>>;
> +
> +typedef WasmAstVector<ValType> WasmAstValTypeVector;
> +
> +struct WasmAstBase
Would using TempObject from jit/JitAllocPolicy work here? Seems we could create a TempAllocator from the LifoAlloc to the same effect.
@@ +119,5 @@
> + return static_cast<T&>(*this);
> + }
> +};
> +
> +struct WasmAstNop : WasmAstExpr
no public inheritance for Nop?
@@ +167,5 @@
> + : WasmAstNode(WasmAstKind::Export),
> + externalName_(externalNameBegin),
> + externalNameLength_(externalNameEnd - externalNameBegin),
> + internalIndex_(UINT32_MAX)
> + {}
MOZ_ASSERT(externalNameBegin <= externalNameEnd);
@@ +314,5 @@
> + }
> + const char16_t* textEnd() const {
> + MOZ_ASSERT(kind_ == Text);
> + MOZ_ASSERT(end_[-1] == '"');
> + return end_ - 1;
Per the previous assertion this will be '"' all the time, is this off by one?
@@ +329,5 @@
> +
> +static bool
> +IsWasmNewLine(char16_t c)
> +{
> + return c == '\n';
nitpicking here, but what about '\r'?
@@ +335,5 @@
> +
> +static bool
> +IsWasmSpace(char16_t c)
> +{
> + return c == ' ' || c == '\t';
Should a new line be a space? I think so, considering the beginning of WasmTokenStream::next()
@@ +407,5 @@
> + case ')':
> + return WasmToken(WasmToken::CloseParen, begin, cur_);
> +
> + case '0': case '1': case '2': case '3': case '4':
> + case '5': case '6': case '7': case '8': case '9': {
What about signed numbers? (later patch?)
@@ +413,5 @@
> + while (cur_ != end_ && IsWasmDigit(*cur_)) {
> + u32 *= 10;
> + u32 += *cur_ - '0';
> + if (!u32.isValid())
> + break;
If we break here, then we call u32.value() which MOZ_ASSERT(isValid()), so this will trigger an error. In non-debug builds, this will just yield the first digit. Can we fail instead here?
::: js/src/builtin/TestingFunctions.cpp
@@ +3692,5 @@
> fuzzingSafe = true;
>
> disableOOMFunctions = disableOOMFunctions_;
>
> + if (!wasm::DefineTestingFunctions(cx, obj))
This isn't very idiomatic, but I guess you're doing this because wasm functions will get into a WASM global object later?
| Assignee | ||
Comment 33•10 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #32)
Thanks! Great catches; I'll upload a new patch with all these fixed.
> - startSection could write the number of bytes in an outparam uint32_t
> - then finishSection could directly use this number of bytes, instead of
> recording the offset to where the number of bytes can be read.
That would almost work but if finishSection is given the number of bytes, it doesn't know how to verify that this is correct: you need to know where you were at startSection. A less trixie strategy would be to return/pass both offset and the numBytes, but that didn't seem worth it.
> > + MOZ_WARN_UNUSED_RESULT bool skipSection(uint32_t numBytes) {
> > + if (uintptr_t(end_ - cur_) < numBytes)
> > + return false;
> > + cur_ += numBytes;
>
> Arbitrary bits could overflow addition this while having the first assertion
> being correct,
I don't see how: if the test doesn't return false then cur_ should be in the range [beg_, end_].
> Would using TempObject from jit/JitAllocPolicy work here? Seems we could
> create a TempAllocator from the LifoAlloc to the same effect.
It could perhaps but I kinda like decoupling the two (this is outside the JIT) and the code-reuse is pretty minimal. E.g., I don't want to do the ballast thing.
> > +struct WasmAstNop : WasmAstExpr
>
> no public inheritance for Nop?
structs default to public inheritance
> > + MOZ_ASSERT(end_[-1] == '"');
> > + return end_ - 1;
>
> Per the previous assertion this will be '"' all the time, is this off by one?
The end in a [begin, end) pair is usually non-inclusive so "one past the end" which, for string literals is what we want.
> What about signed numbers? (later patch?)
Yeah, this is super-bare-bones; *lots* left to do here if you compare with https://github.com/WebAssembly/spec/blob/master/ml-proto/host/lexer.mll.
> > + if (!wasm::DefineTestingFunctions(cx, obj))
>
> This isn't very idiomatic, but I guess you're doing this because wasm
> functions will get into a WASM global object later?
That's right and this function would be called instead from the vm/GlobalObject machinery. Even then, though, if the device doesn't support wasm (no hardfp, old SSE) I was thinking we'd represent that by not having a 'WASM' object.
| Assignee | ||
Comment 34•10 years ago
|
||
Updated with comments addressed.
Attachment #8710818 -
Attachment is obsolete: true
Attachment #8710818 -
Flags: review?(bbouvier)
Attachment #8711210 -
Flags: review?(bbouvier)
| Assignee | ||
Updated•10 years ago
|
Attachment #8710250 -
Attachment is obsolete: true
Comment 35•10 years ago
|
||
Comment on attachment 8711210 [details] [diff] [review]
baldr
Review of attachment 8711210 [details] [diff] [review]:
-----------------------------------------------------------------
Great work! This is so nice that no other pieces in the module generation et al. don't change in this patch, so thank you for all the previous refactorings.
This is very nice to read and to understand in general, we can feel the asmjs/ subdir style touch (maybe time to rename this directory?).
I am excited to see this landing. Could we expose the wasmEval/wasmTextToBinary on Nightly only and/or behind a flag, so that we can demo it?
::: js/src/asmjs/WasmBinary.h
@@ +345,5 @@
> bytecode_[pc] = uint8_t(v);
> }
>
> + template <class T>
> + static const T loadUnaligned(const uint8_t* p) {
the "unaligned" is implicitly assumed in the Decoder class, maybe we could just assume it here too?
::: js/src/asmjs/WasmText.cpp
@@ +269,5 @@
> + ValType valueType_;
> + } u;
> + public:
> + WasmToken() = default;
> + WasmToken(Kind kind, const char16_t* begin, const char16_t* end)
Can we make it explicit as the other non-default ctors?
@@ +331,5 @@
> +
> +static bool
> +IsWasmNewLine(char16_t c)
> +{
> + return c == '\n' || c == '\r';
Hah, getting more information about EOL encoding, there could \r\n to indicate a single end of line, although this will be counted as two lines per the current impl. Can you fix this, please?
@@ +337,5 @@
> +
> +static bool
> +IsWasmSpace(char16_t c)
> +{
> + return IsWasmNewLine(c) || c == ' ' || c == '\t';
According the comment below FirstCharKind in TokenStream.cpp, there are also \v and \f in this category.
Bonus points if you can share code with TokenStream.cpp/h.
@@ +370,5 @@
> + uint32_t lookaheadIndex_;
> + uint32_t lookaheadDepth_;
> + WasmToken lookahead_[LookaheadSize];
> +
> + bool finishKeyword(const char16_t* end, const char16_t* keyword) {
Considering that this is much used in the context of a prefix tree, can you name it "startsWith" or "prefixedBy"? The name is misleading in the sense that you can do finishKeyword('e') and then finishKeyword('xport') but this is a single keyword.
@@ +513,5 @@
> + WasmToken get() {
> + if (lookaheadDepth_) {
> + lookaheadDepth_--;
> + uint32_t got = lookaheadIndex_;
> + lookaheadIndex_ ^= 1;
nit: please static_assert LookaheadSize == 2 or MOZ_ASSERT(lookaheadIndex <= 1)
@@ +514,5 @@
> + if (lookaheadDepth_) {
> + lookaheadDepth_--;
> + uint32_t got = lookaheadIndex_;
> + lookaheadIndex_ ^= 1;
> + return lookahead_[got];
You can just store the WasmToken before xor-ing the lookaheadIndex, and return the token rather than storing the lookahead index?
@@ +520,5 @@
> + return next();
> + }
> + void unget(WasmToken token) {
> + MOZ_ASSERT(lookaheadDepth_ <= 1);
> + lookaheadDepth_++;
Is it me or lookaheadDepth_ === lookaheadIndex_ + 1 everywhere? Maybe we could spare one variable here...
@@ +614,5 @@
> + if (!args.append(valueType.valueType()))
> + return nullptr;
> + break;
> + }
> + case WasmToken::Result: {
Can Param and Result be put in any order and interleaved? It seems that (func) (param) (result) (param) would validate, and it doesn't look very intuitive. Or is your intent to let wasmTextToBinary be a bit loose (thus not exposed to the open web), so that all errors get caught by wasmEval? In any cases, this needs tests.
@@ +616,5 @@
> + break;
> + }
> + case WasmToken::Result: {
> + if (result != ExprType::Void) {
> + c.ts.generateError(field, c.error);
Could we have a more explicit message, "can't have several result types"?
@@ +685,5 @@
> +
> + while (c.ts.getIf(WasmToken::OpenParen)) {
> + WasmToken section = c.ts.get();
> +
> + switch (section.kind()) {
Sections can be put in any order/interleaved, and there can be several of them? In any cases, this needs tests!
::: js/src/jit-test/tests/wasm/basic.js
@@ +27,5 @@
> +var ver1 = 0xff;
> +var ver2 = 0xff;
> +var ver3 = 0xff;
> +
> +assertErrorMessage(() => wasmEval(Uint8Array.of().buffer), Error, magicError);
If wasmEval is supposed to be long lived and callable on the wasm global object, you might want to test it as well: 0 args, 3 args, 1 arg but not an array buffer, 2 args and second one isn't an object. Ditto for wasmTextToBinary. Of course, if these are just testing functions, no need to test them.
@@ +38,5 @@
> +assertErrorMessage(() => wasmEval(Uint8Array.of(magic0, magic1, magic2, magic3, ver0, ver1, ver2).buffer), Error, versionError);
> +assertErrorMessage(() => wasmEval(Uint8Array.of(magic0, magic1, magic2, magic3, ver0, ver1, ver2, ver3, 0, 1).buffer), Error, extraError);
> +
> +var o = wasmEval(Uint8Array.of(magic0, magic1, magic2, magic3, ver0, ver1, ver2, ver3, 0).buffer);
> +assertEq(Object.getOwnPropertyNames(o).length, 0);
Binary decoding can go wrong in so many ways that are untested here. Do you mind to move these first tests to a binary.js file, and add more binary tests there? This can be done in a later patch / bug; actually I'd be volunteering to write such test cases.
@@ +44,5 @@
> +assertErrorMessage(() => wasmEvalText(""), Error, parsingError);
> +assertErrorMessage(() => wasmEvalText("("), Error, parsingError);
> +assertErrorMessage(() => wasmEvalText("(m"), Error, parsingError);
> +assertErrorMessage(() => wasmEvalText("(module"), Error, parsingError);
> +assertErrorMessage(() => wasmEvalText("(module"), Error, parsingError);
Can you add a test that "(moduler)" is also a parse error (to try the prefix tree)?
Attachment #8711210 -
Flags: review?(bbouvier) → review+
| Assignee | ||
Comment 36•10 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #35)
Thanks, great comments.
> I am excited to see this landing. Could we expose the
> wasmEval/wasmTextToBinary on Nightly only and/or behind a flag, so that we
> can demo it?
The plan is to expose behind a pref (default off) however, that will likely be a separate WASM.compile/eval function which returns a Promise (which will be easier to do once we have bug 911216). Thus, wasmEval itself should remain a testing-only function.
> Is it me or lookaheadDepth_ === lookaheadIndex_ + 1 everywhere? Maybe we
> could spare one variable here...
When the lookaheadDepth is > 0, the lookaheadIndex can be either 0 or 1. You can think of this as a circular buffer specialized to the case where the buffer size is 2.
> Can Param and Result be put in any order and interleaved?
Yep: https://github.com/WebAssembly/spec/blob/master/ml-proto/host/parser.mly#L293
> Or is your intent to let wasmTextToBinary be a bit loose (thus
> not exposed to the open web), so that all errors get caught by wasmEval?
This is ultimately a temporary text format (https://github.com/WebAssembly/design/blob/master/TextFormat.md#official-text-format) and so the goal is really just to have "enough to write tests". Later a more precisely-defined text language would be specified which gets a more careful treatment (and probably a switch to lex/yacc). Note that, in any case, this won't ever be content-visible.
> Could we have a more explicit message, "can't have several result types"?
Given the testing-only goals and possible rewrite to lex/yacc, I'm not too interested in building up the error reporting. Patches welcome, of course.
Comment 37•10 years ago
|
||
Comment 38•10 years ago
|
||
| bugherder | ||
Comment 39•10 years ago
|
||
decoder wants to build the files in asmjs/ in non-unified mode. This patch fixes some compile issues blocking that.
Attachment #8712209 -
Flags: review?(luke)
| Assignee | ||
Comment 40•10 years ago
|
||
Comment on attachment 8712209 [details] [diff] [review]
Fix non-unified build
Review of attachment 8712209 [details] [diff] [review]:
-----------------------------------------------------------------
Thanks!
::: js/src/asmjs/Wasm.cpp
@@ +122,5 @@
> uintptr_t bodyLength = bodyEnd - bodyBegin;
> if (!fg.bytecode().resize(bodyLength))
> return false;
>
> + mozilla::PodCopy(fg.bytecode().begin(), bodyBegin, bodyLength);
Can you add a 'using mozilla::PodCopy' up at the top?
Attachment #8712209 -
Flags: review?(luke) → review+
Comment 41•10 years ago
|
||
| Assignee | ||
Updated•10 years ago
|
Whiteboard: [leave open]
Comment 42•10 years ago
|
||
| bugherder | ||
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
status-firefox47:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
You need to log in
before you can comment on or make changes to this bug.
Description
•