Closed Bug 1277246 Opened 3 years ago Closed 3 years ago

[wasm] Add support for names section

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla49
Tracking Status
firefox49 --- fixed

People

(Reporter: yury, Assigned: yury)

References

Details

Attachments

(1 file)

The binary WebAssembly format may contain names section that has original function (and locals) names [1]. It's expected to have these function name in the stack traces if the valid names section is present. Notice that this section does not cause validation for the whole module to fail. 

  [1] https://github.com/WebAssembly/design/blob/master/BinaryEncoding.md#name-section
Comment on attachment 8758868 [details]
Bug 1277246 - Add support for names section.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56980/diff/1-2/
Comment on attachment 8758868 [details]
Bug 1277246 - Add support for names section.

https://reviewboard.mozilla.org/r/56980/#review53752

Thanks!  Just some small requests:

::: js/src/asmjs/Wasm.cpp:1030
(Diff revision 2)
>  
>      return true;
>  }
>  
> +// Rejecting names that are empty or contain non-ASCII characters.
> +// TODO Extend validation to include valid UTF-8 characters.

Actually, now that I think about it, I think we don't need the UTF8 limitation at all: these byte strings go into a UniqueChars which is treated like UTF8 by consumers (see FrameIterator::functionDisplayAtom).

::: js/src/asmjs/Wasm.cpp:1052
(Diff revision 2)
> +MaybeDecodeNameSectionBody(JSContext* cx, Decoder& d, uint32_t sectionStart,
> +                           uint32_t sectionSize, CacheableCharsVector& funcNames)
> +{
> +    uint32_t numFuncNames;
> +    if (!d.readVarU32(&numFuncNames))
> +        return false; // failed to read number of function names

I appreciate the thought behind the comment (since there is no Fail() message), but I think the cause is pretty much always obvious when reading the code, so I'd remove all these fail comments.

::: js/src/asmjs/Wasm.cpp:1067
(Diff revision 2)
> +        uint32_t funcNameNumBytes;
> +        if (!d.readVarU32(&funcNameNumBytes))
> +            return false; // expected function name length
> +
> +        const uint8_t* funcName;
> +        if (!d.hasSectionBytesRemain(sectionStart, sectionSize, funcNameNumBytes) ||

I think the hasSectionBytesRemain() check is unnecessary: readBytesRaw() catches going past the end of the whole buffer and finishSection() makes sure you end at exactly the end of the section.

::: js/src/asmjs/Wasm.cpp:1068
(Diff revision 2)
> +        if (!d.readVarU32(&funcNameNumBytes))
> +            return false; // expected function name length
> +
> +        const uint8_t* funcName;
> +        if (!d.hasSectionBytesRemain(sectionStart, sectionSize, funcNameNumBytes) ||
> +            !d.readBytesRaw(funcNameNumBytes, &funcName))

DecodeExportName() is doing a very similar sort of task here: could you factor out the common code in to a static function that returns a UniqueChars?

::: js/src/asmjs/Wasm.cpp:1098
(Diff revision 2)
> +    funcNames = Move(result);
> +    return true;
> +}
> +
> +static bool
> +DecodeNameSection(JSContext* cx, Decoder& d, CacheableCharsVector& funcNames)

SM style is to take outparams by *, not &, to indicate, so that it's more evident at the callsite which parameters are being mutated.

::: js/src/asmjs/WasmBinary.h:805
(Diff revision 2)
>      }
> +    void ignoreSection(uint32_t startOffset, uint32_t size) {
> +        cur_ = (beg_ + startOffset) + size;
> +        MOZ_ASSERT(cur_ <= end_);
> +    }
> +    MOZ_MUST_USE bool hasSectionBytesRemain(uint32_t sectionStart, uint32_t sectionSize,

Given above, I think you can remove this function.

::: js/src/asmjs/WasmModule.cpp:1629
(Diff revision 2)
>  
>  const char*
>  Module::prettyFuncName(uint32_t funcIndex) const
>  {
> +    if (funcIndex >= module_->prettyFuncNames.length())
> +        return nullptr;

Can you rename this function maybePrettyFuncName?

::: js/src/asmjs/WasmModule.cpp:1637
(Diff revision 2)
>  
>  const char*
>  Module::getFuncName(JSContext* cx, uint32_t funcIndex, UniqueChars* owner) const
>  {
> -    if (!module_->prettyFuncNames.empty())
> -        return prettyFuncName(funcIndex);
> +    const char* prettyName = prettyFuncName(funcIndex);
> +    if (prettyName)

if (const char* = ...)

::: js/src/jit-test/tests/wasm/names.js:37
(Diff revision 2)
> +function runTest(namesSection, expectedName) {
> +  if (namesSection.length >= 128) throw new Error('namesSection is too big');
> +
> +  var headerSize = 1 + 4 + 1;
> +  var ar = new Uint8Array(wasmBase.length + headerSize + namesSection.length);
> +  ar.set(wasmBase);

Really happy you wrote a test, but could you put it into tests/wasm/binary.js which will let you reuse functions like `moduleWithSections` to programmatically create modules?
Attachment #8758868 - Flags: review?(luke)
https://reviewboard.mozilla.org/r/56980/#review53752

> Actually, now that I think about it, I think we don't need the UTF8 limitation at all: these byte strings go into a UniqueChars which is treated like UTF8 by consumers (see FrameIterator::functionDisplayAtom).

FrameIterator::functionDisplayAtom displays empty name when bytes contain invalid sequence. Do we want similar behavoir?
https://reviewboard.mozilla.org/r/56980/#review53752

> FrameIterator::functionDisplayAtom displays empty name when bytes contain invalid sequence. Do we want similar behavoir?

Well you're right that there is the limitation for null, but I think non-ASCII characters work fine (they are ultimately decoded with AtomizeUTF8Chars).
https://reviewboard.mozilla.org/r/56980/#review53752

> Well you're right that there is the limitation for null, but I think non-ASCII characters work fine (they are ultimately decoded with AtomizeUTF8Chars).

The "te\xE0\xFF" test I provided fails with "InternalError: buffer too small" when I start using this byte sequence with AtomizeUTF8Chars. Our binary format spec explicitly wants "valid utf8 encoding", so there will be nothing wrong to reject entire section or name if invalid UTF8 is used.

We need validation, but currently we don't have that code in js to perform this task. I can add something to js/src/vm/CharacterEncoding.cpp to do just that? With null check?
https://reviewboard.mozilla.org/r/56980/#review53752

> The "te\xE0\xFF" test I provided fails with "InternalError: buffer too small" when I start using this byte sequence with AtomizeUTF8Chars. Our binary format spec explicitly wants "valid utf8 encoding", so there will be nothing wrong to reject entire section or name if invalid UTF8 is used.
> 
> We need validation, but currently we don't have that code in js to perform this task. I can add something to js/src/vm/CharacterEncoding.cpp to do just that? With null check?

Ah, I see now.  Perhaps then keep the current ascii + non-null requirement but could you refactor to reuse for the import and export strings?  According to https://github.com/WebAssembly/design/blob/dynamic-linking/Web.md#names these should all be using the same valid-utf8 checks.
Comment on attachment 8758868 [details]
Bug 1277246 - Add support for names section.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56980/diff/2-3/
Attachment #8758868 - Attachment description: MozReview Request: Bug 1277246 - Add support for names section. r?luke → Bug 1277246 - Add support for names section.
Attachment #8758868 - Flags: review?(luke)
Attachment #8758868 - Flags: review?(luke) → review+
Comment on attachment 8758868 [details]
Bug 1277246 - Add support for names section.

https://reviewboard.mozilla.org/r/56980/#review54080

Thanks for addressing all the feedback!

::: js/src/asmjs/Wasm.cpp:697
(Diff revision 3)
> +
> +    if (!bytes.append(0))
> +        return nullptr;
> +
> +    UniqueChars name((char*)bytes.extractOrCopyRawBuffer());
> +    return Move(name);

I think you can write:
```
return UniqueChars((char*)bytes.extractOrCopyRawBuffer();
```

::: js/src/asmjs/Wasm.cpp:1205
(Diff revision 3)
>      }
>      return true;
>  }
>  
>  static bool
> -GetProperty(JSContext* cx, HandleObject obj, const Bytes& bytes, MutableHandleValue v)
> +GetProperty(JSContext* cx, HandleObject obj, const UniqueChars& chars, MutableHandleValue v)

I'd just pass a `const char*` here

::: js/src/asmjs/WasmModule.cpp:1660
(Diff revision 3)
>          return nullptr;
>  
>      JSAtom* atom = AtomizeUTF8Chars(cx, chars, strlen(chars));
> -    if (!atom)
> +    if (!atom) {
> +        // Failing due to invalid UTF8 sequence, standard name again
> +        char* chars = JS_smprintf("wasm-function[%u]", funcIndex);

How is it possible to hit this with your <128 check in MaybeDecodeName?  This path is also a bit dangerous: after AtomizeUTF8Chars() fails, it leaves an exception pending.  If the JS_smprintf() below succeeds, then we'll be returning success even though an exception is pending (which is a general engine error).  If JS_smprintf fails, then ReportOutOfMemory(cx) is a double-report which is also an error.  Ideally, we'd know here that we can't fail (b/c asm.js only feeds us well-formed utf8 and we have our ascii limitation for wasm).
Comment on attachment 8758868 [details]
Bug 1277246 - Add support for names section.

Review request updated; see interdiff: https://reviewboard.mozilla.org/r/56980/diff/3-4/
Keywords: checkin-needed
Pushed by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d02ab40ae319
Add support for names section. r=luke
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/d02ab40ae319
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla49
You need to log in before you can comment on or make changes to this bug.