Closed Bug 1501155 Opened 6 years ago Closed 6 years ago

[BinAST] Use WTF-8 in string encoding

Categories

(Core :: JavaScript Engine, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
mozilla65
Tracking Status
firefox65 --- fixed

People

(Reporter: arai, Assigned: arai)

References

Details

Attachments

(2 files, 1 obsolete file)

The string literal in binjs file is raw UTF-8 string which doesn't contain any escape sequence like \uXXXX, which means a lone surrogate should be representable there. We can use WTF-8 as the string encoding in the binjs file. related issue in binjs-ref https://github.com/binast/binjs-ref/issues/190
corresponds to https://github.com/binast/binjs-ref/pull/193 will ask review once binjs-ref is patched
Blocks: 1507368
This adds API for WTF-8 [1] that is used by BinAST parser which is fixed in Part 2. * add WTF8Chars * add AtomizeWTF8Chars * templatize methods/classes in CharacterEncoding.cpp to accept both UTF8Chars and WTF8Chars given AtomizeWTF8Chars needs the exact same optimization added by bug 1494942, I've cloned them by template. [1] https://simonsapin.github.io/wtf-8/
Attachment #9019270 - Attachment is obsolete: true
Attachment #9025227 - Flags: review?(jwalden+bmo)
And just replaced the consumer in BinAST side.
Attachment #9025228 - Flags: review?(dteller)
Comment on attachment 9025228 [details] [diff] [review] Part 2: Use WTF-8 as BinAST string encoding. Review of attachment 9025228 [details] [diff] [review]: ----------------------------------------------------------------- Most complicated patch ever :)
Attachment #9025228 - Flags: review?(dteller) → review+
Comment on attachment 9025227 [details] [diff] [review] Part 1: Add AtomizeWTF8Chars. Review of attachment 9025227 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/vm/CharacterEncoding.cpp @@ +210,2 @@ > */ > +template <typename InputCharsT> Use class for this, because the parameter *is* always a class. Same for the other InputCharsT below. @@ +214,2 @@ > { > MOZ_ASSERT(1 <= utf8Length && utf8Length <= 4); Add static_assert(std::is_same<InputCharsT, UTF8Chars>::value || std::is_same<InputCharsT, WTF8Chars>::value, "must be either UTF-8 or WTF-8"); at top here. @@ +404,2 @@ > static void > +CopyAndInflateUTF8IntoBuffer(JSContext* cx, const InputCharsT src, CharT *dst, size_t outlen, bool allASCII) Put the * by CharT while you're touching.
Attachment #9025227 - Flags: review?(jwalden+bmo) → review+
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: