Closed Bug 1428863 Opened 2 years ago Closed 2 years ago

Grab-bag of various token stream/parser changes to make those classes amenable to instantiation for single-byte characters

Categories

(Core :: JavaScript Engine, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox59 --- fixed
firefox60 --- fixed

People

(Reporter: Waldo, Assigned: Waldo)

References

Details

Attachments

(12 files)

26.52 KB, patch
arai
: review+
Details | Diff | Splinter Review
4.71 KB, patch
arai
: review+
Details | Diff | Splinter Review
4.77 KB, patch
arai
: review+
Details | Diff | Splinter Review
5.70 KB, patch
arai
: review+
Details | Diff | Splinter Review
2.38 KB, patch
arai
: review+
Details | Diff | Splinter Review
2.32 KB, patch
arai
: review+
Details | Diff | Splinter Review
2.87 KB, patch
arai
: review+
Details | Diff | Splinter Review
4.13 KB, patch
arai
: review+
Details | Diff | Splinter Review
4.19 KB, patch
arai
: review+
Details | Diff | Splinter Review
2.00 KB, patch
arai
: review+
Details | Diff | Splinter Review
4.28 KB, patch
arai
: review+
Details | Diff | Splinter Review
3.89 KB, patch
arai
: review+
Details | Diff | Splinter Review
I've had these patches since before the all-hands.  With them, and then with one additional +149,-29 patch that does known-stupid things just to make something compilable, I can instantiate a single-byte token stream and parser.  (Some of the extra implementation is stub |return false;| sorts of things -- this is just to prove it's far enough to compile, not to work.)

I'm not sure I have entirely coherent ideas how to describe the purposes of all these (except to the extent of "get us to the point to do that show-it's-possible" patch), so the bug description is a bit silly.  Not worth trying to come up with a better one.
This eliminates the need to duplicate constant-folding for one/two-byte parsing.  Obviously, such duplication would be very silly and mostly useless, as the little changes involved in doing this show.
Attachment #8940808 - Flags: review?(arai.unmht)
TokenStream no longer has the meaning it's traditionally had -- rather than Parser containing a TokenStream, you have a TokenStreamAnyChars in ParserBase and a TokenStreamSpecific in GeneralParser.  TokenStream exists only for tokenizing/parsing regexps any more, and really that use is just silly and should go away.  (I don't know whether I'll do that soon or not.  Hopefully.)  So this stuff should use TokenStreamShared as sort of the closest thing to what TokenStream used to be, for this purpose.
Attachment #8940812 - Flags: review?(arai.unmht)
TokenStream's historic meaning is going away (see the prior patch).  Position is really |typename TokenStreamCharsBase<CharT>::Position|.  Obviously it's silly to type out *that*.  So really it makes sense to just |using| this into the Parser classes that really need it.
Attachment #8940813 - Flags: review?(arai.unmht)
Without this, obviously you get compile errors trying to assign to |const char16_t*| from a single-byte-derived thing.  The single-byte instantiation will look quite different (probably it'll be explode the string to UTF-16 and then handle that by an entirely separate code path...except then pointing at the location of an error will be wrong, so I'm not sure about that), so no reason for this function to be defined in a CharT-agnostic way, versus specializing.
Attachment #8940814 - Flags: review?(arai.unmht)
This arguably is a little a matter of taste, but it seems better to me this way.
Attachment #8940824 - Flags: review?(arai.unmht)
This is sort of weirdly inlined right now, and if we do UTF-8 single-byte handling, then we can't just do a literal element-by-element copy.  Better to split it out, then there's one thing to specialize when UTF-8 happens.
Attachment #8940825 - Flags: review?(arai.unmht)
General access hygiene.
Attachment #8940826 - Flags: review?(arai.unmht)
Attachment #8940808 - Flags: review?(arai.unmht) → review+
Attachment #8940810 - Flags: review?(arai.unmht) → review+
Attachment #8940812 - Flags: review?(arai.unmht) → review+
Comment on attachment 8940813 [details] [diff] [review]
Add non-qualified Position typenames to various parser structs

Review of attachment 8940813 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/Parser.h
@@ +648,5 @@
>  class GeneralParser
>    : public PerHandlerParser<ParseHandler>
>  {
> +  public:
> +    using TokenStream = TokenStreamSpecific<CharT, ParserAnyCharsAccess<GeneralParser>>;

so, "TokenStream" identifier means different things between inside and outside GeneralParser, right?
it's a bit scary, but I'm fine as long as the another definition is renamed shortly.
Attachment #8940813 - Flags: review?(arai.unmht) → review+
Attachment #8940814 - Flags: review?(arai.unmht) → review+
Attachment #8940815 - Flags: review?(arai.unmht) → review+
Attachment #8940816 - Flags: review?(arai.unmht) → review+
Status: ASSIGNED → NEW
Priority: -- → P2
Comment on attachment 8940824 [details] [diff] [review]
De-indent TokenStreamSpecific::getDirective a level

Review of attachment 8940824 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/frontend/TokenStream.cpp
@@ +1133,5 @@
>  {
>      MOZ_ASSERT(directiveLength <= 18);
>      char16_t peeked[18];
>  
> +    if (!peekChars(directiveLength, peeked) || !CharsMatch(peeked, directive))

I prefer separating them into 2 if-s, since now they're just for early-return.
Attachment #8940824 - Flags: review?(arai.unmht) → review+
Attachment #8940825 - Flags: review?(arai.unmht) → review+
Attachment #8940826 - Flags: review?(arai.unmht) → review+
Attachment #8941248 - Flags: review?(arai.unmht) → review+
Turns out I can introduce this a little bit higher up in the class hierarchy.
Attachment #8943439 - Flags: review?(arai.unmht)
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d01ca5c5f3d0
Pass a PerHandlerParser<ParseHandler> to FoldConstants rather than a GeneralParser<ParseHandler, CharT>, eliminating the need to duplicate constant-folding for one/two-byte parsing.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/2a4be20ecde2
Remove unused chars/length arguments from the ParserBase and PerHandlerParser constructors.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/98378bbaf807
Change a few TokenStream::* uses in AsmJS.cpp to TokenStreamShared::*, because TokenStream no longer has the meaning it's traditionally had.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/ae260591e20e
Add non-qualified Position typenames to various parser structs.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/591f66dd6bcf
Adjust Parser::newRegExp to be roughly CharT-agnostic (up to a static_assert restricting it to char16_t, as a flag of a place requiring changes for UTF-8 handling).  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/c1406363fb17
Make TokenStreamChars have a delegated constructor.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/118c5618f348
Add TokenStreamChars::asSpecific to consolidate a downcast into a helper function.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/325b6c9e845e
De-indent TokenStreamSpecific::getDirective a level.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/6349fe7d8cbf
Add and use TokenStreamChars::copyTokenbufTo.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/515639ccfaf2
Minimize access to various base-class |using| declarations in TokenStreamSpecific.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/24b899860ad0
Move TokenStreamChars::atomizeChars into TokenStreamCharsBase -- no need for it in the more-derived class as long as it can be specialized for multiple character types.  r=arai
Attachment #8943439 - Flags: review?(arai.unmht) → review+
Actually, let's just keep this open for now.  I'll keep piling on small bits of patch that move us toward being able to instantiate the various TokenStream classes for Utf8Unit (bug 1426909, still WIP but the Utf8Unit bits seem solid) here until Utf8Unit (and a way to have UTF-8-containing strings from which a char* can be extracted when needed) has actually landed.
Status: RESOLVED → REOPENED
Keywords: leave-open
Resolution: FIXED → ---
Target Milestone: mozilla59 → ---
Given the merge next week.. It might make sense to file a new bug for patches landing in 60 :)
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9eecb5e20e10
Move TokenStreamChars::copyTokenbufTo to TokenStreamCharsBase.  r=arai
Oh, that's probably fair.  Will remove the keyword.
Keywords: leave-open
https://hg.mozilla.org/mozilla-central/rev/9eecb5e20e10
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
You need to log in before you can comment on or make changes to this bug.