Closed Bug 1478587 Opened Last year Closed Last year

Instantiate TokenStreamSpecific for UTF-8

Categories

(Core :: JavaScript Engine, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla63
Tracking Status
firefox63 --- fixed

People

(Reporter: Waldo, Assigned: Waldo)

References

Details

Attachments

(10 files)

1.76 KB, patch
arai
: review+
Details | Diff | Splinter Review
2.04 KB, patch
arai
: review+
Details | Diff | Splinter Review
829 bytes, patch
arai
: review+
Details | Diff | Splinter Review
3.60 KB, patch
arai
: review+
Details | Diff | Splinter Review
2.27 KB, patch
arai
: review+
Details | Diff | Splinter Review
828 bytes, patch
arai
: review+
Details | Diff | Splinter Review
2.29 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
1.09 KB, patch
arai
: review+
Details | Diff | Splinter Review
7.04 KB, patch
arai
: review+
Details | Diff | Splinter Review
1.02 KB, patch
arai
: review+
Details | Diff | Splinter Review
Many of these changes are just for TokenStream.cpp-internal thinkos, easy fixes no thinking.  But some places are trickier.  I'll get out the easy changes first.

The trickier ones I *have* patches for, but they varyingly hackish, and I want to think a little more before posting them.
Easy-peasy with the newer function for this.
Attachment #8995097 - Flags: review?(arai.unmht)
Entertainingly, without this fix trying to instantiate TokenStreamSpecific exhausts the 50k lines of scrollback my terminal's currently set to keep.  ;-)
Attachment #8995098 - Flags: review?(arai.unmht)
I don't know if the extra uint8_t cast will speed anything up here over the previous two-byte comparison we were using, but you never know.
Attachment #8995099 - Flags: review?(arai.unmht)
It's either add this or recreate similar reinterpret_cast<> multiple places.  This seems best.

Conveniently we at least don't have to think about mutation being a possibility in this stuff yet.  Even after all my patching is wrapped up, we may still not need to.  That will be great if it holds up.  :-|
Attachment #8995386 - Flags: review?(nfroyd)
Attachment #8995387 - Flags: review?(arai.unmht)
Attachment #8995389 - Flags: review?(arai.unmht)
Blocks: 1478892
Attachment #8995096 - Flags: review?(arai.unmht) → review+
Attachment #8995097 - Flags: review?(arai.unmht) → review+
Attachment #8995098 - Flags: review?(arai.unmht) → review+
Attachment #8995099 - Flags: review?(arai.unmht) → review+
Attachment #8995100 - Flags: review?(arai.unmht) → review+
Attachment #8995101 - Flags: review?(arai.unmht) → review+
Attachment #8995387 - Flags: review?(arai.unmht) → review+
Attachment #8995388 - Flags: review?(arai.unmht) → review+
Attachment #8995389 - Flags: review?(arai.unmht) → review+
Comment on attachment 8995386 [details] [diff] [review]
Implement mozilla::Utf8AsUnsignedChars to centralize UTF-8-to-unsigned-chars casts and their justifications

Review of attachment 8995386 [details] [diff] [review]:
-----------------------------------------------------------------

::: mfbt/Utf8.h
@@ +205,5 @@
> + * may validly be dereferenced.
> + *
> + * No access is provided to mutate this underlying memory as |unsigned char|.
> + * Presently memory inside |Utf8Unit| is *only* stored as |char|, and we are
> + * loathe to offer a way to write non-|char| data until absolutely necessary.

Nit: I think this is "we are loath", although "loathe" does express the proper sentiment here!
Attachment #8995386 - Flags: review?(nfroyd) → review+
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/a47cd872a700
Invoke TokenStreamChars::getNonAsciiCodePointDontNormalize passing CharT lead (not int32_t).  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/b335930f96b4
Implement correct handling of Unicode line/paragraph separators in template/string literals for UTF-8.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/9546bfc93c7a
Return the result of CodeUnitValue(), not a CharT, from GeneralTokenStreamChars::getCodeUnit.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/756cbd89533a
Add CharT casts when comparing code units to particular ASCII characters.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/3a111fdc418d
Switch on the numeric type underlying CharT, not on CharT itself, in a couple switches in TokenStreamSpecific::getTokenInternal.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/63ccd68e1da3
Make ReportUnterminatedRegExp accept an int32_t unit, not a CharT unit.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/d1cd66e6d3c3
Implement mozilla::Utf8AsUnsignedChars to centralize UTF-8-to-unsigned-chars casts and their justifications.  r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/aa8a0735f303
Make FindReservedWord work for CharT=Utf8Unit.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/1a4669e53b46
Make number-parsing functions deal with UTF-8 input.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/c4b339472f4b
Instantiate TokenStreamSpecific for UTF-8.  r=arai
Backed out 11 changesets (Bug 1478892, Bug 1478587) for build bustages in ../../../dist/bin/gdb-tests on a CLOSED TREE

Failure: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified&selectedJob=190761702
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=190761702&repo=mozilla-inbound&lineNumber=32210

[task 2018-07-29T04:32:00.324Z] 04:32:00     INFO -  make[4]: *** [../../../dist/bin/gdb-tests] Error 254
[task 2018-07-29T04:32:00.325Z] 04:32:00     INFO -  make[4]: Leaving directory '/builds/worker/workspace/build/src/obj-firefox/js/src/gdb'
[task 2018-07-29T04:32:00.325Z] 04:32:00     INFO -  /builds/worker/workspace/build/src/config/recurse.mk:74: recipe for target 'js/src/gdb/target' failed
[task 2018-07-29T04:32:00.326Z] 04:32:00     INFO -  make[3]: *** [js/src/gdb/target] Error 2
[task 2018-07-29T04:32:00.326Z] 04:32:00     INFO -  make[3]: *** Waiting for unfinished jobs....
Flags: needinfo?(jwalden+bmo)
Pushed by jwalden@mit.edu:
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb0dd8b2d4e2
Invoke TokenStreamChars::getNonAsciiCodePointDontNormalize passing CharT lead (not int32_t).  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/d3595d3fd0c0
Implement correct handling of Unicode line/paragraph separators in template/string literals for UTF-8.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/805445fc768f
Return the result of CodeUnitValue(), not a CharT, from GeneralTokenStreamChars::getCodeUnit.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/70b1c176aa7b
Add CharT casts when comparing code units to particular ASCII characters.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/caaac3580bd7
Switch on the numeric type underlying CharT, not on CharT itself, in a couple switches in TokenStreamSpecific::getTokenInternal.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/8f8857dc9348
Make ReportUnterminatedRegExp accept an int32_t unit, not a CharT unit.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/aa240ce55658
Implement mozilla::Utf8AsUnsignedChars to centralize UTF-8-to-unsigned-chars casts and their justifications.  r=froydnj
https://hg.mozilla.org/integration/mozilla-inbound/rev/f1de95a9cc69
Make FindReservedWord work for CharT=Utf8Unit.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/69a4d4210881
Make number-parsing functions deal with UTF-8 input.  r=arai
https://hg.mozilla.org/integration/mozilla-inbound/rev/eea8bd3d8e45
Instantiate TokenStreamSpecific for UTF-8.  r=arai
Flags: needinfo?(jwalden+bmo)
You need to log in before you can comment on or make changes to this bug.