Implement String.prototype.matchAll proposal
Categories
(Core :: JavaScript Engine, enhancement, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox66 | --- | disabled |
People
(Reporter: till, Assigned: anba)
References
(Blocks 1 open bug, )
Details
(Keywords: dev-doc-complete)
Attachments
(6 files, 10 obsolete files)
27.79 KB,
patch
|
anba
:
review+
|
Details | Diff | Splinter Review |
11.86 KB,
patch
|
anba
:
review+
|
Details | Diff | Splinter Review |
12.94 KB,
patch
|
anba
:
review+
|
Details | Diff | Splinter Review |
10.94 KB,
patch
|
anba
:
review+
|
Details | Diff | Splinter Review |
1.49 KB,
patch
|
anba
:
review+
|
Details | Diff | Splinter Review |
52.93 KB,
patch
|
anba
:
review+
|
Details | Diff | Splinter Review |
The String.prototype.matchAll proposal is at stage 3 and we should implement. I wouldn't rule out slight changes to the spec, so we shouldn't immediately let this ride the trains.
Updated•6 years ago
|
Updated•6 years ago
|
Assignee | ||
Comment 1•6 years ago
|
||
This implements the current proposal text including https://github.com/tc39/proposal-string-matchall/pull/33, except for the second IsRegExp call in MatchAllIterator because of [1] and [2]. [1] https://github.com/tc39/proposal-string-matchall/pull/33#pullrequestreview-94400839 [2] https://github.com/tc39/proposal-string-matchall/pull/33#discussion_r166352273
Assignee | ||
Comment 2•6 years ago
|
||
Adds inlining support for IsRegExpStringIterator and NewRegExpStringIterator based on the existing inlining support for Array and String iterators.
Assignee | ||
Comment 3•6 years ago
|
||
Adds an optimisation to reuse the input RegExp of MatchAllIterator. I've verified in µ-benchmarks that this optimisation does pay off. And, even though this is the initial implementation of the proposal, landing this optimisation now is useful because we can give implementer's feedback to the proposal author.
Assignee | ||
Comment 4•6 years ago
|
||
And some initial tests, mostly focusing on the optimised path, because I expect the test262 tests will cover the functional tests more thoroughly.
Comment 5•6 years ago
|
||
Comment on attachment 8956024 [details] [diff] [review] bug1435829-part1-implementation.patch Review of attachment 8956024 [details] [diff] [review]: ----------------------------------------------------------------- Crystal clear as always. ::: js/src/builtin/RegExp.js @@ +1188,5 @@ > + return callFunction(CallRegExpStringIteratorMethodIfWrapped, obj, > + "RegExpStringIteratorNext"); > + } > + > + var result = { value: null, done: false }; If I'm reading this right, the proposal returns an object with `value: undefined` in step 4 and step 10. ::: js/src/vm/Iteration.h @@ +148,5 @@ > > StringIteratorObject* > NewStringIteratorObject(JSContext* cx, NewObjectKind newKind = GenericObject); > > +class RegExpStringIteratorObject : public NativeObject Please put all this somewhere in js/src/builtin, since it is standard library stuff, not part of the VM. It's a little odd for GlobalObject::initRegExpStringIteratorProto to live in a different directory, but we already do that for some builtin objects: https://searchfox.org/mozilla-central/search?q=%5EGlobalObject%3A%3Ainit%5Cw%2B&case=false®exp=true&path=js%2Fsrc%2Fbuiltin I hope this is not onerous. (It's so hard to tell when something like this will be.) A separate patch would be fine, and if it's a pain, just file a followup bug...
Comment 6•6 years ago
|
||
Comment on attachment 8956025 [details] [diff] [review] bug1435829-part2-jit-support.patch Review of attachment 8956025 [details] [diff] [review]: ----------------------------------------------------------------- Seems obviously correct to me, but I'd better defer to a jit peer.
Comment 7•6 years ago
|
||
Comment on attachment 8956026 [details] [diff] [review] bug1435829-part3-reuse-regexp.patch Review of attachment 8956026 [details] [diff] [review]: ----------------------------------------------------------------- r=me with a few comments. I guess I'm a little surprised this is a winning optimization. It looks like all that's being optimized away is a few straightforward method calls and a `new RegExp(existingRegExp)` ...which should be fairly quick, relative to the cost of actually searching strings. ::: js/src/builtin/RegExp.js @@ +1142,5 @@ > var C = SpeciesConstructor(R, GetBuiltinConstructor("RegExp")); > > + if (IsMatchAllIteratorOptimizable(R, C)) { > + // Step 2.c. > + source = UnsafeGetStringFromReservedSlot(R, REGEXP_SOURCE_SLOT); I think the source slot of a RegExp is immutable (once it's initialized). If so, we don't need REGEXP_STRING_ITERATOR_SOURCE_SLOT. We can extract the source later, if we ever need it. ::: js/src/vm/Iteration.cpp @@ +1109,2 @@ > RegExpStringIteratorSlotFlags, > + RegExpStringIteratorSlotLastIndex, Now these need short comments explaining why they aren't identical to what's in the spec, and how they correspond to the spec. In particular, the comment on LastIndex should explain - that it's used while we're on the fast path, and if we have to jump to the slower path, then this state is stored in the RegExp's .lastIndex property instead. - how this field doubles as .[[Done]]
Comment 8•6 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #7) > I think the source slot of a RegExp is immutable (once it's initialized). I don't remember ever encountering RegExp.prototype.compile() before. Amazing.
Comment 9•6 years ago
|
||
Comment on attachment 8956027 [details] [diff] [review] bug1435829-part4-tests.patch Review of attachment 8956027 [details] [diff] [review]: ----------------------------------------------------------------- Nice tests. Thank you! ::: js/src/tests/non262/String/matchAll.js @@ +30,5 @@ > +const RegExp_prototype_match = RegExp.prototype[Symbol.match]; > + > +function assertEqIterMatchResult(actual, expected) { > + assertEq(actual.done, expected.done); > + if (actual.value === null || expected.value === null) { undefined, not null @@ +102,5 @@ > + regexp.compile("b+"); > + assertEqMatchResults(iterator, matchResults("aabb", /a+/)); > +} > + > +// Recompile RegExp (source) before first match. I think this one comment should say (flags) rather than (source).
Comment 10•6 years ago
|
||
Comment on attachment 8956025 [details] [diff] [review] bug1435829-part2-jit-support.patch Review of attachment 8956025 [details] [diff] [review]: ----------------------------------------------------------------- Sorry for the delay. Looks great!
Assignee | ||
Comment 11•6 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #5) > ::: js/src/builtin/RegExp.js > @@ +1188,5 @@ > > + return callFunction(CallRegExpStringIteratorMethodIfWrapped, obj, > > + "RegExpStringIteratorNext"); > > + } > > + > > + var result = { value: null, done: false }; > > If I'm reading this right, the proposal returns an object with `value: > undefined` in step 4 and step 10. Yup, this was changed per the review comments in the spec repo -> https://github.com/tc39/proposal-string-matchall/commit/f368659e372b37afe241496db3425209d24f14ec > ::: js/src/vm/Iteration.h > @@ +148,5 @@ > > > > StringIteratorObject* > > NewStringIteratorObject(JSContext* cx, NewObjectKind newKind = GenericObject); > > > > +class RegExpStringIteratorObject : public NativeObject > > Please put all this somewhere in js/src/builtin, since it is standard > library stuff, not part of the VM. > > [...] > > I hope this is not onerous. (It's so hard to tell when something like this > will be.) A separate patch would be fine, and if it's a pain, just file a > followup bug... js/src/vm/Iteration.{h,cpp} currently also contains Array and String iterators and the Iterator prototype object [1]. Do you think it's acceptable to have a single js/src/builtin/Iterators.{h,cpp} file or do you prefer separate files for each iterator class? [1] https://searchfox.org/mozilla-central/source/js/src/vm/Iteration.cpp#1035-1097,1370-1435
Assignee | ||
Comment 12•6 years ago
|
||
(In reply to André Bargull [:anba] from comment #11) > Do you think it's > acceptable to have a single js/src/builtin/Iterators.{h,cpp} file or do you > prefer separate files for each iterator class? Or alternatively we can place the code in js/src/{Array,String,RegExp}.cpp so it matches the code organisation in js/src/{Array,String,RegExp}.js.
Assignee | ||
Comment 13•6 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #7) > I guess I'm a little surprised this is a winning optimization. It looks like > all that's being optimized away is a few straightforward method calls and a > `new RegExp(existingRegExp)` ...which should be fairly quick, relative to > the cost of actually searching strings. I haven't investigated this properly, but I guess there are just multiple little things which add up and lead to the performance difference between the optimised and non-optimised version.
Comment 14•5 years ago
|
||
<anba> jorendorff: There was/is still some discussion around tc39/proposal-string-matchall#37 (and the various linked issues/prs), which I wanted to wait for. But I can upload new patches including tc39/proposal-string-matchall#38.
Assignee | ||
Comment 15•5 years ago
|
||
Filed bug 1485457 for comment #5.
Assignee | ||
Comment 16•5 years ago
|
||
Re-requesting review because the changes are probably a bit too large to simply carry the r+. This implements the current spec proposal, including <https://github.com/tc39/proposal-string-matchall/pull/38>.
Assignee | ||
Comment 17•5 years ago
|
||
Rebased and updated to match the current mechanics to check for object-types. Carrying r+.
Assignee | ||
Comment 18•5 years ago
|
||
Rebased patch, carrying r+. (Decision to carry r+ a bit borderline given the changes necessary for part 1....)
Assignee | ||
Comment 19•5 years ago
|
||
Updated part 4, carrying r+.
Assignee | ||
Comment 20•5 years ago
|
||
Adding properties to |RegExp.prototype| requires review from an XPConnect peer. I don't know if any additional tests are necessary for Xrays or if it's sufficient to simply add |Symbol.matchAll| to the prototype properties of |RegExp.prototype|.
Assignee | ||
Comment 21•5 years ago
|
||
Enable test262 tests for matchAll.
Updated•5 years ago
|
Comment 22•5 years ago
|
||
Comment on attachment 9003253 [details] [diff] [review] bug1435829-part1-implementation.patch Review of attachment 9003253 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/builtin/Symbol.cpp @@ +70,5 @@ > WellKnownSymbols* wks = cx->runtime()->wellKnownSymbols; > for (size_t i = 0; i < JS::WellKnownSymbolLimit; i++) { > +#ifndef NIGHTLY_BUILD > + if (i == SymbolCode::matchAll) > + continue; Wow, good catch. I wouldn't have looked for this!
Comment 23•5 years ago
|
||
Comment on attachment 9003263 [details] [diff] [review] bug1435829-part6-test262.patch Review of attachment 9003263 [details] [diff] [review]: ----------------------------------------------------------------- Great. Thank you, anba. ::: js/src/tests/test262/GIT-INFO @@ +1,2 @@ > commit ab436c465106be86719c4849c9cedecd7b570ff9 > +Merge: d901922 ff475fc This doesn't seem right.
Assignee | ||
Comment 24•5 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #23) > ::: js/src/tests/test262/GIT-INFO > @@ +1,2 @@ > > commit ab436c465106be86719c4849c9cedecd7b570ff9 > > +Merge: d901922 ff475fc > > This doesn't seem right. The commit [1] has two parents and at least git 2.7.4 and 2.17.0 both include the merge info in the `git show` output for me. Not sure why it wasn't already present in GIT-INFO... [1] https://github.com/tc39/test262/commit/ab436c465106be86719c4849c9cedecd7b570ff9
Assignee | ||
Comment 25•5 years ago
|
||
Updated part 1 to include <https://github.com/tc39/proposal-string-matchall/pull/41>. Rebased part 1 to apply cleanly on inbound after the recent formatting changes. Carrying r+.
Assignee | ||
Comment 26•5 years ago
|
||
Rebased part 2 to apply cleanly on inbound after the recent formatting changes. Carrying r+.
Assignee | ||
Comment 27•5 years ago
|
||
Fixed a small bug where the optimized path didn't set |lastIndex| to 0 for global-or-sticky regular expressions. Rebased part 3 to apply cleanly on inbound after the recent formatting changes. Carrying r+.
Assignee | ||
Comment 28•5 years ago
|
||
Updated part 4 to include a regression test for the issue noted in the updated part 3 patch. Carrying r+.
Assignee | ||
Comment 29•5 years ago
|
||
Rebased part 5. No changes, carrying r+.
Assignee | ||
Comment 30•5 years ago
|
||
Regenerated the updated test262 to apply cleanly on inbound. Carrying r+.
Assignee | ||
Comment 31•5 years ago
|
||
Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=41dae78d338c8cc12abb7d9d63772fb6f29f0ae2
Comment 32•5 years ago
|
||
Pushed by apavel@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/85654ecf0d0c Part 1: Implement String.prototype.matchAll proposal. r=jorendorff https://hg.mozilla.org/integration/mozilla-inbound/rev/a32778df6583 Part 2: Add inline support for RegExp-String-Iterator. r=jorendorff https://hg.mozilla.org/integration/mozilla-inbound/rev/cd2f20cc6d23 Part 3: Add fast path to share input RegExp object. r=jorendorff https://hg.mozilla.org/integration/mozilla-inbound/rev/1608deab3d3d Part 4: Tests for String.prototype.matchAll. r=jorendorff https://hg.mozilla.org/integration/mozilla-inbound/rev/f79c5d5bea25 Part 5: Update xray tests for RegExp.prototype. r=peterv https://hg.mozilla.org/integration/mozilla-inbound/rev/5142cc07565a Part 6: Enable test262 tests for String.matchAll. r=jorendorff
Comment 33•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/85654ecf0d0c https://hg.mozilla.org/mozilla-central/rev/a32778df6583 https://hg.mozilla.org/mozilla-central/rev/cd2f20cc6d23 https://hg.mozilla.org/mozilla-central/rev/1608deab3d3d https://hg.mozilla.org/mozilla-central/rev/f79c5d5bea25 https://hg.mozilla.org/mozilla-central/rev/5142cc07565a
Comment 34•5 years ago
|
||
Performance improvement found == Change summary for alert #18224 (as of Fri, 14 Dec 2018 10:37:21 GMT) == Improvements: 3% sessionrestore opt e10s stylo 852.62 -> 828.50 For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=18224
Comment 35•5 years ago
•
|
||
Notes for anyone willing to help with writing MDN documentation:
(I started this work, because I thought this was on my docs to-do list for Firefox 65, but it is Nightly only for now)
matchAll is linked from (this is done)
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/prototype#Methods
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String#Methods_2
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/prototype#Methods
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp#Methods_2
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol#Well-known_symbols
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Text_formatting#Methods_of_String
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Regular_Expressions#Working_with_regular_expressions
Browser compat data PR: https://github.com/mdn/browser-compat-data/pull/3289
(data needs updating once Firefox actually ships in a release version)
New reference pages to write:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/matchAll
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/RegExp/@@matchAll
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Symbol/matchAll
New interactive examples for the reference pages to add here:
https://github.com/mdn/interactive-examples/tree/master/live-examples/js-examples/string (string-matchAll.html)
https://github.com/mdn/interactive-examples/tree/master/live-examples/js-examples/regexp (regexp-prototype-@@match.html)
https://github.com/mdn/interactive-examples/tree/master/live-examples/js-examples/symbol (string-matchAll.html)
If this makes the train in Firefox, we also need to add it to "Firefox 66 for developers" (or 67, or whenever it lands outside of Nightly)
https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/66#JavaScript
Feel free to reach out to me if you want to document this and need help.
Updated•5 years ago
|
Comment 36•5 years ago
|
||
(In reply to Florian Scholz [:fscholz] (MDN) from comment #35)
... because I thought this was on my docs to-do list for Firefox 65, but it is Nightly only for now
is that still true? Chrome 73 ships it in March
https://www.chromestatus.com/feature/5520028858318848
Comment 37•5 years ago
|
||
Till, can we ship this as well or are we shipping already?
Reporter | ||
Comment 38•5 years ago
|
||
(In reply to Florian Scholz [:fscholz] (MDN) from comment #37)
Till, can we ship this as well or are we shipping already?
Looks like it's in 66, so will be in release soon?
Assignee | ||
Comment 39•5 years ago
|
||
It's still behind NIGHTLY_BUILD
.
Reporter | ||
Comment 40•5 years ago
|
||
Oh, ok! Is there any reason we shouldn't unflag and ship it?
Assignee | ||
Comment 41•5 years ago
|
||
I think it should be ready to ship. Do you agree Jason?
Comment 42•5 years ago
|
||
Yes, let's ship it. Filed bug 1531830 for this.
Updated•5 years ago
|
Comment 43•5 years ago
|
||
setting "disabled" for Firefox 66 to avoid confusion
Comment 44•5 years ago
|
||
Doc work has been completed, see comment 35 for details.
Marked is shipping in 67 https://developer.mozilla.org/en-US/docs/Mozilla/Firefox/Releases/67#JavaScript
Review appreciated.
Description
•