Use YARR's new MatchOnly JIT mode

RESOLVED FIXED in mozilla20

Status

()

Core
JavaScript Engine
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: dvander, Assigned: sstangl)

Tracking

(Blocks: 1 bug)

unspecified
mozilla20
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(6 attachments, 3 obsolete attachments)

(Reporter)

Description

5 years ago
On benchmarks, particularly v8-regexp (and on the web, too), it's common to have code like:

       /(pattern)/.exec(string);

And never consume the matched subexpressions. We currently optimize this by not creating the output array, but we can do better by using YARR's new MatchOnly mode. This mode compiles the regex code to not bother saving the subexpression positions in the first place.

Unfortunately there is one complication: the global RegExp object has unspecified, defacto behavior that the last match results are available via RegExp.$1, RegExp.$2, etc. In SpiderMonkey, these are implemented in RegExpStatics.*. To avoid breaking this, JavaScriptCore has an internal variable on the global RegExp constructor that caches the last match inputs (that is, the string and regex pattern). If the user requests RegExp.$n, and the results were optimized away, it will re-run the regex to compute the results.

We expect this to fix bug 806646, which would account for the largest single performance improvement we can currently make on our V8 score. It should be pretty straightforward.
(Assignee)

Comment 1

5 years ago
If the result of exec() is unused, we can actually get away with executing almost nothing at all -- all that's necessary is to cache the last input for reasons given in the above comment.
(Reporter)

Comment 2

5 years ago
(In reply to Sean Stangl from comment #1)
> If the result of exec() is unused, we can actually get away with executing
> almost nothing at all -- all that's necessary is to cache the last input for
> reasons given in the above comment.

Good point!
(Assignee)

Updated

5 years ago
Assignee: general → sstangl
(Reporter)

Updated

5 years ago
Status: NEW → ASSIGNED

Comment 3

5 years ago
How much of an improvement is this bug expected to make?  Currently, our v8-regex score is really bad.  ETA on fix?
(Assignee)

Comment 4

5 years ago
(In reply to Paul Wright from comment #3)
> How much of an improvement is this bug expected to make?  Currently, our
> v8-regex score is really bad.  ETA on fix?

Adding MatchOnly mode moves v8-regexp's score from 1800 to 2000 locally; we require 3600. MatchOnly mode roughly matches our performance on .test() and unused .exec() with the v8 engine (8800 versus 9000 on a v8-regexp benchmark with only .exec() calls), but our benchmark score is held back by poor .replace() performance.

I have patches to address that, but I'm working on breaking them up to post in separate bugs for ease of review. The patch I originally wrote for this bug turned out to be way too complicated.

Comment 5

5 years ago
> but our benchmark score is held back by poor .replace() performance.
> 
> I have patches to address that, but I'm working on breaking them up to post
> in separate bugs for ease of review.

Can you point me in the direction of the applicable bugs / patches?
(Assignee)

Updated

5 years ago
Blocks: 820118
(Assignee)

Comment 6

5 years ago
Created attachment 690572 [details] [diff] [review]
Implement MatchOnly support.

This is the first patch in a three-part series, the latter two to be resolved by Bug 820118 and Bug 820124. I have broken it up in this way in the hope of making the patches easier to review.

This patch implements YARR MatchOnly support and uses it in regexp_test(). In order to support this new mode, it was necessary to change the behavior of ExecuteRegExp() and its callsites. Because ExecuteRegExp() behavior is baked into the JSAPI, a function ExecuteRegExpLegacy() now exists that provides the old behavior, including output array creation.

A number of data structure changes facilitate the MatchOnly support. Instead of requiring ExecuteRegExp() callers to hold a LifoAllocScope, re.execute() now accepts a |MatchPairs| outparam, which is a resizable array of |MatchPair| objects, with memory details fully handled by the caller via |ScopedMatchPairs| and |VectorMatchPairs| (for RegExpStatics). This allows the same code to store directly into the RegExpStatics matches buffer without copying in the lazy evaluation case. Most uses of raw integers in matches have also been changed into uses of MatchPair objects, improving readability.

In total, the patch makes the following changes:

> Removes the RegExpCode/RegExpShared distinction, keeping RegExpShared,
> Compiles RegExp code at execution time, not when getting the RegExpShared,
> Changes RegExp execution to always output MatchPairs or a MatchPair,
> Uses a MatchConduit (union) to switch between MatchOnly/IncludeSubpatterns,
> Moves most RegExpShared querying logic into MatchPairs,
> Lazifies RegExpShared.

Local improvement to v8-regexp is about 1280 -> 1380, but the score is primarily held back by str_replace() performance. Bug 820124 will fix that, but requires MatchOnly mode to exist. On a version of the v8-regexp benchmark with only calls to .exec() (which are executed as .test()), we roughly match v8, ~8800 versus ~9000.
(Assignee)

Updated

5 years ago
Attachment #690572 - Flags: review?(dvander)
(Reporter)

Updated

5 years ago
Attachment #690572 - Flags: feedback?(choller)
(Reporter)

Updated

5 years ago
Attachment #690572 - Flags: feedback?(gary)
(Assignee)

Comment 7

5 years ago
Created attachment 691128 [details] [diff] [review]
Implement MatchOnly support, v2

Version 2, with some additional cleanup and bugfixes around lazy RegExpStatics::makeMatch().
Attachment #690572 - Attachment is obsolete: true
Attachment #690572 - Flags: review?(dvander)
Attachment #690572 - Flags: feedback?(gary)
Attachment #690572 - Flags: feedback?(choller)
Attachment #691128 - Flags: review?(dvander)
(Assignee)

Updated

5 years ago
Attachment #691128 - Flags: feedback?(gary)
Attachment #691128 - Flags: feedback?(choller)
Comment on attachment 691128 [details] [diff] [review]
Implement MatchOnly support, v2

feedback+ given that it didn't blow up jsfunfuzz after about half an hour, but this may be a red herring given that the regex fuzzer in jsfunfuzz keeps hitting bug 808478 before anything else.
Attachment #691128 - Flags: feedback?(gary) → feedback+
(Assignee)

Comment 9

5 years ago
To make the patch easier to review, I have retroactively broken it up into 5 parts. I didn't go too crazy with rewriting data structures, so occasionally a component from a later patch will creep into an earlier one. I verified that all of the patches compile, but did not bother running or testing them.

The series appears to be green on try, with some random orange: https://tbpl.mozilla.org/?tree=Try&rev=050160f7bc4e
(Assignee)

Comment 10

5 years ago
Created attachment 691640 [details] [diff] [review]
Part 0/5 - Fix some unrelated nits

Changes are mostly to vim modelines. At some point I will run through the repo and update all of them, but these were the ones I changed when my editor behaved obnoxiously.
Attachment #691128 - Attachment is obsolete: true
Attachment #691128 - Flags: review?(dvander)
Attachment #691128 - Flags: feedback?(choller)
Attachment #691640 - Flags: review?(dvander)
(Assignee)

Comment 11

5 years ago
Created attachment 691641 [details] [diff] [review]
Part 1/5 - Remove the hackedSource RegExp option

Precursor to supporting compilation at execution time. Unused, and OK'd by Luke.
Attachment #691641 - Flags: review?(dvander)
(Assignee)

Comment 12

5 years ago
Created attachment 691642 [details] [diff] [review]
Part 2/5 - Merge RegExpCode into RegExpShared

Data structure simplification.
Attachment #691642 - Flags: review?(dvander)
(Assignee)

Comment 13

5 years ago
Created attachment 691643 [details] [diff] [review]
Part 3/5 - Compile at execution time
Attachment #691643 - Flags: review?(dvander)
(Assignee)

Comment 14

5 years ago
Created attachment 691646 [details] [diff] [review]
Part 4/5 - Use MatchPairs for RegExp output

This is the main data structure change patch, which rewrites re.execute() and all associated functions to pass the full matches list up to the caller. The interface to MatchPairs is also improved.

Most of the changes to jsstr.cpp are in preparation for lazifying RegExpStatics. They were too obnoxious to rip out.
Attachment #691646 - Flags: review?(dvander)
(Assignee)

Comment 15

5 years ago
Created attachment 691647 [details] [diff] [review]
Part 5/5 - Add MatchOnly mode and lazify RegExpStatics

Adds executeMatchOnly(). In order to use MatchOnly mode in regexp_test(), lazy RegExpStatics need to be in place, so this patch does that also. The cleanup to RegExpStatics was done as part of the previous patch -- I just deleted anything to do with the lazy mode.
Attachment #691647 - Flags: review?(dvander)
(Reporter)

Updated

5 years ago
Attachment #691640 - Flags: review?(dvander) → review+
(Reporter)

Updated

5 years ago
Attachment #691641 - Flags: review?(dvander) → review+
(Reporter)

Updated

5 years ago
Attachment #691642 - Flags: review?(dvander) → review+
(Reporter)

Comment 16

5 years ago
Comment on attachment 691643 [details] [diff] [review]
Part 3/5 - Compile at execution time

Review of attachment 691643 [details] [diff] [review]:
-----------------------------------------------------------------

talked IRL, map code needs to be added back
Attachment #691643 - Flags: review?(dvander)
(Assignee)

Comment 17

5 years ago
Created attachment 692092 [details] [diff] [review]
Part 3/5 - Compile at execution time [v2]

Oh my. This is the correct version of the patch. dvander pointed out that I got carried away with code removal, and removed waaaay too much.

1) Adding the RegExpShared into map_ is necessary for code reuse.
2) Adding the RegExpShared into inUse_ is necessary for object cleanup.
Attachment #691643 - Attachment is obsolete: true
Attachment #692092 - Flags: review?(dvander)
(Reporter)

Updated

5 years ago
Attachment #691647 - Flags: review?(dvander) → review+
(Reporter)

Updated

5 years ago
Attachment #691646 - Flags: review?(dvander) → review+
(Reporter)

Comment 18

5 years ago
Comment on attachment 692092 [details] [diff] [review]
Part 3/5 - Compile at execution time [v2]

Review of attachment 692092 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, glad to see all this RegExp stuff get cleaned up.
Attachment #692092 - Flags: review?(dvander) → review+
(Assignee)

Comment 19

5 years ago
Thank you for the quick reviews!

Try run submitted for final state, green with random orange:
https://tbpl.mozilla.org/?tree=Try&rev=d8a45a696e8b
(Assignee)

Comment 20

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/9121bae603a2
https://hg.mozilla.org/integration/mozilla-inbound/rev/787527f064da
https://hg.mozilla.org/integration/mozilla-inbound/rev/ea7d93401f96
https://hg.mozilla.org/integration/mozilla-inbound/rev/d229008d60ac
https://hg.mozilla.org/integration/mozilla-inbound/rev/7711a36c2771
https://hg.mozilla.org/integration/mozilla-inbound/rev/b7e2ba73b2ff
https://hg.mozilla.org/mozilla-central/rev/9121bae603a2
https://hg.mozilla.org/mozilla-central/rev/787527f064da
https://hg.mozilla.org/mozilla-central/rev/ea7d93401f96
https://hg.mozilla.org/mozilla-central/rev/d229008d60ac
https://hg.mozilla.org/mozilla-central/rev/7711a36c2771
https://hg.mozilla.org/mozilla-central/rev/b7e2ba73b2ff
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20

Updated

5 years ago
Depends on: 822145
Comment on attachment 691646 [details] [diff] [review]
Part 4/5 - Use MatchPairs for RegExp output

Review of attachment 691646 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/vm/RegExpObject.cpp
@@ +122,5 @@
>  
> +    /* Initialize all MatchPair objects to invalid locations. */
> +    for (size_t i = 0; i < pairCount; i++) {
> +        pairs_[i].start = size_t(-1);
> +        pairs_[i].limit = size_t(-1);

Sooo... start and limit are ints, which gives me

js/src/vm/RegExpObject.cpp:125:36 [-Woverflow] overflow in implicit constant conversion
js/src/vm/RegExpObject.cpp:126:36 [-Woverflow] overflow in implicit constant conversion
(Assignee)

Comment 23

5 years ago
(In reply to :Ms2ger from comment #22)
> Comment on attachment 691646 [details] [diff] [review]
> Part 4/5 - Use MatchPairs for RegExp output
> 
> Review of attachment 691646 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: js/src/vm/RegExpObject.cpp
> @@ +122,5 @@
> >  
> > +    /* Initialize all MatchPair objects to invalid locations. */
> > +    for (size_t i = 0; i < pairCount; i++) {
> > +        pairs_[i].start = size_t(-1);
> > +        pairs_[i].limit = size_t(-1);
> 
> Sooo... start and limit are ints, which gives me
> 
> js/src/vm/RegExpObject.cpp:125:36 [-Woverflow] overflow in implicit constant
> conversion
> js/src/vm/RegExpObject.cpp:126:36 [-Woverflow] overflow in implicit constant
> conversion

That initialization isn't even necessary, since Yarr's JIT performs the same initialization as the first step of execution. I'll fix it as part of one of the other regexp bugs.
Depends on: 826581
Depends on: 826588
Depends on: 831055
You need to log in before you can comment on or make changes to this bug.