Reduce allocations in RegExpGetSubstitution

RESOLVED FIXED in Firefox 57

Status

()

enhancement
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: anba, Assigned: anba)

Tracking

(Blocks 1 bug)

Trunk
mozilla57
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox57 fixed)

Details

Attachments

(3 attachments)

(Assignee)

Description

2 years ago
RegExpGetSubstitution is often called when the RegExp contains capturing groups and the replacement string uses "$n" replacement formats. 

To optimize this case, we should avoid copying the capturing groups in http://searchfox.org/mozilla-central/rev/b52285fffc13f36eca6b47de735d4e4403b3859e/js/src/builtin/RegExp.js#443-466 and give the captures vector in http://searchfox.org/mozilla-central/rev/b52285fffc13f36eca6b47de735d4e4403b3859e/js/src/builtin/RegExp.cpp#1434-1436 a bit of stack space.
(Assignee)

Comment 1

2 years ago
The patch changes js::RegExpGetSubstitution to directly take the regexp match-result object instead of separate arguments for the matched string and the captured groups. This allows us to directly pass the match-result to RegExpGetSubstitution in the replace fast paths, which avoids the extra array-copy for the capturing groups.
Attachment #8894423 - Flags: review?(till)
(Assignee)

Comment 2

2 years ago
This gives a bit of stack allocated space for the captures vector in js::RegExpGetSubstitution (2 would be enough for Speedometer, but it probably doesn't hurt to double the stack allocated space to 4 elements). It also optimizes the array accesses slightly by directly accessing the dense elements instead of going through the more general GetElement(). And in intrinsic_RegExpGetSubstitution, we can easily avoid rooting the strings twice by directly calling ensureLinear() on the arguments, so let's do that, too.
Attachment #8894429 - Flags: review?(till)
(Assignee)

Comment 3

2 years ago
And a bit of additional clean-up for RegExp code:
- IsTrailSurrogateWithLeadSurrogate had an unused JSContext*, so removed that.
- ExecuteRegExp was re-rooting an already rooted object.
- It should be able to avoid the extra rooting in GetFirstDollarIndex for the arguments string.
- And intrinsic_GetStringDataProperty had some unnecessary rooting, too.

I'll run the hazard analysis in the try-push to make sure the rooting changes are all okay to do.
Attachment #8894431 - Flags: review?(till)
(Assignee)

Comment 4

2 years ago
(In reply to André Bargull from comment #3)
> I'll run the hazard analysis in the try-push to make sure the rooting
> changes are all okay to do.

Try looks good: https://treeherder.mozilla.org/#/jobs?repo=try&revision=22c5f47704485b72a606cc88fbf541897d4b0811
Comment on attachment 8894423 [details] [diff] [review]
bug1387968-part1-reduce-array-copies.patch

Review of attachment 8894423 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good, thanks. I left a bunch of comments for some further optimizations. I can't see a reason why those wouldn't work, but applying them is optional. r=me either way.

::: js/src/builtin/RegExp.cpp
@@ +1420,5 @@
>  
> +    // Step 10 (reordered).
> +    uint32_t matchResultLength;
> +    if (!GetLengthProperty(cx, matchResult, &matchResultLength))
> +        return false;

If you change this function to take matchResult as a HandleArrayObject - see comments for SelfHosting.cpp below -, you can replace this with
uint32_t matchResultLength = matchResult->length()

@@ +1424,5 @@
> +        return false;
> +    MOZ_ASSERT(matchResultLength > 0);
> +
> +    RootedValue matchedValue(cx);
> +    if (!GetElement(cx, matchResult, matchResult, 0, &matchedValue))

This and the GetElement calls below should be replaceable by matchResult->getDenseElement(n), assuming you change matchResult to be an ArrayObject.

::: js/src/vm/SelfHosting.cpp
@@ +1536,5 @@
> +#ifdef DEBUG
> +    bool isArray = false;
> +    MOZ_ALWAYS_TRUE(IsArray(cx, matchResult, &isArray));
> +    MOZ_ASSERT(isArray);
> +#endif

You should be able to replace all this with MOZ_ASSERT(matchResult->is<ArrayObject>()).

Or, even better, actually coerce to ArrayObject and pass that to RegExpGetSubstitution:

RootedArrayObject matchResult(cx, &args[0].toObject().as<ArrayObject>())
Attachment #8894423 - Flags: review?(till) → review+
(In reply to Till Schneidereit [:till] from comment #5)
> Comment on attachment 8894423 [details] [diff] [review]
> bug1387968-part1-reduce-array-copies.patch
> 
> Review of attachment 8894423 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks good, thanks. I left a bunch of comments for some further
> optimizations. I can't see a reason why those wouldn't work, but applying
> them is optional. r=me either way.

Oh, now I do see a reason: an UnboxedArrayObject isn't a NativeObject. That probably makes these optimizations too annoying to be worth it.
Comment on attachment 8894429 [details] [diff] [review]
bug1387968-part2-optimize-getsubstitution.patch

Review of attachment 8894429 [details] [diff] [review]:
-----------------------------------------------------------------

Hah, and now I see that part of these optimizations largely boil down to those I suggested in the previous review. Excellent.
Attachment #8894429 - Flags: review?(till) → review+
Comment on attachment 8894431 [details] [diff] [review]
bug1387968-part3-extra-rooting.patch

Review of attachment 8894431 [details] [diff] [review]:
-----------------------------------------------------------------

r=me

::: js/src/builtin/RegExp.cpp
@@ +897,5 @@
>       *          otherwise.  YOU HAVE BEEN WARNED.
>       */
>  
>      /* Steps 1-2 performed by the caller. */
> +    Handle<RegExpObject*> reobj = regexp.as<RegExpObject>();

There must be thousands of places in our code base where we could avoid re-rooting like this ...
Attachment #8894431 - Flags: review?(till) → review+
(Assignee)

Comment 9

2 years ago
(In reply to Till Schneidereit [:till] from comment #8)
> ::: js/src/builtin/RegExp.cpp
> @@ +897,5 @@
> >       *          otherwise.  YOU HAVE BEEN WARNED.
> >       */
> >  
> >      /* Steps 1-2 performed by the caller. */
> > +    Handle<RegExpObject*> reobj = regexp.as<RegExpObject>();
> 
> There must be thousands of places in our code base where we could avoid
> re-rooting like this ...

Nah, we don't seem to do that bad! :-)

http://searchfox.org/mozilla-central/search?q=Rooted.*%26%5Cw%2B(%5C.%7C-%3E)as%3C%5Cw%2B%3E%5C(%5C)%5C)&case=false&regexp=true&path=js%2Fsrc
(In reply to André Bargull from comment #9)
> Nah, we don't seem to do that bad! :-)

"Number of results: 1000 (maximum is 1000)"

I fear we do ;)
(Assignee)

Comment 11

2 years ago
(In reply to Till Schneidereit [:till] from comment #10)
> (In reply to André Bargull from comment #9)
> > Nah, we don't seem to do that bad! :-)
> 
> "Number of results: 1000 (maximum is 1000)"
> 
> I fear we do ;)

That's strange, I get: "Number of results: 141 (maximum is 1000)".
(In reply to André Bargull from comment #11)
> That's strange, I get: "Number of results: 141 (maximum is 1000)".

Oh, interesting: I'm using the Skip Redirect extension[1], and that somehow messes up the link when clicked in gmail. If I disable that or click the link in bugzilla, I get your result. 141 is indeed much better than I'd have hoped.

[1] https://addons.mozilla.org/en-US/firefox/addon/skip-redirect/

Comment 14

2 years ago
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/769b9e5ce199
Part 1: Reduce array copies for RegExpGetSubstitution with captures. r=till
https://hg.mozilla.org/integration/mozilla-inbound/rev/cd1d769cabe8
Part 2: Optimize array accesses and allocations in RegExpGetSubstitution. r=till
https://hg.mozilla.org/integration/mozilla-inbound/rev/1f294885ea45
Part 3: Remove unnecessary or duplicate rooting in RegExp code. r=till
Keywords: checkin-needed
Depends on: 1389904
You need to log in before you can comment on or make changes to this bug.