Closed Bug 1264264 Opened 5 years ago Closed 5 years ago

Enable optimization for packers again

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla48
Tracking Status
firefox48 --- fixed

People

(Reporter: h4writer, Assigned: arai)

References

Details

Attachments

(3 files, 2 obsolete files)

In bug 887016 an important optimization for packers was removed. This is quite noticable on ss-string-unpack-code.js.

Our speed used to be:
29ms

After bug 887016 landed our speed is:
80ms

Quick and dirty reimplementing this gives me:
40ms

The optimization is one for packers that use a function similar to "return b[match]".
The original code is visible on the lines beneath:
https://hg.mozilla.org/mozilla-central/file/28226109634e/js/src/jsstr.cpp#l3763
https://hg.mozilla.org/mozilla-central/file/28226109634e/js/src/jsstr.cpp#l3647
Blocks: 887016
Attached patch Quick'nDirtySplinter Review
I noticed that following optimization used in RegExpGlobalReplaceOpt should be applicable to functional replace (and substitution replace too).
  * call RegExpMatcher directly instead of RegExpExec
  * avoid checking the return value type/range from RegExpExec
  * avoid creating results Array

While the `replaceValue` function there could have side-effect, it won't affect RegExp operation, as whole RegExp operation should be done *before* accumulation.

https://tc39.github.io/ecma262/#sec-regexp.prototype-@@replace
> 7. Let global be ToBoolean(? Get(rx, "global")).
> 8. If global is true, then
>   a. Let fullUnicode be ToBoolean(? Get(rx, "unicode")).
>   b. Perform ? Set(rx, "lastIndex", 0, true).
> ...
> 11. Repeat, while done is false
>   a. Let result be ? RegExpExec(rx, S).
>   ...
>   c. Else ...
>     iii. Else ...
>       2. If ...
>         a. Let thisIndex be ? ToLength(? Get(rx, "lastIndex")).
>         ...
>         c. Perform ? Set(rx, "lastIndex", nextIndex, true).
> ...
> 14. Repeat, for each result in results,
>   j. If functionalReplace is true, then
>     i. Let replacerArgs be « matched ».
>     ii. Append in list order the elements of captures to the end of the List replacerArgs.
>     iii. Append position and S as the last two elements of replacerArgs.
>     iv. Let replValue be ? Call(replaceValue, undefined, replacerArgs).
>     v. Let replacement be ? ToString(replValue).

Whole operations to `rx` is done in steps 7, 8, and 11, and the `replaceValue` function is called in step 14, so, even if the `rx` object is modified, it have no effect to step 11.  Also, `rx.lastIndex` should be set to 0 when step 14 is executed, so we can set to it *before* loop.


Prepared 2 optimized paths:

  A. check LambdaIsGetElem, and use elemBase[matched]
  B. directly call RegExpMatcher and use same function call

A is still buggy, as we should fallback to generic path when elemBase's `matched` property is not a data property, and I don't yet prepared that function, especially an effective one for JIT.

Anyway, here's the comparison of ss-string-unpack-code.js.

  before bug 887016:                   23.9 ms
  after bug 887016:                    64.8 ms
  A. optimize LambdaIsGetElem (buggy): 30.9 ms
  B. optimize whole function replace:  38.9 ms

B is still a big improvement, so we should try it first, as that optimization should be applicable to more cases.
Depends on: 1262967
(re-posting from bug 1260435)

Here's WIP patch that adds dedicated optimized functions for functional replacement and substitution.
There I added 3 JS files that is included by #include.

  RegExpGlobalReplaceOpt.h.js
    template function for @@replace with global=true case
    it becomes following function:
      * RegExpGlobalReplaceOpt
          `replaceValue` is a string without "$" (Same function as now)
      * RegExpGlobalReplaceOptFunc
          `replaceValue` is a function
      * RegExpGlobalReplaceOptSubst
          `replaceValue` is a string with "$"

  RegExpLocalReplaceOpt.h.js
    template function for @@replace with global=false case
    it becomes following function:
      * RegExpLocalReplaceOpt
          `replaceValue` is a string without "$" (Same function as now)
      * RegExpLocalReplaceOptFunc
          `replaceValue` is a function
      * RegExpLocalReplaceOptSubst
          `replaceValue` is a string with "$"

  RegExpGetComplexReplacement.h.js
    template codelet for calculating `replacement` variable from `result` object
    it's included from following functions:
      * RegExpReplace
          (The result is same as now)
      * RegExpGlobalReplaceOptFunc
      * RegExpGlobalReplaceOptSubst
      * RegExpLocalReplaceOptFunc
      * RegExpLocalReplaceOptSubst

And here's performance comparison of ss-string-unpack-code.js.

  before bug 887016:  23.9 ms
  after bug 887016:   64.8 ms
  WIP patch:          37.2 ms
  variant A[*1]:      39.8 ms (6.9% regress from WIP patch)
  variant B[*2]:      44.0 ms (18%  regress from WIP patch)

[*1]  make RegExpGetComplexReplacement a function
[*2]  make RegExpGetComplexReplacement a function
      + use same function for no-substitution/function/substitution,
        with if-else

The content of RegExpGetComplexReplacement.h.js is not a function, so it might not be nice to store that codelet in a separated file, as it makes hard to read.
If 6.9% regression is acceptable, it might be better to make RegExpGetComplexReplacement.h.js a function and call it from other functions.

Can I get some feedback on whole approach?
Attachment #8742048 - Flags: feedback?(hv1989)
Comment on attachment 8742048 [details] [diff] [review]
Add optimized path for RegExp.prototype[@@replace] with functional replace and substitution.

Review of attachment 8742048 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!

I didn't know something like this was possible. Cool!
I think for the quick fix having it as a define (and different functions) might be easiest.

Unifying this would be cool, but it seems that causes some regressions.
Currently there is some time pressure, but afterwards we should probably look into why/how. And see how we can improve our engine to not have different perf characteristics when unifying the functions.

If you want feedback if this is a good approach selfhosted code, I would suggest ping'ing till. He knows the "style rules" of selfhosted code better.
Attachment #8742048 - Flags: feedback?(hv1989) → feedback+
Comment on attachment 8742048 [details] [diff] [review]
Add optimized path for RegExp.prototype[@@replace] with functional replace and substitution.

(In reply to Hannes Verschore [:h4writer] from comment #4)
> I think for the quick fix having it as a define (and different functions)
> might be easiest.

Yeah, using macro will make it possible to embed it in same file, but that makes it impossible to use syntax highlight on editor.

> If you want feedback if this is a good approach selfhosted code, I would
> suggest ping'ing till. He knows the "style rules" of selfhosted code better.

I see :)


till, can I get some feedback on using #define+#include for each function/codelet in self-hosted JS?
Is there any existing or planned alternative?
Attachment #8742048 - Flags: feedback?(till)
See Also: → 1265992
Comment on attachment 8742048 [details] [diff] [review]
Add optimized path for RegExp.prototype[@@replace] with functional replace and substitution.

Review of attachment 8742048 [details] [diff] [review]:
-----------------------------------------------------------------

I'm quite unhappy about how much harder it is to follow the logic of these functions now. Given that we need to have a fix here in the next few days, I agree that we should land this without too much churn, however.

Two things I would like to see before landing:
- Comments explaining what this is all about. The explanation from comment 3 is a good starting point for that.
- A change to RegExpGetComplexReplacement. I think this is too hard to understand as it is now, and really should be a function. What happens to performance if you move all of step 14 into that function? That might eliminate much of the slowdown. But even if not: 7% is acceptable for this, I think.
Attachment #8742048 - Flags: feedback?(till) → feedback+
See Also: → 1263490
Thank you for your feedback :)

(In reply to Till Schneidereit [:till] from comment #6)
> Two things I would like to see before landing:
> - Comments explaining what this is all about. The explanation from comment 3
> is a good starting point for that.

Okay, I'll add comments there.

> - A change to RegExpGetComplexReplacement. I think this is too hard to
> understand as it is now, and really should be a function.
> What happens to performance if you move all of step 14 into that function?
> That might eliminate much of the slowdown.

The motivation of making RegExpGetComplexReplacement codelet/function is to share the code to create the replacement string for functional or substitution replace, between generic path and optimized paths.
So I cannot move whole step 14 into a function because it cannot be used by optimized paths.

> But even if not: 7% is acceptable for this, I think.

I see. will make it a function.
Changed RegExpGetComplexReplacement to a function, added comments, and splitted out non-optimized path into different function, to reduce the script length (for bug 1265992).

Here's the comparison of ss-string-unpack-code.js on m-i:

  before: 59.5
  after:  39.6
Assignee: nobody → arai.unmht
Attachment #8742048 - Attachment is obsolete: true
Attachment #8743667 - Flags: review?(till)
Comment on attachment 8743667 [details] [diff] [review]
Add optimized path for RegExp.prototype[@@replace] with functional replace and substitution.

Review of attachment 8743667 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/RegExp.js
@@ +224,5 @@
> +// ES 2017 draft rev 03bfda119d060aca4099d2b77cf43f6d4f11cfa2 21.2.5.8
> +// steps 8-16.
> +// Not optimized case.
> +function RegExpReplaceNoOpt(rx, S, lengthS, replaceValue,
> +                            functionalReplace, firstDollarIndex)

you forgot to add the "global" arugment!
wow, thank you!
will attach fixed one after test.
Attachment #8743667 - Flags: review?(till)
Fixed the parameter and the order of some code.
Attachment #8743667 - Attachment is obsolete: true
Attachment #8743961 - Flags: review?(till)
Based on previous patch and Bug 1263340 Part 3, added optimized path for packer.

Added 2 functions
  GetElemBaseForLambda(func)
    returns an object that is `b` of `functions(a) { return b[a]; }`
    if the func doesn't match to the pattern, it returns null
  GetStringDataProperty(obj, name)
    tries to get string-typed data property of the `obj` object, named `name`

ss-string-unpack-code.js score:
  before:               58.9
  Bug 1264264 Part 1:   40.0
  + Bug 1263340 Part 3: 39.6
  + Bug 1264264 Part 2: 33.7

I don't see any notable regression in SunSpider/Kraken/Octane other benchmarks.


So, it seems that not-inlinable function is still usable here.
maybe because the original property access is problematic?
Attachment #8744264 - Flags: review?(hv1989)
Comment on attachment 8743961 [details] [diff] [review]
Add optimized path for RegExp.prototype[@@replace] with functional replace and substitution.

Review of attachment 8743961 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks. This is still not pretty, but it gets the job done.

One concern I didn't think of earlier is code size. How much larger does obj-dir/js/src/selfhosted.js get with this patch? The embedded version of this is zip-compressed, so I'm not too worried about binary size increase. I am however worried about startup time and memory usage. Eventually we'll have to address at least the first of those in some better way, such as precompiling and bundling bytecode, but for now this does worry me.

Still, r=me given the time constraints.

::: js/src/builtin/RegExp.js
@@ +161,5 @@
>             RegExpInstanceOptimizable(rx, RegExpProto) &&
>             RegExpProto.exec === RegExp_prototype_Exec;
>  }
>  
> +// ES 2017 draft rev 03bfda119d060aca4099d2b77cf43f6d4f11cfa2 21.2.5.8.

I'm actually in favor of date-based draft revision naming, but I don't care too much, so this is fine.

@@ +222,5 @@
> +}
> +
> +// ES 2017 draft rev 03bfda119d060aca4099d2b77cf43f6d4f11cfa2 21.2.5.8
> +// steps 8-16.
> +// Not optimized case.

"Unoptimized" or, probably better, "slow path". Maybe also rename the function to RegExpReplaceSlowPath.

@@ +331,5 @@
>  
> +// ES 2017 draft rev 03bfda119d060aca4099d2b77cf43f6d4f11cfa2 21.2.5.8
> +// steps 14.g-k.
> +// Calculates functional/substitution replaceement from match result.
> +// Shared between following:

"Used in the following functions"

@@ +346,5 @@
> +    var captures = [];
> +    var capturesLength = 0;
> +
> +    // Step 14.j.i (reordered).
> +    // For nCaptures <= 4 case, call replaceValue directly, otherwise

Please mark identifiers in comments with either `ident` or |ident|. Otherwise it's sometimes quite hard to follow the comment. In this sentence, the "captures" at the end could easily also mean something other than an identifier.

@@ +397,5 @@
> +}
> +
> +// ES 2017 draft rev 03bfda119d060aca4099d2b77cf43f6d4f11cfa2 21.2.5.8
> +// steps 8-16.
> +// Optimized path for @@replace with following conditions:

"the following"

@@ +455,5 @@
>  }
>  
> +// ES 2017 draft rev 03bfda119d060aca4099d2b77cf43f6d4f11cfa2 21.2.5.8
> +// steps 8.b-16.
> +// Optimized path for @@replace with following conditions each:

// Optimized paths for @@replace:
and then
// Conditions:
above each of the blocks below.

::: js/src/builtin/RegExpGlobalReplaceOpt.h.js
@@ +1,1 @@
> +// Function template for following functions:

"the following"

@@ +1,5 @@
> +// Function template for following functions:
> +//   * RegExpGlobalReplaceOpt
> +//   * RegExpGlobalReplaceOptFunc
> +//   * RegExpGlobalReplaceOptSubst
> +// Define following macro and include this file to declare function:

"the following"

@@ +5,5 @@
> +// Define following macro and include this file to declare function:
> +//   * FUNC_NAME     -- function name (required)
> +//       e.g.
> +//         #define FUNC_NAME RegExpGlobalReplaceOpt
> +// Define following macro (without value) to switch the code:

"the following"

@@ +7,5 @@
> +//       e.g.
> +//         #define FUNC_NAME RegExpGlobalReplaceOpt
> +// Define following macro (without value) to switch the code:
> +//   * SUBSTITUTION     -- replaceValue is a string with "$"
> +//   * FUNCTIONS        -- replaceValue is a function

"FUNCTIONAL"

@@ +12,5 @@
> +//   * neither of above -- replaceValue is a string without "$"
> +
> +// ES 2017 draft 03bfda119d060aca4099d2b77cf43f6d4f11cfa2 21.2.5.8
> +// steps 8-16.
> +// Optimized path for @@replace with following conditions:

"the following"

::: js/src/builtin/RegExpLocalReplaceOpt.h.js
@@ +1,1 @@
> +// Function template for following functions:

"the following", here and everywhere below.

@@ +7,5 @@
> +//       e.g.
> +//         #define FUNC_NAME RegExpLocalReplaceOpt
> +// Define following macro (without value) to switch the code:
> +//   * SUBSTITUTION     -- replaceValue is a string with "$"
> +//   * FUNCTIONS        -- replaceValue is a function

"FUNCTIONAL"
Attachment #8743961 - Flags: review?(till) → review+
Thank you for reviewing :D

(In reply to Till Schneidereit [:till] from comment #13)
> Comment on attachment 8743961 [details] [diff] [review]
> Add optimized path for RegExp.prototype[@@replace] with functional replace
> and substitution.
> 
> Review of attachment 8743961 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Thanks. This is still not pretty, but it gets the job done.
> 
> One concern I didn't think of earlier is code size. How much larger does
> obj-dir/js/src/selfhosted.js get with this patch? The embedded version of
> this is zip-compressed, so I'm not too worried about binary size increase. I
> am however worried about startup time and memory usage. Eventually we'll
> have to address at least the first of those in some better way, such as
> precompiling and bundling bytecode, but for now this does worry me.

Here's the size of selfhosted.js, in non-debug build.
  before: 235718 bytes
  after:  242030 bytes (+6312 bytes, +2.7%)
Keywords: leave-open
https://hg.mozilla.org/integration/mozilla-inbound/rev/b9ac34e99e3048648c839de22de82f97eb4ce3a7
Bug 1264264 - Add optimized path for RegExp.prototype[@@replace] with functional replace and substitution. r=till
Blocks: 1265307
Comment on attachment 8744264 [details] [diff] [review]
Part 2: Enable optimization for packers again in RegExp.prototype[@@replace].

Review of attachment 8744264 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/builtin/RegExp.cpp
@@ +1595,5 @@
> +    CallArgs args = CallArgsFromVp(argc, vp);
> +    MOZ_ASSERT(args.length() == 1);
> +
> +    JSObject& lambda = args[0].toObject();
> +    args.rval().setNull();

Why null and not undefined?
Attachment #8744264 - Flags: review?(hv1989) → review+
(In reply to Hannes Verschore [:h4writer] from comment #16)
> > +    args.rval().setNull();
> 
> Why null and not undefined?

ah, sorry I should've asked about that.
I was thinking about type inference, and I thought "object or null" is better than "object or undefined", as both are a kind of "object".
doesn't it matter?
if it doesn't I'll change it to undefined.
Flags: needinfo?(hv1989)
Got answer in IRC.
object and null are also different type.
changed it to undefined locally.

will land it with bug 1263340 Part 3 shortly.
Flags: needinfo?(hv1989)
Keywords: leave-open
https://hg.mozilla.org/integration/mozilla-inbound/rev/8b08faebf408b352fde6504d33b3faaf6149a62f
Bug 1264264 - Part 2: Enable optimization for packers again in RegExp.prototype[@@replace]. r=h4writer
https://hg.mozilla.org/mozilla-central/rev/b9ac34e99e30
https://hg.mozilla.org/mozilla-central/rev/8b08faebf408
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Depends on: 1267364
Arai: just a question. Can we remove sunspider/check-string-unpack-code.js from the disabled list in devtools/automation/cgc-jittest-timeouts.txt now? Or is it still too slow?
Flags: needinfo?(arai.unmht)
Will check it shortly.
it's still hitting timeout on cgc and ggc
Flags: needinfo?(arai.unmht)
You need to log in before you can comment on or make changes to this bug.