Closed
Bug 765325
Opened 12 years ago
Closed 12 years ago
IonMonkey: Inline small functions more aggressively, turn off insertRecompileCheck for inliningDepth > 0
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: djvj, Unassigned)
Details
Attachments
(2 files)
1.80 KB,
patch
|
sstangl
:
review+
|
Details | Diff | Splinter Review |
27.71 KB,
text/plain
|
Details |
Small functions should be inlined more aggressively, because we have less to lose from doing that. Also, removing recompile check when compiling inlined functions. This was suggested by jandem and seems to yield some nice performance boost when combined with the small function inlining.
Reporter | ||
Comment 1•12 years ago
|
||
This patch does the following: Turns off insertRecompileCheck when inliningDepth > 0. During inline check, if the function we're trying to inline (the target function) is "small" (<100 bytes in bytecode), then we increase the inliningDepth by 1, and decrease the useCount requirement on the caller from usesBeforeInlining to usesBeforeInlining/4 (with a lower threshold of 10). These are the perf results on various benches. Note the decent boost all around on V8: ========================== | ========================== ========================== | ========================== |||| | |||| |||| REFERENCE_AFTERPULL/bench.ion-n-jm.kraken | |||| SMALLTARGET_INLINE_DEPTH3_AND_SKIPRECOMPILECHECK_AFTERPULL/bench.ion-n-jm.kraken |||| | |||| ========================== | ========================== ========================== | ========================== | RESULTS (means and 95% confidence intervals) | RESULTS (means and 95% confidence intervals) ----------------------------------------------- | ----------------------------------------------- Total: 1975.2ms +/- 0.5% | Total: 1982.5ms +/- 0.4% ----------------------------------------------- | ----------------------------------------------- | ai: 118.1ms +/- 0.3% | ai: 118.8ms +/- 0.4% astar: 118.1ms +/- 0.3% | astar: 118.8ms +/- 0.4% | audio: 621.7ms +/- 0.5% | audio: 622.3ms +/- 0.4% beat-detection: 178.8ms +/- 2.0% | beat-detection: 177.5ms +/- 0.2% dft: 177.1ms +/- 0.5% | dft: 178.6ms +/- 0.7% fft: 115.2ms +/- 0.3% | fft: 115.7ms +/- 0.7% oscillator: 150.6ms +/- 1.4% | oscillator: 150.5ms +/- 1.4% | imaging: 584.2ms +/- 1.1% | imaging: 588.1ms +/- 1.1% gaussian-blur: 200.4ms +/- 2.9% | gaussian-blur: 203.4ms +/- 2.9% darkroom: 248.8ms +/- 0.4% | darkroom: 249.8ms +/- 0.7% desaturate: 135.0ms +/- 1.0% | desaturate: 134.9ms +/- 0.9% | json: 95.0ms +/- 0.7% | json: 95.2ms +/- 0.6% parse-financial: 60.9ms +/- 0.7% | parse-financial: 61.3ms +/- 0.8% stringify-tinderbox: 34.1ms +/- 1.2% | stringify-tinderbox: 33.9ms +/- 0.7% | stanford: 556.2ms +/- 0.6% | stanford: 558.1ms +/- 0.5% crypto-aes: 120.2ms +/- 0.8% | crypto-aes: 119.4ms +/- 0.3% crypto-ccm: 121.9ms +/- 0.5% | crypto-ccm: 122.4ms +/- 0.7% crypto-pbkdf2: 241.3ms +/- 1.0% | crypto-pbkdf2: 243.2ms +/- 1.0% crypto-sha256-iterative: 72.8ms +/- 0.6% | crypto-sha256-iterative: 73.1ms +/- 0.6% | Results are located at kraken-1.1-results/sunspider-results-2012-06-14-16.43.35.js | Results are located at kraken-1.1-results/sunspider-results-2012-06-14-16.35.51.js | ========================== | ========================== ========================== | ========================== |||| | |||| |||| REFERENCE_AFTERPULL/bench.ion-n-jm.sunspider | |||| SMALLTARGET_INLINE_DEPTH3_AND_SKIPRECOMPILECHECK_AFTERPULL/bench.ion-n-jm.sunspider |||| | |||| ========================== | ========================== ========================== | ========================== | RESULTS (means and 95% confidence intervals) | RESULTS (means and 95% confidence intervals) -------------------------------------------- | -------------------------------------------- Total: 288.2ms +/- 0.1% | Total: 289.7ms +/- 0.2% -------------------------------------------- | -------------------------------------------- | 3d: 34.1ms +/- 0.8% | 3d: 34.8ms +/- 1.8% cube: 12.4ms +/- 2.1% | cube: 13.1ms +/- 4.8% morph: 6.2ms +/- 0.2% | morph: 6.2ms +/- 0.2% raytrace: 15.5ms +/- 0.1% | raytrace: 15.5ms +/- 0.2% | access: 124.9ms +/- 0.1% | access: 125.9ms +/- 0.1% binary-trees: 111.4ms +/- 0.1% | binary-trees: 112.4ms +/- 0.1% fannkuch: 7.2ms +/- 0.2% | fannkuch: 7.2ms +/- 0.2% nbody: 3.6ms +/- 0.3% | nbody: 3.6ms +/- 0.4% nsieve: 2.7ms +/- 0.2% | nsieve: 2.7ms +/- 0.3% | bitops: 11.7ms +/- 0.3% | bitops: 11.6ms +/- 0.2% 3bit-bits-in-byte: 1.0ms +/- 0.4% | 3bit-bits-in-byte: 1.0ms +/- 0.4% bits-in-byte: 4.5ms +/- 0.6% | bits-in-byte: 4.5ms +/- 0.5% bitwise-and: 2.9ms +/- 0.2% | bitwise-and: 2.9ms +/- 0.1% nsieve-bits: 3.3ms +/- 0.3% | nsieve-bits: 3.3ms +/- 0.2% | controlflow: 2.1ms +/- 0.2% | controlflow: 2.1ms +/- 0.1% recursive: 2.1ms +/- 0.2% | recursive: 2.1ms +/- 0.1% | crypto: 18.2ms +/- 0.2% | crypto: 18.2ms +/- 0.2% aes: 8.9ms +/- 0.2% | aes: 8.9ms +/- 0.2% md5: 5.8ms +/- 0.3% | md5: 5.8ms +/- 0.4% sha1: 3.6ms +/- 0.3% | sha1: 3.6ms +/- 0.2% | date: 23.1ms +/- 0.5% | date: 23.1ms +/- 0.5% format-tofte: 13.3ms +/- 0.3% | format-tofte: 13.3ms +/- 0.3% format-xparb: 9.8ms +/- 1.1% | format-xparb: 9.8ms +/- 0.9% | math: 14.1ms +/- 0.1% | math: 14.1ms +/- 0.1% cordic: 3.3ms +/- 0.3% | cordic: 3.3ms +/- 0.2% partial-sums: 8.2ms +/- 0.1% | partial-sums: 8.3ms +/- 0.2% spectral-norm: 2.6ms +/- 0.2% | spectral-norm: 2.6ms +/- 0.2% | regexp: 9.8ms +/- 0.2% | regexp: 9.8ms +/- 0.1% dna: 9.8ms +/- 0.2% | dna: 9.8ms +/- 0.1% | string: 50.3ms +/- 0.2% | string: 50.1ms +/- 0.2% base64: 3.6ms +/- 0.4% | base64: 3.6ms +/- 0.3% fasta: 6.1ms +/- 0.4% | fasta: 6.0ms +/- 0.2% tagcloud: 17.5ms +/- 0.4% | tagcloud: 17.4ms +/- 0.2% unpack-code: 17.1ms +/- 0.4% | unpack-code: 17.0ms +/- 0.4% validate-input: 6.0ms +/- 0.2% | validate-input: 6.0ms +/- 0.3% | Results are located at sunspider-1.0-results/sunspider-results-2012-06-14-16.44.01.js | Results are located at sunspider-1.0-results/sunspider-results-2012-06-14-16.36.18.js | ========================== | ========================== ========================== | ========================== |||| | |||| |||| REFERENCE_AFTERPULL/bench.ion-n-jm.v8 | |||| SMALLTARGET_INLINE_DEPTH3_AND_SKIPRECOMPILECHECK_AFTERPULL/bench.ion-n-jm.v8 |||| | |||| ========================== | ========================== ========================== | ========================== | Richards: 9267 | Richards: 9592 DeltaBlue: 6248 | DeltaBlue: 6863 Crypto: 13033 | Crypto: 13170 RayTrace: 3612 | RayTrace: 3678 EarleyBoyer: 9034 | EarleyBoyer: 8917 RegExp: 2172 | RegExp: 2177 Splay: 11270 | Splay: 11295 NavierStokes: 17049 | NavierStokes: 17459 ---- | ---- Score (version 7): 7524 | Score (version 7): 7688 |
Attachment #633606 -
Flags: review?(sstangl)
Reporter | ||
Comment 2•12 years ago
|
||
This is the benchmark results attached as a file. The without-patch is on the left and with-patch on the right. The first 3 sections are kraken, sunspider, and v8 with '--ion -n -m', and the last 3 sections are kraken, sunspider, and v8 with '--ion -n --no-jm'
Comment 3•12 years ago
|
||
Comment on attachment 633606 [details] [diff] [review] Patch Review of attachment 633606 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/ion/IonBuilder.cpp @@ +2827,5 @@ > } > }; > > +static bool IsSmallFunction(JSFunction *target) { > + return target->isInterpreted() && target->script()->length < 100; How was 100 determined? What happens if other values are used? @@ +2835,5 @@ > IonBuilder::makeInliningDecision(JSFunction *target) > { > + bool smallTarget = IsSmallFunction(target); > + > + if ((smallTarget && inliningDepth >= 3) || (!smallTarget && inliningDepth >= 2)) A patch from jandem recently changed the maximum inlining depth from 2 to 3. Three comments: 1) What are the results from applying the patch after adding yet another inlining depth? Is it still a small win? 2) Instead of hardcoding constants, we should use named static const uint32. 3) The rationale behind extra inlining for small functions should be provided in a comment above. @@ +2842,5 @@ > + uint32 checkUses = js_IonOptions.usesBeforeInlining; > + if (smallTarget) { > + checkUses = js_IonOptions.usesBeforeInlining / 4; > + if (checkUses < 10) > + checkUses = 10; This seems more arbitrary than usual. I would almost prefer setting some js_IonOptions.smallFunctionUsesBeforeInlining rather than having a distant calculation in the IonBuilder.
Attachment #633606 -
Flags: review?(sstangl) → review+
Reporter | ||
Comment 4•12 years ago
|
||
(In reply to Sean Stangl from comment #3) > How was 100 determined? What happens if other values are used? Empirically. Just tried a bunch of different values, ran full benches, compared results. > 1) What are the results from applying the patch after adding yet another > inlining depth? Is it still a small win? Re-ran with jandem's patches included. The gains are still there. But I removed the inlining depth change only (leaving in the lowering of the usecount check), and realized that the depth change wasn't having a direct effect. It was the other aggressiveness (reducing the use count requirements) that were leading to the gains. So I removed that. > 2) Instead of hardcoding constants, we should use named static const uint32. > > 3) The rationale behind extra inlining for small functions should be > provided in a comment above. Done and done :) > This seems more arbitrary than usual. I would almost prefer setting some > js_IonOptions.smallFunctionUsesBeforeInlining rather than having a distant > calculation in the IonBuilder. I added both smallFunctionMaxBytecodeLength and smallFunctionUsesBeforeInlining to js_IonOptions, with a short explanation of how they were arrived at. Pushed: https://hg.mozilla.org/projects/ionmonkey/rev/a0c81536dd0b Waiting for tbpl and awfy before closing.
Reporter | ||
Updated•12 years ago
|
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•