Closed
Bug 822116
Opened 12 years ago
Closed 12 years ago
IonMonkey: Backtracking allocator x86/x64 tuning
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
mozilla20
People
(Reporter: bhackett1024, Assigned: bhackett1024)
References
Details
Attachments
(3 files, 1 obsolete file)
Bug 814966 focused on making an am3 microbenchmark faster on x86 than LSRA. Currently, these results do not extrapolate at all to other benchmarks, or even to the am3() used in v8-crypto. The only difference between the two is that the microbenchmark adds an argument 't' and substitutes a this.array for a t.array (outside the loop body), which manages to increase the spill code % from 25% to 36%! The amount of spill code generated using the backtracking allocator is also greater than with LSRA on all the benchmarks. The backtracking allocator needs to be far more robust against such changes, and to be able to consistently improve on the code generated using LSRA. This bug is about the tuning work to get there, focusing on x86/x64 behavior on SS/v8/kraken. The improvements here should hopefully carry over to ARM, but the architectures are different enough (memory vs. register operands, two vs. three operand arithmetic, register requirements, ...) to focus on the former for now.
Assignee | ||
Comment 1•12 years ago
|
||
Current comparison between the spill code generated by the LSRA and Backtracking allocators, for x86/x64 and benchmarks. For each test, e.g. check-3d-cube.js :: 4299964 x86 old 2.6 new 27.4 x64 old 0.4 new 23.5 After the :: is the number of non-spill Ion instructions executed on x86, for weighting the benchmark. After 'old' is the % of Ion instructions which were spill code for LSRA, after 'new' is the same for Backtracking. (Few benchmarks are as bad as this one.) At the end of each suite, ratios are given for Backtracking spill code / LSRA spill code, totaled across all tests. > 1 means LSRA did better, < 1 means Backtracking did better. Current ratios: sunspider x86 ratio 1.069 x64 ratio 1.195 v8 x86 ratio 1.319 x64 ratio 1.109 kraken x86 ratio 1.073 x64 ratio 1.287
Assignee | ||
Comment 2•12 years ago
|
||
WIP after a few hours, putting up with new numbers so that comment 1 doesn't look so bleak. New ratios: sunspider x86 ratio 0.837 x64 ratio 1.006 v8 x86 ratio 1.103 x64 ratio 1.266 kraken x86 ratio 0.731 x64 ratio 0.849 Some outliers in ss/kraken to look at, but most remaining work needs to see what's going on with v8bench.
Comment 3•12 years ago
|
||
(In reply to Brian Hackett (:bhackett) from comment #2) > Created attachment 692775 [details] [diff] [review] > WIP (dc4887f61d2e) > > WIP after a few hours, putting up with new numbers so that comment 1 doesn't > look so bleak. I wasn't worried :-), I knew it would get better. It looks like this is going very well.
Assignee | ||
Comment 4•12 years ago
|
||
There is of course always more that could be done here, but this patch puts allocator performance in a much better place than it is now. New ratios: sunspider x86 ratio 0.8 x64 ratio 0.874 v8 x86 ratio 0.809 x64 ratio 0.995 kraken x86 ratio 0.716 x64 ratio 0.885
Assignee: general → bhackett1024
Attachment #692775 -
Attachment is obsolete: true
Attachment #693390 -
Flags: review?(jdemooij)
Assignee | ||
Comment 5•12 years ago
|
||
Comment 6•12 years ago
|
||
Comment on attachment 693390 [details] [diff] [review] patch (dc4887f61d2e) Review of attachment 693390 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/ion/Lowering.cpp @@ +429,5 @@ > // The second operand has known null/undefined type, so just test the > // first operand. > if (IsNullOrUndefined(comp->specialization())) { > LIsNullOrUndefinedAndBranch *lir = new LIsNullOrUndefinedAndBranch(ifTrue, ifFalse); > + if (!useBoxAtStart(lir, LIsNullOrUndefinedAndBranch::Value, left)) Just curious, why does this make a difference?
Attachment #693390 -
Flags: review?(jdemooij) → review+
Assignee | ||
Comment 7•12 years ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #6) > ::: js/src/ion/Lowering.cpp > @@ +429,5 @@ > > // The second operand has known null/undefined type, so just test the > > // first operand. > > if (IsNullOrUndefined(comp->specialization())) { > > LIsNullOrUndefinedAndBranch *lir = new LIsNullOrUndefinedAndBranch(ifTrue, ifFalse); > > + if (!useBoxAtStart(lir, LIsNullOrUndefinedAndBranch::Value, left)) > > Just curious, why does this make a difference? Maybe something to be fixed in the allocator itself, but if there are two intervals for the vreg, one ending at the end of the comparev and the other starting at the beginning of it (intervals can overlap, and the new one starts at the instruction's input in case the operand is clobbered within), we would generate spill code for the transition between the intervals at both the start of the comparev and at the start of any successor blocks.
Assignee | ||
Comment 8•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/70f1b2db9f5f
Comment 9•12 years ago
|
||
Backed out for mochitest-2 crashes: https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=mochitest-2&fromchange=d0797dfcab56&tochange=1df75697408f https://hg.mozilla.org/integration/mozilla-inbound/rev/1df75697408f
Comment 10•12 years ago
|
||
Any numbers regarding the effects this new allocator has on our benchmark scores?
Assignee | ||
Comment 11•12 years ago
|
||
Oof, the one change this made to code that is actually run by default interacted badly with the bug 792108 changes. Measuring ss/v8/kraken on x86 with off thread compilation I see a <1% difference between the allocators on all benchmarks. Need to investigate this more. https://hg.mozilla.org/integration/mozilla-inbound/rev/976f2d5ad400
Comment 12•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/976f2d5ad400
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla20
You need to log in
before you can comment on or make changes to this bug.
Description
•