Closed
Bug 877338
Opened 11 years ago
Closed 11 years ago
OdinMonkey: Integrate {heap,global} loads/stores in Alias Analysis
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
mozilla27
People
(Reporter: bbouvier, Assigned: bbouvier)
References
Details
Attachments
(2 files, 4 obsolete files)
5.48 KB,
patch
|
bbouvier
:
review+
|
Details | Diff | Splinter Review |
6.38 KB,
patch
|
jandem
:
review+
|
Details | Diff | Splinter Review |
Currently, loads and stores are all considered to be effectful and hence cannot be considered as aliases during alias analysis. This results in a lot of unnecessary moves in the generated code. For instance, the following test case generates useless moves: function storeGlobal() { myGlobalInt = myGlobalInt; // 2 mov myGlobalInt = 3; // 1 mov (ideally, see bug 877321) myGlobalInt = 5; // 1 mov -- the previous mov's are not observable and could be removed return myGlobalInt | 0; // 1 mov for the load } This could be reduced to 2 mov's instead of : mov $5, eax mov eax, @myGlobalInt ret // myGlobalInt is already in eax, no need to reload it for the return instruction
Comment 1•11 years ago
|
||
For heap loads/stores, we can generally assume LLVM has already done a good job hoisting this already. However, you make a great point about global variables. (Also, when there are too many globals, Emscripten uses heap loads/stores at constant offsets; these should be just as optimizable as globals.)
Assignee | ||
Comment 2•11 years ago
|
||
I made this patch, which does what we want on micro benchmarks (checked with IONFLAGS=logs). While it doesn't introduce particular speedups on the asm.js benchmarks (except for primes, which gets better by 6%), it seems to slow down a few benchmarks by 2 or 3% (fannkuch and fasta). However, as these are microbenchmarks, I am not sure whether this slowdowns are provoked by this patch or are just random noise. Jan, if you have some time, can you please give a look at this pretty simple patch and tell me if you see any obvious error in my patch, or something that I would have forgotten to do?
Attachment #774379 -
Flags: feedback?(jdemooij)
Comment 3•11 years ago
|
||
Comment on attachment 774379 [details] [diff] [review] alias analysis + gvn for global variables Review of attachment 774379 [details] [diff] [review]: ----------------------------------------------------------------- Nice! ::: js/src/ion/MIR.cpp @@ +2262,5 @@ > + return true; > +} > + > +bool > +MAsmJSLoadGlobalVar::congruentTo(MDefinition* const &ins) const { Nit: { on its own line ::: js/src/ion/MIR.h @@ +8047,5 @@ > + > + bool congruentTo(MDefinition* const &ins) const; > + > + AliasSet getAliasSet() const { > + return AliasSet::Load(AliasSet::FixedSlot); Globals are not stored in fixed JSObject slots right? We should add a separate AliasSet flag, AsmJSGlobal or something.
Attachment #774379 -
Flags: feedback?(jdemooij) → feedback+
Assignee | ||
Comment 4•11 years ago
|
||
Thanks for the feedback, Jan! Indeed, a new enum field should be created for AsmJSGlobal, probably. I've used this one instead of creating another one, as it was to me the closest in semantics to a global variable. However, I won't ask for review for this patch as I can still see a slow down of ~15% on asmjs-ubench/primes.js. Analysis with perf shows that the new version spends more time in the main function, block 23 (which contains the call to sqrt), but there is absolutely no difference in the generated MIR / LIR / code for this particular block. It might be an code alignment issue or something I haven't seen. The only observable differences are that there are less loads in the patched version. While it might increase register pressure, it doesn't seem like it happens on the hot blocks. If somebody else wanted to give it a try and could print the results here, that'd be nice. The only affected benchmarks are the asmjs microbenchmarks, present on AWFY (benchmarks/asmjs-ubench).
Assignee | ||
Comment 5•11 years ago
|
||
The problem was that calls can also modify global variables, so they must be considered as a possible "store" operations. With this change, there are no regressions anymore and some speedups (asmjs-ubench: copy 4%, primes 9%).
Assignee: general → bbouvier
Attachment #774379 -
Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #799747 -
Flags: review?(jdemooij)
Comment 6•11 years ago
|
||
Great work Benjamin! Any chance you could optimize MAsmJSHeapLoad/Store in the same manner when the index is a constant?
Assignee | ||
Comment 7•11 years ago
|
||
(In reply to Luke Wagner [:luke] from comment #6) > Great work Benjamin! Any chance you could optimize MAsmJSHeapLoad/Store in > the same manner when the index is a constant? Yep.
Attachment #799821 -
Flags: review?(jdemooij)
Assignee | ||
Comment 8•11 years ago
|
||
The last patch shows some slow-down, especially on primes.js (it gets 2% worse than initial with no patch). Either I might be doing something wrong, or this script is *really* sensitive to code alignment.
Comment 9•11 years ago
|
||
Comment on attachment 799747 [details] [diff] [review] global var Review of attachment 799747 [details] [diff] [review]: ----------------------------------------------------------------- Nice, r=me with comment below addressed. ::: js/src/jit/MIR.cpp @@ +2285,5 @@ > > bool > +MAsmJSLoadGlobalVar::mightAlias(MDefinition *def) > +{ > + if (def->isAsmJSStoreGlobalVar() && def->toAsmJSStoreGlobalVar()->globalDataOffset() != globalDataOffset_) Style nit: doesn't fit within 99 characters. (Either rewrite it to look like the method below, or just make the condition multi-line and add {} with { on its own line) ::: js/src/jit/MIR.h @@ +8425,5 @@ > return true; > } > + > + AliasSet getAliasSet() const { > + return AliasSet::Store(AliasSet::AsmJSGlobalVar); The callee can do ffi stuff, modify the asm.js heap and call arbitrary JS right? I think we shouldn't override getAliasSet for MAsmJSCall to get the default behavior (aliases anything).
Attachment #799747 -
Flags: review?(jdemooij) → review+
Comment 10•11 years ago
|
||
Comment on attachment 799821 [details] [diff] [review] heap loads Review of attachment 799821 [details] [diff] [review]: ----------------------------------------------------------------- Just curious, does this have any effect on compilation time? I can imagine alias analysis and LICM/GVN doing a bit more work now, but it's hopefully negligible. ::: js/src/jit/MIR.cpp @@ +2298,5 @@ > + > +bool > +MAsmJSLoadHeap::congruentTo(MDefinition *ins) const > +{ > + if (ins->isAsmJSLoadHeap() && ptr()->isConstant()) { return congruentIfOperandsEqual(ins); should also work and handle more cases (not just constants).
Attachment #799821 -
Flags: review?(jdemooij) → review+
Assignee | ||
Comment 11•11 years ago
|
||
I actually have a correctness issue with this latter patch. jit-tests pass but bullet.js fails by throwing an assertion. I reworked it and found some obvious issues, but it still asserts. Trying to reduce the test case but it's quite hard as it's an emscripten generated script.
Assignee | ||
Comment 12•11 years ago
|
||
Fixed nits, carrying over r+ from jandem.
Attachment #799747 -
Attachment is obsolete: true
Attachment #805722 -
Flags: review+
Assignee | ||
Comment 13•11 years ago
|
||
Reflagging for review as almost everything changed. The issue was that stores of global variables should be considered as a dependency for a heap load too, otherwise this would fail: function f() { var x = 0, y = 0; u8[0] = 1; u8[1] = 2; x = 0|u8[i]; // where i is a global variable initialized to 0 i = x; y = 0|u8[i]; // redundant if global stores are not dependencies of heap loads // while it is a different value return y|0; } Added this test case in testHeapAccesses, to be sure it never happens again. (In reply to Jan de Mooij [:jandem] from comment #10) > Comment on attachment 799821 [details] [diff] [review] > heap loads > > Review of attachment 799821 [details] [diff] [review]: > ----------------------------------------------------------------- > > Just curious, does this have any effect on compilation time? I can imagine > alias analysis and LICM/GVN doing a bit more work now, but it's hopefully > negligible. 20ms more on Epic Citadel, 10ms more on BananaBread, so yes, it's negligible :) However, the speedups seen before are not present anymore (the patch was incorrect) and there is almost no difference between a version with or without this patch (<1% better with this patch), on my terribly-noisy computer. We can suppose, as Luke said before, that LLVM already hoisted all the possible loads.
Attachment #799821 -
Attachment is obsolete: true
Attachment #805731 -
Flags: review?(jdemooij)
Comment 14•11 years ago
|
||
Comment on attachment 805731 [details] [diff] [review] heap loads + tests Review of attachment 805731 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/MIR.h @@ +8496,5 @@ > unsigned globalDataOffset() const { return globalDataOffset_; } > MDefinition *value() const { return getOperand(0); } > > AliasSet getAliasSet() const { > + return AliasSet::Store(AliasSet::AsmJSGlobalVar | AliasSet::AsmJSHeap); The AsmJSHeap here is not correct/necessary. I think there's a problem somewhere else. It should work like this: x = 0|u8[i]; // AsmJSLoadGlobalVar i i = x; // AsmJSStoreGlobalVar i y = 0|u8[i]; // AsmJSLoadGlobalVar i The AsmJSLoadGlobalVar on line 3 should depend on the AsmJSStoreGlobalVar on line 2. This should be enough to keep us from eliminating the second u8[i], because the "i" operand is no longer the same. Let me know if you need help tracking this down :)
Attachment #805731 -
Flags: review?(jdemooij)
Assignee | ||
Comment 15•11 years ago
|
||
Doh, you're right. My previous rewriting of the global var patch was actually incorrect (inverted condition in mightAlias) and it resulted in the second load not being made. I fixed it locally AFTER I wrote this latter patch, so the AsmJSHeap store in the StoreGlobal alias set isn't needed anymore.
Attachment #805731 -
Attachment is obsolete: true
Attachment #806167 -
Flags: review?(jdemooij)
Updated•11 years ago
|
Attachment #806167 -
Flags: review?(jdemooij) → review+
Assignee | ||
Comment 16•11 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/6c99d5808529 https://hg.mozilla.org/integration/mozilla-inbound/rev/71f2968c7359
Comment 17•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/6c99d5808529
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Comment 18•11 years ago
|
||
(In reply to Benjamin Bouvier [:bbouvier] from comment #16) ... > https://hg.mozilla.org/integration/mozilla-inbound/rev/71f2968c7359 This one has an incorrect bug number and this might need to be corrected.
Assignee | ||
Comment 19•11 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #18) > This one has an incorrect bug number and this might need to be corrected. Thanks. Posted a message in bug 877378.
Comment 20•11 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #18) > (In reply to Benjamin Bouvier [:bbouvier] from comment #16) > ... > > https://hg.mozilla.org/integration/mozilla-inbound/rev/71f2968c7359 > > This one has an incorrect bug number and this might need to be corrected. It also causes regressions on many of the asm.js demos, such as BananaBench and Citadel. Could I suggest testing such changes against the larger demos in future.
Assignee | ||
Comment 21•11 years ago
|
||
(In reply to Douglas Crosher [:dougc] from comment #20) > It also causes regressions on many of the asm.js demos, such as BananaBench > and Citadel. Could I suggest testing such changes against the larger > demos in future. How can it be? This patch is supposed to enhance performance, not decreasing it. The only thing I can imagine is that register pressure gets higher, so there are more move groups, but it seems surprising that it decreases performance a lot. Do the regressions happen on all platforms?
Comment 23•11 years ago
|
||
Oops, ignore comment 22, wrong bug.
You need to log in
before you can comment on or make changes to this bug.
Description
•