Closed Bug 877338 Opened 7 years ago Closed 6 years ago

OdinMonkey: Integrate {heap,global} loads/stores in Alias Analysis

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla27

People

(Reporter: bbouvier, Assigned: bbouvier)

References

Details

Attachments

(2 files, 4 obsolete files)

Currently, loads and stores are all considered to be effectful and hence cannot be considered as aliases during alias analysis. This results in a lot of unnecessary moves in the generated code. For instance, the following test case generates useless moves:

function storeGlobal()
{
  myGlobalInt = myGlobalInt; // 2 mov
  myGlobalInt = 3; // 1 mov (ideally, see bug 877321)
  myGlobalInt = 5; // 1 mov -- the previous mov's are not observable and could be removed
  return myGlobalInt | 0; // 1 mov for the load
}

This could be reduced to 2 mov's instead of :
 mov $5, eax
 mov eax, @myGlobalInt
 ret // myGlobalInt is already in eax, no need to reload it for the return instruction
For heap loads/stores, we can generally assume LLVM has already done a good job hoisting this already.  However, you make a great point about global variables.  (Also, when there are too many globals, Emscripten uses heap loads/stores at constant offsets; these should be just as optimizable as globals.)
I made this patch, which does what we want on micro benchmarks (checked with IONFLAGS=logs). While it doesn't introduce particular speedups on the asm.js benchmarks (except for primes, which gets better by 6%), it seems to slow down a few benchmarks by 2 or 3% (fannkuch and fasta). However, as these are microbenchmarks, I am not sure whether this slowdowns are provoked by this patch or are just random noise.

Jan, if you have some time, can you please give a look at this pretty simple patch and tell me if you see any obvious error in my patch, or something that I would have forgotten to do?
Attachment #774379 - Flags: feedback?(jdemooij)
Comment on attachment 774379 [details] [diff] [review]
alias analysis + gvn for global variables

Review of attachment 774379 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!

::: js/src/ion/MIR.cpp
@@ +2262,5 @@
> +    return true;
> +}
> +
> +bool
> +MAsmJSLoadGlobalVar::congruentTo(MDefinition* const &ins) const {

Nit: { on its own line

::: js/src/ion/MIR.h
@@ +8047,5 @@
> +
> +    bool congruentTo(MDefinition* const &ins) const;
> +
> +    AliasSet getAliasSet() const {
> +        return AliasSet::Load(AliasSet::FixedSlot);

Globals are not stored in fixed JSObject slots right? We should add a separate AliasSet flag, AsmJSGlobal or something.
Attachment #774379 - Flags: feedback?(jdemooij) → feedback+
Thanks for the feedback, Jan!
Indeed, a new enum field should be created for AsmJSGlobal, probably. I've used this one instead of creating another one, as it was to me the closest in semantics to a global variable.

However, I won't ask for review for this patch as I can still see a slow down of ~15% on asmjs-ubench/primes.js. Analysis with perf shows that the new version spends more time in the main function, block 23 (which contains the call to sqrt), but there is absolutely no difference in the generated MIR / LIR / code for this particular block. It might be an code alignment issue or something I haven't seen.

The only observable differences are that there are less loads in the patched version. While it might increase register pressure, it doesn't seem like it happens on the hot blocks.

If somebody else wanted to give it a try and could print the results here, that'd be nice. The only affected benchmarks are the asmjs microbenchmarks, present on AWFY (benchmarks/asmjs-ubench).
Attached patch global var (obsolete) — Splinter Review
The problem was that calls can also modify global variables, so they must be considered as a possible "store" operations.

With this change, there are no regressions anymore and some speedups (asmjs-ubench: copy 4%, primes 9%).
Assignee: general → bbouvier
Attachment #774379 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #799747 - Flags: review?(jdemooij)
Great work Benjamin!  Any chance you could optimize MAsmJSHeapLoad/Store in the same manner when the index is a constant?
Attached patch heap loads (obsolete) — Splinter Review
(In reply to Luke Wagner [:luke] from comment #6)
> Great work Benjamin!  Any chance you could optimize MAsmJSHeapLoad/Store in
> the same manner when the index is a constant?

Yep.
Attachment #799821 - Flags: review?(jdemooij)
The last patch shows some slow-down, especially on primes.js (it gets 2% worse than initial with no patch). Either I might be doing something wrong, or this script is *really* sensitive to code alignment.
Comment on attachment 799747 [details] [diff] [review]
global var

Review of attachment 799747 [details] [diff] [review]:
-----------------------------------------------------------------

Nice, r=me with comment below addressed.

::: js/src/jit/MIR.cpp
@@ +2285,5 @@
>  
>  bool
> +MAsmJSLoadGlobalVar::mightAlias(MDefinition *def)
> +{
> +    if (def->isAsmJSStoreGlobalVar() && def->toAsmJSStoreGlobalVar()->globalDataOffset() != globalDataOffset_)

Style nit: doesn't fit within 99 characters. (Either rewrite it to look like the method below, or just make the condition multi-line and add {} with { on its own line)

::: js/src/jit/MIR.h
@@ +8425,5 @@
>          return true;
>      }
> +
> +    AliasSet getAliasSet() const {
> +        return AliasSet::Store(AliasSet::AsmJSGlobalVar);

The callee can do ffi stuff, modify the asm.js heap and call arbitrary JS right? I think we shouldn't override getAliasSet for MAsmJSCall to get the default behavior (aliases anything).
Attachment #799747 - Flags: review?(jdemooij) → review+
Comment on attachment 799821 [details] [diff] [review]
heap loads

Review of attachment 799821 [details] [diff] [review]:
-----------------------------------------------------------------

Just curious, does this have any effect on compilation time? I can imagine alias analysis and LICM/GVN doing a bit more work now, but it's hopefully negligible.

::: js/src/jit/MIR.cpp
@@ +2298,5 @@
> +
> +bool
> +MAsmJSLoadHeap::congruentTo(MDefinition *ins) const
> +{
> +    if (ins->isAsmJSLoadHeap() && ptr()->isConstant()) {

return congruentIfOperandsEqual(ins);

should also work and handle more cases (not just constants).
Attachment #799821 - Flags: review?(jdemooij) → review+
I actually have a correctness issue with this latter patch. jit-tests pass but bullet.js fails by throwing an assertion. I reworked it and found some obvious issues, but it still asserts. Trying to reduce the test case but it's quite hard as it's an emscripten generated script.
Fixed nits, carrying over r+ from jandem.
Attachment #799747 - Attachment is obsolete: true
Attachment #805722 - Flags: review+
Attached patch heap loads + tests (obsolete) — Splinter Review
Reflagging for review as almost everything changed.

The issue was that stores of global variables should be considered as a dependency for a heap load too, otherwise this would fail:

function f() {
  var x = 0, y = 0;
  u8[0] = 1;
  u8[1] = 2;
  x = 0|u8[i]; // where i is a global variable initialized to 0
  i = x;
  y = 0|u8[i]; // redundant if global stores are not dependencies of heap loads
               // while it is a different value
  return y|0;
}

Added this test case in testHeapAccesses, to be sure it never happens again.

(In reply to Jan de Mooij [:jandem] from comment #10)
> Comment on attachment 799821 [details] [diff] [review]
> heap loads
> 
> Review of attachment 799821 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Just curious, does this have any effect on compilation time? I can imagine
> alias analysis and LICM/GVN doing a bit more work now, but it's hopefully
> negligible.
20ms more on Epic Citadel, 10ms more on BananaBread, so yes, it's negligible :)
However, the speedups seen before are not present anymore (the patch was incorrect) and there is almost no difference between a version with or without this patch (<1% better with this patch), on my terribly-noisy computer. We can suppose, as Luke said before, that LLVM already hoisted all the possible loads.
Attachment #799821 - Attachment is obsolete: true
Attachment #805731 - Flags: review?(jdemooij)
Comment on attachment 805731 [details] [diff] [review]
heap loads + tests

Review of attachment 805731 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/MIR.h
@@ +8496,5 @@
>      unsigned globalDataOffset() const { return globalDataOffset_; }
>      MDefinition *value() const { return getOperand(0); }
>  
>      AliasSet getAliasSet() const {
> +        return AliasSet::Store(AliasSet::AsmJSGlobalVar | AliasSet::AsmJSHeap);

The AsmJSHeap here is not correct/necessary. I think there's a problem somewhere else. It should work like this:

  x = 0|u8[i];   // AsmJSLoadGlobalVar i
  i = x;         // AsmJSStoreGlobalVar i
  y = 0|u8[i];   // AsmJSLoadGlobalVar i

The AsmJSLoadGlobalVar on line 3 should depend on the AsmJSStoreGlobalVar on line 2. This should be enough to keep us from eliminating the second u8[i], because the "i" operand is no longer the same.

Let me know if you need help tracking this down :)
Attachment #805731 - Flags: review?(jdemooij)
Doh, you're right. My previous rewriting of the global var patch was actually incorrect (inverted condition in mightAlias) and it resulted in the second load not being made. I fixed it locally AFTER I wrote this latter patch, so the AsmJSHeap store in the StoreGlobal alias set isn't needed anymore.
Attachment #805731 - Attachment is obsolete: true
Attachment #806167 - Flags: review?(jdemooij)
Attachment #806167 - Flags: review?(jdemooij) → review+
https://hg.mozilla.org/mozilla-central/rev/6c99d5808529
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
(In reply to Benjamin Bouvier [:bbouvier] from comment #16)
...
> https://hg.mozilla.org/integration/mozilla-inbound/rev/71f2968c7359

This one has an incorrect bug number and this might need to be corrected.
(In reply to Douglas Crosher [:dougc] from comment #18)
> This one has an incorrect bug number and this might need to be corrected.

Thanks. Posted a message in bug 877378.
(In reply to Douglas Crosher [:dougc] from comment #18)
> (In reply to Benjamin Bouvier [:bbouvier] from comment #16)
> ...
> > https://hg.mozilla.org/integration/mozilla-inbound/rev/71f2968c7359
> 
> This one has an incorrect bug number and this might need to be corrected.

It also causes regressions on many of the asm.js demos, such as BananaBench
and Citadel.  Could I suggest testing such changes against the larger
demos in future.
(In reply to Douglas Crosher [:dougc] from comment #20)
> It also causes regressions on many of the asm.js demos, such as BananaBench
> and Citadel.  Could I suggest testing such changes against the larger
> demos in future.

How can it be? This patch is supposed to enhance performance, not decreasing it. The only thing I can imagine is that register pressure gets higher, so there are more move groups, but it seems surprising that it decreases performance a lot.
Do the regressions happen on all platforms?
Depends on: 919958
Depends on: 920452
Oops, ignore comment 22, wrong bug.
You need to log in before you can comment on or make changes to this bug.