Closed Bug 1167677 Opened 6 years ago Closed 6 years ago

Try harder to find scratch registers for memory->memory MoveGroup moves

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla41
Tracking Status
firefox41 --- fixed

People

(Reporter: bhackett1024, Assigned: bhackett1024)

References

Details

Attachments

(2 files)

Attached patch patchSplinter Review
Bug 1067610 changed the number of memory->memory push/pop moves we do on various tests, some for the better (e.g. octane-zlib) and some for the worse (e.g. octane-crypto).  While there might be better heuristics we can use to avoid these moves in the first place, there is more on top of bug 1124377 that can be done to find scratch registers to execute these moves more efficiently.  The attached patch adds a couple optimizations.

First, when adding moves to resolve phis and other control flow stuff, the backtracking allocator tries to use ranges with a register allocation rather than ones with a stack allocation.  This only helps a tiny bit with crypto, but it might help on other tests and is basically free, so why not.

Second, when resolving memory->memory moves in a MoveGroup and there is no scratch register available throughout (i.e. all registers are either used by the move group or are live afterwards), the move emitter looks at later moves in the sequence for a register whose current value will be clobbered, and which can be used as a scratch register.  This eliminates more than 60% of the memory->memory push/pops on crypto, bringing the number of such moves below the number we had before bug 1067610.
Attachment #8609530 - Flags: review?(sunfish)
Attached patch part 2Splinter Review
This builds on the previous patch to try to slide moves around so that there will be a scratch register available for them.  This pretty much eliminates the remaining memory push/pops on octane-crypto, though it's hard to tell if the benchmark score is affected.
Attachment #8610754 - Flags: review?(sunfish)
Comment on attachment 8609530 [details] [diff] [review]
patch

Review of attachment 8609530 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/x86-shared/MoveEmitter-x86-shared.cpp
@@ +106,5 @@
>  #endif
>  
>      for (size_t i = 0; i < moves.numMoves(); i++) {
> +#if defined(JS_CODEGEN_X86) && defined(DEBUG)
> +        if (!scratchRegister_.isSome()) {

The hasScratchRegister() helper had the nice property of hiding the JS_CODEGEN_X86 test. Why was it removed?

@@ +558,5 @@
> +        } else if (move.to().isMemoryOrEffectiveAddress()) {
> +            regs.takeUnchecked(move.to().base());
> +        }
> +    }
> +    */

Did you mean to leave this block of code commented out?
Comment on attachment 8610754 [details] [diff] [review]
part 2

Review of attachment 8610754 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/CodeGenerator.cpp
@@ +2157,4 @@
>      if (group->maybeScratchRegister().isGeneralReg())
>          emitter.setScratchRegister(group->maybeScratchRegister().toGeneralReg()->reg());
> +    else
> +        resolver.sortMemoryToMemoryMoves();

It seems like optimizing memory-to-memory moves would be beneficial on all platforms, not just x86?
(In reply to Dan Gohman [:sunfish] from comment #2)
> ::: js/src/jit/x86-shared/MoveEmitter-x86-shared.cpp
> @@ +106,5 @@
> >  #endif
> >  
> >      for (size_t i = 0; i < moves.numMoves(); i++) {
> > +#if defined(JS_CODEGEN_X86) && defined(DEBUG)
> > +        if (!scratchRegister_.isSome()) {
> 
> The hasScratchRegister() helper had the nice property of hiding the
> JS_CODEGEN_X86 test. Why was it removed?

hasScratchRegister() wasn't so much removed as replaced by findScratchRegister(), and having both functions in the interface seems like it would cause confusion over which is correct to use.  In this function there is already a test for JS_CODEGEN_X86 && DEBUG.  While we could remove the test here and always call findScratchRegister(), on x64 we would just clobber the general purpose scratch register before every move, which seems kind of silly and wasteful.

> @@ +558,5 @@
> > +        } else if (move.to().isMemoryOrEffectiveAddress()) {
> > +            regs.takeUnchecked(move.to().base());
> > +        }
> > +    }
> > +    */
> 
> Did you mean to leave this block of code commented out?

Oops, no, this was in for testing.
(In reply to Dan Gohman [:sunfish] from comment #3)
> ::: js/src/jit/CodeGenerator.cpp
> @@ +2157,4 @@
> >      if (group->maybeScratchRegister().isGeneralReg())
> >          emitter.setScratchRegister(group->maybeScratchRegister().toGeneralReg()->reg());
> > +    else
> > +        resolver.sortMemoryToMemoryMoves();
> 
> It seems like optimizing memory-to-memory moves would be beneficial on all
> platforms, not just x86?

sortMemoryToMemoryMoves() just reorders moves to help findScratchRegister() on x86.  This should only help performance if we are executing a move group with no available scratch register.
Comment on attachment 8609530 [details] [diff] [review]
patch

Review of attachment 8609530 [details] [diff] [review]:
-----------------------------------------------------------------

Ok. If we ever have another architecture that lacks a ScratchRegister, it'd be nice to generalize this, but it's ok for now.
Attachment #8609530 - Flags: review?(sunfish) → review+
Attachment #8610754 - Flags: review?(sunfish) → review+
It looks like you forgot to enable the code in findScratchRegister.
(In reply to Tom Schuster [:evilpie] from comment #8)
> It looks like you forgot to enable the code in findScratchRegister.

Oops, thanks.
https://hg.mozilla.org/mozilla-central/rev/bb1fe6825826
https://hg.mozilla.org/mozilla-central/rev/82322f431ac1
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
You need to log in before you can comment on or make changes to this bug.