Closed
Bug 1077014
Opened 11 years ago
Closed 10 years ago
Generate better code when the result is not used
Categories
(Core :: JavaScript Engine: JIT, defect)
Tracking
()
RESOLVED
FIXED
mozilla39
| Tracking | Status | |
|---|---|---|
| firefox39 | --- | fixed |
People
(Reporter: lth, Assigned: lth)
References
(Blocks 1 open bug)
Details
Attachments
(4 files, 1 obsolete file)
|
8.02 KB,
patch
|
mjrosenb
:
review+
|
Details | Diff | Splinter Review |
|
27.86 KB,
patch
|
h4writer
:
review+
|
Details | Diff | Splinter Review |
|
5.54 KB,
patch
|
rankov
:
review+
|
Details | Diff | Splinter Review |
|
39.07 KB,
patch
|
h4writer
:
review+
|
Details | Diff | Splinter Review |
Followup work on 979594.
When the result of an atomic ADD or SUB is not used we can generate simply 'LOCK ADD' or 'LOCK SUB' instead of 'MOV; LOCK XADD' for add or 'MOV; NEG; LOCK XADD' for subtract. Immediate operands will fit right into the instruction.
When the result of an atomic AND, OR, or XOR is not used we can generate simply 'LOCK AND', etc, instead of a loop with CMPXCHG that we have to use if the result is needed. Again, immediate operands will fit right into the instruction.
(I'm not aware of tweaks to atomic CMPXCHG, and in any case the result of that operation will usually be used.)
| Assignee | ||
Comment 1•10 years ago
|
||
A fair amount of the code is crossplatform and even on ARM there are pleasant effects of this optimization (less register pressure).
Summary: x86 atomics: Generate better code when the result is not used → Generate better code when the result is not used
| Assignee | ||
Comment 2•10 years ago
|
||
Simple variant on existing code.
Attachment #8575167 -
Flags: review?(mrosenberg)
| Assignee | ||
Comment 4•10 years ago
|
||
(There are additional opportunities, many will be taken care of by bug 1141121.)
Attachment #8575226 -
Flags: review?(hv1989)
Comment 5•10 years ago
|
||
Comment on attachment 8575225 [details] [diff] [review]
Macroassembler changes
Review of attachment 8575225 [details] [diff] [review]:
-----------------------------------------------------------------
Can you open a follow-up bug for the mips changes and needinfo rankov?
::: js/src/jit/MacroAssembler.cpp
@@ +641,5 @@
> MacroAssembler::atomicBinopToTypedIntArray(AtomicOp op, Scalar::Type arrayType,
> const Register &value, const BaseIndex &mem,
> Register temp1, Register temp2, AnyRegister output);
>
> +// For effect
??
Attachment #8575225 -
Flags: review?(hv1989) → review+
Comment 6•10 years ago
|
||
Comment on attachment 8575226 [details] [diff] [review]
Optimizations and tests
Review of attachment 8575226 [details] [diff] [review]:
-----------------------------------------------------------------
I think it would be better to add a new LIR for this case. LAsmJSAtomicBinopHeap has a 'def', where the register allocator need to worry about. If we add a new LIR we can remove that 'def'.
Attachment #8575226 -
Flags: review?(hv1989)
| Assignee | ||
Comment 7•10 years ago
|
||
MIPS stubs factored out from the MacroAssembler patch, as requested by h4writer.
Attachment #8577118 -
Flags: review?(branislav.rankov)
| Assignee | ||
Comment 8•10 years ago
|
||
Refactored as requested.
Attachment #8575226 -
Attachment is obsolete: true
Attachment #8577171 -
Flags: review?(hv1989)
Comment 9•10 years ago
|
||
Comment on attachment 8577171 [details] [diff] [review]
Optimizations and tests, v2
Review of attachment 8577171 [details] [diff] [review]:
-----------------------------------------------------------------
I only had some reservations with the function "asmJSAtomicComputeAddress". Feel free to contact me in chat upon questions about it or other possible suggestions.
r+ when agreeing on the "asmJSAtomicComputeAddress" changes.
::: js/src/jit/LIR-Common.h
@@ +5094,5 @@
> return mir_->toAtomicTypedArrayElementBinop();
> }
> };
>
> +class LAtomicTypedArrayElementBinopForEffect : public LInstructionHelper<0, 3, 0>
Can you add an explanation above of what this LIR does?
@@ +6523,5 @@
> return mir_->toAsmJSAtomicBinopHeap();
> }
> };
>
> +class LAsmJSAtomicBinopHeapForEffect : public LInstructionHelper<0, 2, 1>
Can you add an explanation above of what this LIR does?
::: js/src/jit/x86/CodeGenerator-x86.cpp
@@ +694,1 @@
> {
Oh this is strange. Can we make sure the name "asmJSAtomicComputeAddress" matches what happens in here?
This function does more than expected. It also does: "set out to zero upon boundscheckfail", instead of only "computeAddress".
So can we move this out of this function:
> memoryBarrier(MembarFull);
> if (out != InvalidReg)
> masm.xorl(out,out);
To something like this:
> asmJSAtomicComputeAddress(Register addrTemp, Register ptrReg, bool boundsCheck,
> int32_t offset, int32_t endOffset, Label &fail)
> {
> if (mir->needsBoundsCheck()) {
> ...
> masm.j(Assembler::Above, &fail);
> }
> ...
> }
>
> visitAsmJSAtomicBinopHeap(..)
> {
> asmJSAtomicComputeAddress(...);
>
>
> if (fail.used) {
> masm.jump(&rejoin);
> masm.bind(&fail);
> memoryBarrier(MembarFull);
> if (out != InvalidReg)
> masm.xorl(out,out);
>
> }
> masm.bind(&rejoin);
> }
If the "fail" case is very uncommon we can even make it an oolcode, removing the jump.
Attachment #8577171 -
Flags: review?(hv1989) → review+
| Assignee | ||
Comment 10•10 years ago
|
||
Discussion with Hannes on IRC: we'll add a comment to the existing asmJSAtomicComputeAddress() method to clarify what it does (bounds check, barrier + clear result on failure + compute address).
Comment 11•10 years ago
|
||
Comment on attachment 8575167 [details] [diff] [review]
Atomic binops for effect for ARM
Review of attachment 8575167 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/arm/MacroAssembler-arm.cpp
@@ +5049,5 @@
> + const T &mem)
> +{
> + // Fork for non-word operations on ARMv6.
> + //
> + // Bug 1077321: We may further optimize for ARMv8 here.
ARMv8 is different enough from ARMv7 that it is getting an entirely new arch directory.
Attachment #8575167 -
Flags: review?(mrosenberg) → review+
Comment 12•10 years ago
|
||
Comment on attachment 8577118 [details] [diff] [review]
Macroassembler changes (MIPS)
Review of attachment 8577118 [details] [diff] [review]:
-----------------------------------------------------------------
::: js/src/jit/mips/MacroAssembler-mips.h
@@ +1016,5 @@
> template<typename T, typename S>
> void atomicFetchAdd32(const S &value, const T &mem, Register temp, Register output) {
> MOZ_CRASH("NYI");
> }
> + template <typename T, typename S> void atomicAdd8(const T &value, const S &mem) {
NIT: Add new line after template <typename T, typename S>
Do the same for all other instances.
Attachment #8577118 -
Flags: review?(branislav.rankov) → review+
| Assignee | ||
Comment 13•10 years ago
|
||
(In reply to Marty Rosenberg [:mjrosenb] from comment #11)
> Comment on attachment 8575167 [details] [diff] [review]
> Atomic binops for effect for ARM
>
> Review of attachment 8575167 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: js/src/jit/arm/MacroAssembler-arm.cpp
> @@ +5049,5 @@
> > + const T &mem)
> > +{
> > + // Fork for non-word operations on ARMv6.
> > + //
> > + // Bug 1077321: We may further optimize for ARMv8 here.
>
> ARMv8 is different enough from ARMv7 that it is getting an entirely new arch
> directory.
You're probably thinking about 64-bit? I was thinking about the fact that there are new AArch32 instructions for synchronization (some release-acquire thing that will allow for the insertion of fewer memory barrier).
(I'll clarify the comment.)
| Assignee | ||
Comment 14•10 years ago
|
||
| Assignee | ||
Comment 15•10 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/1949f82ba1ba
https://hg.mozilla.org/mozilla-central/rev/01e1baebdd64
https://hg.mozilla.org/mozilla-central/rev/c741c522c108
https://hg.mozilla.org/mozilla-central/rev/5ddeb7d76c6c
Status: NEW → RESOLVED
Closed: 10 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
You need to log in
before you can comment on or make changes to this bug.
Description
•