Closed Bug 1077014 Opened 11 years ago Closed 10 years ago

Generate better code when the result is not used

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla39

Tracking Flags:

Tracking

Status

firefox39

---

fixed

People

(Reporter: lth, Assigned: lth)

References

(Blocks 1 open bug)

Details

Attachments

(4 files, 1 obsolete file)

Atomic binops for effect for ARM 10 years ago Lars T Hansen [:lth] 8.02 KB, patch	mjrosenb : review+	Details \| Diff \| Splinter Review
Macroassembler changes 10 years ago Lars T Hansen [:lth] 27.86 KB, patch	h4writer : review+	Details \| Diff \| Splinter Review
Optimizations and tests 10 years ago Lars T Hansen [:lth] 28.74 KB, patch		Details \| Diff \| Splinter Review
Macroassembler changes (MIPS) 10 years ago Lars T Hansen [:lth] 5.54 KB, patch	rankov : review+	Details \| Diff \| Splinter Review
Optimizations and tests, v2 10 years ago Lars T Hansen [:lth] 39.07 KB, patch	h4writer : review+	Details \| Diff \| Splinter Review

Lars T Hansen [:lth]

Assignee

Description

•

11 years ago

Followup work on 979594. When the result of an atomic ADD or SUB is not used we can generate simply 'LOCK ADD' or 'LOCK SUB' instead of 'MOV; LOCK XADD' for add or 'MOV; NEG; LOCK XADD' for subtract. Immediate operands will fit right into the instruction. When the result of an atomic AND, OR, or XOR is not used we can generate simply 'LOCK AND', etc, instead of a loop with CMPXCHG that we have to use if the result is needed. Again, immediate operands will fit right into the instruction. (I'm not aware of tweaks to atomic CMPXCHG, and in any case the result of that operation will usually be used.)

Lars T Hansen [:lth]

Assignee

Comment 1

•

10 years ago

A fair amount of the code is crossplatform and even on ARM there are pleasant effects of this optimization (less register pressure).

Summary: x86 atomics: Generate better code when the result is not used → Generate better code when the result is not used

Lars T Hansen [:lth]

Assignee

Updated

•

10 years ago

Depends on: 1141067

Lars T Hansen [:lth]

Assignee

Updated

•

10 years ago

Blocks: 1141121

Lars T Hansen [:lth]

Assignee

Comment 2

•

10 years ago

Attached patch Atomic binops for effect for ARM — Details — Splinter Review

Simple variant on existing code.

Attachment #8575167 - Flags: review?(mrosenberg)

Lars T Hansen [:lth]

Assignee

Comment 3

•

10 years ago

Attached patch Macroassembler changes — Details — Splinter Review

Simple API changes.

Attachment #8575225 - Flags: review?(hv1989)

Lars T Hansen [:lth]

Assignee

Comment 4

•

10 years ago

Attached patch Optimizations and tests (obsolete) — Details — Splinter Review

(There are additional opportunities, many will be taken care of by bug 1141121.)

Attachment #8575226 - Flags: review?(hv1989)

Hannes Verschore [:h4writer]

Comment 5

•

10 years ago

Comment on attachment 8575225 [details] [diff] [review] Macroassembler changes Review of attachment 8575225 [details] [diff] [review]: ----------------------------------------------------------------- Can you open a follow-up bug for the mips changes and needinfo rankov? ::: js/src/jit/MacroAssembler.cpp @@ +641,5 @@ > MacroAssembler::atomicBinopToTypedIntArray(AtomicOp op, Scalar::Type arrayType, > const Register &value, const BaseIndex &mem, > Register temp1, Register temp2, AnyRegister output); > > +// For effect ??

Attachment #8575225 - Flags: review?(hv1989) → review+

Hannes Verschore [:h4writer]

Comment 6

•

10 years ago

Comment on attachment 8575226 [details] [diff] [review] Optimizations and tests Review of attachment 8575226 [details] [diff] [review]: ----------------------------------------------------------------- I think it would be better to add a new LIR for this case. LAsmJSAtomicBinopHeap has a 'def', where the register allocator need to worry about. If we add a new LIR we can remove that 'def'.

Attachment #8575226 - Flags: review?(hv1989)

Lars T Hansen [:lth]

Assignee

Comment 7

•

10 years ago

Attached patch Macroassembler changes (MIPS) — Details — Splinter Review

MIPS stubs factored out from the MacroAssembler patch, as requested by h4writer.

Attachment #8577118 - Flags: review?(branislav.rankov)

Lars T Hansen [:lth]

Assignee

Comment 8

•

10 years ago

Attached patch Optimizations and tests, v2 — Details — Splinter Review

Refactored as requested.

Attachment #8575226 - Attachment is obsolete: true

Attachment #8577171 - Flags: review?(hv1989)

Hannes Verschore [:h4writer]

Comment 9

•

10 years ago

Comment on attachment 8577171 [details] [diff] [review] Optimizations and tests, v2 Review of attachment 8577171 [details] [diff] [review]: ----------------------------------------------------------------- I only had some reservations with the function "asmJSAtomicComputeAddress". Feel free to contact me in chat upon questions about it or other possible suggestions. r+ when agreeing on the "asmJSAtomicComputeAddress" changes. ::: js/src/jit/LIR-Common.h @@ +5094,5 @@ > return mir_->toAtomicTypedArrayElementBinop(); > } > }; > > +class LAtomicTypedArrayElementBinopForEffect : public LInstructionHelper<0, 3, 0> Can you add an explanation above of what this LIR does? @@ +6523,5 @@ > return mir_->toAsmJSAtomicBinopHeap(); > } > }; > > +class LAsmJSAtomicBinopHeapForEffect : public LInstructionHelper<0, 2, 1> Can you add an explanation above of what this LIR does? ::: js/src/jit/x86/CodeGenerator-x86.cpp @@ +694,1 @@ > { Oh this is strange. Can we make sure the name "asmJSAtomicComputeAddress" matches what happens in here? This function does more than expected. It also does: "set out to zero upon boundscheckfail", instead of only "computeAddress". So can we move this out of this function: > memoryBarrier(MembarFull); > if (out != InvalidReg) > masm.xorl(out,out); To something like this: > asmJSAtomicComputeAddress(Register addrTemp, Register ptrReg, bool boundsCheck, > int32_t offset, int32_t endOffset, Label &fail) > { > if (mir->needsBoundsCheck()) { > ... > masm.j(Assembler::Above, &fail); > } > ... > } > > visitAsmJSAtomicBinopHeap(..) > { > asmJSAtomicComputeAddress(...); > > > if (fail.used) { > masm.jump(&rejoin); > masm.bind(&fail); > memoryBarrier(MembarFull); > if (out != InvalidReg) > masm.xorl(out,out); > > } > masm.bind(&rejoin); > } If the "fail" case is very uncommon we can even make it an oolcode, removing the jump.

Attachment #8577171 - Flags: review?(hv1989) → review+

Lars T Hansen [:lth]

Assignee

Comment 10

•

10 years ago

Discussion with Hannes on IRC: we'll add a comment to the existing asmJSAtomicComputeAddress() method to clarify what it does (bounds check, barrier + clear result on failure + compute address).

Marty Rosenberg [:mjrosenb]

Comment 11

•

10 years ago

Comment on attachment 8575167 [details] [diff] [review] Atomic binops for effect for ARM Review of attachment 8575167 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/arm/MacroAssembler-arm.cpp @@ +5049,5 @@ > + const T &mem) > +{ > + // Fork for non-word operations on ARMv6. > + // > + // Bug 1077321: We may further optimize for ARMv8 here. ARMv8 is different enough from ARMv7 that it is getting an entirely new arch directory.

Attachment #8575167 - Flags: review?(mrosenberg) → review+

Branislav Rankov [:rankov]

Comment 12

•

10 years ago

Comment on attachment 8577118 [details] [diff] [review] Macroassembler changes (MIPS) Review of attachment 8577118 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/mips/MacroAssembler-mips.h @@ +1016,5 @@ > template<typename T, typename S> > void atomicFetchAdd32(const S &value, const T &mem, Register temp, Register output) { > MOZ_CRASH("NYI"); > } > + template <typename T, typename S> void atomicAdd8(const T &value, const S &mem) { NIT: Add new line after template <typename T, typename S> Do the same for all other instances.

Attachment #8577118 - Flags: review?(branislav.rankov) → review+

Lars T Hansen [:lth]

Assignee

Comment 13

•

10 years ago

(In reply to Marty Rosenberg [:mjrosenb] from comment #11) > Comment on attachment 8575167 [details] [diff] [review] > Atomic binops for effect for ARM > > Review of attachment 8575167 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: js/src/jit/arm/MacroAssembler-arm.cpp > @@ +5049,5 @@ > > + const T &mem) > > +{ > > + // Fork for non-word operations on ARMv6. > > + // > > + // Bug 1077321: We may further optimize for ARMv8 here. > > ARMv8 is different enough from ARMv7 that it is getting an entirely new arch > directory. You're probably thinking about 64-bit? I was thinking about the fact that there are new AArch32 instructions for synchronization (some release-acquire thing that will allow for the insertion of fewer memory barrier). (I'll clarify the comment.)

Lars T Hansen [:lth]

Assignee

Comment 14

•

10 years ago

Try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=add8097950fa

Lars T Hansen [:lth]

Assignee

Comment 15

•

10 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/1949f82ba1ba https://hg.mozilla.org/integration/mozilla-inbound/rev/01e1baebdd64 https://hg.mozilla.org/integration/mozilla-inbound/rev/c741c522c108 https://hg.mozilla.org/integration/mozilla-inbound/rev/5ddeb7d76c6c

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 16

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/1949f82ba1ba https://hg.mozilla.org/mozilla-central/rev/01e1baebdd64 https://hg.mozilla.org/mozilla-central/rev/c741c522c108 https://hg.mozilla.org/mozilla-central/rev/5ddeb7d76c6c

Status: NEW → RESOLVED

Closed: 10 years ago

status-firefox39: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla39

You need to log in before you can comment on or make changes to this bug.