Instructions like testq $0x8, %rax can be rewritten to testb $0x8, %al, which is a smaller instruction due to using a smaller immediate field. cmpq $0, %rax can be rewritten to testq %rax, %rax, which is smaller.
Created attachment 753924 [details] [diff] [review] a proposed fix
So is it only 16-bit operations that are to be avoided? The 8-bit ones are fine?
I attempted to time the cost of |testb| versus |testq|. Given the following code: > section .text > global _start > _start: > mov rdi, dword 0x100000 > xor rcx, rcx > > _loop: > times 1000 test bx, 0x8 > add rcx, dword 1 > cmp rcx, rdi > jne _loop > > _exit: > xor rdi,rdi > mov rax, dword 0x3c > syscall and a version that replaces "test bx, 0x8" with "test rbx, 0x8", the |testb| version runs in 900ms, while the |testq| version runs in 100ms. Based on the execution time of |testb|, it sounds like making this change might not be advantageous.
Luke: Yes, it's generally only the 16-bit operations that are to be avoided. Sean: "test bx, 0x8", is |testw|, the 16-bit form. My patch doesn't use the 16-bit form. With your testcase on a Sandy Bridge, testw is slow, while testb, testl, and testq are all fast.
Comment on attachment 753924 [details] [diff] [review] a proposed fix Review of attachment 753924 [details] [diff] [review]: ----------------------------------------------------------------- You're right -- 8-bit test operations appear to be around the same speed as 32-bit test operations with smaller code size. Nice, I had no idea! ::: js/src/assembler/assembler/X86Assembler.h @@ +1278,5 @@ > spew("testb %s, %s", > nameIReg(1,src), nameIReg(1,dst)); > m_formatter.oneByteOp(OP_TEST_EbGb, src, dst); > } > There's some preexisting trailing whitespace here, would you mind removing it?
I'll submit a separate patch to clean up the preexisting whitespace. I checked in the reviewed patch here: https://hg.mozilla.org/integration/mozilla-inbound/rev/ef66f749167d
Created attachment 760506 [details] [diff] [review] trailing whitespace cleanup