Closed Bug 533854 Opened 15 years ago Closed 14 years ago

x64 implementation of XORPS is wrong

Categories

(Tamarin Graveyard :: Baseline JIT (CodegenLIR), defect, P1)

x86_64
All
defect

Tracking

(Not tracked)

VERIFIED FIXED
flash10.1

People

(Reporter: stejohns, Assigned: stejohns)

References

Details

Attachments

(1 file)

It incorrectly assumes that this instruction has a prefix, which it doesn't, which gets the REX value set wrong... net effect is that it works as expected for XORPS(XMM8) gets emitted as XORPS(XMM0) (etc for 8..15), so bug is hard to notice unless you are using a lot of XMM regs.
Attached patch PatchSplinter Review
Assignee: nobody → stejohns
Attachment #416809 - Flags: review?(edwsmith)
Attachment #416809 - Flags: review?(nnethercote)
Blocks: 532240
You could clarify the comment in your patch that XORPS has a 1-byte prefix 0F whereas most SSE2 instructions have a two byte prefix 66,0F.

Also, the original comment reads: 

// XORPS is a 4x32f vector operation, we use it instead of the more obvious
// XORPD because it's one byte shorter.  This is ok because it's only used for
// zeroing an XMM register;  hence the single argument.

Note that XORPD (2x64f) and XORPS (4x32f) are interchangeable since XOR is a bitwise operation.  Both instructions modify the whole register; XORPD is just a one-byte-longer encoding for exactly the same operation.  (zeroing or not).  Why intel has both isn't clear to me, but if we start criticizing this instruction set, why stop there?  ha ha.  Maybe one or the other is designed to pipeline better in conjection with other PS or PD operations, but that's just heresay and seems unlikely since XOR itself is an ALU operation.
Attachment #416809 - Flags: review?(edwsmith) → review+
> Why intel has both isn't clear to me

Marketing influenced?  They probably translate to the exact same micro-ops.
Comment on attachment 416809 [details] [diff] [review]
Patch

You mistyped "instructions", r=me with that fixed.
Attachment #416809 - Flags: review?(nnethercote) → review+
pushed to nj-c as changeset:   1105:49246c195323
pushed to TR as changeset:   3306:1c7c9b53d3c3
Pretty sure that this has been pushed to all relevant repos, shall we close it?
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → flash10.1
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Steven is it possible to generate a testcase for this or should it be verified fixed by the fact that the patch has landed?
Flags: in-testsuite?
QA Contact: nanojit → brbaker
I don't think a testcase is realistically possible.
Verified fixed and set in-testsuite to - based on stevens comment.
Status: RESOLVED → VERIFIED
Flags: in-testsuite? → in-testsuite-
QA Contact: brbaker → nanojit
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: