Closed Bug 869525 Opened 12 years ago Closed 12 years ago

Use xor+setcc instead of setcc+movzbl in BaselineIC-x86-shared.cpp

Categories

(Core :: JavaScript Engine, enhancement)

x86_64
All
enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla23

People

(Reporter: sunfish, Assigned: sunfish)

References

Details

Attachments

(1 file, 1 obsolete file)

BaselineIC-x86-shared.cpp has code that does this: ucomisd %xmm0, %xmm1 seta %cl movzbl %cl, %ecx Since it happens to be convenient to do so, this would be better: xorl %ecx, %ecx ucomisd %xmm0, %xmm1 seta %cl because it has smaller code size and on common processors today, and the xor of a register with itself is handled specially.
Blocks: 869532
No longer blocks: 869532
This patch depends on the patch in bug 869532.
Depends on: 869532
Attachment #746482 - Attachment is obsolete: true
Attachment #747421 - Flags: review?(nicolas.b.pierron)
Attachment #747421 - Flags: review?(nicolas.b.pierron) → review+
Assignee: general → sunfish
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Are you sure that this is better? Both gcc and clang use the mozbl variant.
GCC (4.6 and 4.8) on my system uses the xor variant, for this C code with -O2 for example: long foo(double x) { return x > 0; } The xor variant is part of the "peephole2" pass in GCC, which specifically rewrites setcc+movzbl to xor+setcc when it can. Also, the xor variant is recommended first by Agner's "Optimizing subroutines in assembly language", section "Replacing conditional jumps with conditional set instructions" [0]. Also, Intel uses the xor variant in their x86 optimizing manual [1]. It isn't a specific recommendation, but they do use it in an example (Example 3-2. "Code Optimization to Eliminate Branches"). Also, the xor trick has a smaller overall encoding. [0] http://www.agner.org/optimize/optimizing_assembly.pdf [1] http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf
(In reply to Dan Gohman from comment #6) > GCC (4.6 and 4.8) on my system uses the xor variant, for this C code with > -O2 for example: Very true. I forgot my gcc was very old. I've filed a bug against clang to add the optimization there: http://llvm.org/bugs/show_bug.cgi?id=15946
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: