The default bug view has changed. See this FAQ.

Use xor+setcc instead of setcc+movzbl in BaselineIC-x86-shared.cpp

RESOLVED FIXED in mozilla23

Status

()

Core
JavaScript Engine
--
enhancement
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: sunfish, Assigned: sunfish)

Tracking

Trunk
mozilla23
x86_64
All
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

4 years ago
BaselineIC-x86-shared.cpp has code that does this:

  ucomisd %xmm0, %xmm1
  seta    %cl
  movzbl  %cl, %ecx

Since it happens to be convenient to do so, this would be better:

  xorl    %ecx, %ecx
  ucomisd %xmm0, %xmm1
  seta    %cl

because it has smaller code size and on common processors today, and the xor of a register with itself is handled specially.
(Assignee)

Comment 1

4 years ago
Created attachment 746482 [details] [diff] [review]
a patch containing a proposed fix
(Assignee)

Updated

4 years ago
Blocks: 869532
(Assignee)

Updated

4 years ago
No longer blocks: 869532
(Assignee)

Comment 2

4 years ago
This patch depends on the patch in bug 869532.
Depends on: 869532
(Assignee)

Comment 3

4 years ago
Created attachment 747421 [details] [diff] [review]
refresh patch to apply to trunk
Attachment #746482 - Attachment is obsolete: true
Attachment #747421 - Flags: review?(nicolas.b.pierron)
Attachment #747421 - Flags: review?(nicolas.b.pierron) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/0f05638c8f26
Assignee: general → sunfish
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Are you sure that this is better? Both gcc and clang use the mozbl variant.
(Assignee)

Comment 6

4 years ago
GCC (4.6 and 4.8) on my system uses the xor variant, for this C code with -O2 for example:

long foo(double x) { return x > 0; }

The xor variant is part of the "peephole2" pass in GCC, which specifically rewrites setcc+movzbl to xor+setcc when it can.

Also, the xor variant is recommended first by Agner's "Optimizing subroutines in assembly 
language", section "Replacing conditional jumps with conditional set instructions" [0].

Also, Intel uses the xor variant in their x86 optimizing manual [1]. It isn't a specific recommendation, but they do use it in an example (Example 3-2. "Code Optimization to Eliminate Branches").

Also, the xor trick has a smaller overall encoding.

[0] http://www.agner.org/optimize/optimizing_assembly.pdf
[1] http://www.intel.com/content/dam/doc/manual/64-ia-32-architectures-optimization-manual.pdf
(In reply to Dan Gohman from comment #6)
> GCC (4.6 and 4.8) on my system uses the xor variant, for this C code with
> -O2 for example:


Very true. I forgot my gcc was very old. I've filed a bug against clang to add the optimization there:
http://llvm.org/bugs/show_bug.cgi?id=15946
https://hg.mozilla.org/mozilla-central/rev/0f05638c8f26
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
You need to log in before you can comment on or make changes to this bug.