Closed Bug 1517553 Opened 5 years ago Closed 5 years ago

js::jit::CodeGenerator::visitLoadSlotT with testInt32 lacks scratch registers.

Categories

(Core :: JavaScript Engine: JIT, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla66
Tracking Status
firefox66 --- fixed

People

(Reporter: nbp, Assigned: nbp)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

      No description provided.
Ok, looking at the code and available assembly instructions … I think the best way to proceed for the ARM64 unboxing both for the C++ code and the JIT code is to change the strategy for comparing Value tags.

Clang generates the following assembly sequence for Value::isInt32:

  lsr     x0, x0, #47
  mov     w8, #0xfff1
  movk    w8, #0x1, lsl #16
  subs    w8, w0, w8
  cset    w0, eq

In our JIT we use UBFX instruction to extract form the bit field corresponding to the mask of the JSVAL_TAG_*, which is similar to a the LSR instruction generated by clang.

I think we can optimize ARM64 use of Values tags by working with the sign-extended version of the tags. All Tags have a 1 bit as the top-most bit. Doing a sign-extension will set all the high bits to 1.  Doing so, we can test for equality with the CMN/ADDS instruction instead of CMP/SUBS instruction.

  asr     x0, x0, #47
  cmn     x0, #(= (0 - (int64_t(JSVAL_SHIFTED_TAG_INT64) >> JSVAL_TAG_SHIFT) ) )
(In reply to Nicolas B. Pierron [:nbp] from comment #1)
> I think we can optimize ARM64 …

And fix this issue, as the proposed generated assembly does not require any scratch register.

So far the patch that I am working on sounds good because extractTag / splitTag extracted values are only flowing in branchTest functions which are using the test functions based on CMN.
As opposed to other architecture, splitTag is replaced by splitSignExtTag
which extract the tag with a sign extension. The reason being that cmp32
with a tag value would be too large to fit as a 12 bits immediate value,
and would require the VIXL macro assembler to add an extra instruction and
require extra scratch register to load the Tag value.

Instead, we compare with the negative value of the sign extended tag with
the CMN instruction. The sign extended tag is expected to be a negative
value. Therefore the negative of the sign extended tag is expected to be
near 0 and fit on 12 bits.

Ignoring the sign extension, the logic is the following:

  CMP32(Reg, Tag) = Reg - Tag
                  = Reg + (-Tag)
                  = CMN32(Reg, -Tag)

Note: testGCThing, testPrimitive and testNumber which are checking for
inequalities should use unsigned comparisons (as done by default) in
order to keep the same relation order after the sign extension, i.e.
using Above or Below which are based on the carry flag.
Attachment #9034754 - Flags: review?(sstangl)
Attachment #9034754 - Flags: review?(kvijayan)

Comment on attachment 9034754 [details] [diff] [review]
ARM64 JIT: Change JSVAL_TAG comparisons to require only 2 instructions and 1 register.

Review of attachment 9034754 [details] [diff] [review]:

::: js/src/jit/arm64/MacroAssembler-arm64.h
@@ +1436,5 @@

  • //
  • // Note: testGCThing, testPrimitive and testNumber which are checking for
  • // inequalities should use unsigned comparisons (as done by default) in
  • // order to keep the same relation order after the sign extension, i.e.
  • // using Above or Below which are based on the carry flag.

Please add a brief one-line comment to those functions referring to this one, something like // Requires unsigned comparison due to cmpTag internals.

@@ +1439,5 @@

  • // order to keep the same relation order after the sign extension, i.e.
  • // using Above or Below which are based on the carry flag.
  • uint32_t hiShift = JSVAL_TAG_SHIFT - 32;
  • int32_t seTag = int32_t(ref.value);
  • seTag = (seTag << hiShift) >> hiShift;

Is this behavior well-defined? I would be worried that if you passed -O3 the compiler would decide to helpfully compile this into a NOP, since sign-extension isn't part of the C++ spec, as far as I'm aware.

But the assertions below would catch that, at least, in debug mode.

Attachment #9034754 - Flags: review?(sstangl) → review+

(In reply to Sean Stangl [:sstangl] from comment #4)

  • seTag = (seTag << hiShift) >> hiShift;

Is this behavior well-defined?

Strangely no, but it is not undefined either.

From http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3797.pdf , Section 5.8 Shift operator:

(context E1 >> E2) If E1 has a signed type and a negative value, the resulting value is implementation-defined.

If you prefer, I can use a or-mask to set to 1 all the hi-bits with:

uint32_t signExt = 0xffffffff;
signExt = (signExt << hiShift) >> hiShift;
MOZ_ASSERT(JSVAL_TAG_MAX_DOUBLE <= seTag && seTag <= JSVAL_TAG_OBJECT);
seTag = seTag | signExt;

But I think we are safe unless this got added to the list of undefined behaviour of clang.

Comment on attachment 9034754 [details] [diff] [review]
ARM64 JIT: Change JSVAL_TAG comparisons to require only 2 instructions and 1 register.

Review of attachment 9034754 [details] [diff] [review]:
-----------------------------------------------------------------

I am satisfied with sean reviewing this - I don't think the changes really require my go-ahead.  Thanks for the heads up nbp.
Attachment #9034754 - Flags: review?(kvijayan)
Pushed by npierron@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9429979e423c
ARM64 JIT: Change JSVAL_TAG comparisons to require only 2 instructions and 1 register. r=sstangl
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: