Closed Bug 1702259 Opened 3 years ago Closed 3 years ago

Mismatch for float32 size in ARM64 MacroAssembler in MacroAssembler::storeRegsInMask and FloatRegister::GetPushSizeInBytes

Categories

(Core :: JavaScript Engine: JIT, defect, P1)

ARM64
All
defect

Tracking

()

RESOLVED FIXED
89 Branch
Tracking Status
firefox89 --- fixed

People

(Reporter: yury, Assigned: jseward)

References

Details

Attachments

(1 file)

I'm getting an assertion mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=1516701#c1 : "I suspect this assertion would trigger in the future with float32 test cases as FloatRegister::GetPushSizeInBytes [1] does not compute float32 storage space correctly, as opposed to x64 [2]."

[task 2021-03-30T19:50:32.344Z] Assertion failure: diffF == 0, at /builds/worker/checkouts/gecko/js/src/jit/arm64/MacroAssembler-arm64.cpp:866
[task 2021-03-30T19:50:32.344Z] Exit code: -11
[task 2021-03-30T19:50:32.344Z] FAIL - baseline/bug844470.js
[task 2021-03-30T19:50:32.344Z] TEST-UNEXPECTED-FAIL | js/src/jit-test/tests/baseline/bug844470.js | Assertion failure: diffF == 0, at /builds/worker/checkouts/gecko/js/src/jit/arm64/MacroAssembler-arm64.cpp:866 (code -11, args "") [0.2 s]
[task 2021-03-30T19:50:32.344Z] INFO exit-status     : -11
[task 2021-03-30T19:50:32.344Z] INFO timed-out       : False
[task 2021-03-30T19:50:32.344Z] INFO stderr         2> Assertion failure: diffF == 0, at /builds/worker/checkouts/gecko/js/src/jit/arm64/MacroAssembler-arm64.cpp:866

Reproducible with bug 1687629 patches: https://treeherder.mozilla.org/logviewer?job_id=334938352&repo=try&lineNumber=4696

I does not look related to (Wasm) SIMD since can trigger that with baseline/bug844470.js --jitflags=none --args=--baseline-eager

Nice find. As you note, this will block wasm on ARM64 (targeted for FF90), setting priority level to something visible.

Severity: -- → S3
OS: Unspecified → All
Priority: -- → P1
Hardware: Unspecified → ARM64

Note (for clarity): I can only reproduce this with the ARM64 SIMD patch stack applied:
[1687629/D100656, 1678097/D100116, 1687629/D100659]
Without that, I can't reproduce it.

The register set patch in that stack has a lot of moving parts iirc, among other things it switches the order of singles and doubles in the register type enum, to correspond with other architectures. That could have uncovered a latent bug.

Minimal repro:

./dist/bin/js \
    -f /home/sewardj/MOZ/MU/js/src/jit-test/lib/prologue.js \
    --blinterp-eager \
    -f /home/sewardj/MOZ/MU/js/src/jit-test/tests/baseline/bug844470.js

On initial investigation, it appears that the failure occurs due to an
inconsistency between FloatRegisterSet::GetPushSizeInBytes and
FloatRegister::size. The failing function, MacroAssembler::storeRegsInMask
expects them to be consistent (or, more precisely, for an arbitrary mixture
of float and double registers, expects them to differ overall by a maximum
of sizeof(uintptr_t).

One the one hand:

jit/arm64/Architecture-arm64.cpp
uint32_t FloatRegister::GetPushSizeInBytes(const FloatRegisterSet& s) {
  return s.size() * sizeof(double);
}

but on the other

jit/arm64/Architecture-arm64.h
FloatRegister::size
  constexpr uint32_t size() const {
    MOZ_ASSERT(!invalid_);
    if (kind_ == FloatRegisters::Double) {
      return sizeof(double);
    }
    if (kind_ == FloatRegisters::Single) {
      return sizeof(float);        <---- this is != sizeof(double)
    }

The question that comes to mind is: which is right? Specifically, where
is the single-point-of-truth for layout of this dump area?

(In reply to Julian Seward [:jseward] from comment #5)

The question that comes to mind is: which is right?

storeRegsInMask.

If we can save stack space, we should.

Looking deeper, it seems like there are at least the following problems to
resolve, on arm64:

  • The layout implemented by MacroAssembler::storeRegsInMask is inconsistent
    with what's computed by GetPushSizeInBytes for FloatRegisterSet.

  • It's unclear and unstated (at least AFAICS) which routines generate code
    that restore registers from the area created by
    MacroAssembler::storeRegsInMask. Is it
    MacroAssembler::PopRegsInMaskIgnore? Something else? Are there others?

  • If MacroAssembler::PopRegsInMaskIgnore is indeed a restorer for the
    MacroAssembler::storeRegsInMask area, then it seems to be inconsistent.
    It appears to sometimes use load-pair instructions even though
    MacroAssembler::storeRegsInMask doesn't use store-pair instructions, and
    it's less than obvious that the two routines will match in layout given a
    mixture of float and double registers.

Note that during initial arm64 work, the register sets did not have simd registers but the stubs still needed to have meaningful save/restore routines that took simd into account b/c of the baseline compiler, and so there are variants of some of the PopRegsInMask etc routines used by stubs on ARM64+SIMD called PushRegsInMaskForWasmStubs etc. These may have bug fixes (or additional bugs). They should be removed by the patch stack and the baseline compiler should use the register sets with simd registers instead. I believe I asked Yury to do this but since I haven't seen those patches in a long time I don't know what's been done.

(In reply to Julian Seward [:jseward] from comment #7)

  • It's unclear and unstated (at least AFAICS) which routines generate code
    that restore registers from the area created by
    MacroAssembler::storeRegsInMask. Is it
    MacroAssembler::PopRegsInMaskIgnore? Something else? Are there others?

This comment describes that PushRegsInMask and storeRegsInMask should follow the [same] stack layout.

It seems the only use of storeRegsInMask is used to save registers on the stack after pushing arguments to the stack within CacheRegisterAllocator::saveIonLiveRegisters, which are then restored with PopRegsInMask.

(In reply to Nicolas B. Pierron [:nbp] from comment #9)

It seems the only use of storeRegsInMask is used to save registers on the stack after pushing arguments to the stack within CacheRegisterAllocator::saveIonLiveRegisters, which are then restored with PopRegsInMask.

It would be nice to replace this one-and-only use of storeRegsInMask by
PushRegsInMask, since that would reduce our proof-of-correctness burden
to showing that MacroAssembler::PushRegsInMask and
MacroAssembler::PopRegsInMaskIgnore are consistent. Currently the ARM64
storeRegsInMask is way different from either of them.

(In reply to Julian Seward [:jseward] from comment #10)

It would be nice to replace this one-and-only use of storeRegsInMask by
PushRegsInMask

Forget that. It would create worse code on non-arm64 targets. The problem is
easy enough to fix directly in the arm64 storeRegsInMask. Patch to follow.

The functions MacroAssembler::PushRegsInMask,
MacroAssembler::storeRegsInMask and MacroAssembler::PopRegsInMaskIgnore
generate code to save/restore registers in a memory area. They must agree on
the format of this area, and that must also be consistent with what
FloatRegisterSet::getPushSizeInBytes claims.

On arm64, MacroAssembler::storeRegsInMask is inconsistent with the other
three, and that leads to this assertion failure. As far as I can see, this
has been the case since at least late 2018 (505083:fcf60924da90). It places
32-bit floating registers back-to-back in the memory area. But the restore
code created by MacroAssembler::PopRegsInMaskIgnore reads them back as
64-bit floats. Hence, not only does the assertion fail, but the stored format
is actually different -- floating point registers are stored as 64-bit
entities with no holes in between, even if the MacroAssembler regards them as
32-bit floats (meaning, they satisfy reg.isSingle()).

This patch changes MacroAssembler::storeRegsInMask to be consistent with the
other three routines.

Assignee: nobody → jseward
Status: NEW → ASSIGNED
Pushed by jseward@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/ba66cd86f6e9
Mismatch for float32 size in ARM64 MacroAssembler in MacroAssembler::storeRegsInMask and FloatRegister::GetPushSizeInBytes.  r=nbp.
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 89 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: