Closed Bug 1943704 Opened 20 days ago Closed 5 days ago

Assertion failure: cx_->hadResourceExhaustion(), at js/src/jit/WarpOracle.cpp:206

Categories

(Core :: JavaScript Engine: JIT, defect, P3)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
137 Branch
Tracking Status
firefox-esr128 --- unaffected
firefox134 --- unaffected
firefox135 --- unaffected
firefox136 --- wontfix
firefox137 --- fixed

People

(Reporter: decoder, Assigned: iain)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: assertion, regression, testcase, Whiteboard: [bugmon:update,bisected,confirmed])

Attachments

(3 files)

The following testcase crashes on mozilla-central revision 20250123-0da4e2f4d975 (debug build, run with --fuzzing-safe --ion-offthread-compile=off test.js):

for (;;)[][-this.a]

Backtrace:

received signal SIGSEGV, Segmentation fault.
0x00005d590b228c6d in js::jit::WarpOracle::createSnapshot() ()
#0  0x00005d590b228c6d in js::jit::WarpOracle::createSnapshot() ()
#1  0x00005d590b16b9a1 in js::jit::Compile(JSContext*, JS::Handle<JSScript*>, js::jit::BaselineFrame*, unsigned char*) ()
#2  0x00005d590b16c830 in IonCompileScriptForBaseline(JSContext*, js::jit::BaselineFrame*, unsigned char*) ()
#3  0x00005d590b16d19a in js::jit::IonCompileScriptForBaselineOSR(JSContext*, js::jit::BaselineFrame*, unsigned int, unsigned char*, js::jit::IonOsrTempData**) ()
#4  0x000033ce53a060a6 in ?? ()
#5  0x0000000000000000 in ?? ()
rax	0x5d5908a74f37	102636978655031
rbx	0x7ffde1960e50	140728388161104
rcx	0x5d590bbe17d0	102637030479824
rdx	0x7e114e181723	138612789745443
rsi	0x0	0
rdi	0x7e114e182a60	138612789750368
rbp	0x7ffde1960db0	140728388160944
rsp	0x7ffde1960d10	140728388160784
r8	0x79	121
r9	0x0	0
r10	0x0	0
r11	0x18	24
r12	0x0	0
r13	0x0	0
r14	0x7e114ad68518	138612735116568
r15	0xd7b8b2cb	3619205835
rip	0x5d590b228c6d <js::jit::WarpOracle::createSnapshot()+1885>
=> 0x5d590b228c6d <_ZN2js3jit10WarpOracle14createSnapshotEv+1885>:	movl   $0xce,0x0
   0x5d590b228c78 <_ZN2js3jit10WarpOracle14createSnapshotEv+1896>:	call   0x5d590a09dac0 <abort>
Attached file Testcase

Verified bug as reproducible on mozilla-central 20250124210220-39fbc3b843e9.
The bug appears to have been introduced in the following build range:

Start: fed3206664a81d1830242d0172f4e1919210692b (20250122214009)
End: c019b65d56985472dbfbd7a910f25afeaafc139f (20250122230128)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=fed3206664a81d1830242d0172f4e1919210692b&tochange=c019b65d56985472dbfbd7a910f25afeaafc139f

Whiteboard: [bugmon:update,bisect] → [bugmon:update,bisected,confirmed]

:iain there are two bugs in Comment 3 that might be the possible regressor?
Possibly Bug 1941947 or Bug 1941932
Could you confirm which bug is the correct regressor?

Flags: needinfo?(iireland)

This is a regression from bug 1941947.

In baseline, we get this CacheIR:

 WarpCacheIR (offset 18, JSOp::Neg)
    stubCode: 0x31968e872088
    stubInfo: 0x707513d2a260
    stubData: 0x7075121be3c0
    IR:
      GuardIsUndefined              inputId 0
      LoadDoubleConstant            valOffset 0, resultId 1
      DoubleNegationResult          inputId 1
      ReturnFromIC                  

  WarpCacheIR (offset 19, JSOp::GetElem)
    stubCode: 0x31968e8720b0
    stubInfo: 0x707513d28b80
    stubData: 0x7075121be420
    IR:
      GuardToObject                 inputId 0
      GuardSpecificValue            valId 1, expectedOffset 0
      ...

this.a is undefined. Converting undefined to a number produces NaN. Negating the NaN sets the sign bit, although it's obviously still NaN. So we guard that the input is undefined, then guard that the result is -GenericNaN().

When we transpile to Ion, GVN helpfully folds away the negation of the constant NaN.

Our transpiled GuardValue will see an input NaN with positive sign, but expect an input NaN with negative sign. branchTestValue does a bitwise comparison, so the two values aren't equal and we bail. Nothing has changed in baseline, so we recompile and trigger a bailout loop.

Doing a quick audit to see if this problem exists elsewhere:

MGuardValue::New is called in six places. Three of those only guard specific magic/null/undefined values. The others three come from transpilation of GuardSpecificValue, GuardFixedSlotValue, and GuardDynamicsSlotValue. The only potentially interesting use of Guard(Fixed|Dynamic)SlotValue is this GuardFixedSlotValue in tryAttachSpecializedFunctionBind, because the length is initialized as a double, but we've already checked that it's an Int32 before attaching. The only uses of GuardSpecificValue are the ones I added in bug 1941947.

There are three other callers of branchTestValue. branchTestMagicValue and loadDOMExpandoValueGuardGeneration are both guaranteed to be guarding values that aren't NaN. At first glance it looks like this debug assert in MacroAssembler::dateTimeFromSecondsIntoYear could have the same confusion between -NaN and NaN, but it turns out that we canonicalize before storing into the internal slots of the Date object, so we don't have to worry about it.

So this should be the only instance of the problem. Not entirely sure what the best fix is; it might make sense to special-case NaN in MacroAssembler::branchTestValue?

Flags: needinfo?(iireland)
Regressed by: 1941947

Set release status flags based on info from the regressing bug 1941947

Severity: -- → S3
Priority: -- → P3
Blocks: sm-opt-jits
Severity: S3 → S4
Duplicate of this bug: 1944297

We support two canonical NaN values (0x7ff8_0000_0000_0000 and 0xfff8_0000_0000_0000), which differ only in the sign bit. Our platforms with JIT support all generate one or the other for (eg) Infinity - Infinity, and you can convert between the two values by negating them (it's cheapest for hardware to just flip the sign bit instead of making an exception for NaNs.) If we're guarding on NaN, we ideally want to catch both cases. We can accomplish this by masking off the sign bit before comparing.

I used godbolt to find good codegen for this patch.

The 32-bit platforms are easy.

On x86-64, Clang generates the current sequence (load a large constant + and). GCC uses BTR (bit test and reset) to clear the sign bit. It looks like LLVM currently only generates BTR for -Os. We could consider using BTR here, but it seemed like overkill to add a whole new instruction to the assembler for this.

Arm64 and Loong64 have nice single-instruction ways to implement this. On risc-v, both GCC and Clang generate two shifts to clear the top bit, so I followed them.

The MIPS64 build currently appears to be broken, and godbolt wasn't giving an obvious good option, so I didn't touch it.

Assignee: nobody → iireland
Status: NEW → ASSIGNED
Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/bc445d1470af Mask off sign bit of NaN in branchTestValue r=jandem
Status: ASSIGNED → RESOLVED
Closed: 5 days ago
Resolution: --- → FIXED
Target Milestone: --- → 137 Branch

Testcase crashes using the initial build (mozilla-central 20250123095238-0da4e2f4d975) but not with tip (mozilla-central 20250208091603-053595a05e65.)

The bug appears to have been fixed in the following build range:

Start: c3fad748e37c53bcbcc913168cd9570a13ba4c0f (20250207175414)
End: 4e9c0a710588f889ae6c5b7272ea111c05515014 (20250207193227)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=c3fad748e37c53bcbcc913168cd9570a13ba4c0f&tochange=4e9c0a710588f889ae6c5b7272ea111c05515014

iain, can you confirm that the above bisection range is responsible for fixing this issue?
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.

Flags: needinfo?(iireland)
Keywords: bugmon
Flags: needinfo?(iireland) → in-testsuite+

The patch landed in nightly and beta is affected.
:iain, is this bug important enough to require an uplift?

  • If yes, please nominate the patch for beta approval.
  • If no, please set status-firefox136 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(iireland)
Regressions: 1947125

This is just a perf issue in code that nobody would write in practice. It can ride the trains.

Flags: needinfo?(iireland)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: