Closed Bug 1694600 Opened 4 years ago Closed 4 years ago

Assertion failure: cx_->hadNondeterministicException(), at /jit/WarpOracle.cpp:188

Categories

(Core :: JavaScript Engine: JIT, defect, P2)

x86_64
Linux
defect

Tracking

()

VERIFIED FIXED
88 Branch
Tracking Status
firefox-esr78 --- unaffected
firefox86 --- wontfix
firefox87 --- wontfix
firefox88 --- verified

People

(Reporter: decoder, Assigned: iain)

References

(Regression)

Details

(Keywords: assertion, regression, testcase, Whiteboard: [bugmon:update,bisected,confirmed])

Attachments

(3 files)

The following testcase crashes on mozilla-central revision 20210219-f2cb3ed27e68 (debug build, run with --fuzzing-safe --no-threads --ion-eager --baseline-warmup-threshold=0):

mathy0 = function() {};
mathy1 = (function(y) {
  (0x07fffffff >= y) ? undefined : (y ? undefined : y >>> 0);
  mathy0()(y >>> y);
})
try {
  mathy1(0);
} catch(e) {}
mathy0 = function() { throw 0; };
for (i = 0; i < 50; ++i)
  try {
    mathy1(-0x080000000);
  } catch(e) {}

Backtrace:

received signal SIGSEGV, Segmentation fault.
#0  0x0000555557922aeb in js::jit::WarpOracle::createSnapshot() ()
#1  0x00005555578bee96 in js::jit::CreateWarpSnapshot(JSContext*, js::jit::MIRGenerator*, JS::Handle<JSScript*>) ()
#2  0x000055555789e1d8 in js::jit::Compile(JSContext*, JS::Handle<JSScript*>, js::jit::BaselineFrame*, unsigned char*) ()
#3  0x000055555789efdd in IonCompileScriptForBaseline(JSContext*, js::jit::BaselineFrame*, unsigned char*) ()
#4  0x000029554de87be5 in ?? ()
[...]
#11 0x0000000000000000 in ?? ()
rax	0x55555572dfe3	93824994172899
rbx	0x7ffff6024000	140737320730624
rcx	0x555557fe9608	93825036883464
rdx	0x0	0
rsi	0x7ffff7105770	140737338431344
rdi	0x7ffff7104540	140737338426688
rbp	0x7fffffffb7a0	140737488336800
rsp	0x7fffffffb700	140737488336640
r8	0x7ffff7105770	140737338431344
r9	0x7ffff7f99840	140737353717824
r10	0x58	88
r11	0x7ffff6dac7a0	140737334921120
r12	0x7ffff5de9970	140737318394224
r13	0x7fffffffb7c0	140737488336832
r14	0x7ffff60ea498	140737321542808
r15	0x87a364d6	2275632342
rip	0x555557922aeb <js::jit::WarpOracle::createSnapshot()+1627>
=> 0x555557922aeb <_ZN2js3jit10WarpOracle14createSnapshotEv+1627>:	movl   $0xbc,0x0
   0x555557922af6 <_ZN2js3jit10WarpOracle14createSnapshotEv+1638>:	callq  0x555556a7ea5c <abort>
Attached file Testcase

Bugmon Analysis:
Verified bug as reproducible on mozilla-central 20210224100119-b3eb91f0b5a7.
The bug appears to have been introduced in the following build range:

Start: f4af0087a1b49c221f54143a10b7bebca35db49c (20210111195436)
End: febd0fad07331284c49334bab4d9c653f2c80275 (20210111195806)
Pushlog: https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=f4af0087a1b49c221f54143a10b7bebca35db49c&tochange=febd0fad07331284c49334bab4d9c653f2c80275

Whiteboard: [bugmon:update,bisect] → [bugmon:update,bisected,confirmed]

Here's a cleaned up version of the test:

function foo(fn, y) {
    var dummy = y > 0 ? 1 : 2;
    fn();
    return y * y;
}

function nop() {}
function throws() { throw 1; }

with ({}) {}
for (var i = 0; i < 500; i++) {
    foo(nop, 0)
    try {
	foo(throws, 0x7fffffff)
    } catch {}
}

In foo, we only reach y * y for non-overflowing values of y. In the cases where y * y would overflow, fn() throws first. However, the instruction reordering pass hoists the multiplication past the call. When it bails out, we don't change the CacheIR, because we never reach the Mul with the "bad" value.

The fix, as usual, is to disable this optimization if we have this kind of bailout. I considered doing something similar to LICM, where we check to see whether we attach a new stub before deciding to disable the optimization. Unlike LICM, though, which can easily cause bailouts but is also a big performance win, this bug is hard to trigger, and instruction reordering is just an incremental improvement. (The original bug says: "improves our octane-zlib score by 5% or so without seeming to move other octane tests".

(If we wanted, though, we could unify our handling of LICM, InstructionReordering, and HoistBoundsCheck to all use noteBaselineFallback.)

While simplifying this testcase, I also noticed an issue with ICScript::hash that could hypothetically cause us to miss some bailout loops. We currently hash together the address of the first stub with the entry counts for the subsequent stubs. But once the entry count of a subsequent stub is greater than 0, incrementing it further won't affect our transpilation choices. We can end up in a situation where we miss a bailout loop because of an unrelated IC with multiple active stubs. The fix is to instead hash enteredCount == 0 for each non-first stub.

Severity: -- → S4
Priority: -- → P2

Working under the assumptions that instructions can be reordered pretty often, and bailouts will very rarely occur because of reordering, I made this act like TranspiledCacheIR aside from setting the flag.

Driveby: tidied up some of the script flags to separate the bailout flags from the rest, and removed two unused flags.

Assignee: nobody → iireland
Status: NEW → ASSIGNED

Once the entry count of a non-first stub is greater than 0, incrementing it further won't affect our decision to transpile, so it shouldn't change the hash.

Depends on D106514

Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/691b620b590c Add BailoutKind::InstructionReordering r=jandem https://hg.mozilla.org/integration/autoland/rev/80f5cb43d64e Make ICScript::hash more precise r=jandem
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 88 Branch

Bugmon Analysis:
Verified bug as fixed on rev mozilla-central 20210227214359-abe97e99af31.
Removing bugmon keyword as no further action possible.
Please review the bug and re-add the keyword for further analysis.

Status: RESOLVED → VERIFIED
Keywords: bugmon
Has Regression Range: --- → yes
Regressions: 1745907
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: