Closed Bug 1892184 Opened 10 months ago Closed 9 months ago

Assertion failure: kind != BailoutKind::Unknown, at jit/shared/Lowering-shared.cpp:275

Categories

(Core :: JavaScript Engine: JIT, defect, P2)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
127 Branch
Tracking Status
firefox-esr115 --- unaffected
firefox125 --- wontfix
firefox126 --- wontfix
firefox127 --- fixed

People

(Reporter: gkw, Assigned: iain)

References

(Blocks 3 open bugs)

Details

(Keywords: regression, reporter-external, testcase)

Attachments

(2 files)

Attached file debug stack
function f() {}
function g() {
  ("" + 5).x;
}
Object.defineProperty(f, "toString", { e: true, e: true, value: g });
function h(x, y) {
  y.compile(x);
}
h(f, /x/);
oomTest(function() {
  function n() {
    return n;
  }
  class C extends n {}
  new n(Infinity, ..."ZP");
  this.trialInline();
  z;
});
js::jit::LIRGeneratorShared::assignSnapshot (this=<optimized out>, ins=<optimized out>, kind=<optimized out>) at /home/ubumain/trees/mozilla-central/js/src/jit/shared/Lowering-shared.cpp:275
275       MOZ_ASSERT(kind != BailoutKind::Unknown);
(gdb) bt
#0  js::jit::LIRGeneratorShared::assignSnapshot (this=<optimized out>, ins=<optimized out>, kind=<optimized out>) at /home/ubumain/trees/mozilla-central/js/src/jit/shared/Lowering-shared.cpp:275
#1  0x0000555558404658 in js::jit::LIRGenerator::visitToNumberInt32 (this=0x7ffff6000808, convert=0x7ffff4d49cd8) at /home/ubumain/trees/mozilla-central/js/src/jit/Lowering.cpp:2988
#2  0x000055555843fd0c in js::jit::LIRGenerator::visitInstruction (this=this@entry=0x7ffff6000808, ins=ins@entry=0x7ffff4d49cd8) at /home/ubumain/trees/mozilla-central/js/src/jit/Lowering.cpp:7348
#3  0x000055555844042f in js::jit::LIRGenerator::visitBlock (this=this@entry=0x7ffff6000808, block=block@entry=0x7ffff4d3f4d0) at /home/ubumain/trees/mozilla-central/js/src/jit/Lowering.cpp:7418
#4  0x000055555844081e in js::jit::LIRGenerator::generate (this=0x7ffff6000808) at /home/ubumain/trees/mozilla-central/js/src/jit/Lowering.cpp:7492
#5  0x000055555836a71d in js::jit::GenerateLIR (mir=mir@entry=0x7ffff4d33180) at /home/ubumain/trees/mozilla-central/js/src/jit/Ion.cpp:1546
#6  0x000055555836aba1 in js::jit::CompileBackEnd (mir=0x7ffff4d33180, snapshot=<optimized out>) at /home/ubumain/trees/mozilla-central/js/src/jit/Ion.cpp:1635
#7  0x000055555838f29e in js::jit::IonCompileTask::runTask (this=0x7ffff4d347b8) at /home/ubumain/trees/mozilla-central/js/src/jit/IonCompileTask.cpp:52
/snip

Run with --fuzzing-safe --ion-eager --fast-warmup --trial-inlining-warmup-threshold=0, compile with AR=ar sh ../configure --enable-debug --enable-debug-symbols --with-ccache --enable-nspr-build --enable-ctypes --enable-gczeal --enable-rust-simd --disable-tests, tested on m-c rev 795f884f7ce1.

Note that this testcase may be slightly intermittent, but is still fairly reproducible.

Setting s-s as a start. I'm working on finding a better regression range but this issue seems to already exist as of m-c rev 443c7bf9d76b (Aug 2023).

Flags: sec-bounty?
Group: core-security → javascript-core-security

This one is weird. On the original machine where this happened, it's a regression range between m-c rev 98796da1f171 (mid-May 2023) to m-c rev e963fffcb3a0 (late-Jun 2023), while accounting for the intermittence, but on my current machine it seems to pointing at something more recent, somewhere between m-c rev 443c7bf9d76b (late-Aug 2023) and m-c rev 7eddf33b21eb (late-Oct 2023).

I'll try and get the latter, because in any case, it's likely from last year.

The first bad revision is:
changeset:   https://hg.mozilla.org/mozilla-central/rev/b3b64c6b12cd
user:        Jan de Mooij
date:        Thu Aug 31 14:06:30 2023 +0000
summary:     Bug 1850814 - Add post barrier fast path for last-buffered-cell also to Baseline and IC code. r=jonco

Jan, is bug 1850814 a likely regressor?

Flags: needinfo?(jdemooij)
Keywords: regression
Regressed by: 1850814

Set release status flags based on info from the regressing bug 1850814

On the original machine where this happened, it's a regression range between m-c rev 98796da1f171 (mid-May 2023) to m-c rev e963fffcb3a0 (late-Jun 2023), while accounting for the intermittence

And on the original machine, it seems to point here:

The first bad revision is:
changeset:   https://hg.mozilla.org/mozilla-central/rev/1a1a14b6af4e
user:        Iain Ireland
date:        Tue Jun 13 19:19:16 2023 +0000
summary:     Bug 1837157: Don't transpile ICs with failures r=jandem

I'll leave it to Jan to decide which one is correct.

This is flaky, but I managed to reproduce it locally.

The problem is that this code in ComparePolicy::adjustInputs, which currently has no test coverage, generates an MToNumberInt32 node without a bailout kind, which triggers an assertion. This only matters for bailout loop detection, which only matters for performance, so this is not security sensitive.

I will try to write a more robust testcase and then figure out what we want to do if this bails out.

Group: javascript-core-security
Flags: needinfo?(jdemooij)

:jandem, since you are the author of the regressor, bug 1850814, could you take a look? Also, could you set the severity field?

For more information, please visit BugBot documentation.

Flags: needinfo?(jdemooij)
Flags: needinfo?(jdemooij)
No longer regressed by: 1850814
Blocks: sm-opt-jits
Severity: -- → S4
Priority: -- → P2

Writing a more robust testcase turns out to be extremely difficult and not worth it.

Roughly speaking, what's going on here is that we inline string iteration code (for ..."ZP") into the function being OOM-tested. We end up with something along the lines of:

var result = {index: 0};
for (...) {
  ...
  if (result.index >= ...) {...}
  result.index += ...;
}

Because this is implemented in self-hosted code, we use UnsafeGetInt32FromReservedSlot to load index. We generate an Int32-typed comparison for the >=. In scalar replacement, we manage to eliminate the object allocation entirely. We replace the loadSlot for result.index with a phi, where one input to the phi is 0 and the other is the result of the += in the loop body.

This is almost always fine: both inputs to the phi are Int32, so we can conclude that the phi is also Int32. However, in this testcase, a perfectly timed OOM prevents us from attaching an IC stub for the addition. We end up generating an Ion IC instead. Now, one of the inputs to the phi is Value-typed, making the phi Value-typed, so the compare's type policy kicks in and generates a conversion node. It doesn't look like there's any normal way to reach this code, so we hadn't previously assigned a bailout kind. In practice, this conversion will never bail out, but the assertion doesn't know that.

It doesn't feel great to have a bunch of code in ComparePolicy::adjustInputs that almost never runs. As far as I can tell, the only effectful code anywhere in this function that is reachable via our testing suite is this code that converts from Double to Float32, which kicks in for this test when we compare two floating point constants, because this code changes the type of an MCompare from double to float32 and relies on the type policy to insert conversions.

We should clean this up somehow.

To bang my favourite drum for a moment: this is a (relatively innocuous) example of the kind of unexpected edge case we can hit because of small OOM recovery.

Most of this code will do nothing under normal circumstances, because we already inserted the correct conversions in the warp transpiler. The exception is double to float32 conversion, which could be eliminated by rewriting MCompare::trySpecializeFloat32. However, since there's at least one abnormal case where this code does something, it might be best to leave it in place for now until we come up with something better.

The existing code here generated infallible conversions. I've changed it to use fallible conversions instead, because performance doesn't matter here, and I would prefer to bail out safely if something funny is going on.

The part of this patch that actually fixes the bug is the line that assigns BailoutKind::TypePolicy. I've verified that it prevents the bug from triggering in the original testcase over several hundred runs.

I'm leaving out the testcase because it depends on a precisely timed OOM preventing us from attaching a single IC stub in self-hosted code, and there's no way to write that that isn't hilariously fragile. The original testcase only triggers ~1/10 times on my machine.

Assignee: nobody → iireland
Status: NEW → ASSIGNED
Pushed by iireland@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/20d7fb4043b6 Simplify ComparePolicy::adjustInputs r=jandem
Status: ASSIGNED → RESOLVED
Closed: 9 months ago
Resolution: --- → FIXED
Target Milestone: --- → 127 Branch
Flags: sec-bounty? → sec-bounty-
Regressions: 1894456
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: