Closed Bug 1914821 Opened 6 months ago Closed 6 months ago

Assertion failure: cx_->hadResourceExhaustion(), at /builds/worker/checkouts/gecko/js/src/jit/WarpOracle.cpp:206

Tracking

()

Status:

RESOLVED FIXED

Milestone:

132 Branch

Tracking Flags:

Tracking

Status

firefox-esr115

---

wontfix

firefox-esr128

---

wontfix

firefox130

---

wontfix

firefox131

---

wontfix

firefox132

---

fixed

People

(Reporter: tsmith, Assigned: iain)

References

(Blocks 2 open bugs,
URL
)

Details

(Keywords: assertion, pernosco)

Attachments

(1 file)

Bug 1914821: Clear IC hash when folding stub r=jandem 6 months ago Iain Ireland [:iain] 48 bytes, text/x-phabricator-request		Details \| Review

Tyson Smith [:tsmith]

Reporter

Description

•

6 months ago

Found with m-c 20240824-4da5ac911a89 (--enable-debug --enable-fuzzing)

This was found by visiting a live website with a debug build.

STR:

Launch browser and visit site

This issue was triggered by visiting http://www.tandfonline.com/doi/epdf/10.1080/01603477.2024.2355452.

Assertion failure: cx_->hadResourceExhaustion(), at /builds/worker/checkouts/gecko/js/src/jit/WarpOracle.cpp:206

#0 0x75d575ecd9bc in js::jit::WarpOracle::createSnapshot() /builds/worker/checkouts/gecko/js/src/jit/WarpOracle.cpp:205:5
#1 0x75d57636af73 in js::jit::CreateWarpSnapshot(JSContext*, js::jit::MIRGenerator*, JS::Handle<JSScript*>) /builds/worker/checkouts/gecko/js/src/jit/Ion.cpp:1653:48
#2 0x75d57632a236 in IonCompile /builds/worker/checkouts/gecko/js/src/jit/Ion.cpp:1723:41
#3 0x75d57632a236 in js::jit::Compile(JSContext*, JS::Handle<JSScript*>, js::jit::BaselineFrame*, unsigned char*) /builds/worker/checkouts/gecko/js/src/jit/Ion.cpp:1916:24
#4 0x75d57632b56d in BaselineCanEnterAtEntry /builds/worker/checkouts/gecko/js/src/jit/Ion.cpp:2048:25
#5 0x75d57632b56d in IonCompileScriptForBaseline(JSContext*, js::jit::BaselineFrame*, unsigned char*) /builds/worker/checkouts/gecko/js/src/jit/Ion.cpp:2173:12
#6 0x875c6f8f9d5  ([anon:js-executable-memory]+0x129d5)

Tyson Smith [:tsmith]

Reporter

Comment 1

•

6 months ago

A Pernosco session is available here: https://pernos.co/debug/UzwlKYcvsmyfOH8Zn58ATQ/index.html

Keywords: pernosco

Iain Ireland [:iain]

Assignee

Comment 2

•

6 months ago

Thanks for the pernosco recording. It is a bit tricky to debug bailout jitcode with an optimized build, but I think I've worked out roughly what's going on. At the very least, here's a testcase that triggers the assertion in the shell.

var arr = [];
for (var i = 0; i < 8; i++) {
  arr.push({x: 1, ["y" + i]: 1});
}

function foo(o, o2, cond) {
  var result = 0;
  for (var i = 0; i < 2; i++) {
    if (cond) {
      result += o.x; // Potentially invalid unbox
    }
  }
  result += o2.x; // Stub folded GetProp
  return result;
}

with ({}) {}
// Ion-compile
for (var i = 0; i < 2000; i++) {
  foo({x: 1}, arr[i%6], i % 2 == 0);
}

// Bail out 
for (var i = 0; i < 10; i++) {
  foo(undefined, arr[6], false);
}
foo({x:1}, arr[7], false);

for (var i = 0; i < 2000; i++) {
  foo({x: 1}, arr[i%6], i % 2 == 0);
}

This assertion doesn't affect correctness. It's just a way of catching potential performance problems due to unnecessary bailout loops..

The issue is a combination of our handling of LICM-induced bailouts and stub folding. We Ion-compile foo. Inside a conditional, inside a loop, we have an MUnbox that is speculatively hoisted outside the loop. After the loop, we have a getprop that is called with multiple shapes, in a way that allows us to do stub folding.

After we've compiled, we call with a carefully crafted set of arguments. The hoisted unbox fails (because undefined is not an object). We bail out with BailoutKind::LICM, updating licmState from NeverBailed to Bailed. Because cond is false, we don't actually execute the corresponding unbox in baseline. Normally, this would cause us to invalidate next time we did an LICM bail. However, after exiting the loop, we see a new shape and do stub folding. This accomplishes two things. First, it calls js::jit::MaybeNotifyWarp to inform Warp that we hit a fallback stub, which updates licmState to BailedAndHitFallback. Secondly, it eats the bailout in a way that doesn't change the CacheIR.

On the next iteration, we bail out at the same unbox. This time, though, because licmState is BailedAndHitFallback, we treat it like a regular bailout. We increment the numFixableBailouts counter and bail out to blinterp. In blinterp, nothing gets updated, because we don't hit the unbox and have already folded in the new shape.

This continues until numFixableBailouts hits 9. This time, we pass in a new shape and don't trigger the unbox failure. We bail out. Normally, we would bail out, fold the shape into the getprop stub, and avoid invalidating. However, because we've been messing around with LICM bailouts, numFixableBailouts hits 10 and triggers an immediate invalidation. We fold the stub in blinterp, but there's no IonScript left to notify. When we recompile, the only thing that's changed is the number of shapes in the stub, which isn't included in the hash, so we trigger an assertion.

We did actually make progress here, so this isn't a real bailout loop. I don't think this should be a performance issue in practice. I haven't managed to figure out how the bug corresponds to the website's source code, although I can say that we inline this function:

function nextChar() { return this.currentChar = this.stream.getByte() }

inside this function:

function getObj() {
  for (var e = !1, t = this.currentChar;;) {
    if (t < 0) return n.EOF;
    if (e) 10 !== t && 13 !== t || (e = !1);
    else if (37 === t) e = !0;
    else if (1 !== w[t]) break;
    t = this.nextChar()
  }
  /* a bunch of code omitted here, but it doesn't include loops so I think it's mostly irrelevant? */
}

(both from the script here).

Not entirely sure what the best fix is. I'll have to think about it.

Nicolas B. Pierron [:nbp]

Updated

•

6 months ago

Blocks: sm-opt-jits

Severity: -- → S4

Priority: -- → P1

Iain Ireland [:iain]

Assignee

Comment 3

•

6 months ago

After thinking about this, I think the best fix is to just clear out the failed IC hash if we fold a transpiled stub and there's no IonScript.

The ultimate problem is that, when we bail out from an LICM-hoisted guard and check to see if we hit a fallback in baseline, we don't have sufficiently precise tracking to validate that the fallback corresponds to the failing guard. If we hit a different fallback, we get a false negative on the "was this bailout caused by LICM?" question. I was temporarily optimistic that we could check whether we hit a fallback in subsequent bailouts and deduce something from that, but upon reflection we expect to see the same thing in both the good and bad cases, so it doesn't work.

This situation is a bit awkward, but has limited downside: we end up with one extra cycle of invalidation/recompilation before we disable LICM. Given how rarely we expect to end up in this case, I think that's basically fine.

So the remaining problem is just that we report this as a bailout loop because we don't think we've made any progress, and we don't think we've made any progress because we save the hash before stub folding has a chance to reset the counter. If we just plug that hole, I think everything works out.

Iain Ireland [:iain]

Assignee

Comment 4

•

6 months ago

Attached file Bug 1914821: Clear IC hash when folding stub r=jandem — Details

Phabricator Automation

Updated

•

6 months ago

Assignee: nobody → iireland

Status: NEW → ASSIGNED

Mayank Bansal

Comment 5

•

6 months ago

The patch here may need some help in landing.

Matthew Gaudet (he/him) [:mgaudet]

Comment 6

•

6 months ago

Thanks -- I have hit the button. Fingers crossed it sticks :P

Pulsebot

Comment 7

•

6 months ago

Pushed by mgaudet@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/fb1823163a73 Clear IC hash when folding stub r=jandem

Norisz Fay [:noriszfay]

Comment 8

•

6 months ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/fb1823163a73

Status: ASSIGNED → RESOLVED

Closed: 6 months ago

status-firefox132: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 132 Branch

BugBot [:suhaib / :marco/ :calixte]

Comment 9

•

6 months ago

The patch landed in nightly and beta is affected.
:iain, is this bug important enough to require an uplift?

If yes, please nominate the patch for beta approval.
If no, please set status-firefox131 to wontfix.

For more information, please visit BugBot documentation.

Flags: needinfo?(iireland)

Nicolas B. Pierron [:nbp]

Comment 10

•

6 months ago

Given comment 2: “This assertion doesn't affect correctness. It's just a way of catching potential performance problems due to unnecessary bailout loops..”
I guess this can wait if we are not confident.

Jan, what do you think, should this follow the train or be uplifted? (see comment 9)

Flags: needinfo?(iireland) → needinfo?(jdemooij)

Ryan VanderMeulen [:RyanVM]

Updated

•

6 months ago

status-firefox130: --- → wontfix

status-firefox-esr115: --- → wontfix

status-firefox-esr128: --- → wontfix

Flags: in-testsuite+

Jan de Mooij [:jandem]

Updated

•

6 months ago

status-firefox131: affected → wontfix

Flags: needinfo?(jdemooij)

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Assertion failure: cx_->hadResourceExhaustion(), at /builds/worker/checkouts/gecko/js/src/jit/WarpOracle.cpp:206

Categories

(Core :: JavaScript Engine: JIT, defect, P1)

Tracking

()

People

(Reporter: tsmith, Assigned: iain)

References

(Blocks 2 open bugs,
URL
)

Details

(Keywords: assertion, pernosco)

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Updated

Updated

Attachment

General

Description

File Name

Content Type