Crash [@ js::NativeObject::setEmptyDynamicSlots] with OOM
Categories
(Core :: JavaScript Engine, defect, P1)
Tracking
()
People
(Reporter: decoder, Assigned: jonco)
Details
(Keywords: crash, regression, testcase, Whiteboard: [bugmon:update,bisected,confirmed])
Crash Data
Attachments
(3 files)
The attached testcase crashes on mozilla-central revision 20230629-c93a9e0ad90d (debug build, run with --fuzzing-safe --ion-offthread-compile=off --fast-warmup).
Backtrace:
received signal SIGSEGV, Segmentation fault.
#0 0x0000555556d50f4f in js::NativeObject::setEmptyDynamicSlots(unsigned int) ()
#1 0x0000555557070eae in js::NativeObject::allocateInitialSlots(JSContext*, unsigned int) ()
#2 0x00005555579f5762 in js::jit::CreateMatchResultFallbackFunc(JSContext*, js::gc::AllocKind, unsigned long) ()
#3 0x0000183b4b611533 in ?? ()
#4 0x0000000000000000 in ?? ()
rax 0x0 0
rbx 0x25e6e8b60128 41673676947752
rcx 0x7fffffffaa90 140737488333456
rdx 0x555555909201 93824996119041
rsi 0x0 0
rdi 0x25e6e8b60128 41673676947752
rbp 0x7fffffffa350 140737488331600
rsp 0x7fffffffa350 140737488331600
r8 0x0 0
r9 0x0 0
r10 0x7fffffffa608 140737488332296
r11 0x7ffff3e24c30 140737285082160
r12 0x7ffff3e2e100 140737285120256
r13 0x40 64
r14 0x0 0
r15 0x6 6
rip 0x555556d50f4f <js::NativeObject::setEmptyDynamicSlots(unsigned int)+15>
=> 0x555556d50f4f <_ZN2js12NativeObject20setEmptyDynamicSlotsEj+15>: mov 0x8(%rax),%eax
0x555556d50f52 <_ZN2js12NativeObject20setEmptyDynamicSlotsEj+18>: test $0x10,%al
This testcase is super fragile, likely some form of OOM and I can't reduce it further. I've hit this particular crash quite a few times already but wasn't able to reproduce, so now that we have a reproducer, I suggest we try to diagnose this as quickly as possible before it breaks again.
| Reporter | ||
Comment 1•2 years ago
|
||
| Reporter | ||
Comment 2•2 years ago
|
||
Comment 3•2 years ago
|
||
Verified bug as reproducible on mozilla-central 20230705154257-9ab7ea679b12.
Unable to bisect testcase (Unable to launch the start build!):
Start: fc7fbf3a78e0776b0a0ff21cf27151e6b6aa5970 (20220706094022)
End: c93a9e0ad90d7eca1077e5b52a84b577549bc02e (20230629085424)
BuildFlags: BuildFlags(asan=False, tsan=False, debug=True, fuzzing=False, coverage=False, valgrind=False, no_opt=False, fuzzilli=False, nyx=False)
Comment 4•2 years ago
|
||
Reproduced this locally. I can see why it's so fragile.
We're allocating the match result for a regexp. We try to allocate it in the nursery, but the nursery is full, so we call into the VM to allocate it. We will initialize it in jitcode after we return from the VM call.
Inside the VM call, we successfully allocate a Cell for the ArrayObject, but when we go to allocate its initial slots, we trigger a simulated OOM, with impeccably bad timing. To avoid breaking GC compartment checks, we call initEmptyDynamicSlots. That's fine in release builds, but in a debug build, it hits a bunch of assertions. The assertions look at the shape of the object, which is a problem because the object isn't initialized and the shape pointer is null. We crash.
I think we could fix this by calling a different helper (initEmptyDynamicSlotsForUninitializedObject? makeUninitializedObjectSafeForGC?) that does the same thing without the assertions, but maybe there's a better approach.
Jon, you have been poking around at this code a bit recently. Mind taking another look?
| Assignee | ||
Comment 5•2 years ago
|
||
I think it would be simpler to create the match array using NewDenseFullyAllocatedArrayWithTemplate in the fallback (as CreateRegExpMatchResult does), rather splitting cell creation and initialisation between VM and JIT code as happens currently. Then the issue should not arise.
I noticed that the fallback uses the NoGC option when creating the cell though. Don't we use the fallback path when the nursery is full and we need to GC? I'm wondering if this will ever succeed in that case. How does fallback allocation normally work?
| Assignee | ||
Updated•2 years ago
|
Updated•2 years ago
|
| Assignee | ||
Comment 6•2 years ago
|
||
I wrote a patch for this but it turned out that it wasn't as simple as I thought.
| Assignee | ||
Comment 7•2 years ago
|
||
This changes the fallback path to use NewDenseFullyAllocatedArrayWithTemplate
to allocate this object as happens in the interpreter and removes the JIT
object initialisation.
There are a couple of wrinkles with this:
-
the contract seems to be not to GC, which required added an AllowGC template
parameter in a few places -
we can't depend on the template object object existing any more so we may
have to re-create it (arguably these should not be weakly held as there's
very little benefit)
Updated•2 years ago
|
Comment 8•2 years ago
|
||
Unable to reproduce bug 1841771 using build mozilla-central 20230629085424-c93a9e0ad90d. Without a baseline, bugmon is unable to analyze this bug.
Removing bugmon keyword as no further action possible. Please review the bug and re-add the keyword for further analysis.
Updated•2 years ago
|
Updated•2 years ago
|
Comment 10•2 years ago
|
||
| bugherder | ||
Updated•2 years ago
|
Description
•