Closed Bug 1559099 Opened 2 years ago Closed 2 years ago

Cranelift: Segfault crash

Categories

(Core :: Javascript: WebAssembly, defect, P3)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla69
Tracking Status
firefox-esr60 --- unaffected
firefox67 --- wontfix
firefox68 --- wontfix
firefox69 --- fixed

People

(Reporter: gkw, Assigned: bbouvier)

References

Details

(4 keywords, Whiteboard: [jsbugmon:][adv-main69-])

Attachments

(4 files, 2 obsolete files)

The following testcase crashes on mozilla-central revision 1cd7d9c0a6c3 (build with --enable-debug --enable-more-deterministic, run with --fuzzing-safe --no-threads --no-baseline --no-ion --wasm-compiler=cranelift w10-out.wrapper w10-out.wasm):

See attachment.

Backtrace:

Thread 1 "js-dbg-64-dm-li" received signal SIGILL, Illegal instruction.
0x0000389b3f0ef641 in ?? ()
(gdb) bt
#0 0x0000389b3f0ef641 in ?? ()
#1 0x00007fffffffb790 in ?? ()
#2 0x0000000000000000 in ?? ()
(gdb)

For detailed crash information, see attachment.

Setting s-s as a start because I'm not sure how bad this SIGILL crash is.

Summary: SIGILL crash with wasm and cranelift → Crash with Illegal instruction (SIGILL) with wasm and cranelift

Gary, did this ever reproduce for you outside of GDB? I can reproduce it in GDB, but I can continue. WebAssembly uses SIGILL for traps, in that case it could not be a bug.

Flags: needinfo?(nth10sd)

Actually, after running for a long time I got a segfault outside of GDB. I will try to get a correct backtrace.

[fuzz-exec] calling $func_4519
[LoggingExternalInterface logging i32.const -1236489146]

Thread 1 "js" received signal SIGSEGV, Segmentation fault.
0x000005511644f7bc in ?? ()
(gdb) c
Continuing.

Thread 1 "js" received signal SIGSEGV, Segmentation fault.
0x000005511644f7bc in ?? ()
(gdb) bt
#0  0x000005511644f7bc in ?? ()
#1  0x0000000000000000 in ?? ()
(gdb) x /i $pc
=> 0x5511644f7bc:       movslq (%rax,%rcx,4),%rcx
(gdb) info reg rax rcx
rax            0x5511644fa68    5845824109160
rcx            0xfffffff2       4294967282
(gdb) c
Continuing.

Program terminated with signal SIGSEGV, Segmentation fault.
The program no longer exists.
(gdb)
Flags: needinfo?(nth10sd)

autobisectjs shows this is probably related to the following changeset:

The first bad revision is:
changeset: https://hg.mozilla.org/mozilla-central/rev/17ee66cc26ef
user: Mike Hommey
date: Wed Feb 06 03:05:52 2019 +0000
summary: Bug 1522609 - Pass compilers and flags down to cargo on sanitizer/fuzzying/ccov builds. r=ted

Totally unsure if this is correct. Might be something related to cranelift/rust component compilation that's throwing it off. Setting needinfo? from Benjamin and Lars as a start.

Flags: needinfo?(lhansen)
Flags: needinfo?(bbouvier)

This is not urgent, as Cranelift is not in any product (it's in Nightly by opt-in only), but it would be good to make sure we're not putting partners at risk. Will attempt to look into this before long.

Severity: critical → major
Priority: -- → P3

Regressed bug is probably incorrect; I suspect Cranelift just never generated the right code for this.

No longer regressed by: 1522609
Flags: needinfo?(bbouvier)
Summary: Crash with Illegal instruction (SIGILL) with wasm and cranelift → Segfault with wasm and cranelift
Flags: needinfo?(bbouvier)
Attached file wrapper.js

Note: the following JS file does run into the same crash, but only calls one wasm function. I wonder if we could use the wasm-bindgen (esp. the wasm-gc tool) to remove all functions that aren't transitively called by this wasm function, to make compilation faster.

Flags: needinfo?(lhansen)
Summary: Segfault with wasm and cranelift → Cranelift: Segfault crash
Attached file Smaller wat test case

A smaller wat test case, extracted from the big wasm thanks to binaryen. Unfortunately, it seems that we need binaryen's wasm-as to make it a correct wasm file, because Spidermonkey's wasmTextToBinary creates an invalid binary (the error is that we're trying to pop an empty stack at some point; that's something else to investigate).

Assignee: nobody → bbouvier
Status: NEW → ASSIGNED
Flags: needinfo?(bbouvier)

Found it: TLDR is that LICM isn't aware that jump_table_entry can load data, thus it would hoist it above a loop before the out-of-bounds check (for the jump table) was done. Patch incoming.

PR is in https://github.com/CraneStation/cranelift/pull/805 and will need a review. I'll add the Cranelift bump here, as well as supplementary code needed to not panic in our bindings.

Changes have been reviewed on the Cranelift side; this is just a bump to the
latest version.

Attachment #9074775 - Attachment description: Bug 1559099: Don't check metadata for JumpTableEntry instructions; r?lth → Bug 1559099: Don't check metadata for JumpTableEntry instructions; r=lth

This has landed on inbound: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=c4ed2de25e6db29bc346f82b8a198c5b4a533b22

b684d4532b91 Bug 1559099: Don't check metadata for JumpTableEntry instructions; r=lth
e73151bcaff8 Bug 1559099: Bump Cranelift in Spidermonkey to e455f6ae; r=lth

Group: javascript-core-security → core-security-release
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla69
Blocks: 1560277
Attachment #9071850 - Attachment is obsolete: true
Whiteboard: [jsbugmon:] → [jsbugmon:][adv-main69-]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.