Loading site triggers SIGILL
Categories
(Core :: JavaScript: WebAssembly, defect, P1)
Tracking
()
People
(Reporter: tsmith, Assigned: jseward)
References
(Blocks 1 open bug, )
Details
(Keywords: crash, pernosco, sec-moderate, Whiteboard: [adv-main127+r])
Attachments
(8 files)
|
852 bytes,
application/x-javascript
|
Details | |
|
4.27 KB,
patch
|
Details | Diff | Splinter Review | |
|
6.08 KB,
text/plain
|
Details | |
|
10.82 KB,
text/plain
|
Details | |
|
3.67 KB,
text/plain
|
Details | |
|
7.35 KB,
text/plain
|
Details | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review |
Found with m-c 20240423-4630feab4425 (--enable-debug)
This was found by visiting a live website with a debug build.
STR:
- Launch browser and visit site
This issue was triggered by visiting http://interactive.index.hr/.
A Pernosco session is available here: https://pernos.co/debug/Dboj3Qti_SpayKs1wCrQHA/index.html#f{m[AcwJ,AbT5_,t[AcI,KZE_,f{e[Acl1,jgy2_,s{acNH/AAAA,bAbs,uLgy95w,oLiPC7g___/
Comment 1•2 years ago
|
||
It seems like the Pernosco session might be broken - I don't see a SIGILL, and opening the GDB panel actually produces an internal error, at which point the whole Pernosco UI stops working properly. Could you look into it and possibly send us another trace?
| Reporter | ||
Comment 2•2 years ago
|
||
Open toolbox -> signals received.
I will also contact the Pernosco team about the Pernosco UI.
Comment 3•2 years ago
|
||
Comment 4•2 years ago
|
||
Comment on attachment 9399235 [details]
jsshell -wasm-compilier=baseline test case
Pernosco recording https://pernos.co/debug/qoPS6ZHr1SYudsji8u0jsQ/index.html#f{m[A7s,AR/u_,t[AQ,NFGz_,f{e[A7c,BFOR_,s{aVdmgPqAA,bBQ,uBVZEjA,oBWVL7w___/
| Assignee | ||
Comment 5•2 years ago
|
||
Some random observations (may be irrelevant):
I can reproduce on x86_64-linux with m-c 737006:650dda918743, unoptimised
debug build, with the test case in comment 3, with
--no-threads --wasm-compiler=baseline.
With --wasm-compiler=ion, it does not reproduce.
The two x.exports.f() lines are both necessary; with only one, it does not
reproduce.
With valgrind and also GDB I see the pattern of two SIGILLs followed by a
SIGTRAP. The two SIGILLs have the same stack ..
valgrind: Unrecognised instruction at address 0x215e1447c10e.
at 0x215E1447C10E: ???
by 0x215E1447C16A: ???
by 0x215E1447C2AA: ???
by 0x215E1447C4B3: ???
by 0x3C77A0F: js::wasm::Instance::callExport(JSContext*, unsigned int, JS::CallArgs const&, js::wasm::CoercionLevel) (src/wasm/WasmInstance.cpp:3268)
and the SIGTRAP has the same two frames at the bottom ('AOF, 'C4B3) but a
different tail:
Process terminating with default action of signal 5 (SIGTRAP)
at 0x215E1447C3AE: ???
by 0x215E1447C2E0: ???
by 0x215E1447C4B3: ???
by 0x3C77A0F: js::wasm::Instance::callExport(JSContext*, unsigned int, JS::CallArgs const&, js::wasm::CoercionLevel) (src/wasm/WasmInstance.cpp:3268)
It might be that the routine 0x215E1447C2AA in the SIGILLs runs a few
instructions further (to 0x215E1447C2E0), makes another call, and that traps
(at 0x215E1447C3AE).
With --wasm-compiler=ion, I see the two SIGILLs but no SIGTRAP.
| Assignee | ||
Comment 6•2 years ago
|
||
Using the attached patch, and
./dist/bin/js --no-ion --no-threads --wasm-compiler=baseline \
../../bug1893915-testcase-from-comment4.js
it is possible to control individually which compiler (Ion vs Baseline) is
used for each wasm function (see hacks in ExecuteCompileTask).
From this it is clear that it doesn't matter which compiler compiles
func $f1 and func (export "f"). It is func $f2 that matters; with
Baseline we get SIGTRAP, with Ion there's no SIGTRAP. So it seems like it's
the code generation of $f2 on Baseline that is somehow wrong.
Of course that is hardly a surprise -- of the 3 functions this is by far the
most complex.
| Assignee | ||
Updated•2 years ago
|
| Assignee | ||
Comment 8•2 years ago
|
||
Per comment 6, Ion code for function $f2. Perhaps we can deduce what the
problem is by comparing these two.
| Assignee | ||
Comment 9•2 years ago
|
||
Cleaned up Baseline code for function $f2 (with redundant labels removed)
| Assignee | ||
Comment 10•2 years ago
|
||
Cleaned up Ion code for function $f2
| Assignee | ||
Updated•2 years ago
|
| Assignee | ||
Updated•2 years ago
|
Comment 11•2 years ago
|
||
Comment 12•2 years ago
|
||
We ran this in Ion for comparison, since Ion does not crash. Pernosco session: https://pernos.co/debug/T0cREOVqbf1RT4mYE86UGw/index.html#f{m[A+s,1UY_,t[AQ,jtU_,f{e[A+s,0/o_,s{aYP6a/sAA,bBQ,uBU63Fg,oBVNjHQ___/
The difference with Ion seems to be that some other instance calls run, which end up setting and clearing packedExitFP_. Therefore, by the time we call the imported JS function, packedExitFP_ is null again and the assert does not fire.
Updated•2 years ago
|
Comment 13•2 years ago
|
||
Depends on D209039
Comment 14•2 years ago
•
|
||
This bug probably does not need a very high security rating. We trip a debug assert, but the exit FP information in JIT activations is often stale and regularly reset to other values whenever it would be important (e.g. when starting a trap). It's not clear to any of us how this could actually be exploited. Still, we might as well keep the bug secure, just with a low or moderate severity.
Updated•2 years ago
|
Comment 15•2 years ago
|
||
Comment 16•2 years ago
|
||
Updated•2 years ago
|
| Comment hidden (obsolete) |
Comment 18•1 year ago
|
||
(In reply to Acasandrei Beatrice (needinfo me) from comment #17)
(In reply to Pulsebot from comment #15)
Pushed by bvisness@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/a414439df5c0
Always clear exit FP when handling wasm exceptions. r=jandemPerfherder has detected a talos performance change from push a414439df5c0f6563443f055e6f53e4fec5f3b94.
Not sure it is us. I see pdf updates in the same pushlog https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=91cfcfc5935b5ac51c36c1a08a064c33a2d96fd5&tochange=a414439df5c0f6563443f055e6f53e4fec5f3b94 . Bug 1894727 or bug 1894705 ?
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 19•1 year ago
|
||
Should we backport this ESR also? We'll need a rebased patch and an uplift request if yes.
| Assignee | ||
Comment 20•1 year ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #19)
Should we backport this ESR also? We'll need a rebased patch and an uplift request if yes.
We decided "no" (but thanks for the query):
(1) it seems like the fix only fixes an assertion failure.
(2) we don't have a good way to know that we won't break something else in esr115.
(3) in any case esr115 is late in cycle and esr128, with this fix already in, will be along soon.
(4) we don't have any indication of this crashing in the wild; the bug came from tsmith on the fuzzing team.
(5) if esr115 does start crashing we can always request uplift later.
Updated•1 year ago
|
Updated•1 year ago
|
Updated•1 year ago
|
Comment 21•3 months ago
|
||
Comment 22•3 months ago
|
||
| bugherder | ||
Updated•3 months ago
|
Description
•