Closed Bug 1569137 Opened 7 months ago Closed 7 months ago

Cranelift doesn't use the baldrdash calling convention for libcall calls


(Core :: Javascript: WebAssembly, defect)

Not set



Tracking Status
firefox70 --- fixed


(Reporter: jseward, Assigned: bbouvier)




(1 file)

Bug 1564789 changes Cranelift's register-selection heuristics in an attempt to
improve the gains made by the redundant-reload remover as tracked in
bug 1564772.

Prior to bug 1564789, CL (in cranelift-codegen/src/regalloc/solver,rs,
function find_solution) would select the lowest-numbered register of the
relevant class. However, bug 1564789 modifies this so as to select the
highest-numbered available register for reloaded values. This is a heuristic
hack which aims -- partially successfully -- to keep registers for
reloaded and non-reloaded (ordinary) values disjoint, to the extent possible.

It should be the case that a correct allocation is
performed regardless of which of the available registers is selected.
However, this does not appear to be the case. In particular, with the bug
1564789 heuristic, test case wasm_box2d.js segfaults with %r14 holding 0x362c,
an obviously non-pointer value, but being used as a memory reference:
mov 0x20(%r14),%r14. Debug printing in find_solution confirms that it
sometimes does allocate %r14. I'd guess that SM believes %r14 is the TLS
register at this point.

This failure happens even with unmodified CL sources (apart of course from a
modification to change the allocation order; see below), so it's not a bug
caused by the redundant-reload remover itself.

I am not sure on which side of the SM/CL interface the error lies. Or, for
that matter, even if my analysis is correct. In CL, I tried removing %r14 from
usable_regs in and in an attempt to verify the theory,
but got bogged down in regalloc assertion failures as a result.

Here's the bit of code where the fault happens. It looks suspiciously like a
stub of some kind to me, not CL generated code. Unfortunately Gdb couldn't
find the caller.

   0x000020b530287420:	push   %r14
   0x000020b530287422:	push   %rbp
   0x000020b530287423:	mov    %rsp,%rbp
=> 0x000020b530287426:	mov    0x20(%r14),%r10
   0x000020b53028742a:	mov    0x208(%r10),%r10
   0x000020b530287431:	movl   $0x17,0x78(%r10)
   0x000020b530287439:	or     $0x1,%rbp
   0x000020b53028743d:	mov    %rbp,0x70(%r10)
   0x000020b530287441:	and    $0xfffffffffffffffe,%rbp
   0x000020b530287445:	sub    $0x8,%rsp
   0x000020b530287449:	test   $0xf,%spl
   0x000020b53028744d:	je     0x20b530287454
   0x000020b530287453:	int3   
   0x000020b530287454:	callq  0x20b530289750
   0x000020b530287459:	add    $0x8,%rsp
   0x000020b53028745d:	mov    0x20(%r14),%r10
   0x000020b530287461:	mov    0x208(%r10),%r10
   0x000020b530287468:	movq   $0x0,0x70(%r10)
   0x000020b530287470:	movl   $0x0,0x78(%r10)
   0x000020b530287478:	pop    %rbp
   0x000020b530287479:	pop    %r14
   0x000020b53028747b:	retq

Here's the make-it-crash patch:

diff --git a/cranelift-codegen/src/regalloc/ b/cranelift-codegen/src/regalloc/
index 0d6a816d..18a809ef 100644
--- a/cranelift-codegen/src/regalloc/
+++ b/cranelift-codegen/src/regalloc/
@@ -917,7 +917,21 @@ impl Solver {
         for v in &mut self.vars {
             let rc = v.constraint;
-            let reg = match v.iter(&iregs, &oregs, &gregs).next() {
+            let mut chosen: Option<RegUnit> = None;
+            {
+                let mut avails = Vec::<RegUnit>::new();
+                let mut rsi = v.iter(&iregs, &oregs, &gregs);
+                while let Some(r) = {
+                    avails.push(r);
+                }
+                if !avails.is_empty() {
+                    //chosen = Some(avails[0]);  // Works (== the default setting)
+                    chosen = Some(avails[avails.len() - 1]); // Fails, but should work
+                }
+            }
+            let reg = match chosen {
                 Some(reg) => reg,
                 None => {
                     // If `v` must avoid global interference, there is not point in requesting
Blocks: 1564789

I think the root cause is that when translating the wasm code in Baldrdash, we include the VMContext (= WasmTlsReg) argument in direct call for non-imported functions. The stub looks like a fast wasm -> jit code stub (see the 1-or'ing of RBP that marks a fast-wasm-to-jit-frame). In baldrdash an indirect (CL) call is made for direct (wasm) calls to imported functions (= JS functions), and it doesn't pass the VmContext argument, so this could be the source of the issue.

Julian, what do I need to reproduce the issue? Just applying the small diff in comment 0 onto Cranelift trunk + Spidermonkey inbound using local Cranelift doesn't reproduce the issue on my machine.

Flags: needinfo?(jseward)

Initial intuition is probably incorrect, as usual. This code stems from SetExitFP, and the code around tells me that it's in GenerateExitPrologue, which is used by debug trap stub (so not used in our case, since we're compiling with CL that doesn't have debug support), or the import exit stub, or the builtin thunk.

The builtin thunk looks like a strong candidate for causing trouble, since these can result from libcall calls in Cranelift. In particular, these are created at legalization time, and don't use a client-provided function to create the function signature, so they're unaware of the VmContext special argument requirement that Baldrdash has. I'll try to write a patch tomorrow.

(In reply to Benjamin Bouvier [:bbouvier] from comment #2)

Julian, what do I need to reproduce the issue?

Hmm, not sure why that doesn't fail for you. I am using git rev e7f2b719eebfb9280c3e38eae42a9ee25221a4e3 (Dan Gohman <>, Fri Jul 19 16:28:40 2019 -0700, "Bump version to 0.36.0") and mozilla-central 483772:db8f3ee41bdf.

Configured as below, although I don't think this should matter (should it?)

CC="ccache clang" CXX="ccache clang++" ../js/configure \
   --enable-debug --enable-optimize="-g -Og" --enable-valgrind \
   --disable-jemalloc --enable-wasm-reftypes --enable-cranelift

and run

/home/sewardj/MOZ/MC-WASM/js/BUILD64/dist/bin/js --no-threads --wasm-compiler=cranelift wasm_box2d.js
Flags: needinfo?(jseward)

Another way I've seen programs fail -- maybe more common than segfaulting -- is an out of range exception, eg:

exception thrown: RuntimeError: index out of bounds,@wasm_lua_binarytrees.c.js line 1692 > WebAssembly.Module:wasm-function[203]:0xfbf9
@wasm_lua_binarytrees.c.js line 1692 > WebAssembly.Module:wasm-function[226]:0x11e07
@wasm_lua_binarytrees.c.js line 1692 > WebAssembly.Module:wasm-function[234]:0x1292f
@wasm_lua_binarytrees.c.js line 1692 > WebAssembly.Module:wasm-function[215]:0x10d01

although maybe that's really just a failure in a wasm memory access, which leads to a segfault, which is caught by the VM. Whereas the segfault in the code in comment 0 isn't caught since it is in a function prologue and not associated with any wasm memory access. Just a guess.

I tried this again just now and I can still reproduce the segfault. Note that
it's important not have the redundant-reload-removal patch applied to the CL
tree. That will cause a failure during compilation, due to missing bits on
the SM side, rather than failures in the Jit-generated code.

So, my recipe is:

mozilla-central 485382:d497a1f58d27
cranelift c6f08e26e6ca99f414f41ff83ec7870ebef84c9f
(although I suspect the exact versions don't matter much)

m-c using external cranelift, obviously

apply make-it-crash patch from comment 0 to the external cranelift tree

cd <top of mozilla-central tree>
cd js
mkdir BUILD64
cd BUILD64

(cd ../src && autoconf-2.13) && \
CC="ccache clang" CXX="ccache clang++" ../src/configure \
  --enable-debug --enable-optimize="-g -Og" \
  --enable-valgrind --disable-jemalloc --enable-wasm-reftypes \

and then run the test, per comment 4. Note that wasm_box2d.js is the one in
Lars' embenchen repo, not the in-SM-tree version.

ni? myself, to make sure I don't forget about it. I've entered a yak shaving state.

Flags: needinfo?(bbouvier)
Assignee: nobody → bbouvier

Posted a patch for Baldrdash; see accompanying pull request on the Cranelift side:

Flags: needinfo?(bbouvier)
Summary: Confusion on the role of x86_64 %r14 in the wasm-via-Cranelift pipeline → Cranelift doesn't use the baldrdash calling convention for libcall calls

Thanks for the patches. I have verified that it fixes the resulting segfault,
and furthermore, that there are no further failures on the set of 6 benchmark
programs we are using, both without and with my per-EBB
redundant-reload-remover patch, using 8 different register selection strategies
(128 total test configurations).

Duplicate of this bug: 1573001
Depends on: 1573409
Attachment #9082966 - Attachment description: Bug 1569137: Use the baldrdash calling convention for libcall expansion; r?jseward → Bug 1569137: Use the baldrdash calling convention for libcall expansion; r=jseward
Pushed by
Use the baldrdash calling convention for libcall expansion; r=jseward
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla70
You need to log in before you can comment on or make changes to this bug.