Bug 1625891 Comment 7 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

The fib benchmark performs 10 indirect calls in a row in the base case of the recursion, so disabling bounds checks should help quite a bit.  It also has the most to gain from BCE, since all calls use the same known table index.  BCE would eliminate the last nine checks in this innermost block.

The raybench benchmarks traverses a scene graph and calls a virtual intersect method on each object.  Most of these do a fair amount of computation.  The index is usually a constant vtable index off a variable index representing the start of the object's vtable, the latter index is loaded from the object.  There should be some benefit from disabling bounds checking, but BCE would probably have a hard time removing the checks in practice.

Results, showing running time reduction for removing the bounds check altoghether:

fib arm64/Apple-M1: 6%
fib x64/Intel-i7: 20% (yeah, actually)
raybench arm64/Apple-M1: 5%
raybench x64/intel-i7: 2% (if we're charitable)

The advantage of removing bounds checks might be reduced if we rewrote our code generator so that we would jump to an out-of-line trap and fall through to success code, as conditional forward branches are statically predicted as not taken (for sure on Intel but I believe even on the M1).
The fib benchmark performs 10 indirect calls in a row in the base case of the recursion, so disabling bounds checks should help quite a bit.  It also has the most to gain from BCE, since all calls use the same known table index.  BCE would eliminate the last nine checks in this innermost block.

The raybench benchmarks traverses a scene graph and calls a virtual intersect method on each object.  Most of these do a fair amount of computation.  The index is usually a constant vtable index off a variable index representing the start of the object's vtable, the latter index is loaded from the object (and emscripten then masks it, at least in the code for this benchmark).  There should be some benefit from disabling bounds checking, but BCE would probably have a hard time removing the checks in practice.

Results, showing running time reduction for removing the bounds check altoghether:

fib arm64/Apple-M1: 6%
fib x64/Intel-i7: 20% (yeah, actually)
raybench arm64/Apple-M1: 5%
raybench x64/intel-i7: 2% (if we're charitable)

The advantage of removing bounds checks might be reduced if we rewrote our code generator so that we would jump to an out-of-line trap and fall through to success code, as conditional forward branches are statically predicted as not taken (for sure on Intel but I believe even on the M1).

Back to Bug 1625891 Comment 7