Open Bug 1900336 Opened 1 month ago Updated 16 days ago

Code generation aspects for wasm lazy tiering

Categories

(Core :: JavaScript: WebAssembly, enhancement, P1)

enhancement

Tracking

()

People

(Reporter: jseward, Unassigned)

References

(Blocks 1 open bug)

Details

This is as described in
https://docs.google.com/document/d/1F-Nah2fQjdbx-D_cqD6jU2CBB36l3GijG1kP6dmEiso,
section "Support for lazy tiering -- simplest version"

Because the rest of the tier-up mechanism is not yet in place, the end result
of the steps below is that we'll generate code to call into a new method,
Instance::requestTierUp, which serves as the connection point to the rest of
the tier-up system, but which currently does nothing. Hence we can run/test
the below without having the rest of the mechanism in place.

(0) Build this on top of the patches in bug 1891182; not doing so will create
rebase hassle later.

(1) When baseline compiling, collect some estimation of the cost of Ion
compiling each function. Or at least collect info from which some estimate
can be made. The simplest possible metric is probably the bytecode length of
each function. Anyways, somehow create these estimates and store them in a
vector (one int32_t per function) that hangs off of wasm::ModuleMetadata.

Estimates must be in the range 1 .. 2^31-1 inclusive. From discussion in the
abovementioned gdoc, it would be wise to have them not exceed (eg) 1 million
and not smaller than (eg) 100.

(2) To wasm::Instance, add one counter (int32_t) per function. This is
contained within the Instance itself; it is not a vector that is pointed to be
Instance.

Fix up CodeMetadata::doInstanceLayout to compute instance offsets for these
counters. Add method Instance::offsetOfHotnessCounter(uint32_t fnIndex).
Note that CodeMetadata::doInstanceLayout runs before baseline compilation, so
these offsets will be available at during baseline compilation.

(3) In baseline compilation, when generating the function prologue, create a
tiering up check. For context see
https://bug1879010.bmoattachments.org/attachment.cgi?id=9395921 and search for
the text "tiering_up_check".

The check is the following

subl $1, $offsetOfCounter(%r14)
js OOLCode // it went negative
resume:

and

OOLCode:
call Instance::requestTierUp, passing it the fn index as param
jmp resume

(and obviously the arm64 etc equivalents)

Note the use of "js" instead of the check "jz".

Do this by creating a MacroAssembler method that generates both the check and
the OOL call.

Also place this check (with $1 possibly replaced by some larger number) at
loop heads, at least innermost loops. The check itself is cheap, but the
OOLCode is expensive in space, so minimising the total number seems
worthwhile.

(4) Create new Instance method Instance::requestLoopCheck(uint32_t fnIndex)

This must "ignore" the request by setting the Instance counter slot for
function fnIdx to its maximum allowable value, 2^31-1.

(5) When instantiating a module (producing the initial Instance fields),
initialise the counter array as a copy of the cost-estimates array created in
(1).

Severity: -- → N/A
Priority: -- → P1
Depends on: 1903539
You need to log in before you can comment on or make changes to this bug.