Closed Bug 1756792 Opened 3 years ago Closed 3 years ago

Don't generate a checked call prologue if the function is not first-class

Categories

(Core :: JavaScript: WebAssembly, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
110 Branch
Tracking Status
firefox110 --- fixed

People

(Reporter: lth, Assigned: bvisness)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files)

The checked call prologue is 16 bytes on x64 and 32 bytes on arm64 - and hard to shrink due to alignment constraints and complications around what the largest signature check sequences could be, though see bug 1705495. In principle I believe we don't need this prologue at all if the function's address is never taken and the function is not exported, and we can know that statically when we compile it because it is declared in the module prefix. We should therefore try not to generate it. The savings will depend on the program and how it was compiled - simulated exception handling will tend to expose most functions, presently - but are worth measuring, if we can.

I ran a small experiment to see what the gains might be. For baseline code, it looks like a 2.5% reduction in code size; for ion code, about a 4% reduction (ZenGarden, release shell, x64, disable the checked prologue for all functions defined in a module). This is the maximal reduction; in reality, the gains would be smaller, as many functions are either exported or in tables. There's a little bit of hair associated with computing whether a function needs the checked prologue or not, and piping this information into the point that calls the prologue generator, but overall it does not look like this would create a lot of complexity.

In programs like PhotoShop and LibreOffice, a 4% reduction would mean 10-20MB of code saved.

For baseline, we could combine this with the TLS pinning optimization (5% reduction) for a healthy gain there as well.

Ditto savings on ARM64 are 4.5% for baseline and 6.5% for ion - a result of a very large area set aside for the checked prologue due to alignment requirements and a possibly long code sequence for the signature check. Bug 1705495 might reduce that, and might therefore reduce the savings from the present change, but it would be best to do both.

From a conversation with Lars, we may need to be careful with the hardcoded offsets that are used in GenerateFunctionPrologue and StartUnwinding. Before this change, they are the same for every function and can be assumed. After this change, that may no longer holder.

I went through a corpus of WASM files to estimate the impact of this optimization. Unfortunately my numbers are a bit more pessimistic than Lars's. A majority of functions wouldn't be eligible for this optimization due to being exported or placed in a table for call_indirect. My numbers also exclude the data segments and only focus on machine code - I'm not sure whether data segments were factored into Lars's work.

This may still be worth doing, since larger modules may still benefit from even a 2-3% reduction in code size. But on average I'm only seeing a reduction of about 1%, and the 99th percentile of size savings was 1.73MB - nowhere near the 10-20 we were hoping for.

The checked call prologue is only required for first-class functions; that is, functions that can be referenced in a table. We can omit this from the generated code where possible to save up to 3% generated code size for many real-world modules.

This patch also removes code related to checked tail entries, which no longer exist.

Assignee: nobody → ben
Status: NEW → ASSIGNED
Pushed by rhunt@eqrion.net: https://hg.mozilla.org/integration/autoland/rev/3fefd0bf6da3 Omit the checked call prologue for some WASM functions. r=rhunt
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 110 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: