Bug 1442540 Comment 9 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

SoftIron Overdrive 1000 (4xCortex-A57, pretty spiffy), representative runs, this is baseline only since we have no ion:

```
                bytecode  machinecode   machine/byte   ms   bytecodes/ms
zlib               66551       209840       3.15        3          22183
raybench           44504       173952       3.91        4          11126
box2d              97060       323200       3.33        6          16176
lua_binarytrees   215718       705952       3.27       12          17976
bullet            382678      1114016       2.91       17          22510
angrybots       10413501     29747744       2.86      414          25153
tanks           11039867     31967296       2.90      445          24808
zengarden       23891164     72693744       3.04     1032          23150
```

zlib highlights some cache effects.  The benchmark appears to compile the same wasm program twice (ie create two new modules from the same bytecode array).  The second compilation is about twice as fast as the first, presumably because both bytecode and compiler are in-cache.  (What's reported here is the second compilation.)

As programs get larger, we see machine code bytes per bytecode byte approach 3.0.  Since this is baseline, compile times are really not that interesting, but if we have a plausible ion/baseline compilation time factor from another platform we can use that as a placeholder.  On x64, for the larger programs, that factor is about 4 on my Xeon system, let's say 5 to use a round number.

In addition, the ARM64 system I'm using for this is pretty fast, so let's assume that current consumer hardware will take twice as long to compile.  (Bug 1496325 is about monitoring these values continually across our user population.)  High-end phones now have 4 fast cores so the fact that my dev system is quad-core should not be a factor.

Thusly, we get the following initial values for ARM64:

```
arm64BaselineBytesPerBytecode = 3.0
arm64IonBytesPerBytecode = 2.75 (this is a guess, we have no data)
arm64BytecodesPerMs = 24000/5/2 = 2400
```

Remarkably, 2400 bytecodes/ms is faster than the value we're using for x64 (2100) but the reference hardware for x64 is fairly old,
SoftIron Overdrive 1000 (4xCortex-A57, pretty spiffy), representative runs, this is baseline only since we have no ion:

```
                bytecode  machinecode   machine/byte   ms   bytecodes/ms
zlib               66551       209840       3.15        3          22183
raybench           44504       173952       3.91        4          11126
box2d              97060       323200       3.33        6          16176
lua_binarytrees   215718       705952       3.27       12          17976
bullet            382678      1114016       2.91       17          22510
angrybots       10413501     29747744       2.86      414          25153
tanks           11039867     31967296       2.90      445          24808
zengarden       23891164     72693744       3.04     1032          23150
```

zlib highlights some cache effects.  The benchmark appears to compile the same wasm program twice (ie create two new modules from the same bytecode array).  The second compilation is about twice as fast as the first, presumably because both bytecode and compiler are in-cache.  (What's reported here is the second compilation.)

As programs get larger, we see machine code bytes per bytecode byte approach 3.0.  Since this is baseline, compile times are really not that interesting, but if we have a plausible ion/baseline compilation time factor from another platform we can use that as a placeholder.  On x64, for the larger programs, that factor is about 4 on my Xeon system, let's say 5 to use a round number.

In addition, the ARM64 system I'm using for this is pretty fast, so let's assume that current consumer hardware will take twice as long to compile.  (Bug 1496325 is about monitoring these values continually across our user population.)  High-end phones now have 4 fast cores so the fact that my dev system is quad-core should not be a factor.

Thusly, we get the following initial values for ARM64:

```
arm64BaselineBytesPerBytecode = 3.0
arm64IonBytesPerBytecode = 2.75 (this is a guess, we have no data)
arm64BytecodesPerMs = 24000/5/2 = 2400
```

Remarkably, 2400 bytecodes/ms is faster than the value we're using for x64 (2100) but the reference hardware for x64 is fairly old, it might be worth re-measuring. On my Xeon system I see stable rates of 15K bytecodes/ms with Ion, and 60K-100K/ms with baseline.

Back to Bug 1442540 Comment 9