[exploration] Consider growing table capacity exponentially
Categories
(Core :: JavaScript: WebAssembly, task, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox109 | --- | fixed |
People
(Reporter: luke, Assigned: bvisness)
References
(Blocks 2 open bugs)
Details
Attachments
(2 files)
Currently, we just realloc() each time so, if realloc() was O(n), then n grows would be O(n^2). Realloc() mostly helps out out here by doing some internal rounding-up, but it seems dangerous to depend on this. I bring this up b/c there was a Chrome bug where they were, I'm guessing, not using realloc, and thus they were getting O(n^2), and someone reported a bug, and so they are switching to exponential growth, bringing them from slower than us to faster than us.
Updated•5 years ago
|
Updated•3 years ago
|
Comment 1•3 years ago
|
||
Instead of growing a Table by the requested delta (and relying on
realloc to do the heavy lifting for us to keep this efficient),
increase the table capacity by at least a factor of 1.5 when we need
to grow it.
Comment 2•3 years ago
|
||
I believe the attached patch does exactly what we need, for a growth factor of 1.5x. This does not improve the performance on Linux. On Linux, Nightly is quite a bit faster than Chromium [sic] on the benchmark attached to the v8 bug, and the patch regresses performance for the part of the benchmark in the build we care about. (You want to look at the "__assign_got_entries" in the console, and be sure to disable javascript.options.wasm_verbose before you run this, as there's tremendous spew that slows everything down, due to thousands of modules being constructed to simulate WebAssembly.Function.)
It may be that 1.5x is much too aggressive and that this will result in copying because realloc already was implementing this optimization for us and now it doesn't have the breathing room to do so. But 1.1x isn't really much better.
(I need to measure this on other systems where other mallocs may be behaving differently.)
Comment 3•3 years ago
•
|
||
On macOS X (M1 MacBook Pro, OS 12.3), Chrome is about 30% faster than Firefox on that segment of the test. Growing the tables exponentially does not improve the Firefox score in my local testing; the cost is spent elsewhere, TBD. The profiler shows almost no time being spent in Table::grow.
Some of the time is background compilation noise: running with either just baseline or ion, the time improves by about 15%. But the profiler still shows almost all time everywhere being spent in compilation.
(I need to test Windows too.)
Comment 4•3 years ago
•
|
||
On Windows, basically no time is spent in Table::grow, the profiler has just 13 hits, for a total of 26ms (this is Nightly 100, without the patch). Chrome 100 is about 15% faster than Nightly 100 on the segment of the test in question so they're winning on something. A lot of time in Firefox is being spent in AllocateCodeBytes (for the memcpy it looks like) and VirtualProtect - this is essentially a compiler benchmark, from what I can see, and what we're likely seeing the most is that when new WebAssembly.Module is being used to simulate WebAssembly.Function, we have more overhead than Chrome.
In short, I think it might be worth digging into this test case some more to see if we can speed up compilation, but the table growth does not seem to be a factor. It's possible we're saved by good realloc implementations.
Profile from Windows: https://share.firefox.dev/3DMQVDn
Comment 5•3 years ago
|
||
Blocking wasm-large-web-properties since they will suffer the most from the compilation overhead of lots of tiny modules, see comment 4, and also from our WebAssembly.Function implementation, which basically does the same thing.
Updated•3 years ago
|
Updated•2 years ago
|
Assignee | ||
Comment 6•2 years ago
|
||
Rather than sticking with our current realloc
strategy and growing exponentially, we decided to simply store func refs in a vector. Microbenchmark performance indicates that this is 7-8x faster.
(The current implementation spends a lot of time in memset
for reasons unclear to me - I imagine it's repeatedly re-zeroing memory, but I don't have a good explanation as to why.)
Updated•2 years ago
|
Comment 8•2 years ago
|
||
bugherder |
Description
•