Improve performance of throw heavy code
Categories
(Core :: JavaScript: WebAssembly, enhancement, P3)
Tracking
()
People
(Reporter: rhunt, Unassigned)
References
(Depends on 1 open bug, Blocks 3 open bugs)
Details
Attachments
(2 files)
OCaml can use exception handling for normal control flow. I've heard that in V8 ~40% of certain OCaml programs can be spent in the code for throwing an exception. We should investigate if we're in a similar situation or if something can be done about it.
| Reporter | ||
Comment 1•2 years ago
|
||
Jerome, have you observed this to be the case in SM too? If so do you have any interesting benchmarks we could look at?
This program is slow in Chrome because of exceptions. It is even slower in Firefox but I'm not sure this is due to exceptions.
Here is a version of the JavaScript loader script which works with the JS shell (with option --wasm-tail-calls).
It seems a lot of time is spent in js::wasm::GetNearestEffectiveInstance (which is apparently inlined in js::wasm::WasmFrameIter::WasmFrameIter).
- 48,89% WasmTrapHandler
- 22,55% js::jit::JitActivation::startWasmTrap
+ 21,11% js::wasm::GetNearestEffectiveInstance
+ 14,41% js::wasm::LookupCode
5,18% js::wasm::Code::lookupCallSite
+ 21,47% js::wasm::GetNearestEffectiveInstance
+ 1,10% 0xffffffff93400b2b
1,07% js::wasm::Code::lookupTrap
- 24,68% WasmHandleThrow
- 21,52% js::wasm::WasmFrameIter::WasmFrameIter
+ 14,64% js::wasm::LookupCode
5,25% js::wasm::Code::lookupCallSite
+ 2,25% js::wasm::HandleThrow
| Reporter | ||
Updated•2 years ago
|
So the issue is that one typically needs to traverse the whole Wasm stack to get the Wasm instance (in function js::wasm::GetNearestEffectiveInstance), even when just a few frames need to be unwound to reach an exception handler.
This is currently done three times for each exception thrown, in the following functions:
js::wasm::HandleExceptionWasmjs::wasm::HandleTrapjs::jit::JitActivation::startWasmTrap
I think the last traversal could be avoided by passing the instance fromHandleTrap. But this does not save that much.
Traversing the stack is not very fast since one makes a code lookup at each frame. One could use a cache, though it might be tricky to make it really efficient. I implemented a per-isolate cache in V8 to avoid any locking, since using a lock had a significant performance impact even when the exception was immediately caught in the calling function.
| Reporter | ||
Updated•1 year ago
|
| Reporter | ||
Updated•11 months ago
|
Description
•