[exploration] Experiment with AVX encoding and (maybe) assumed-aligned loads in simd wormhole
Categories
(Core :: JavaScript: WebAssembly, task, P3)
Tracking
()
People
(Reporter: lth, Unassigned)
References
(Blocks 2 open bugs)
Details
Since it's the workhorse of inner loops in the machine learning codes, the WHPMADDUBSW operation could usefully use an AVX encoding (to avoid clobbering a register whose value is needed, thus necessitating an additional move to preserve that value). This will be a little tricky, because we do not want to enable AVX for any other instructions at all, yet the encoding is chosen fairly deep down in the pipeline. Probably this means changing the AVX test in the encoder from if (AVXPresent(...)) { ... } else { ... }
to if (AVXPresent(...) || op == WHPMADDUBSW && AVXReallyPresent(...) { ... } else { ... }
since the AVXPresent predicate is subject to various switches that are off (and shall remain off).
Another issue here is that we're not able to fuse a v128.load into a WHPMADDUBSW. I'm not sure how valuable this is - if the code preloads a bunch of registers and then operates on them then there's no sense in trying to fuse anything, but if it consists of load-and-operate pairs then the matter is different. But the problem is that fusing only works if the load is aligned, and we have no guarantee of that. We could do an exception handler fixup of unaligned loads but this is basically going to be a mess. But for starters we could look at the code to see if it would match the pattern, and if it does then we could experimentally try for a fusing, and then we could measure the result to see if there's an improvement.
Related discussion here: https://github.com/mozilla-extensions/bergamot-browser-extension/issues/75
Reporter | ||
Comment 1•4 years ago
|
||
We may solve this differently and it's not a priority right now to investigate this.
Reporter | ||
Updated•3 years ago
|
Comment 2•3 years ago
|
||
This optimization is too narrow. Also, looking at intgemm multiply code, it is rarely direct memory operands for pmaddubsw.
Comment 3•3 years ago
|
||
We're intending to phase out the wormhole.
Description
•