Remove redundant bitwise {and/or/xor}{32,64} for wasm-via-Ion
Categories
(Core :: JavaScript: WebAssembly, enhancement, P3)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox96 | --- | fixed |
People
(Reporter: lth, Assigned: jseward)
References
(Blocks 1 open bug)
Details
Attachments
(2 files, 1 obsolete file)
|
20.37 KB,
patch
|
Details | Diff | Splinter Review | |
|
48 bytes,
text/x-phabricator-request
|
Details | Review |
For int32, there is code in the JIT to remove redundant bit operations of the type n|0, see MBinaryBitwiseInstruction::foldUnnecessaryBitop. It would be useful to port this function to also support int64, as we are beefing up int64 support in the JIT to better support wasm.
This is slightly ambitious as a first bug, but let's see.
Comment 1•4 years ago
|
||
I looked into this bug and have a few questions regarding the implementation. initWasmOptimizationInfo sets autoTruncate_ to false, hence removeUnnecessaryBitops never removes anything for wasm code (as bitops stays empty). This is not limited to int64 but applies to int32 as well.
Hence my first question: how would one restructure the code such that wasm profits from bitop folding w/o invalidating other assumptions? RangeAnalysis::removeUnnecessaryBitops mentions something regarding bailouts.
As far as I understand some of the bitop folding might be optimized further, in particular:
MBitOr::foldIfAllBitsSet // e.g. for uint16: x | 0xffff => 0xffff;
MBitXor::foldIfEqual // => 0
MBitXor::foldIfNegOne // => MBitNot?
| Reporter | ||
Comment 2•4 years ago
|
||
Hah, that's a nice find! My gut feeling is that it probably doesn't matter a great deal for wasm in practice (unnecessary bitops are probably more common in JS and especially asm.js code), but it would be nice to clean it up even so.
(I'll give this some more thought on Monday and come back with a proper answer.)
| Reporter | ||
Comment 3•4 years ago
|
||
(Memo to self: test this on x64 and not on arm64, because on arm64 the Vixl macroassembler also has logic to remove redundant bit operations, thus the code looks pretty good anyway.)
In RangeAnalysis::truncate (which is the function that drives truncation), we find this comment, which relates to what comment 1 says about RangeAnalysis::removeUnnecessaryBitops:
// Automatic truncation is disabled for wasm because the truncation logic
// is based on IonMonkey which assumes that we can bailout if the truncation
// logic fails. As wasm code has no bailout mechanism, it is safer to avoid
// any automatic truncations.
It's unnecessary for wasm to rely on range analysis to determine whether folding these operations is safe or not, it is always safe. We probably will need a flag that distinguishes wasm semantics from JS semantics in each node, so that we can perform folding on wasm nodes. This leads to a situation where the bitops probably all acquire a NewWasm constructor that sets a flag defined as part of MBinaryBitwiseInstruction. (MUrsh has something like this already, just as a for-instance.)
In practice it may be best to put logic into MBitOr::foldsTo etc to delegate to MBinaryBitwiseInstruction::foldsTo for constant evaluation and then apply whatever folding optimizations are appropriate for the node type, when the wasm flag is set.
Comment 4•4 years ago
|
||
Before working on the issue, I'd like to make sure my understanding of the existing code is correct:
In the JS case, folding of bitops (via foldUnnecessaryBitop) needs to run after the Sink pass. This is necessary because some kind of interference during bailout. I can't really come up with an example that would suffer from an earlier (before sinking) folding, but I can just take this as an axiom for now.
In contrast to JS, wasm doesn't have a bailout mechanism. All subclasses of MBinaryBitwiseInstruction could have a NewWasm constructor (analogue to MUrsh), flagging the instruction as non-bailing. As a consequence MBinaryBitwiseInstruction::foldsTo (running earlier, during GVN) could already simplify some of the bitops iff there is no bailing.
This raises the question: why is MRsh implementing foldsTo? The statement ((x << 16) >> 16) gets folded to a MSignExtendInt32 during GVN (before Sinking). Is this some kind of special case that has no negative side-effects on bailout code?
It feels like the implementations of foldsTo and foldUnnecessaryBitop would duplicate logic. Should this be avoided by implementing a single method (per class) reasoning about folding? This method could be called during foldsTo (in the wasm case) or after sinking (in the JS case).
According to bug 1109195 (and the default prefs is JitOptions.cpp), sinking is disabled for quite some time (see regression in bug 1108413). If there are no plans to enabling sinking at some point, maybe MBinaryBitwiseInstruction::foldsTo could just run instead of foldUnnecessaryBitop, making the delayed folding of bitops obsolete?
| Reporter | ||
Comment 5•4 years ago
|
||
I need to look at this a little before I can give you a good answer here (will find time next week). I'm guessing that successive iterations of optimizations have piled up a bit, with JS, asm.js and wasm having somewhat different needs. The MRsh optimization could look like an asm.js special case (it's really the only way to express 16-to-32 sign extension in JS which would be a common operation when operating on 16-bit data maybe), but it could also be something that got added to deal with a benchmark, for all I know.
(I also had the feeling when I was looking at this code that there was duplication that could be tidied up. Tidying up tends to be a larger project with more stakeholders. Hence one tends to just add an optimization to scratch the itch of the day. Hence the current mess.)
| Reporter | ||
Comment 6•4 years ago
|
||
What a mess. Lukas, good detective work on your part, and now it feels like this is a problem that needs more than a local fix. I suspect that what we need here is a rethink regarding how and when to do constant folding for wasm and whether to decouple that from the constant folding for JS in some way.
Since you seem to be well out of "first bug" territory, how do you feel about bug 1709209 as an alternative to this one?
| Reporter | ||
Updated•4 years ago
|
| Assignee | ||
Comment 7•4 years ago
|
||
WIP patch. Folds all simple cases for {and,or,xor}{32,64} for wasm only,
excluding xor64 against all 1s, as the result is not representable. JS
handling of and/or/xor is unchanged; this patch adds a new path (MIR node) for
the wasm/machine-level-semantics versions of these operations, which then
participates in constant folding (via GVN) in the "normal" way. Test cases
are to follow.
| Assignee | ||
Comment 8•4 years ago
|
||
Now with test cases. Needs cleanup and review.
| Assignee | ||
Updated•4 years ago
|
| Assignee | ||
Comment 9•4 years ago
|
||
This patch performs MIR-level folding of bitwise {and/or/xor}{32,64} arising
from wasm inputs. JS handling of and/or/xor is unchanged -- this patch adds a
new path (MIR node) for the wasm/machine-level-semantics versions of these
operations, which then participates in constant folding (via GVN) in the
"normal" way. Changes:
jit-test/lib/codegen-x64-test.js: codegenTestX64_adhoc:
- new option
must_not_matchto negate the sense of the final test
jit-test/tests/wasm/binop-x64-ion-folding.js:
- new file with test cases
jit/MIR.h:
- new node kind MWasmBinaryBitwiseInstruction for wasm-only {and/or/xor}{32,64}
jit/MIR.cpp:
- new method MWasmBinaryBitwiseInstruction::foldsTo to handle all folding cases
- added a few small helper functions for ::foldsTo
wasm/WasmIonCompile.cpp
- a new overload for MDefinition* binary(..), taking a MIR opcode
- generate MWasmBinaryBitwiseInstruction as appropriate, via function
EmitBitwiseAndOrXor - remaining uses of EmitBitwise renamed to EmitShift
Comment 10•4 years ago
|
||
Comment 11•4 years ago
|
||
| bugherder | ||
Description
•