A couple of things: - Structure fields don't fit very nicely into the existing PostBarrierArg mechanism; even if we were to widen the argument to uintptr_t size we could not hope to pass eg object pointer + offset within the one word that is allowed us. Instead we'd probably prefer to compute the cell address and then pass the address directly to the barrier code. Then it fits in one word again. (We can tag this address if we like, to distinguish it from the PostBarrierArg for globals; the two low bits should for sure be available. But see below.) - The common case for globals will be that the prebarrier does not fire but that the postbarrier does fire. The prebarrier does not usually fire because incremental gc is not usually ongoing. The postbarrier will usually fire because the pointer being stored usually will point into the nursery, and because the container containing the global cell will usually be non-nursery. The latter point is either because it is a private global whose storage is always non-nursery, or because the heap-allocated cell for the global has become tenured, as we expect for cells. In this common case, Instance::postBarrier() will compute the location pointer for the global cell. The jitted code has an elaborate song and dance to compute the location pointer only when necessary, but since it will usually be necessary to compute it anyway we should just compute it in the jitted code always. This complexity is my fault - I made Benjamin split the prebarrier code so that we only compute the pointer when it's needed, not realizing that it is needed much more often than I expected. In consequence of these two observations - the code for field setting will always compute the cell address and the common path for global setting will also compute the cell address - I think we should simplify code so that the barrier always receives the cell address. (This also paves the way for a possible later optimization where we just call directly into the GC's StoreBuffer::putCell code because we have all the information we need in our own TLS, but that's for another day.) The one remaining sticking point is that we must use the PreBarrierReg for the cell address because that's where the preBarrier needs to find it; this may increase spilling in the baseline compiler if the preBarrier ends up *not* being called, since that register may be in use already. But we should be able to select the register for the address conditionally based on the result of testNeedPreBarrier(), it shouldn't be awful.
(In reply to Lars T Hansen [:lth] from comment #0) > ... But we should be able to select the register for the address conditionally > based on the result of testNeedPreBarrier(), it shouldn't be awful. That's an insane remark. Please disregard.
Generalize the tracing of global cells to also trace them when they are Ref, not just AnyRef.
Attachment #8995223 - Flags: review?(bbouvier)
Generalize the barrier code as described in comment 0. (I also ended up moving the code in WasmBaselineCompiler.cpp because it was in the middle of the section for branch optimization, so the diff looks more severe than it really is.)
Attachment #8995225 - Flags: review?(bbouvier)
Attachment #8995223 - Flags: review?(bbouvier) → review+
Comment on attachment 8995225 [details] [diff] [review] bug1478616-generalize-write-barrier.patch Review of attachment 8995225 [details] [diff] [review]: ----------------------------------------------------------------- Thanks. ::: js/src/wasm/WasmInstance.cpp @@ +482,4 @@ > > #ifdef ENABLE_WASM_GC > /* static */ void > +Instance::postBarrier(Instance* instance, void* location) We could also cheat and pretend location had a gc::Cell** type.
Attachment #8995225 - Flags: review?(bbouvier) → review+
The second patch is incorrect, because it calls needRef() to obtain a second scratch register inside one branch of a control flow diamond. needRef() may sync(), which means the join point may have different expression stacks coming in from the two arms, which is bad. This is basically why there is an unstated rule that register allocation does not happen on this level, but in the higher-level code (in this case emitSetGlobal); when that can't happen we introduce weirdo structures like the PopAtomicXchg64Regs RAII classes. (Didn't find this in testing; realized it before falling asleep last night.) I wish there were some way of guarding against this problem, but it would probably involve several RAII classes and some way of connecting them during compilation to model stack operations - complicated & expensive. So we just try to do our best.
Pushed by firstname.lastname@example.org: https://hg.mozilla.org/integration/mozilla-inbound/rev/55e008024f24 Wasm, trace Ref as well as Anyref global and Val. r=bbouvier https://hg.mozilla.org/integration/mozilla-inbound/rev/e713a94f6e55 Generalize the wasm write barrier. r=bbouvier
You need to log in before you can comment on or make changes to this bug.