Closed Bug 1312751 Opened 3 years ago Closed 3 years ago

Merge ARM load and store code for Wasm

Categories

(Core :: JavaScript Engine: JIT, defect, P3)

ARM
All
defect

Tracking

()

RESOLVED FIXED
mozilla53
Tracking Status
firefox53 --- fixed

People

(Reporter: lth, Assigned: lth)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

When I landed bug 1277011 I sweet-talked Hannes into letting me land code in WasmBaselineCompile.cpp for load and store operations rather than in MacroAssembler-arm.cpp, which avoided me having to try to merge my code with the code already in CodeGenerator-arm.cpp and other places to support WasmIonCompile.  We should try to merge those paths, if we can.
Version: 51 Branch → unspecified
Priority: P2 → P1
Assignee: nobody → lhansen
Part 1: Refactor ARM code for wasm load and store from CodeGenerator-arm to MacroAssembler-arm.  (The new MASM APIs are not quite compatible with those on x86, but by and large they mirror them.)
Attachment #8815720 - Flags: review?(hv1989)
Part 2: Use the refactored load/store code from the wasm baseline compiler, and remove the now-obsolete support code from the compiler.
Attachment #8815734 - Flags: review?(hv1989)
Comment on attachment 8815720 [details] [diff] [review]
bug1312751-refactor-wasm-load-store.patch

Review of attachment 8815720 [details] [diff] [review]:
-----------------------------------------------------------------

My quick round of an incomplete review.

::: js/src/jit/arm/MacroAssembler-arm.cpp
@@ +5532,5 @@
> +
> +            load = ma_dataTransferN(IsLoad, 32, /* signed = */ false, HeapReg, ptr, out64.low);
> +            append(access, load.getOffset(), framePushed);
> +
> +            as_add(ptr, ptr, Imm8(INT64HIGH_OFFSET));

Can we use ScratchRegisterScope here instead of overwriting ptr?

@@ +5570,5 @@
> +    wasmLoadImpl(access, ptr, output, Register64::Invalid());
> +}
> +
> +void
> +MacroAssembler::wasmLoadI64(const wasm::MemoryAccessDesc& access, Register ptr, Register64 output)

For making it hard to shoot in our foot, should we change this to:

MacroAssembler::wasmLoadI64(const wasm::MemoryAccessDesc& access, Register ptr, Register scratch, Register64 output)
{
    MOZ_ASSERT(ptr == scratch);

I would prefer using the ScratchRegister though, which would solve this better.

@@ +5636,5 @@
> +}
> +
> +void
> +MacroAssembler::wasmStoreI64(const wasm::MemoryAccessDesc& access, Register64 value, Register ptr)
> +{

ditto

::: js/src/jit/arm/MacroAssembler-arm.h
@@ +465,5 @@
> +
> +    // The value to be stored is in `floatValue` (if not invalid), `val64` (if not invalid),
> +    // or in `valOrTmp` (if `floatValue` and `val64` are both invalid).
> +    void wasmUnalignedStoreImpl(const wasm::MemoryAccessDesc& access, FloatRegister floatValue,
> +                                Register64 val64, Register ptr, Register valOrTmp);

I assume we would like to have these functions private/protected?
Masm users should use the wasmStore/wasmStoreI64/wasmLoad/wasmLoadI64.
(In reply to Hannes Verschore [:h4writer] from comment #3)
> Comment on attachment 8815720 [details] [diff] [review]
> bug1312751-refactor-wasm-load-store.patch

Sorry to let this sit untouched for so long.

> Review of attachment 8815720 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> My quick round of an incomplete review.
> 
> ::: js/src/jit/arm/MacroAssembler-arm.cpp
> @@ +5532,5 @@
> > +
> > +            load = ma_dataTransferN(IsLoad, 32, /* signed = */ false, HeapReg, ptr, out64.low);
> > +            append(access, load.getOffset(), framePushed);
> > +
> > +            as_add(ptr, ptr, Imm8(INT64HIGH_OFFSET));
> 
> Can we use ScratchRegisterScope here instead of overwriting ptr?

Well, in principle we can do that right here, but it won't matter.  ptr is already updated higher up in the function, and we can't obviously use ScratchRegisterScope as the target for that update, because it's already being used as a temp in that update.  In practice we'd probably introduce a temp, but we'd like to avoid that if we can, since ptr is only conditionally updated.

I would like to argue that we should not try to fix this problem.  The calling code in Lowering is already set up to handle a ptr that may be overwritten, and the existing API happens to work pretty well for both Baldr and Rabaldr.  I know it looks a little hairy, but this is intended to be a refactoring of working code.

Same comment applies to your two next remarks, I think.

> ::: js/src/jit/arm/MacroAssembler-arm.h
> @@ +465,5 @@
> > +
> > +    // The value to be stored is in `floatValue` (if not invalid), `val64` (if not invalid),
> > +    // or in `valOrTmp` (if `floatValue` and `val64` are both invalid).
> > +    void wasmUnalignedStoreImpl(const wasm::MemoryAccessDesc& access, FloatRegister floatValue,
> > +                                Register64 val64, Register ptr, Register valOrTmp);
> 
> I assume we would like to have these functions private/protected?
> Masm users should use the wasmStore/wasmStoreI64/wasmLoad/wasmLoadI64.

protected might work; I'll investigate.
Flags: needinfo?(hv1989)
(In reply to Lars T Hansen [:lth] from comment #4)
> (In reply to Hannes Verschore [:h4writer] from comment #3)
> > Comment on attachment 8815720 [details] [diff] [review]
> > bug1312751-refactor-wasm-load-store.patch
> 
> Sorry to let this sit untouched for so long.
> 
> > Review of attachment 8815720 [details] [diff] [review]:
> > -----------------------------------------------------------------
> > 
> > My quick round of an incomplete review.
> > 
> > ::: js/src/jit/arm/MacroAssembler-arm.cpp
> > @@ +5532,5 @@
> > > +
> > > +            load = ma_dataTransferN(IsLoad, 32, /* signed = */ false, HeapReg, ptr, out64.low);
> > > +            append(access, load.getOffset(), framePushed);
> > > +
> > > +            as_add(ptr, ptr, Imm8(INT64HIGH_OFFSET));
> > 
> > Can we use ScratchRegisterScope here instead of overwriting ptr?
> 
> Well, in principle we can do that right here, but it won't matter.  ptr is
> already updated higher up in the function, and we can't obviously use
> ScratchRegisterScope as the target for that update, because it's already
> being used as a temp in that update.  In practice we'd probably introduce a
> temp, but we'd like to avoid that if we can, since ptr is only conditionally
> updated.
> 
> I would like to argue that we should not try to fix this problem.  The
> calling code in Lowering is already set up to handle a ptr that may be
> overwritten, and the existing API happens to work pretty well for both Baldr
> and Rabaldr.  I know it looks a little hairy, but this is intended to be a
> refactoring of working code.
> 
> Same comment applies to your two next remarks, I think.

I can understand that we cannot use "ScratchRegisterScope". Though I would prefer to have the "scratch" argument in the signature. Even if we assert that it should be equal to "ptr" in the body. It is just a way to signal that "ptr" is used as scratch in that code too for the people wanting to use that function.
Flags: needinfo?(hv1989)
Deferring this since I will not have time to do this reengineering before we branch.
Priority: P1 → P3
Hannes: so we can reap the maintainability rewards of code-sharing sooner than later, do you suppose we could land this in two steps: factor out the common code first and improve the interface after?
(In reply to Luke Wagner [:luke] from comment #7)
> Hannes: so we can reap the maintainability rewards of code-sharing sooner
> than later, do you suppose we could land this in two steps: factor out the
> common code first and improve the interface after?

Don't think the changes that I requested are that big. I was under the impression Lars is running out of time for FF53 and wants to take this up again after merge. Which I understand. In most cases I'm quite allowing in deferring "refactors" to follow-up bugs (if possible). This bug is precisely such a follow-up bug.

There could be miscommunication. I haven't done a full review either. I think this happened during the work week and only had time for a quick round and I think Lars said he would address those first and re-request review. But I could it now too. Either way is fine for me.
(In reply to Hannes Verschore [:h4writer] from comment #8)
> (In reply to Luke Wagner [:luke] from comment #7)
> > Hannes: so we can reap the maintainability rewards of code-sharing sooner
> > than later, do you suppose we could land this in two steps: factor out the
> > common code first and improve the interface after?
> 
> Don't think the changes that I requested are that big.

That's not really the point; it is no longer a refactoring of working code that has already been tested, but a change to the register management logic :-)  But see below.

> I was under the
> impression Lars is running out of time for FF53 and wants to take this up
> again after merge. 

I do now, because I have a lot of things that need to happen in the next week -- the FF53 fork and the TC39 meeting are happening at the same time and I have more important deadlines for both than this patch.

(So unless the patch can land basically as-is I will defer until early February.)
Comment on attachment 8815720 [details] [diff] [review]
bug1312751-refactor-wasm-load-store.patch

Review of attachment 8815720 [details] [diff] [review]:
-----------------------------------------------------------------

Talked over IRC about the register issue. Seems there was indeed miscommunication about it.
And reassured the change to the arguments is not a functional change, but only argument clarification and to make sure we don't shoot in our foot.
The logic should be the same. Since we will take the same registers nonetheless. 
As a result Lars asked over IRC to do the review with that change accepted.

R+ with following issues addressed and the argument clarification.

::: js/src/jit/arm/MacroAssembler-arm.cpp
@@ +5532,5 @@
> +
> +            load = ma_dataTransferN(IsLoad, 32, /* signed = */ false, HeapReg, ptr, out64.low);
> +            append(access, load.getOffset(), framePushed);
> +
> +            as_add(ptr, ptr, Imm8(INT64HIGH_OFFSET));

Given Lars said no. Scratch above question.

@@ +5644,5 @@
> +void
> +MacroAssemblerARM::wasmUnalignedLoadImpl(const wasm::MemoryAccessDesc& access, Register ptr,
> +                                         AnyRegister outAny, Register64 out64, Register tmp,
> +                                         Register tmp2, Register tmp3)
> +{

Can you put an assert that only out64 or outAny can exist and not at the same time.
And can you put asserts that assert the needed tmp3 registers exists for the specific cases.

@@ +5689,5 @@
> +            else
> +                ma_mov(Imm32(0), out64.high);
> +        }
> +    }
> +    else if (outAny.isFloat()) {

Nit: put the else if on the previous line.

@@ +5727,5 @@
> +MacroAssembler::wasmUnalignedLoadI64(const wasm::MemoryAccessDesc& access, Register ptr,
> +                                     Register64 out64, Register tmp)
> +{
> +    wasmUnalignedLoadImpl(access, ptr, AnyRegister(), out64, tmp, Register::Invalid(),
> +                          Register::Invalid());

Could be a follow-up bug. But we should have an AnyRegister::Invalid() static function here. And anywhere AnyRegister is initialized to contain nothing in this patch.

@@ +5733,5 @@
> +
> +void
> +MacroAssemblerARM::wasmUnalignedStoreImpl(const wasm::MemoryAccessDesc& access, FloatRegister floatValue,
> +                                          Register64 val64, Register ptr, Register tmp)
> +{

Again. Can you add the asserts for which Register should definitely or not exist in which cases... I.e. that val64 cannot coexist with floatValue.

@@ +5744,5 @@
> +        ScratchRegisterScope scratch(asMasm());
> +        ma_add(Imm32(offset), ptr, scratch);
> +    }
> +
> +    asMasm().memoryBarrier(access.barrierBefore());

Is there a reason that this is now before the "ma_add". If not please put it back after the "ma_add"

@@ +5753,5 @@
> +    if (val64 != Register64::Invalid()) {
> +        if (val64.low != tmp)
> +            ma_mov(val64.low, tmp);
> +    }
> +    else if (!floatValue.isInvalid()) {

Style nit: remove the newline before the "else if"
Attachment #8815720 - Flags: review?(hv1989) → review+
Comment on attachment 8815734 [details] [diff] [review]
bug1312751-baseline-use-refactored-wasm-load-store.patch

Review of attachment 8815734 [details] [diff] [review]:
-----------------------------------------------------------------

We should create a static function in WasmBaselineCompile.cpp that encapsulates:
"return access.isUnaligned() && access.byteSize() > 1" for now.
And use that function everywhere instead of having this statement everywhere.


Thanks for doing this!

::: js/src/wasm/WasmBaselineCompile.cpp
@@ +3314,5 @@
>              if (byteRegConflict)
>                  masm.mov(ScratchRegX86, dest.i32());
>          }
>  #elif defined(JS_CODEGEN_ARM)
> +        if (access.isUnaligned() && access.byteSize() > 1) {

Talking briefly with Lars on irc about this. Seems that isUnaligned is suboptimal here and should/could maybe report true for this case. Lars also mentioned there is another case here: "there's another case on ARM, where iirc doubles only require four-byte alignment".
Seems like a good follow-up bug to investigate and make our "unaligned" definition same for wasmUnalignedLoad and access.isUnaligned.

For now can you add a comment saying we can always do an aligned load for byteSize() == 0, even if access is reporting that is not true.

@@ +3320,5 @@
> +              case AnyReg::I64:
> +                masm.wasmUnalignedLoadI64(access, ptr, dest.i64(), tmp1);
> +                break;
> +              case AnyReg::F32:
> +                masm.wasmUnalignedLoadFP(access, ptr, dest.f32(), tmp1, tmp2, tmp3);

masm.wasmUnalignedLoadFP(access, ptr, dest.f32(), tmp1, tmp2, Register::Invalid());

@@ +3348,5 @@
> +    MOZ_MUST_USE size_t storeTemps(MemoryAccessDesc& access) {
> +#if defined(JS_CODEGEN_ARM)
> +        if (access.isUnaligned() && access.byteSize() > 1) {
> +            // See comment in store() about how this temp could be avoided for
> +            // unaligned i8/i16/i32 stores with some restructuring elsewhere.

We should open a bug for this and put // FIXM bug XXX here (with your comment).

@@ -3561,5 @@
> -                masm.ma_vstr(src, VFPAddr(ptr, VFPOffImm(0)), Assembler::Always);
> -            masm.append(access, st.getOffset(), masm.framePushed());
> -        }
> -    }
> -#endif // JS_CODEGEN_ARM

bye bye :D
Attachment #8815734 - Flags: review?(hv1989) → review+
(In reply to Hannes Verschore [:h4writer] from comment #10)
> Comment on attachment 8815720 [details] [diff] [review]
> bug1312751-refactor-wasm-load-store.patch
> 

> @@ +5644,5 @@
> > +void
> > +MacroAssemblerARM::wasmUnalignedLoadImpl(const wasm::MemoryAccessDesc& access, Register ptr,
> > +                                         AnyRegister outAny, Register64 out64, Register tmp,
> > +                                         Register tmp2, Register tmp3)
> > +{
> 
> Can you put an assert that only out64 or outAny can exist and not at the
> same time.

No, because if out64 is valid then outAny has no defined value, see below.

> @@ +5727,5 @@
> > +MacroAssembler::wasmUnalignedLoadI64(const wasm::MemoryAccessDesc& access, Register ptr,
> > +                                     Register64 out64, Register tmp)
> > +{
> > +    wasmUnalignedLoadImpl(access, ptr, AnyRegister(), out64, tmp, Register::Invalid(),
> > +                          Register::Invalid());
> 
> Could be a follow-up bug. But we should have an AnyRegister::Invalid()
> static function here. And anywhere AnyRegister is initialized to contain
> nothing in this patch.

Yes indeed.  This is bug 1320353 - there is no way to construct an "invalid" AnyRegister, the concept is not currently defined.

We should fix that, but I have not done so here because there is a risk that there's code manipulating AnyRegister values that might not handle the new range of AnyRegister properly.  (Unlikely but possible.)  That is, it seemed like a larger job.  The logic in the present patch is correct but as you note it is hard to assert certain invariants when some inputs can have undefined value.
https://hg.mozilla.org/integration/mozilla-inbound/rev/06c91f4ef053f69435ee491cd32a2f26cda544ad
Bug 1312751 - Refactor wasm{Load,Store}{,I64} to MacroAssembler-arm. r=h4writer

https://hg.mozilla.org/integration/mozilla-inbound/rev/bc9b79d56651c8fca79aa0178b5576d5c96111fe
Bug 1312751 - Wasm baseline, use refactored ARM load/store methods. r=h4writer
https://hg.mozilla.org/mozilla-central/rev/06c91f4ef053
https://hg.mozilla.org/mozilla-central/rev/bc9b79d56651
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla53
You need to log in before you can comment on or make changes to this bug.