Tracking Bug for AArch64 JIT Support

RESOLVED FIXED

Status

()

RESOLVED FIXED
4 years ago
3 years ago

People

(Reporter: sstangl, Assigned: sstangl)

Tracking

(Depends on: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(20 attachments)

700.02 KB, patch
jbramley
: review-
Details | Diff | Splinter Review
43.87 KB, patch
Details | Diff | Splinter Review
12.20 KB, patch
Details | Diff | Splinter Review
133.07 KB, patch
Details | Diff | Splinter Review
30.34 KB, patch
Details | Diff | Splinter Review
60.60 KB, patch
Details | Diff | Splinter Review
23.70 KB, patch
djvj
: review+
Details | Diff | Splinter Review
46.91 KB, patch
Details | Diff | Splinter Review
2.91 KB, patch
Details | Diff | Splinter Review
18.48 KB, patch
Details | Diff | Splinter Review
13.61 KB, patch
sfink
: review+
glandium
: review-
Details | Diff | Splinter Review
47.57 KB, patch
nbp
: review+
Details | Diff | Splinter Review
6.67 KB, patch
nbp
: review+
Details | Diff | Splinter Review
115.84 KB, patch
Details | Diff | Splinter Review
2.62 KB, patch
Details | Diff | Splinter Review
8.74 KB, patch
bbouvier
: review+
Details | Diff | Splinter Review
27.11 KB, patch
h4writer
: review+
Details | Diff | Splinter Review
24.45 KB, patch
luke
: review+
Details | Diff | Splinter Review
15.42 KB, patch
Details | Diff | Splinter Review
14.84 KB, patch
Details | Diff | Splinter Review
(Assignee)

Description

4 years ago
Tracking bug for ARM64 / AArch64 JIT support pre-landing. The platform introduces some new requirements for shared code that are best handled by breaking off into separate, landable patches:

- Memory accesses using the stack pointer require 16-byte alignment on the stack pointer. In practice, this means that to preserve common code, a "pseudo" stack pointer (PSP) is used as an offset base, while care is made to keep SP <= PSP. Common JIT code can no longer use BaselineStackReg/StackPointer directly, but must use special SP-manipulating functions that preserve the SP <= PSP invariant. Care must be maintained across function call boundaries to not use the wrong stack pointer.

- The PC register is no longer accessible. LR must always be pushed callee-side (Bug 1088316).
(Assignee)

Comment 1

4 years ago
The branch is being handled in a git repository tracking gecko-dev master: https://github.com/sstangl/js-arm64
(Assignee)

Comment 2

4 years ago
The branch now passes all jit-tests that don't forcibly enable ionmonkey.

The patch is very large -- talking with Jan, we are going to break it into manageable chunks and review it via the normal processes.
(Assignee)

Comment 3

4 years ago
Just for an update -- the ARM64 repo is still being maintained, and mjrosenb is working on finishing up the asm.js and Ion code.

We've been asked to hold off on landing until ARM64 becomes more of a company-wide priority, at which point it should be relatively simple to whip the branch back into modernity and land it.
Depends on: 1054103
(Assignee)

Updated

4 years ago
Depends on: 986680
(Assignee)

Updated

4 years ago
Depends on: 1147629
Created attachment 8585685 [details] [diff] [review]
add_vixl.patch

I ended up deciding that the best thing to do would be to give you the patch as-is, and let you find the right diff environment to make it easy for you to review, since there are a large number of changes that we made to VIXL. (also, the names no longer match)
Attachment #8585685 - Flags: review?(Jacob.Bramley)
Created attachment 8585686 [details] [diff] [review]
add_trampoline.patch
Attachment #8585686 - Flags: review?(sstangl)
Created attachment 8585688 [details] [diff] [review]
add_moveemitter.patch
Attachment #8585688 - Flags: review?(benj)
Created attachment 8585691 [details] [diff] [review]
add_macroassembler.patch
Attachment #8585691 - Flags: review?(jdemooij)
Created attachment 8585693 [details] [diff] [review]
add_lir.patch
Attachment #8585693 - Flags: review?(sstangl)
Created attachment 8585698 [details] [diff] [review]
add_codegen.patch
Attachment #8585698 - Flags: review?(jdemooij)
Created attachment 8585699 [details] [diff] [review]
add_baseline.patch
Attachment #8585699 - Flags: review?(kvijayan)
Created attachment 8585701 [details] [diff] [review]
add_assembler.patch
Attachment #8585701 - Flags: review?(hv1989)
Created attachment 8585702 [details] [diff] [review]
add_bailouts.patch
Attachment #8585702 - Flags: review?(nicolas.b.pierron)
Created attachment 8585703 [details] [diff] [review]
add_arch.patch
Attachment #8585703 - Flags: review?(nicolas.b.pierron)
Created attachment 8585705 [details] [diff] [review]
fix_buildconfig.patch

Looking at this patch, I believe it also attempts to add configure support for directories outside of js/, I can easily strip those out if having them is a bad idea.
Attachment #8585705 - Flags: review?(sphink)
Created attachment 8585707 [details] [diff] [review]
fix_simple_cpp.patch
Attachment #8585707 - Flags: review?(nicolas.b.pierron)
Created attachment 8585708 [details] [diff] [review]
fix_simple_cpp2.patch

part of the similarly named fixe_simple_cpp.patch, these two will be folded before getting merged.
Created attachment 8585710 [details] [diff] [review]
fix_stackpointer.patch
Attachment #8585710 - Flags: review?(bhackett1024)
Created attachment 8585714 [details] [diff] [review]
fix_stackpointer2.patch

This should be part of the similarly named fix_stackpointer.patch, I'll fold hem into one before committing them.
Attachment #8585714 - Flags: review?(bhackett1024)
Created attachment 8585715 [details] [diff] [review]
fix_ret.patch
Attachment #8585715 - Flags: review?(benj)
Created attachment 8585716 [details] [diff] [review]
fix_mov.patch
Attachment #8585716 - Flags: review?(hv1989)
Created attachment 8585717 [details] [diff] [review]
integrate_baseline.patch

Sorry, my git-foo is not very strong.  I meant for these to be two different patches, but evidently, that was not meant to be.
Attachment #8585717 - Flags: review?(luke)
Attachment #8585717 - Flags: review?(kvijayan)
Created attachment 8585718 [details] [diff] [review]
integrate_other_arches.patch
Attachment #8585718 - Flags: review?(sstangl)
Created attachment 8585719 [details] [diff] [review]
last_kipple.patch
Attachment #8585719 - Flags: review?(sstangl)
Attachment #8585708 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8585705 [details] [diff] [review]
fix_buildconfig.patch

Review of attachment 8585705 [details] [diff] [review]:
-----------------------------------------------------------------

Looks ok to me afaict, but this requires a build peer review. I can add an arm64 build to treeherder whenever someone tells me we want it.

::: build/autoconf/android.m4
@@ +43,5 @@
>      android_tool_prefix="arm-linux-androideabi"
>      ;;
> +aarch64-linux*-android*)
> +    android_tool_prefix="aarch64-linux-android"
> +    echo "FOUND AARCH54"

Removing debugging printout

@@ +55,5 @@
>  *)
>      android_tool_prefix="$target_os"
>      ;;
>  esac
> +echo "android_tool_prefix=$android_tool_prefix"

Remove debugging printout

::: nsprpub/configure.in
@@ +230,5 @@
>              ;;
>          esac
>  
>          android_platform="$android_ndk"/platforms/android-"$android_version"/arch-"$target_name"
> +        echo "android_platform: ($android_platform)"

debugging printout
Attachment #8585705 - Flags: review?(sphink)
Attachment #8585705 - Flags: review?(mh+mozilla)
Attachment #8585705 - Flags: review+
The stack pointer patches are also in bug 1147629, there's some discussion there.
Comment on attachment 8585702 [details] [diff] [review]
add_bailouts.patch

Review of attachment 8585702 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm64/Bailouts-arm64.cpp
@@ +18,5 @@
> +
> +class BailoutStack
> +{
> +    uintptr_t frameClassId_;
> +    uintptr_t frameSize_;

Why do you need a frameClassID and a frameSize?!
You should remove them and just use the frameSize, as pushed from the CodeGen, as done on x64.

@@ +23,5 @@
> +
> +    RegisterDump::FPUArray fpregs_;
> +    RegisterDump::GPRArray regs_;
> +    uintptr_t snapshotOffset_;
> +    uintptr_t padding_;

I do not see any reference to this padding in the CodeGeneratorARM64::visitOutOfLineBailout.
Why not pushing the frameSize and the snapshotOffset from the visitOutOfLineBailout?

@@ +29,5 @@
> +    MachineState machineState() {
> +        return MachineState::FromBailout(regs_, fpregs_);
> +    }
> +    uint32_t snapshotOffset() const {
> +        printf("&snapshotOffset = %p(0x%x)\n", &snapshotOffset_, snapshotOffset_);

Remove these printf.
Attachment #8585702 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8585703 [details] [diff] [review]
add_arch.patch

Review of attachment 8585703 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/arm64/Architecture-arm64.h
@@ +54,5 @@
> +
> +class Registers {
> +  public:
> +    enum RegisterID {
> +        w0  =  0, x0  =  0,

What are the w* registers ?

@@ +133,5 @@
> +
> +    static const uint32_t AllMask = 0xFFFFFFFF;
> +
> +    // FIXME: Validate
> +    static const uint32_t ArgRegMask =

Replace uint32_t by SetType, and identically for other Registers::Mask.

@@ +160,5 @@
> +        (1 << Registers::x21) | (1 << Registers::x22) |
> +        (1 << Registers::x23) | (1 << Registers::x24) |
> +        (1 << Registers::x25) | (1 << Registers::x26) |
> +        (1 << Registers::x27) |
> +        (1 << Registers::x29) | (1 << Registers::x30);

What about x28, I thought, based on the previous comment that it was callee-saved?

@@ +239,5 @@
> +        s30 = 30, d30 = 30, v30 = 30,
> +        s31 = 31, d31 = 31, v31 = 31, // Scratch register.
> +        invalid_fpreg
> +    };
> +    typedef FPRegisterID Code;

Use an integral type instead of FPRegisterID.  This prevent mixing up the typed register (Code) with the untyped register (FPRegisterID).

@@ +243,5 @@
> +    typedef FPRegisterID Code;
> +
> +    typedef uint64_t SetType;
> +
> +    static const char *GetName(Code code) {

Either use Encoding, or add single register in the list of names.

@@ +254,5 @@
> +    }
> +
> +    static const char *GetName(uint32_t i) {
> +        MOZ_ASSERT(i < Total);
> +        return GetName(Code(i));

MOZ_ASSERT(i < TotalPhys)

@@ +265,5 @@
> +    static const uint32_t Total = 64;
> +    static const uint32_t TotalPhys = 32;
> +    static const SetType AllMask = 0xFFFFFFFFFFFFFFFFULL;
> +    static const SetType AllPhysMask = 0xFFFFFFFFULL;
> +    static const SetType SpreadCoefficent = 0x100000001ULL;

Do I understand correctly, that no vector interpretation is used at all?

@@ +271,5 @@
> +    static const uint32_t Allocatable = 62; // Without d31, the scratch register.
> +
> +    // FIXME: Validate
> +    // d31 is the ScratchFloatReg.
> +    static const uint32_t NonVolatileMask =

use SetType instead of uint32_t.

@@ +323,5 @@
> +struct FloatRegister
> +{
> +    typedef FloatRegisters Codes;
> +    typedef Codes::Code Code;
> +    typedef Codes::Code Encoding;

typedef Codes::Encoding Encoding;

@@ +326,5 @@
> +    typedef Codes::Code Code;
> +    typedef Codes::Code Encoding;
> +    typedef Codes::SetType SetType;
> +    union RegisterContent {
> +        double d;

Also add,

  float s;

as you have both kind of content.

@@ +349,5 @@
> +        return mozilla::CountPopulation32(x);
> +    }
> +
> +    Code code_;
> +    FloatRegisters::Kind k_;

Can you use some bit packing, such that one FloatRegister can be a pod.

@@ +373,5 @@
> +    bool operator !=(FloatRegister other) const {
> +        return other.code_ != code_;
> +    }
> +    bool operator ==(FloatRegister other) const {
> +        return other.code_ == code_;

You should compare the typed register, and not the physical register, otherwise use the aliases function.

@@ +398,5 @@
> +    // to D16, since S0 holds a float32, and D16 holds a Double.
> +    // Since all floating point registers on x86 and x64 are equivalent, it is
> +    // reasonable for this function to do the same.
> +    bool equiv(FloatRegister other) const {
> +        return true;

Compare the kind of the register. A FloatRegister has a specific type, and we should avoid mixing uses of the float register type.  This is useful to ensure that the register allocator is aware of the type in MoveEmitters.

@@ +404,5 @@
> +    MOZ_CONSTEXPR uint32_t size() const {
> +        return k_ == FloatRegisters::Double ? sizeof(double) : sizeof(float);
> +    }
> +    uint32_t numAlignedAliased() {
> +        return 1;

return numAliased(); // == 2

@@ +408,5 @@
> +        return 1;
> +    }
> +    void alignedAliased(uint32_t aliasIdx, FloatRegister *ret) {
> +        MOZ_ASSERT(aliasIdx == 0);
> +        *ret = *this;

aliased(aliasIdx, ret);
Attachment #8585703 - Flags: review?(nicolas.b.pierron)
Comment on attachment 8585707 [details] [diff] [review]
fix_simple_cpp.patch

Review of attachment 8585707 [details] [diff] [review]:
-----------------------------------------------------------------

I will do another review later, I have too look more at the context around the code path enabled by this patch.
Especially since I would have expected this Arm64 code to be shared more often with x64 than ARM.

::: js/src/asmjs/AsmJSFrameIterator.cpp
@@ +156,5 @@
>      // The x86/x64 call instruction pushes the return address.
>  #endif
>  }
> +static void
> +popReturn(MacroAssembler &masm)

Capitalize static function.
Why are these function not part of the generic macro assembler?

::: js/src/jit/shared/Assembler-shared.h
@@ +753,5 @@
>  // everywhere. Values are asserted in AsmJSModule.h.
>  static const unsigned AsmJSActivationGlobalDataOffset = 0;
>  static const unsigned AsmJSHeapGlobalDataOffset = sizeof(void*);
> +#if defined(JS_CODEGEN_X64) || defined(JS_CODEGEN_ARM64)
> +static const unsigned AsmJSNaN64GlobalDataOffset = 3*sizeof(void*);

nit: add spaces around the '*' operator.

::: js/src/shell/js.cpp
@@ +6308,5 @@
>          PropagateFlagToNestedShells("--enable-avx");
>      }
>  #endif
> +    if (op.getBoolOption("no-asmjs")) {
> +        PropagateFlagToNestedShells("--no-asmjs");

Shouldn't this change belong to a different patch?
Attachment #8585708 - Flags: review?(nicolas.b.pierron) → review+
Comment on attachment 8585705 [details] [diff] [review]
fix_buildconfig.patch

Review of attachment 8585705 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/autoconf/android.m4
@@ +125,5 @@
>          target_name=arm
>          ;;
> +    aarch64)
> +        target_name=arm64
> +        ;;

ANDROID_CPU_ARCH and android_version need to be set too.

::: js/src/configure.in
@@ +3151,5 @@
> +MOZ_ARG_ENABLE_BOOL(arm64-simulator,
> +[  --enable-arm64-simulator Enable ARM64 simulator for JIT code],
> +    JS_ARM64_SIMULATOR=1,
> +    JS_ARM64_SIMULATOR= )
> +

You need to make it an error to use this flag with --enable-arm-simulator and --enable-mips-simulator... which begins to be cumbersome.

I think all those options should be consolidated as --enable-simulator=(arm|arm64|mips)

@@ +3164,5 @@
>      AC_DEFINE(JS_CODEGEN_ARM)
>      JS_CODEGEN_ARM=1
> +elif test -n "$JS_ARM64_SIMULATOR"; then
> +    if test "$CPU_ARCH" != "x86_64"; then
> +        AC_MSG_ERROR([The ARM64 simuator only works on x86_64.])

typo

::: media/webrtc/trunk/build/common.gypi
@@ +1029,5 @@
>                }],
> +              ['target_arch=="aarch64"', {
> +                'android_app_abi%': 'arm64',
> +                'android_ndk_sysroot%': '<(android_ndk_root)/platforms/android-L/arch-arm64',
> +              }]

shouldn't you set an android_app_abi? Note this should be in a separate patch and reviewed by, probably, jesup.

::: mozglue/linker/Elfxx.h
@@ +136,5 @@
> +#define RELOC(n) DT_REL ## n
> +#define UNSUPPORTED_RELOC(n) DT_RELA ## n
> +#define STR_RELOC(n) "DT_REL" # n
> +#define Reloc Rel
> +

This should be a separate patch. R_AARCH64_ABS32 is not used, why add it? Also, does that actually work? (that is, aren't there other relocation types actually necessary?)

::: nsprpub/configure.in
@@ +118,5 @@
>      android_version=5
>      ;;
> +aarch64)
> +    android_version=L
> +    ;;

patches for nspr should be separate.
Attachment #8585705 - Flags: review?(mh+mozilla) → review-
Comment on attachment 8585685 [details] [diff] [review]
add_vixl.patch

Review of attachment 8585685 [details] [diff] [review]:
-----------------------------------------------------------------

It's pretty hard to review this. I tried looking at the diff from VIXL 1.5 to your changes (file by file, because they've been renamed). However, the formatting changes hide the real changes, so it's quite difficult to actually see what you've added. Nonetheless, I've had a go!

I made lots of comments but they have widely varying severity. If you'd rather commit now and fix them later, that's up to you.

- Most files have been reformatted to match Mozilla's style, but not all of them. Why not?
- I can't reasonably review the new stuff that wasn't part of VIXL. For example, the interactions with BufferOffset; VIXL does (what looks like) a lot of the same things already and I'm not sure how your buffer differs. The literal pools are another example of this.

::: js/src/jit/arm64/vixl/Architecture-vixl.h
@@ +189,5 @@
> +
> +    MOZ_CONSTEXPR inline explicit ARMRegister(const CPURegister& other)
> +      : CPURegister(other.code(), other.size(), other.type())
> +    {
> +        //      VIXL_ASSERT(IsValidRegister());

Why is this commented out?

@@ +215,5 @@
> +    static const ARMRegister wregisters[];
> +    static const ARMRegister xregisters[];
> +};
> +
> +class ARMFPRegister : public CPURegister

Why is ARMFPRegister in Architecture-vixl.h, whilst ARMRegister is in Assembler-vixl.h?

@@ +223,5 @@
> +
> +    MOZ_CONSTEXPR inline explicit ARMFPRegister(const CPURegister& other)
> +      : CPURegister(other.code(), other.size(), other.type())
> +    {
> +        //  VIXL_ASSERT(IsValidARMFPRegister());

Why is this commented out?

@@ +237,5 @@
> +
> +    static const ARMFPRegister& SRegFromCode(unsigned code);
> +    static const ARMFPRegister& DRegFromCode(unsigned code);
> +
> +    // V8 compatibility.

You probably don't need these aliases. (They're not in newer VIXLs anyway.)

::: js/src/jit/arm64/vixl/Assembler-vixl.cpp
@@ +74,5 @@
> +                is_valid &= CPURegister(i, size_, type_).IsValid();
> +        }
> +        return is_valid;
> +    }
> +    

There are several examples of trailing whitespace in the patch.

@@ +385,5 @@
> +AssemblerVIXL::Reset()
> +{
> +#ifdef DEBUG
> +    // TODO: MOZ_ASSERT((pc_ >= buffer_) && (pc_ < buffer_ + buffer_size_));
> +    // TODO: memset(buffer_, 0, pc_ - buffer_);

What makes these TODOs difficult? Note that if the pc_ check doesn't hold, other bits of VIXL might not work properly.

Are you using the assembler buffer in a way that we didn't anticipate? (It looks like it, since you set pc_ to nullptr in a few lines.)

@@ +715,5 @@
> +    BufferOffset offset = Emit(0);
> +    Instruction *ins = getInstructionAt(offset);
> +
> +    // Encode the relative offset.
> +    // TODO: This is probably incorrect -- ADR needs patching

Allowing generation of a probably-incorrect address-calculation instruction opens a security hole. This should be fixed, marked UNREACHABLE, or at least sprinkled with release-mode assertions to check that the constant pools won't affect the offset.

@@ +1531,5 @@
> +    LoadStoreExclusive op = rt.Is64Bits() ? STLXP_x : STLXP_w;
> +    Emit(op | Rs(rs) | Rt(rt) | Rt2(rt2) | RnSP(dst.base()));
> +}
> +
> +void AssemblerVIXL::ldaxp(const ARMRegister& rt, const ARMRegister& rt2, const MemOperand& src)

This one hasn't been reformatted.

@@ +1617,5 @@
> +    Emit(MSR | Rt(rt) | ImmSystemRegister(sysreg));
> +}
> +
> +BufferOffset
> +AssemblerVIXL::hint(SystemHint code)

This is a bit ugly; everything else emits into the instruction stream, but this one returns the encoding. I would make `hint` emit into the stream, then make something like `Encode_hint` to do a one-off.

(Yes, it's a shame that VIXL can't encode without a buffer to put the results into, but the problem is pretty rare in practice and it would require a lot of changes to allow it everywhere.)

@@ +2729,5 @@
> +
> +    struct Header
> +    {
> +        // The size should take into account the pool header.
> +        // The size is in units of Instruction (4bytes), not byte.

VIXL's `Instruction` is an empty class (with a size of 1). Have you changed that?

(We do have `Instr`, with a size of 4.)

::: js/src/jit/arm64/vixl/Assembler-vixl.h
@@ +222,5 @@
> +
> +    static const ARMRegister& WRegFromCode(unsigned code);
> +    static const ARMRegister& XRegFromCode(unsigned code);
> +
> +    // V8 compatibility.

You don't need this.

@@ +259,5 @@
> +
> +    static const ARMFPRegister& SRegFromCode(unsigned code);
> +    static const ARMFPRegister& DRegFromCode(unsigned code);
> +
> +    // V8 compatibility.

Ditto.

@@ +290,5 @@
> +REGISTER_CODE_LIST(DEFINE_FPREGISTERS)
> +#undef DEFINE_FPREGISTERS
> +
> +// ARMRegisters aliases.
> +const ARMRegister ip0_64 = x16;

What's the `_64` suffix for? The 32-bit version is called `wip0`.

::: js/src/jit/arm64/vixl/MacroAssembler-vixl.cpp
@@ +58,5 @@
> +
> +void
> +MacroAssemblerVIXL::And(const ARMRegister& rd, const ARMRegister& rn, const Operand& operand)
> +{
> +    LogicalMacro(rd, rn, operand, AND);

The `allow_macro_instructions_` checks have been removed from the MacroAssembler. These were used to enforce InstructionAccurateScope. If you don't use the scope, that's fine, but you should probably remove it so that it doesn't _appear_ to work (but not actually do the checks it claims to do).

@@ +824,5 @@
> +    //   str sp, [sp, #-8]!
> +    // which pre-decrements the stack pointer, storing the decremented value.
> +    //
> +    // Instead, we must use a scratch register. This hijacks ip1.
> +    if (src0.Is(GetStackPointer64())) {

Perhaps this should be a separate function (`PushStackPointer`).

@@ +919,5 @@
> +void
> +MacroAssemblerVIXL::PushPseudoStackPointerHelper()
> +{
> +    // FIXME: This is gross. The helper register should be required in the call.
> +    ARMRegister scratch(ip1, 64);

Use `UseScratchRegisterScope` to acquire a scratch register (or you will have lots of pain debugging things later).

@@ +921,5 @@
> +{
> +    // FIXME: This is gross. The helper register should be required in the call.
> +    ARMRegister scratch(ip1, 64);
> +
> +    add(scratch, GetStackPointer64(), Operand(0));

The MacroAssembler's `Mov` can do this for you: `Mov(scratch, GetStackPointer64())`. (It uses `add` if the stack pointer is the architectural sp.)

::: js/src/jit/arm64/vixl/Simulator-vixl.h
@@ +888,5 @@
> +};
> +
> +
> +
> +// FIXME: This is MPLv2, and probably shouldn't be here...

Licensing concerns aren't something you can avoid (even temporarily) with a FIXME. We wanted VIXL's licence to be permissive, but I have no idea if you can relicense MPLv2 as BSD-3-clause. (My gut says probably not, but you could probably arrange something.)

::: js/src/jit/arm64/vixl/VIXL-Globals-vixl.h
@@ +1,1 @@
> +// -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 4 -*-

Files from `vixl/src` have been called `VIXL-<name>-vixl.h`. Why not just `<name>-vixl.h` like the others?
Attachment #8585685 - Flags: review?(Jacob.Bramley) → review-
(In reply to Jacob Bramley [:jbramley] from comment #30)
> Comment on attachment 8585685 [details] [diff] [review]
> add_vixl.patch
> 
> Review of attachment 8585685 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> It's pretty hard to review this. I tried looking at the diff from VIXL 1.5
> to your changes (file by file, because they've been renamed). However, the
> formatting changes hide the real changes, so it's quite difficult to
> actually see what you've added. Nonetheless, I've had a go!
> 
> I made lots of comments but they have widely varying severity. If you'd
> rather commit now and fix them later, that's up to you.
> 
> - Most files have been reformatted to match Mozilla's style, but not all of
> them. Why not?
> - I can't reasonably review the new stuff that wasn't part of VIXL. For
> example, the interactions with BufferOffset; VIXL does (what looks like) a
> lot of the same things already and I'm not sure how your buffer differs. The
> literal pools are another example of this.
> 
> ::: js/src/jit/arm64/vixl/Architecture-vixl.h
> @@ +189,5 @@
> > +
> > +    MOZ_CONSTEXPR inline explicit ARMRegister(const CPURegister& other)
> > +      : CPURegister(other.code(), other.size(), other.type())
> > +    {
> > +        //      VIXL_ASSERT(IsValidRegister());
> 
> Why is this commented out?
> 
> @@ +215,5 @@
> > +    static const ARMRegister wregisters[];
> > +    static const ARMRegister xregisters[];
> > +};
> > +
> > +class ARMFPRegister : public CPURegister
> 
> Why is ARMFPRegister in Architecture-vixl.h, whilst ARMRegister is in
> Assembler-vixl.h?
> 
Good question.  I'm not entirely sure where Architecture-vixl.h came from.  It isn't in my local checkout.
I've just ignored all comments from this file, since it doesn't actually exist.
> @@ +223,5 @@
> > +
> > +    MOZ_CONSTEXPR inline explicit ARMFPRegister(const CPURegister& other)
> > +      : CPURegister(other.code(), other.size(), other.type())
> > +    {
> > +        //  VIXL_ASSERT(IsValidARMFPRegister());
> 
> Why is this commented out?
> 
> @@ +237,5 @@
> > +
> > +    static const ARMFPRegister& SRegFromCode(unsigned code);
> > +    static const ARMFPRegister& DRegFromCode(unsigned code);
> > +
> > +    // V8 compatibility.
> 
> You probably don't need these aliases. (They're not in newer VIXLs anyway.)
> 
> ::: js/src/jit/arm64/vixl/Assembler-vixl.cpp
> @@ +74,5 @@
> > +                is_valid &= CPURegister(i, size_, type_).IsValid();
> > +        }
> > +        return is_valid;
> > +    }
> > +    
> 
> There are several examples of trailing whitespace in the patch.
all trailing whitespace has been removed.
> 
> @@ +385,5 @@
> > +AssemblerVIXL::Reset()
> > +{
> > +#ifdef DEBUG
> > +    // TODO: MOZ_ASSERT((pc_ >= buffer_) && (pc_ < buffer_ + buffer_size_));
> > +    // TODO: memset(buffer_, 0, pc_ - buffer_);
> 
> What makes these TODOs difficult? Note that if the pc_ check doesn't hold,
> other bits of VIXL might not work properly.
> 
> Are you using the assembler buffer in a way that we didn't anticipate? (It
> looks like it, since you set pc_ to nullptr in a few lines.)
> 
We actively don't use pc_.  I can try to remove it alltogether, but this was the fastest route to getting rid of the assertion failures.
> @@ +715,5 @@
> > +    BufferOffset offset = Emit(0);
> > +    Instruction *ins = getInstructionAt(offset);
> > +
> > +    // Encode the relative offset.
> > +    // TODO: This is probably incorrect -- ADR needs patching
> 
> Allowing generation of a probably-incorrect address-calculation instruction
> opens a security hole. This should be fixed, marked UNREACHABLE, or at least
> sprinkled with release-mode assertions to check that the constant pools
> won't affect the offset.
> 
yeah.  The fix for this was pretty simple.
> @@ +1531,5 @@
> > +    LoadStoreExclusive op = rt.Is64Bits() ? STLXP_x : STLXP_w;
> > +    Emit(op | Rs(rs) | Rt(rt) | Rt2(rt2) | RnSP(dst.base()));
> > +}
> > +
> > +void AssemblerVIXL::ldaxp(const ARMRegister& rt, const ARMRegister& rt2, const MemOperand& src)
> 
> This one hasn't been reformatted.
> 
That one isn't more than 100 characters. (Unless you were talking about something other than the fact that it isn't line wrapped.)

> @@ +1617,5 @@
> > +    Emit(MSR | Rt(rt) | ImmSystemRegister(sysreg));
> > +}
> > +
> > +BufferOffset
> > +AssemblerVIXL::hint(SystemHint code)
> 
> This is a bit ugly; everything else emits into the instruction stream, but
> this one returns the encoding. I would make `hint` emit into the stream,
> then make something like `Encode_hint` to do a one-off.
> 
> (Yes, it's a shame that VIXL can't encode without a buffer to put the
> results into, but the problem is pretty rare in practice and it would
> require a lot of changes to allow it everywhere.)
> 
As far as I can tell, it does in fact emit the specified hint into the instruction stream.  It also happens to return
the offset of the instruction that was emitted.
> @@ +2729,5 @@
> > +
> > +    struct Header
> > +    {
> > +        // The size should take into account the pool header.
> > +        // The size is in units of Instruction (4bytes), not byte.
> 
> VIXL's `Instruction` is an empty class (with a size of 1). Have you changed
> that?
> 
> (We do have `Instr`, with a size of 4.)
> 
the comment was wrong, I was not attempting to refer to the class Instruction, but rather an actual instruction, which is 4 bytes.  This has been changed to 'Instr'.
> ::: js/src/jit/arm64/vixl/Assembler-vixl.h
> @@ +222,5 @@
> > +
> > +    static const ARMRegister& WRegFromCode(unsigned code);
> > +    static const ARMRegister& XRegFromCode(unsigned code);
> > +
> > +    // V8 compatibility.
> 
> You don't need this.
I think we actually need XRegFromCode, and DRegFromCode.  I'd be happy to replace it with something else that does the same thing.
> 
> @@ +259,5 @@
> > +
> > +    static const ARMFPRegister& SRegFromCode(unsigned code);
> > +    static const ARMFPRegister& DRegFromCode(unsigned code);
> > +
> > +    // V8 compatibility.
> 
> Ditto.
> 
> @@ +290,5 @@
> > +REGISTER_CODE_LIST(DEFINE_FPREGISTERS)
> > +#undef DEFINE_FPREGISTERS
> > +
> > +// ARMRegisters aliases.
> > +const ARMRegister ip0_64 = x16;
> 
> What's the `_64` suffix for? The 32-bit version is called `wip0`.
Because we moved vixl out of its own namespace, ip0 conflicted with the global Register (not ARMRegister) definition of it.
I've changed all of the ARMRegisters to have the actual names, and the Registers to have some other name.
> 
> ::: js/src/jit/arm64/vixl/MacroAssembler-vixl.cpp
> @@ +58,5 @@
> > +
> > +void
> > +MacroAssemblerVIXL::And(const ARMRegister& rd, const ARMRegister& rn, const Operand& operand)
> > +{
> > +    LogicalMacro(rd, rn, operand, AND);
> 
> The `allow_macro_instructions_` checks have been removed from the
> MacroAssembler. These were used to enforce InstructionAccurateScope. If you
> don't use the scope, that's fine, but you should probably remove it so that
> it doesn't _appear_ to work (but not actually do the checks it claims to do).
> 
> @@ +824,5 @@
> > +    //   str sp, [sp, #-8]!
> > +    // which pre-decrements the stack pointer, storing the decremented value.
> > +    //
> > +    // Instead, we must use a scratch register. This hijacks ip1.
> > +    if (src0.Is(GetStackPointer64())) {
> 
> Perhaps this should be a separate function (`PushStackPointer`).
That isn't really practical.  For the most part, the only time this will happen is when we are pushing all of the registers from PushRegsInMask.  That usually doesn't even know it is pushing the stack pointer.
> 
> @@ +919,5 @@
> > +void
> > +MacroAssemblerVIXL::PushPseudoStackPointerHelper()
> > +{
> > +    // FIXME: This is gross. The helper register should be required in the call.
> > +    ARMRegister scratch(ip1, 64);
> 
> Use `UseScratchRegisterScope` to acquire a scratch register (or you will
> have lots of pain debugging things later).
> 
> @@ +921,5 @@
> > +{
> > +    // FIXME: This is gross. The helper register should be required in the call.
> > +    ARMRegister scratch(ip1, 64);
> > +
> > +    add(scratch, GetStackPointer64(), Operand(0));
> 
> The MacroAssembler's `Mov` can do this for you: `Mov(scratch,
> GetStackPointer64())`. (It uses `add` if the stack pointer is the
> architectural sp.)
> 
Ok, this has been added.
> ::: js/src/jit/arm64/vixl/Simulator-vixl.h
> @@ +888,5 @@
> > +};
> > +
> > +
> > +
> > +// FIXME: This is MPLv2, and probably shouldn't be here...
> 
> Licensing concerns aren't something you can avoid (even temporarily) with a
> FIXME. We wanted VIXL's licence to be permissive, but I have no idea if you
> can relicense MPLv2 as BSD-3-clause. (My gut says probably not, but you
> could probably arrange something.)
> 
Doesn't matter that code was commented out (and now has been removed)
> ::: js/src/jit/arm64/vixl/VIXL-Globals-vixl.h
> @@ +1,1 @@
> > +// -*- Mode: C++; tab-width: 8; indent-tabs-mode: nil; c-basic-offset: 4 -*-
> 
> Files from `vixl/src` have been called `VIXL-<name>-vixl.h`. Why not just
> `<name>-vixl.h` like the others?
I'm guessing those were added the last time we pulled from upstream VIXL,  they've been renamed.

I'm going to give you a diff on the old patch, as well as the new patch.
Comment on attachment 8585710 [details] [diff] [review]
fix_stackpointer.patch

Review of attachment 8585710 [details] [diff] [review]:
-----------------------------------------------------------------

What is the difference between these patches and the one in bug 1147629?
Comment on attachment 8585715 [details] [diff] [review]
fix_ret.patch

Review of attachment 8585715 [details] [diff] [review]:
-----------------------------------------------------------------

It looks good, but to be sure to understand: wasn't the ret/abiret distinction already acting like this? I mean that popReturn seems to match the old ret, while abiret is the real ARM return instruction?
Comment on attachment 8585707 [details] [diff] [review]
fix_simple_cpp.patch

Review of attachment 8585707 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/asmjs/AsmJSValidate.cpp
@@ +8268,5 @@
>  #if defined(JS_CODEGEN_ARM)
> +    masm.pushReturnAddress();
> +#elif defined(JS_CODEGEN_ARM64)
> +    masm.initStackPtr();
> +    masm.pushReturnAddress();

And I guess we cannot swap the 2 instructions?
Do we make sure that we are properly aligned after the return address in Jit code?

@@ +9031,5 @@
>      return m.finishGeneratingInlineStub(&m.onOutOfBoundsLabel()) && !masm.oom();
>  }
>  
> +// TODO: the stack pointer is kind of determined at runtime on arm64,
> +#ifndef JS_CODEGEN_ARM64

nit: I would prefer to read

  #if defined(JS_CODEGEN_ARM64)
  …
  #else
  …
  #endif

@@ +9036,4 @@
>  static const LiveRegisterSet AllRegsExceptSP(
> +    GeneralRegisterSet(Registers::AllMask &
> +                                   ~(uint32_t(1) << Registers::StackPointer)),
> +                FloatRegisterSet(FloatRegisters::AllMask));

nit: Align properly.

::: js/src/jit/BaselineCompiler.h
@@ +15,5 @@
>  #elif defined(JS_CODEGEN_ARM)
>  # include "jit/arm/BaselineCompiler-arm.h"
> +#elif defined(JS_CODEGEN_ARM64)
> +# include "jit/arm64/BaselineCompiler-arm64.h"
> +

nit: remove extra empty line.

::: js/src/jit/JitCommon.h
@@ +11,5 @@
>  
>  #if defined(JS_ARM_SIMULATOR)
> +# include "jit/arm/Simulator-arm.h"
> +#elif defined(JS_ARM64_SIMULATOR)
> +# include "jit/arm64/vixl/Simulator-vixl.h"

Is there any reason to have a different directory, and a different name instead of having arm64/Simulator-arm64.h?

::: js/src/jit/MacroAssembler.h
@@ +241,5 @@
>          initWithAllocator();
>          m_buffer.id = jcx->getNextAssemblerId();
>  #endif
> +
> +#ifdef JS_CODEGEN_ARM64

nit:
  #if defined(JS_CODEGEN_ARM)
  …
  #elif defined(JS_CODEGEN_ARM64)

::: js/src/jit/shared/CodeGenerator-shared.cpp
@@ +1404,5 @@
>      else
>          masm.callWithABI(BitwiseCast<void*, int32_t(*)(double)>(JS::ToInt32));
>      masm.storeCallResult(dest);
>  
> +#if !defined(JS_CODEGEN_ARM) &&  !defined(JS_CODEGEN_ARM64)

nit: remove the extra space between && and !defined.
Attachment #8585707 - Flags: review?(nicolas.b.pierron) → review+
Comment on attachment 8585710 [details] [diff] [review]
fix_stackpointer.patch

Bug 1147629 has landed, canceling these reviews.
Attachment #8585710 - Flags: review?(bhackett1024)
Attachment #8585714 - Flags: review?(bhackett1024)
Comment on attachment 8585701 [details] [diff] [review]
add_assembler.patch

Review of attachment 8585701 [details] [diff] [review]:
-----------------------------------------------------------------

Can somebody with more arm knowledge review the functions containing arm instructions?

::: js/src/jit/arm64/Assembler-arm64.cpp
@@ +24,5 @@
> +using mozilla::CountLeadingZeroes32;
> +using mozilla::DebugOnly;
> +
> +// Note this is used for inter-AsmJS calls and may pass arguments and results
> +// in floating point registers even if the system ABI does not.

This comment is standing alone? Where does it belong with?

@@ +49,5 @@
> +            break;
> +        }
> +        current_ = ABIArg(FloatRegister(floatRegIndex_,
> +                                        type == MIRType_Double ? FloatRegisters::Double : FloatRegisters::Single));
> +        floatRegIndex_++;

in arm-32 floatRegIndex is up-ed by 2 for MIRType_Double. Why do we only need to up by 1 for both double and float?

@@ +69,5 @@
> +namespace jit {
> +
> +void
> +Assembler::finish()
> +{

MOZ_ASSERT(!finished); ?

@@ +128,5 @@
> +        DebugOnly<size_t> postOffset = size_t(armbuffer_.nextOffset().getOffset());
> +
> +        MOZ_ASSERT(postOffset - preOffset == SizeOfJumpTableEntry);
> +    }
> +

I would like to have this double checked by somebody that knows arm.

@@ +164,5 @@
> +                entry->data = target;
> +            }
> +        } else {
> +            // Currently a two-instruction call, it should be possible to optimize this
> +            // into a single instruction call + nop in some instances, but this will work.

When do we have UnkownBranchType? And are we certain we don't need to do something here?

@@ +255,5 @@
> +    Instruction *inst = getInstructionAt(BufferOffset(branchOffset));
> +    inst->SetImmPCOffsetTarget(inst + nextOffset().getOffset() - branchOffset);
> +}
> +
> +// FIXME: Share with ARM?

We should definitely make something like arm-shared. There are a lot of functions here that can get deduplicated. Can you open follow-up for this?

@@ +297,5 @@
> +
> +size_t
> +Assembler::addPatchableJump(BufferOffset src, Relocation::Kind reloc)
> +{
> +    MOZ_CRASH("TODO: This is currently unused (and untested)");

In that case, shouldn't we just remove this?

@@ +315,5 @@
> +    // entry).
> +    Instruction *jump = (Instruction*)jump_.raw();
> +    //printInstruction(jump-12,3);
> +    //printf("*");
> +    //printInstruction(jump,3);

Remove the printfs

@@ +396,5 @@
> +    }
> +    printf("Toggle Call, load: ");
> +    printInstruction(load, 1);
> +    printf("call: ");
> +    printInstruction(call, 1);

Remove printfs

::: js/src/jit/arm64/Assembler-arm64.h
@@ +24,5 @@
> +// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
> +// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
> +// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
> +// OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> +// OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

Can't we use the normal 'Mozilla Public Licence'? Why not? And is this compatible with our current license? Do we need somebody to check that to be certain?

@@ +44,5 @@
> +static MOZ_CONSTEXPR_VAR Register ScratchReg = { Registers::ip0 };
> +static MOZ_CONSTEXPR_VAR ARMRegister ScratchReg64 = { ScratchReg, 64 };
> +static MOZ_CONSTEXPR_VAR ARMRegister ScratchReg32 = { ScratchReg, 32 };
> +
> +static MOZ_CONSTEXPR_VAR Register ScratchReg2 = { Registers::ip1 };

Two scratch registers?

@@ +46,5 @@
> +static MOZ_CONSTEXPR_VAR ARMRegister ScratchReg32 = { ScratchReg, 32 };
> +
> +static MOZ_CONSTEXPR_VAR Register ScratchReg2 = { Registers::ip1 };
> +static MOZ_CONSTEXPR_VAR ARMRegister ScratchReg2_64 = { ScratchReg2, 64 };
> +static MOZ_CONSTEXPR_VAR ARMRegister ScratchReg2_32 = { ScratchReg2, 32 };

In 'arm-32' you define these as FloatRegister (which is VFPRegister). Shouldn't we also use that here?
FloatRegister::Single / FloatRegister::Double can be used in that case instead of 64/32.

@@ +61,5 @@
> +static MOZ_CONSTEXPR_VAR ARMFPRegister ScratchDoubleReg_ = { ScratchDoubleReg, 64 };
> +static MOZ_CONSTEXPR_VAR ARMFPRegister ReturnDoubleReg_ = { ReturnDoubleReg, 64 };
> +
> +static MOZ_CONSTEXPR_VAR ARMFPRegister ReturnFloat32Reg_ = { ReturnFloat32Reg, 32 };
> +static MOZ_CONSTEXPR_VAR ARMFPRegister ScratchFloat32Reg_ = { ScratchFloat32Reg, 32 };

FloatRegister?

@@ +84,5 @@
> +static MOZ_CONSTEXPR_VAR Register JSReturnReg = { Registers::x2 };
> +static MOZ_CONSTEXPR_VAR Register FramePointer = { Registers::fp };
> +static MOZ_CONSTEXPR_VAR Register ZeroRegister = { Registers::sp };
> +static MOZ_CONSTEXPR_VAR ARMRegister ZeroRegister64 = { Registers::sp, 64 };
> +static MOZ_CONSTEXPR_VAR ARMRegister ZeroRegister32 = { Registers::sp, 32 };

Do we have something like VFPRegister but for gpr? So we can use ::Single/::Double too?

@@ +87,5 @@
> +static MOZ_CONSTEXPR_VAR ARMRegister ZeroRegister64 = { Registers::sp, 64 };
> +static MOZ_CONSTEXPR_VAR ARMRegister ZeroRegister32 = { Registers::sp, 32 };
> +
> +static MOZ_CONSTEXPR_VAR FloatRegister ReturnFloatReg = { FloatRegisters::d0 };
> +static MOZ_CONSTEXPR_VAR FloatRegister ScratchFloatReg = { FloatRegisters::d31 };

Isn't this ReturnDoubleReg/ScratchDoubleReg? This can get removed...

@@ +92,5 @@
> +
> +static MOZ_CONSTEXPR_VAR FloatRegister ReturnSimdReg = InvalidFloatReg;
> +static MOZ_CONSTEXPR_VAR FloatRegister ScratchSimdReg = InvalidFloatReg;
> +
> +static MOZ_CONSTEXPR_VAR Register RealStackPointer = { Registers::sp };

Since 'RealStackPointer' and 'StackPointer' are the same? Why distinguish between them?

@@ +228,5 @@
> +
> +    bool oom() const {
> +        // FIXME: Currently not possible to OOM.
> +        return false;
> +    }

Can you implement this? I cannot imagine a full arm-64 implementation that cannot OOM.

@@ +280,5 @@
> +        }
> +    }
> +
> +    void Bind(uint8_t *rawCode, AbsoluteLabel *label, const void *address) {
> +        uint32_t off = actualOffset(label->offset());

Can we try to use as much as uintptr_t as possible?

@@ +289,5 @@
> +        uint32_t nextLinkOffset = uint32_t(link->ImmPCRawOffset());
> +        if (nextLinkOffset == uint32_t(LabelBase::INVALID_OFFSET))
> +            return false;
> +        *next = BufferOffset(nextLinkOffset + cur.getOffset());
> +        return true;

Why the difference with "arm-32"?

@@ +308,5 @@
> +
> +    // The buffer is about to be linked. Ensure any constant pools or
> +    // excess bookkeeping has been flushed to the instruction stream.
> +    void flush() {
> +        // TODO: MOZ_ASSERT(!isFinished);

Assert this!

@@ +325,5 @@
> +    }
> +    static uint8_t *PatchableJumpAddress(JitCode *code, uint32_t index) {
> +        return code->raw() + index;
> +    }
> +    void setPrinter(Sprinter *sp) {

Create followup bug to support this in arm?

@@ +353,5 @@
> +        return 4;
> +    }
> +
> +    static void PatchWrite_NearCall(CodeLocationLabel start, CodeLocationLabel toCall) {
> +        Instruction *dest = (Instruction*)start.raw();

style nit: space between (Instruction*) and start.raw()

@@ +354,5 @@
> +    }
> +
> +    static void PatchWrite_NearCall(CodeLocationLabel start, CodeLocationLabel toCall) {
> +        Instruction *dest = (Instruction*)start.raw();
> +        //printf("patching %p with call to %p\n", start.raw(), toCall.raw());

Remove comment

@@ +355,5 @@
> +
> +    static void PatchWrite_NearCall(CodeLocationLabel start, CodeLocationLabel toCall) {
> +        Instruction *dest = (Instruction*)start.raw();
> +        //printf("patching %p with call to %p\n", start.raw(), toCall.raw());
> +        bl(dest, ((Instruction*)toCall.raw() - dest)>>2);

No need for AutoFlushICache anymore?

@@ +399,5 @@
> +                int poolSize = (next->InstructionBits() & 0x7fff);
> +                Instruction *ret = (next + (poolSize << 2));
> +                return ret;
> +            }
> +        }

Can somebody else review this code? I don't have the knowledge to know if that is correct?

@@ +546,5 @@
> +    ABIArg current_;
> +};
> +
> +// FIXME: ugh. why is this not a static member of Assembler?
> +void PatchJump(CodeLocationJump &jump_, CodeLocationLabel label);

Open follow-up bug and replace FIXME with TODO: BUGxxxx

@@ +596,5 @@
> +    return true;
> +
> +}
> +
> +// FIXME: Should be shared with ARM's Assembler.

Open follow-up bug and replace FIXME with TODO: BUGxxxx
Attachment #8585701 - Flags: review?(hv1989)
Attachment #8585716 - Flags: review?(hv1989) → review+
(In reply to Marty Rosenberg [:mjrosenb] from comment #31)
> > > +void AssemblerVIXL::ldaxp(const ARMRegister& rt, const ARMRegister& rt2, const MemOperand& src)
> > 
> > This one hasn't been reformatted.
> > 
> That one isn't more than 100 characters. (Unless you were talking about
> something other than the fact that it isn't line wrapped.)

Ah, I'm behind the times, sorry. (The style guide for JaegerMonkey was a little
different, if I remember correctly.)

> > @@ +1617,5 @@
> > > +    Emit(MSR | Rt(rt) | ImmSystemRegister(sysreg));
> > > +}
> > > +
> > > +BufferOffset
> > > +AssemblerVIXL::hint(SystemHint code)
> > 
> > This is a bit ugly; everything else emits into the instruction stream, but
> > this one returns the encoding. I would make `hint` emit into the stream,
> > then make something like `Encode_hint` to do a one-off.
> > 
> > (Yes, it's a shame that VIXL can't encode without a buffer to put the
> > results into, but the problem is pretty rare in practice and it would
> > require a lot of changes to allow it everywhere.)
> > 
> As far as I can tell, it does in fact emit the specified hint into the
> instruction stream.  It also happens to return
> the offset of the instruction that was emitted.

Ok, I think I misread it; sorry about that!

> > @@ +2729,5 @@
> > > +
> > > +    struct Header
> > > +    {
> > > +        // The size should take into account the pool header.
> > > +        // The size is in units of Instruction (4bytes), not byte.
> > 
> > VIXL's `Instruction` is an empty class (with a size of 1). Have you changed
> > that?
> > 
> > (We do have `Instr`, with a size of 4.)
> > 
> the comment was wrong, I was not attempting to refer to the class
> Instruction, but rather an actual instruction, which is 4 bytes.  This has
> been changed to 'Instr'.
> > ::: js/src/jit/arm64/vixl/Assembler-vixl.h
> > @@ +222,5 @@
> > > +
> > > +    static const ARMRegister& WRegFromCode(unsigned code);
> > > +    static const ARMRegister& XRegFromCode(unsigned code);
> > > +
> > > +    // V8 compatibility.
> > 
> > You don't need this.
> I think we actually need XRegFromCode, and DRegFromCode.  I'd be happy to
> replace it with something else that does the same thing.

I meant the code after the "V8 compatibility" comment. (I can't see a way to
highlight specific lines in the review system.) Specifically, `kNumRegisters`
and `kNumAllocatableRegisters` are just aliases that don't need to be there.

> > [...]
> > What's the `_64` suffix for? The 32-bit version is called `wip0`.
> Because we moved vixl out of its own namespace, ip0 conflicted with the
> global Register (not ARMRegister) definition of it.
> I've changed all of the ARMRegisters to have the actual names, and the
> Registers to have some other name.

Ok, fair enough. (I think leaving the namespace alone would have been somewhat
easier, but that's up to you.)

> > ::: js/src/jit/arm64/vixl/MacroAssembler-vixl.cpp
> > @@ +58,5 @@
> > > +
> > > +void
> > > +MacroAssemblerVIXL::And(const ARMRegister& rd, const ARMRegister& rn, const Operand& operand)
> > > +{
> > > +    LogicalMacro(rd, rn, operand, AND);
> > 
> > The `allow_macro_instructions_` checks have been removed from the
> > MacroAssembler. These were used to enforce InstructionAccurateScope. If you
> > don't use the scope, that's fine, but you should probably remove it so that
> > it doesn't _appear_ to work (but not actually do the checks it claims to do).
> > 
> > @@ +824,5 @@
> > > +    //   str sp, [sp, #-8]!
> > > +    // which pre-decrements the stack pointer, storing the decremented value.
> > > +    //
> > > +    // Instead, we must use a scratch register. This hijacks ip1.
> > > +    if (src0.Is(GetStackPointer64())) {
> > 
> > Perhaps this should be a separate function (`PushStackPointer`).
> That isn't really practical.  For the most part, the only time this will
> happen is when we are pushing all of the registers from PushRegsInMask. 
> That usually doesn't even know it is pushing the stack pointer.

I don't know what that function looks like. Is it in platform-independent code?
If so, fair enough, but note that this workaround is fragile; as the comments
say, it only works where there is exactly one register (`sp`) to push.

Still, this is up to you (Mozilla); the VIXL side of things looks functionally
correct given the usage you describe.
Comment on attachment 8585699 [details] [diff] [review]
add_baseline.patch

Review of attachment 8585699 [details] [diff] [review]:
-----------------------------------------------------------------

Would like a response to comments question, but otherwise looks sensible.

::: js/src/jit/arm64/BaselineHelpers-arm64.h
@@ +94,5 @@
> +
> +    // Push frame descriptor (minus the return address) and perform the tail call.
> +    MOZ_ASSERT(BaselineTailCallReg == lr);
> +    masm.makeFrameDescriptor(r0, JitFrame_BaselineJS);
> +    masm.MacroAssemblerVIXL::Push(x0);

Why are we using a special Push method here?  Does the regular push not use the right stack reg?

@@ +111,5 @@
> +    ARMRegister reg64(reg, 64);
> +
> +    // Compute stub frame size. We have to add two pointers: the stub reg and previous
> +    // frame pointer pushed by EmitEnterStubFrame.
> +    masm.Add(reg64, ARMRegister(BaselineFrameReg, 64), Operand(sizeof(void *) * 2));

It seems more "logical" to subtract 2 words from StackPointer, and then subtract that from BaselineFrameReg, as it mirrors the stack layout change logic.  The arithmetic works out the same both ways, but is there a particular reason we're adding 2 words to BaselineFrameReg instead of subbing 2 words from the sp?

@@ +136,5 @@
> +    MOZ_ASSERT(scratch != BaselineTailCallReg);
> +
> +    // Compute frame size.
> +    masm.movePtr(BaselineFrameReg, scratch);
> +    masm.addPtr(Imm32(BaselineFrame::FramePointerOffset), scratch);

These two should be combinable with ARM 2-operand instructions:  scratch <- BaselineFrameReg + Imm32(BaselineFrame::FramePoinerOffset)

@@ +171,5 @@
> +        masm.pop(ScratchReg);
> +        masm.Lsr(ScratchReg64, ScratchReg64, FRAMESIZE_SHIFT);
> +        masm.Add(masm.GetStackPointer64(), masm.GetStackPointer64(), ScratchReg64);
> +    } else {
> +        masm.Add(masm.GetStackPointer64(), ARMRegister(BaselineFrameReg, 64), Operand(0));

Why is this an Add with 0 operand instead of a Move?

::: js/src/jit/arm64/BaselineIC-arm64.cpp
@@ +176,5 @@
> +        masm.monoTagMove(R0.valueReg(), Rscratch);
> +        break;
> +      case JSOP_RSH:
> +        masm.Asr(Wscratch, W0, W1);
> +        masm.monoTagMove(R0.valueReg(), Rscratch);

|monoTagMove| is a weird name for a function.  It seems like it's basically "movePayload" for ValueOperands, so why not call it that?
Attachment #8585699 - Flags: review?(kvijayan) → review+
Comment on attachment 8585717 [details] [diff] [review]
integrate_baseline.patch

Review of attachment 8585717 [details] [diff] [review]:
-----------------------------------------------------------------

Leaving the ASMJS stuff for luke.  Cancelling review because I don't understand certain decisions (noted in comments).

::: js/src/jit/BaselineCompiler.cpp
@@ +792,5 @@
>  #endif
> +#ifdef JS_CODEGEN_ARM64
> +    // Insert a nop into the instruction stream to force the next instruction to not
> +    // be hidden behind a pool.
> +    uint32_t curOff = masm.nop().getOffset();

Not sure why this is necessary only for ARM64.  How does a nop prevent next instruction from being hidden behind a pool?  And why is this an issue in ARM64 but not ARM32?

@@ +3774,1 @@
>          masm.push(ImmWord(0));

Why is this disabled for ARM64?  Are we not doing returnaddr-on-stack for our ARM64 abi?

::: js/src/jit/BaselineIC.cpp
@@ +10546,5 @@
>      }
>  
> +    // TODO: Check alignment here?
> +    // Maybe alignment should actually be 8, given that we never use sp.
> +    //masm.checkStackAlignment();

Why is this commented out?

::: js/src/jit/BaselineIC.h
@@ +1154,1 @@
>              MOZ_CRASH("Invalid numInputs");

This might be better done as MOZ_ASSERT(numInputs < 3) above, and omitting the last conditional.
Attachment #8585717 - Flags: review?(kvijayan)

Comment 40

4 years ago
Comment on attachment 8585717 [details] [diff] [review]
integrate_baseline.patch

Review of attachment 8585717 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/asmjs/AsmJSModule.h
@@ +1354,5 @@
>      // offset codeBytes_) in the module's linear allocation. The global data
>      // are laid out in this order:
>      //   0. a pointer to the current AsmJSActivation
>      //   1. a pointer to the heap that was linked to the module
> +    //   2. the length of the heap

This word is no longer needed (and was redundant with module->heapLength()), so can you revert all the changes to AsmJSModule.h in this patch?

::: js/src/asmjs/AsmJSSignalHandlers.cpp
@@ +1170,5 @@
>          if (module.containsFunctionPC((void*)rt->simulator()->get_pc()))
>              rt->simulator()->set_resume_pc(int32_t(module.interruptExit()));
> +#elif defined(JS_ARM64_SIMULATOR)
> +        if (rt->mainThread.simulator() && module.containsFunctionPC((void*)rt->simulator()->get_pc()))
> +            rt->simulator()->set_resume_pc(int64_t(module.interruptExit()));

Can you fold this with the first #ifdef by including the rt->mainThread.simulator() test and using intptr_t?

::: js/src/asmjs/AsmJSValidate.cpp
@@ +9243,4 @@
>  
> +
> +    // SWEET BABY JESUS, AsyncInterruptExit /really/ wants to function without modifing any registers
> +    // trouble is: this is basically impossible, since we still need to /return/

Need to remove the unprofessional part of this comment before landing.

@@ +9270,5 @@
> +    masm.Mov(sp,x25);
> +
> +    masm.SetStackPointer64(fake_sp);
> +
> +

Remove extra \n
Attachment #8585717 - Flags: review?(luke) → review+
Comment on attachment 8585715 [details] [diff] [review]
fix_ret.patch

Review of attachment 8585715 [details] [diff] [review]:
-----------------------------------------------------------------

Assuming comment 33 gets answered and there actually is a difference between the twos, r=me.

::: js/src/asmjs/AsmJSFrameIterator.cpp
@@ +265,5 @@
>          MOZ_ASSERT(PostStorePrePopFP == 0);
>  #endif
>  
>          masm.bind(profilingReturn);
> +        popReturn(masm);

nit: masm.popReturn(); ?
Attachment #8585715 - Flags: review?(benj) → review+
Comment on attachment 8585688 [details] [diff] [review]
add_moveemitter.patch

Review of attachment 8585688 [details] [diff] [review]:
-----------------------------------------------------------------

Mostly good, I'd like to have a second look later though, as there are a lot of unused remains from the ARM copy-pasto.

::: js/src/jit/arm64/MoveEmitter-arm64.cpp
@@ +9,5 @@
> +using namespace js;
> +using namespace js::jit;
> +
> +void
> +MoveEmitterARM64::emit(const MoveResolver &moves)

nit: here and everywhere, code style has changed from "const A &sthg" to "const A& sthg"

@@ +16,5 @@
> +        masm.reserveStack(sizeof(void *));
> +        pushedAtCycle_ = masm.framePushed();
> +    }
> +
> +    // TODO: Optimize cycles.

please open a bug to mention in the comment (or feel free to fix it ;))

@@ +24,5 @@
> +
> +void
> +MoveEmitterARM64::finish()
> +{
> +    MOZ_ASSERT(!inCycle_);

nit: that's assertDone();

@@ +42,5 @@
> +        inCycle_ = false;
> +        return;
> +    }
> +
> +    // TODO: MoveEmitterX86 has logic to attempt to avoid using the stack.

yay for the xor trick! Please open a bug to mention in the comment here or feel free to fix it!

@@ +68,5 @@
> +    }
> +}
> +
> +void
> +MoveEmitterARM64::emitFloat32Move(const MoveOperand &from, const MoveOperand &to)

in this function and the three below, feel free to add more 'return;' statements to have fewer levels of indentation, as done in emitGeneralMove.

@@ +97,5 @@
> +    } else {
> +        if (to.isFloatReg()) {
> +            masm.Ldr(toFPReg(to, MoveOp::DOUBLE), toMemOperand(from));
> +        } else {
> +            masm.ldr(ScratchDoubleReg_, toMemOperand(from));

what's the difference between ldr and Ldr? Why isn't there such a distinction for stores? or is that a typo?

@@ +124,5 @@
> +
> +void
> +MoveEmitterARM64::emitGeneralMove(const MoveOperand &from, const MoveOperand &to)
> +{
> +

nit: blank line

@@ +177,5 @@
> +void
> +MoveEmitterARM64::breakCycle(const MoveOperand &from, const MoveOperand &to, MoveOp::Type type)
> +{
> +    // TODO: Uh, pretty sure cycle resolution should just use a temp register.
> +    // TODO: Stack seems pretty overkill.

Indeed, especially as we have both a temp general and a temp float register? (this optimization could be applied on x64 as well?)

@@ +213,5 @@
> +        } else {
> +            masm.Str(toARMReg64(to), cycleSlot());
> +        }
> +        break;
> +      case MoveOp::INT32X4: // TODO

you can include these two cases in the "default" one, at the moment.

::: js/src/jit/arm64/MoveEmitter-arm64.h
@@ +15,5 @@
> +namespace jit {
> +
> +class CodeGenerator;
> +
> +class MoveEmitterARM64

Any chance we could make a MoveEmitterARMBase? A lot of the code seems duplicated between the MoveEmitterARM class and this one.

@@ +32,5 @@
> +
> +    // These are registers that are available for temporary use. They may be
> +    // assigned InvalidReg. If no corresponding spill space has been assigned,
> +    // then these registers do not need to be spilled.
> +    Register spilledReg_;

unused

@@ +33,5 @@
> +    // These are registers that are available for temporary use. They may be
> +    // assigned InvalidReg. If no corresponding spill space has been assigned,
> +    // then these registers do not need to be spilled.
> +    Register spilledReg_;
> +    FloatRegister spilledFloatReg_;

ditto

@@ +42,5 @@
> +
> +    Register tempReg() {
> +        return ScratchReg;
> +    }
> +    FloatRegister tempFloatReg() {

nit: unused
Attachment #8585688 - Flags: review?(benj)
(Assignee)

Updated

4 years ago
Depends on: 1160672
(Assignee)

Updated

4 years ago
Attachment #8585686 - Flags: review?(sstangl)
(Assignee)

Updated

4 years ago
Attachment #8585693 - Flags: review?(sstangl)
(Assignee)

Updated

4 years ago
Attachment #8585718 - Flags: review?(sstangl)
(Assignee)

Updated

4 years ago
Attachment #8585719 - Flags: review?(sstangl)
(Assignee)

Updated

4 years ago
Attachment #8585698 - Flags: review?(jdemooij)
(Assignee)

Updated

4 years ago
Attachment #8585691 - Flags: review?(jdemooij)
(Assignee)

Comment 43

4 years ago
I'm going to break up the patches in this bug into many sub-bugs, as originally planned, to avoid generating a ton of noise an a singular location. Each bug is to be landed independently, then tied together with a build system patch at the very end.

I worked through most of the commentary here and cleaned up most of the Baseline code. The AsmJS and Ion code are still in very bad shape, and probably shouldn't be landed whatsoever without stringent review and proper testing. So I will just work toward getting baseline support landed.
(Assignee)

Updated

4 years ago
Depends on: 1166037
(Assignee)

Updated

4 years ago
Depends on: 1166527
(Assignee)

Updated

4 years ago
Depends on: 1166530
(Assignee)

Updated

3 years ago
Depends on: 1173642
(Assignee)

Updated

3 years ago
Depends on: 1173969
(Assignee)

Updated

3 years ago
Depends on: 1173992
(Assignee)

Updated

3 years ago
Depends on: 1175556
(Assignee)

Updated

3 years ago
Depends on: 1179502
(Assignee)

Updated

3 years ago
Blocks: 1179514
(Assignee)

Comment 44

3 years ago
This bug is a huge mess, but ARM64 support has landed in-tree, enabled via "--enable-simulator=arm64".

Work on passing the testsuite will be tracked by Bug 1179514.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
(Assignee)

Updated

3 years ago
No longer depends on: 1179502

Updated

3 years ago
Blocks: 1187826
You need to log in before you can comment on or make changes to this bug.