Open Bug 1860412 Opened 11 months ago Updated 4 months ago

ppc64le JIT

Categories

(Core :: JavaScript Engine: JIT, enhancement, P5)

Other
Unspecified
enhancement

Tracking

()

Tracking Status
firefox120 --- affected

People

(Reporter: spectre, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(3 files)

As requested in https://discourse.mozilla.org/t/spidermonkey-jit-for-power-isa-ppc64le/122930/3 .

This is the first draft of a JIT to support little-endian 64-bit Power ISA in Firefox. It is currently compatible with POWER9 OpenPOWER systems and passes the full test suite on Linux. The two pieces left are support for POWER8 and enabling the optimizing Wasm JIT (it uses Baseline right now), but it’s good enough to ship and testers are already evaluating it, and we can get those other pieces completed assuming it sticks. This is a descendant of the TenFourFox JIT for 32-bit PowerPC, upgraded to 64-bit and using the true little-endian mode supported on POWER8 and later.

Architecturally, ppc64le’s main differences from x64 are that it uses a link register, it has up to eight independent condition codes plus other condition codes in special purpose registers, it does not have strict store ordering, and it is a fixed-length instruction word (even Power10, which has “double” instructions, still aligns to a 32-bit word). Like MIPS, there is a limited displacement for branches and offsets.

ppc64le’s main differences from ARM64 is that it has up to eight independent condition codes plus other condition codes in special purpose registers, and that the link register is not a GPR (it’s a special purpose register), which requires it to be transferred and pushed separately.

Additionally, the standard ABI for Power ISA requires a linkage area of no less than 32 bytes on the top of the stack for the callee to store the link register, condition register, stack pointer and TOC register. This is dealt with in the port by pulling down dummy frames as necessary at ABI-Ion and ABI-Wasm edges (otherwise things like the Wasm Frame would get walloped, since it sits on top of the stack).

As requested, I'm posting the diffs against js/ for feedback and to determine if there is interest in accepting the port into the tree. If it is, I will pull up all changes to tip and formally request review.

Changes against wasm/. These are mostly the same as other platforms. There are a couple specific tweaks for the ABI.

Assignee: nobody → spectre
Status: NEW → ASSIGNED
Attachment #9359677 - Flags: feedback?(jdemooij)

Again, largely standard and the same as other ports, except for a couple ABI and architecture-specific tweaks.

Attachment #9359678 - Flags: feedback?(jdemooij)

These are the miscellaneous patches. The only vaguely notable one is that irregexp/ isn't really ABI compliant and needs a couple hacks due to its assumptions about the link register.

Attachment #9359679 - Flags: feedback?(jdemooij)
Blocks: sm-jits
Severity: -- → N/A
Priority: -- → P5

for the record

  • I am using the JIT-enabled Firefox 115 ESR daily on my workstation for over a month now and no issues
  • we (Fedora / Red Hat) are running a CI with daily rebuilds for ppc64le (and s390x, aarch64, ...) so any build issues are detected early
Attachment #9359677 - Flags: feedback?(jdemooij) → feedback?(rhunt)

Additionally, the standard ABI for Power ISA requires a linkage area of no less than 32 bytes on the top of the stack for the callee to store the link register, condition register, stack pointer and TOC register. This is dealt with in the port by pulling down dummy frames as necessary at ABI-Ion and ABI-Wasm edges (otherwise things like the Wasm Frame would get walloped, since it sits on top of the stack).

Could you elaborate on this a bit? Where are you inserting this dummy frame? When calling into wasm or exiting from wasm to native/JIT code? I looked through your wasm/ diff and didn't see anything here (but may have missed something obvious).

Flags: needinfo?(spectre)

(In reply to Ryan Hunt [:rhunt] from comment #5)

Could you elaborate on this a bit? Where are you inserting this dummy frame? When calling into wasm or exiting from wasm to native/JIT code? I looked through your wasm/ diff and didn't see anything here (but may have missed something obvious).

Most of that I've subsumed into the Power architecture-specific MacroAssembler and CodeGenerator. Calls to wasm::SymbolicAddress in particular need that stack adjustment but I didn't need to make any change within js/src/wasm to implement it. ResumeFromException in JitFrames.h does need a dummy linkage area baked into the struct since it's built on stack, but that's in js/src/jit.

Flags: needinfo?(spectre)

Okay, if the ABI changes can be limited to calls in/out of wasm in such a manner that the internal wasm abi doesn't need to change then this looks good to me.

Attachment #9359677 - Flags: feedback?(rhunt) → feedback+

Much obliged, thank you.

Attachment #9359679 - Flags: feedback?(jdemooij) → feedback+
Comment on attachment 9359678 [details] [diff] [review] Diff against js/src/jit Review of attachment 9359678 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/jit/JitFrames.cpp @@ +1776,5 @@ > +#if defined(JS_CODEGEN_PPC64) > + // There is no (simple) way from the ISA to determine if an arbitrary > + // FPR contains a float or a double since the ISA treats them largely > + // synonymously, so the MachineState will always contain a double even > + // if it's encoding a float. I'm not sure I understand this part. Isn't it possible to define float32 vs double registers like what we do on x86 for example?

Thanks for the f+ on the other one.

(In reply to Jan de Mooij [:jandem] from comment #9)

I'm not sure I understand this part. Isn't it possible to define float32 vs
double registers like what we do on x86 for example?

Do you mean determine from the ISA side if there's a float in the register, or do something like what x86's PushRegsInMask does by querying each FloatRegister's size before store*()-ing it? If you mean the latter, I suppose that could work, though we would run the risk of an unaligned stack on the other end.

As a counterargument, I note that RISC-V does basically the same as I'm doing, although for the record I don't know why it doesn't need the same change: without this only the lower half got properly loaded, which made the FPR gibberish.

(In reply to Cameron Kaiser [:spectre] from comment #10)

Do you mean determine from the ISA side if there's a float in the register, or do something like what x86's PushRegsInMask does by querying each FloatRegister's size before store*()-ing it? If you mean the latter, I suppose that could work, though we would run the risk of an unaligned stack on the other end.

As a counterargument, I note that RISC-V does basically the same as I'm doing, although for the record I don't know why it doesn't need the same change: without this only the lower half got properly loaded, which made the FPR gibberish.

Trying to understand how this works: the code we have in tree loads a (4-byte) float if we have a float32 register or an (8-byte) double if it's a double register. If we change this to always load a double on PPC64, where does the conversion from float32 (what's in the register at some point) to double happen?

I'm also asking because this code doesn't have great test coverage from what I remember. It's probably worth checking you hit this path on jit-tests.

Power ISA floating point registers are always doubles internally, so merely loading a 32-bit single precision value will autopromote it to double. You can always round to single precision either with an explicit rounding instruction or the set of single precision math instructions, but the register is still a double (just within single precision range). Actual conversion to a 32-bit single occurs when written to memory with specific instructions.

It does hit this path for us, because I had to make this change to get the test suite to pass (ISTR it was one of the Wasm tests).

Comment on attachment 9359678 [details] [diff] [review]
Diff against js/src/jit

Ah I see. This makes sense then :)

Attachment #9359678 - Flags: feedback?(jdemooij) → feedback+

Many thanks. I will pull these up to tip. Do you (both) want to review those first before the new files?

(In reply to Cameron Kaiser [:spectre] from comment #14)

Many thanks. I will pull these up to tip. Do you (both) want to review those first before the new files?

It would be best to post everything as a single patch stack, with different patches for the new files and the parts you have here.

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: spectre → nobody
Status: ASSIGNED → NEW
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: