Closed Bug 1835034 Opened 2 years ago Closed 1 year ago

Implement JIT Support for Float16Array

Tracking

()

Status:

RESOLVED FIXED

Milestone:

130 Branch

Tracking Flags:

Tracking

Status

firefox130

---

fixed

People

(Reporter: dminor, Assigned: anba)

References

Details

Attachments

(14 files)

Bug 1835034 - Part 1: Add float/int conversions to js::float16. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 2: Support encoding vcvtph2ps and vcvtps2ph instructions. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 3: Add missing to Float16 conversion when simulating FCVT_dh. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 4: Support `float` results in storeCallFloatResult. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 5: Add {Double,Float32,Int32}ToFloat16 conversion methods to MacroAssemblers. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 6: Inline Math.f16round. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 7: Inline loading from Float16Array. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 8: Inline DataView.prototype.getFloat16. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 9: Inline storing into Float16Array. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 10: Inline DataView.prototype.setFloat16. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 11: Fold ToFloat16 when the input is guaranteed to be Float16. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 12: AsmJS codegen doesn't support Float16Array. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 13: Fix indentation for codegen spew. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1835034 - Part 14: Enable float32 optimizations for MToDouble. r=jandem! 1 year ago André Bargull [:anba] 48 bytes, text/x-phabricator-request		Details \| Review

Dan Minor [:dminor]

Reporter

Description

•

2 years ago

From https://bugzilla.mozilla.org/show_bug.cgi?id=1833647#c12, a follow up to implement JIT support for Float16Array.

Matthew Gaudet (he/him) [:mgaudet]

Updated

•

2 years ago

Severity: -- → S3

Priority: -- → P3

Dan Minor [:dminor]

Reporter

Comment 1

•

1 year ago

There's some discussion of optimizations here: https://github.com/tc39/proposal-float16array/issues/12

André Bargull [:anba]

Assignee

Updated

•

1 year ago

Assignee: nobody → andrebargull

Status: NEW → ASSIGNED

Mathew Hodson

Updated

•

1 year ago

Depends on: 1905609

André Bargull [:anba]

Assignee

Comment 2

•

1 year ago

Attached file Bug 1835034 - Part 1: Add float/int conversions to js::float16. r=jandem! — Details

Add the more conversion methods from upstream. Later patches will call the new
methods.

Add if-constexpr to ElementSpecific::valueToNative to avoid compiler errors,
because both js::float16::operator=(double) and js::float16::operator=(float)
are applicable assignment operators when assigning from int64_t.

André Bargull [:anba]

Assignee

Comment 3

•

1 year ago

Attached file Bug 1835034 - Part 2: Support encoding vcvtph2ps and vcvtps2ph instructions. r=jandem! — Details

Support vcvtph2ps and vcvtps2ph instructions from the F16C instruction set.

F16C requires AVX being enabled per "Intel developer manual, Vol 1, §14.4.1
Detection of F16C Instructions".

Depends on D215762

André Bargull [:anba]

Assignee

Comment 4

•

1 year ago

Attached file Bug 1835034 - Part 3: Add missing to Float16 conversion when simulating FCVT_dh. r=jandem! — Details

Upstream already has this fixed.

Depends on D215763

André Bargull [:anba]

Assignee

Comment 5

•

1 year ago

Attached file Bug 1835034 - Part 4: Support `float` results in storeCallFloatResult. r=jandem! — Details

Support float in addition to double in storeCallFloatResult.

Depends on D215764

André Bargull [:anba]

Assignee

Comment 6

•

1 year ago

Attached file Bug 1835034 - Part 5: Add {Double,Float32,Int32}ToFloat16 conversion methods to MacroAssemblers. r=jandem! — Details

Hardware support for float16 conversions is limited:

ARM supports float32<>float16 conversions with Neon. This is not implemented yet.
ARM64 supports float32<>float16 and float64<>float16 conversions.
x86/x64 supports float32<>float16 conversions when F16C instructions are supported.

We use the following approach for this initial implementation:

Use supported conversions when available, otherwise fall back to an ABI call.
Represent float16 as float32 throughout the JIT (so no MIRType::Float16 yet),
because:
1. float32 is supported for all targets, so we reduce cross-target differences
  when choosing the data type which is universally supported,
2. float32<>float16 conversions are natively supported for the main target platforms,
3. actual float16 math operations have even more limited hardware support, so we need
  to convert float16 to either float32 or float64 anyway at some point.
And this also enables using the existing optimisations for MIRType::Float32.

The next part will start using the conversion methods from this patch.

Note 1: float64->float16 conversion can't be emulated through a float64->float32->float16
conversion sequence, because the sequence float64->float32 and float32->float16 can
round differently than the direct float64->float16 conversion.

Note 2: float f(int32_t) in "ABIFunctionType.yaml" requires an explicit General -> Float32
entry for the ARM simulator, just adding Int32 -> Float32 led to an error.

Depends on D215765

André Bargull [:anba]

Assignee

Comment 7

•

1 year ago

Attached file Bug 1835034 - Part 6: Inline Math.f16round. r=jandem! — Details

Inline Math.f16round similar how Math.fround is inlined:

CacheIRCompiler either calls the conversion methods from part 5
or calls into the VM.
Warp transpiles to MToFloat16, which has a similar implementation
as MToFloat32.

Depends on D215766

André Bargull [:anba]

Assignee

Comment 8

•

1 year ago

Attached file Bug 1835034 - Part 7: Inline loading from Float16Array. r=jandem! — Details

Extend MacroAssembler::loadFromTypedArray to support loading from Float16Array.
This requires passing an additional temp-register and LiveRegisterSet when the
target doesn't natively support float32<>float16 conversions.

Codegen for LoadUnboxedScalar on x86/x64 looks like:

movzwl 0x0(%rdx,%rbx,2), %esi
vmovd %esi, %xmm0
vpmovzxwq %xmm0, %xmm0
vcvtph2ps %xmm0, %xmm0
vucomiss %xmm0, %xmm0
jnp .Lfrom120
movss .Lfrom128(%rip), %xmm0

And on ARM64:

ldr h0, [x2, x3, lsl #1]
fcvt s0, h0
fcmp s0, s0
b.vc -> 1015f
ldr s0, pc+24 (addr 0x70c2b0a96224) ; .const nan

Depends on D215767

André Bargull [:anba]

Assignee

Comment 9

•

1 year ago

Attached file Bug 1835034 - Part 8: Inline DataView.prototype.getFloat16. r=jandem! — Details

Extend the existing DataView code to also support Float16, using similar
changes as the previous part.

Depends on D215768

André Bargull [:anba]

Assignee

Comment 10

•

1 year ago

Attached file Bug 1835034 - Part 9: Inline storing into Float16Array. r=jandem! — Details

Slightly larger changes when compared to the previous two parts, because
MacroAssembler::storeToTypedFloatArray had to be changed to support
conversions instead of performing conversion in its caller:

CacheIRCompiler::emitStoreTypedArrayElement used ScratchFloat32Scope to
convert double -> float32, but using the same approach won't work for float16,
because ScratchFloat32Scope is also needed in MacroAssembler::storeFloat16
to convert float32 -> float16.
Therefore move the conversion double -> float32 into StoreToTypedFloatArray
And the conversions double -> float16 into MacroAssembler::storeFloat16.

Codegen for StoreUnboxedScalar on x64 looks like:

vcvtps2ph $0x4, %xmm0, %xmm15
vmovd %xmm15, %r11d
movw %r11w, 0x0(%rdx,%rbx,2)

And on ARM64:

h31, s0
h31, [x2, x4, lsl #1]

Depends on D215769

André Bargull [:anba]

Assignee

Comment 11

•

1 year ago

Attached file Bug 1835034 - Part 10: Inline DataView.prototype.setFloat16. r=jandem! — Details

Depends on D215770

André Bargull [:anba]

Assignee

Comment 12

•

1 year ago

Attached file Bug 1835034 - Part 11: Fold ToFloat16 when the input is guaranteed to be Float16. r=jandem! — Details

Transpiler and type policies add the following instructions when reading and then
storing a value from a Float16Array:

value = MLoadUnboxedScalar(f16array)
guarded_value = MToDouble(value) <-- Inserted by WarpCacheIRTranspiler
typed_value = MToFloat16(guarded_value) <-- Inserted by StoreUnboxedScalarPolicy
MStoreUnboxedScalar(f16array, typed_value)

Neither MToDouble nor MToFloat16 are needed, so let MToFloat16::foldsTo remove them.

This extra folding is needed because we don't yet have a MIRType::Float16 which we
can handle in MToFloat16::foldsTo.

The WarpCacheIRTranspiler change is an optimisation to avoid generating the following
instructions during transpiling and applying the type policy:

value = MLoadUnboxedScalar(f16array)
double_value = MToDouble(value) <-- Inserted by js::jit::AlwaysBoxAt
boxed_value = MBox(double_value) <-- Inserted by BoxPolicy
unboxed_value = MUnbox(boxed_value, Double) <-- Inserted by WarpCacheIRTranspiler

GVN will remove the MBox->MUnbox sequence, but it seems preferable to avoid generating it
in the first place.

Depends on D215771

André Bargull [:anba]

Assignee

Comment 13

•

1 year ago

Attached file Bug 1835034 - Part 12: AsmJS codegen doesn't support Float16Array. r=jandem! — Details

Remove the TODO note about adding Float16Array JIT support by renaming
OutOfLineLoadTypedArrayOutOfBounds to OutOfLineAsmJSLoadHeapOutOfBounds
which makes it more clear that Float16 support isn't needed here.

Depends on D215772

André Bargull [:anba]

Assignee

Comment 14

•

1 year ago

Attached file Bug 1835034 - Part 13: Fix indentation for codegen spew. r=jandem! — Details

Before this change:

[Codegen] vucomiss   %xmm0, %xmm0
[Codegen] jnp        .Lfrom214
[Codegen] movss       .Lfrom222(%rip), %xmm0

After this change:

[Codegen] vucomiss   %xmm0, %xmm0
[Codegen] jnp        .Lfrom214
[Codegen] movss      .Lfrom222(%rip), %xmm0

Note how the label identifiers are now properly aligned.

Depends on D215773

André Bargull [:anba]

Assignee

Comment 15

•

1 year ago

Attached file Bug 1835034 - Part 14: Enable float32 optimizations for MToDouble. r=jandem! — Details

ToFloat32(ToDouble(float32)) is exactly equal to float32, so MToDouble can
produce Float32 when its input can produce Float32. This change is necessary to
enable Float32 optimizations for various instructions, for example MSqrt.

Without this change Float32 optimizations are always disabled, which makes it
hard to verify that Float16 operations correctly handle Float32 inputs and
outputs.

Depends on D215774

Dan Minor [:dminor]

Reporter

Updated

•

1 year ago

Mentor: dminor

Pulsebot

Comment 16

•

1 year ago

Pushed by andre.bargull@gmail.com: https://hg.mozilla.org/integration/autoland/rev/410329e58599 Part 1: Add float/int conversions to js::float16. r=jandem https://hg.mozilla.org/integration/autoland/rev/9d01c98d3d64 Part 2: Support encoding vcvtph2ps and vcvtps2ph instructions. r=jandem https://hg.mozilla.org/integration/autoland/rev/ade51dbcc573 Part 3: Add missing to Float16 conversion when simulating FCVT_dh. r=jandem https://hg.mozilla.org/integration/autoland/rev/3f2cb72c6348 Part 4: Support `float` results in storeCallFloatResult. r=jandem https://hg.mozilla.org/integration/autoland/rev/c6ec3155a5f8 Part 5: Add {Double,Float32,Int32}ToFloat16 conversion methods to MacroAssemblers. r=jandem https://hg.mozilla.org/integration/autoland/rev/f8cdec89d3cc Part 6: Inline Math.f16round. r=jandem https://hg.mozilla.org/integration/autoland/rev/d5ea08f74244 Part 7: Inline loading from Float16Array. r=jandem https://hg.mozilla.org/integration/autoland/rev/816e1e8497d5 Part 8: Inline DataView.prototype.getFloat16. r=jandem https://hg.mozilla.org/integration/autoland/rev/1543e18cfd43 Part 9: Inline storing into Float16Array. r=jandem https://hg.mozilla.org/integration/autoland/rev/b39efb191c8e Part 10: Inline DataView.prototype.setFloat16. r=jandem https://hg.mozilla.org/integration/autoland/rev/10fb376db8cf Part 11: Fold ToFloat16 when the input is guaranteed to be Float16. r=jandem https://hg.mozilla.org/integration/autoland/rev/1006c87386c1 Part 12: AsmJS codegen doesn't support Float16Array. r=jandem https://hg.mozilla.org/integration/autoland/rev/a67dc538eaee Part 13: Fix indentation for codegen spew. r=jandem https://hg.mozilla.org/integration/autoland/rev/f0bc3536dce7 Part 14: Enable float32 optimizations for MToDouble. r=jandem,nbp https://hg.mozilla.org/integration/autoland/rev/e0c206023ab0 apply code formatting via Lando

Cristian Tuns

Comment 17

•

1 year ago

bugherder

Status: ASSIGNED → RESOLVED

Closed: 1 year ago

status-firefox130: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 130 Branch

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

1 year ago

Regressions: 1909092

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Updated

•

1 year ago

Regressions: 1922104

You need to log in before you can comment on or make changes to this bug.