Closed Bug 1664397 Opened 4 years ago Closed 4 years ago

Enhance Movi masm function to generate better code for SIMD constant loads

Tracking

()

Status:

RESOLVED FIXED

Milestone:

84 Branch

Tracking Flags:

Tracking

Status

firefox84

---

fixed

People

(Reporter: lth, Assigned: lth)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Bug 1664397 - Improve simd constant loads. r?jseward 4 years ago Lars T Hansen [:lth] 47 bytes, text/x-phabricator-request		Details \| Review
Bug 1664397 - remove spurious print statements. r?jseward 4 years ago Lars T Hansen [:lth] 47 bytes, text/x-phabricator-request		Details \| Review

Lars T Hansen [:lth]

Assignee

Description

•

4 years ago

Movi is not very sophisticated for SIMD loads, it needs to be improved to take advantage of load-and-splat, compressed constants, etc.

Lars T Hansen [:lth]

Assignee

Updated

•

4 years ago

Severity: -- → N/A

Priority: -- → P3

Lars T Hansen [:lth]

Assignee

Comment 1

•

4 years ago

•

Edited

Julian on IM talking about bitmask as implemented by the baseline compiler: "In particular, for the 8x16 case, the upper and lower 64 bits of constant are the same, so it winds up spending 10 insns overall where it can be done with 5."

It's not quite that black and white but we should probably prioritize fixing Movi. Here's the constant setup for i8x16.bitmask, from Julian:

[Codegen] [q] d2804030        mov     x16, #0x201
[Codegen] [q] f2a10090        movk    x16, #0x804, lsl #16
[Codegen] [q] f2c40210        movk    x16, #0x2010, lsl #32
[Codegen] [q] f2f00810        movk    x16, #0x8040, lsl #48
[Codegen] [q] 4e080e01        dup     v1.2d, x16
[Codegen] [q] d2804030        mov     x16, #0x201
[Codegen] [q] f2a10090        movk    x16, #0x804, lsl #16
[Codegen] [q] f2c40210        movk    x16, #0x2010, lsl #32
[Codegen] [q] f2f00810        movk    x16, #0x8040, lsl #48
[Codegen] [q] 4e181e01        mov     v1.d[1], x16

Obviously, there's an easy optimization when the high 64 bits match the low 64 bits, but maybe we can do better than that still.

Lars T Hansen [:lth]

Assignee

Updated

•

4 years ago

Priority: P3 → P2

Lars T Hansen [:lth]

Assignee

Comment 2

•

4 years ago

Attached file Bug 1664397 - Improve simd constant loads. r?jseward — Details

Lars T Hansen [:lth]

Assignee

Updated

•

4 years ago

Assignee: nobody → lhansen

Status: NEW → ASSIGNED

Phabricator Automation

Updated

•

4 years ago

Attachment #9183444 - Attachment description: Bug 1664397 - Improve simd constant loads (WIP) → Bug 1664397 - Improve simd constant loads. r?jseward

Julian Seward [:jseward]

Comment 3

•

4 years ago

The only other improvable cases I know of are with i16x8.bitmask and i32x4.bitmask, in which hi == lo << 4 and hi == lo << 8 respectively. So they can be done with just one 64-bit constant load, the dup, a shift left of the integer reg, and the insn to copy that to the top of the vector. Not sure it's worth the effort. But I thought I should make a record of it somewhere.

Lars T Hansen [:lth]

Assignee

Comment 4

•

4 years ago

Agreed. Did wonder about this. Decided that it was probably not worth the bother / required too much work to figure out if it was worth it. I would hope that a good implementation boils away MOV+MOVK* in some efficient way anyhow...

Pulsebot

Comment 5

•

4 years ago

Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/cc07e95f3b14
Improve simd constant loads. r=jseward

Cristina Coroiu [:ccoroiu]

Comment 6

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/cc07e95f3b14

Status: ASSIGNED → RESOLVED

Closed: 4 years ago

status-firefox84: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 84 Branch

Lars T Hansen [:lth]

Assignee

Comment 7

•

4 years ago

Test case is actually broken.

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Lars T Hansen [:lth]

Assignee

Comment 8

•

4 years ago

Attached file Bug 1664397 - remove spurious print statements. r?jseward — Details

A test case was broken but this was not discovered because I mostly test on release builds, and
that's all we run on CI as well. On release-debug, with the disassembler available,
we don't quit early and hence run into the bug: the expected values for a case were wrong.
Fix this by changing the expected values and running the functionality test even when
the disassembler is not available.

BugBot [:suhaib / :marco/ :calixte]

Updated

•

4 years ago

status-firefox84: fixed → affected

Phabricator Automation

Updated

•

4 years ago

Attachment #9185151 - Attachment description: Bug 1664397 - fix test case. r?jseward → Bug 1664397 - remove spurious print statements. r?jseward

Pulsebot

Comment 9

•

4 years ago

Pushed by lhansen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/f4c5176760c4
remove spurious print statements. r=jseward DONTBUILD

Sandor Molnar[:smolnar]

Comment 10

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/f4c5176760c4

Status: REOPENED → RESOLVED

Closed: 4 years ago → 4 years ago

status-firefox84: affected → fixed

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Enhance Movi masm function to generate better code for SIMD constant loads

Categories

(Core :: JavaScript: WebAssembly, enhancement, P2)

Tracking

()

People

(Reporter: lth, Assigned: lth)

References

(Blocks 1 open bug)

Details

Crash Data

Security

(public)

User Story

Attachments

(2 files)

Description

Updated

Comment 1

Updated

Comment 2

Updated

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Updated

Comment 9

Comment 10

Attachment

General

Description

File Name

Content Type