Closed Bug 1597790 Opened 6 years ago Closed 4 years ago

Optimization: Use SIMD for inlined bulk-memory operations

Tracking

()

Status:

RESOLVED FIXED

Milestone:

93 Branch

Tracking Flags:

Tracking

Status

firefox93

---

fixed

People

(Reporter: rhunt, Assigned: yury)

References

Details

Attachments

(2 files)

Bug 1597790 - Use SIMD in inlined memory.{fill,copy} ops. r?lth 4 years ago Yury Delendik (:yury) 48 bytes, text/x-phabricator-request		Details \| Review
bulk_simd_tests.js 4 years ago Yury Delendik (:yury) 866 bytes, application/x-javascript		Details

Ryan Hunt [:rhunt]

Reporter

Description

•

6 years ago

This shouldn't block shipping bulk-memory.

Bug 1594204 add a basic OOL path for memory.copy/memory.fill that uses up to 4 byte loads/stores on 32-bit systems and up to 8 byte loads/stores on 64-bit systems. We can do better if we utilize the SIMD register bank.

I originally avoided this for expediency. By only using GPRs we could re-use our existing wasmLoad/wasmStore infrastructure for bounds checking and trap handling. SIMD will require more work.

I had a version of a patch to do this by adding a Scalar::Int128, and extending wasmLoad/Store to utilize this. It was a bit hacky, and didn't fit at all with the baseline compiler (which uses ValueType for the stack). It's not clear that we should care for this optimization in the baseline compiler, so maybe that's fine.

Lars T Hansen [:lth]

Updated

•

5 years ago

No longer blocks: wasm-bulkmem

Summary: Use SIMD for inlined bulk-memory operations → Optimization: Use SIMD for inlined bulk-memory operations

Ryan Hunt [:rhunt]

Reporter

Updated

•

4 years ago

Assignee: nobody → rhunt

Yury Delendik (:yury)

Assignee

Comment 1

•

4 years ago

Attached file Bug 1597790 - Use SIMD in inlined memory.{fill,copy} ops. r?lth — Details

Yury Delendik (:yury)

Assignee

Comment 2

•

4 years ago

•

Edited

I had a version of a patch to do this by adding a Scalar::Int128, and extending wasmLoad/Store to utilize this. It was a bit hacky, and didn't fit at all with the baseline compiler (which uses ValueType for the stack).

Since we have Scalar::Simd128, we could just partially enable ENABLE_WASM_SIMD fragments. Though it becomes too complicated and adds overhead just for copy/fill when ENABLE_WASM_SIMD is off. The patch enables inlining with SIMD ops only when ENABLE_WASM_SIMD is on.

I did some microbenchmarking on x64 (attaching test):

	before	after
ion aligned	1921	1315	46.06%
ion unaligned	2086	1475	41.42%
baseline aligned	2078	1504	38.16%
baseline unaligned	2158	1634	32.06%

Yury Delendik (:yury)

Assignee

Comment 3

•

4 years ago

Attached file bulk_simd_tests.js — Details

Lars T Hansen [:lth]

Comment 4

•

4 years ago

Nice results and nice compromise.

Yury Delendik (:yury)

Assignee

Updated

•

4 years ago

Assignee: rhunt → ydelendik

Status: NEW → ASSIGNED

Phabricator Automation

Updated

•

4 years ago

Attachment #9234768 - Attachment description: WIP: Bug 1597790 - WIP use SIMD in inlined memory.{fill,copy} → Bug 1597790 - Use SIMD in inlined memory.{fill,copy} ops. r?lth

Ryan Hunt [:rhunt]

Reporter

Comment 5

•

4 years ago

Yury, IIUC your test is checking size=128/129 byte copies and fills. Would it be possible to test on smaller sizes? Like 16,17,32,33 bytes? Just curious if that changes the picture. Would also be curious if we could test this on ARM64 to ensure we get reasonable codegen there too.

Flags: needinfo?(ydelendik)

Yury Delendik (:yury)

Assignee

Comment 6

•

4 years ago

(In reply to Ryan Hunt [:rhunt] from comment #5)

Yury, IIUC your test is checking size=128/129 byte copies and fills. Would it be possible to test on smaller sizes? Like 16,17,32,33 bytes? Just curious if that changes the picture. Would also be curious if we could test this on ARM64 to ensure we get reasonable codegen there too.

The benchmark checked length from 1 to 65, I think. The 128/129 is destination offset. ARM64 is pending.

Flags: needinfo?(ydelendik)

Pulsebot

Comment 7

•

4 years ago

Pushed by ydelendik@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/e8b9b45734fa Use SIMD in inlined memory.{fill,copy} ops. r=rhunt

Alexandru Michis [:malexandru]

Comment 8

•

4 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/e8b9b45734fa

Status: ASSIGNED → RESOLVED

Closed: 4 years ago

status-firefox93: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 93 Branch

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Optimization: Use SIMD for inlined bulk-memory operations

Categories

(Core :: JavaScript: WebAssembly, enhancement, P3)

Tracking

()

People

(Reporter: rhunt, Assigned: yury)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(2 files)

Description

Updated

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Attachment

General

Description

File Name

Content Type