Closed Bug 1918970 Opened 2 months ago Closed 2 months ago

Use a vector instead of linked list for VirtualRegister ranges

Tracking

()

Status:

RESOLVED FIXED

Milestone:

132 Branch

Tracking Flags:

Tracking

Status

firefox132

---

fixed

People

(Reporter: jandem, Assigned: jandem)

References

Details

Attachments

(10 files)

Bug 1918970 part 1 - Store virtual registers in a Vector instead of FixedList. r?jseward! 2 months ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1918970 part 2 - Remove dead ranges first in createMoveGroupsFromLiveRangeTransitions. r?jseward! 2 months ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1918970 part 3 - Add VirtualRegister::RangeIterator. r?jseward! 2 months ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1918970 part 4 - Add LiveBundle::removeAllRangesFromVirtualRegisters. r?jseward! 2 months ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1918970 part 5 - Add VirtualRegister::removeRangesForBundle. r?jseward! 2 months ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1918970 part 6 - Use a vector instead of linked list for VirtualRegister ranges. r?jseward! 2 months ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1918970 part 7 - Use binary search to optimize VirtualRegister::rangeFor. r?jseward! 2 months ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1918970 part 8 - Simplify LiveRange linked list code, update comments. r?jseward! 2 months ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1918970 part 9 - Sort ranges explicitly instead of implicitly. r?jseward! 2 months ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1918970 part 10 - Define sort order for ranges with same start position. r?jseward! 2 months ago Jan de Mooij [:jandem] 48 bytes, text/x-phabricator-request		Details \| Review

Jan de Mooij [:jandem]

Assignee

Description

•

2 months ago

A large part of the performance issue in bug 1916442 is quadratic behavior from updating the linked list of ranges in VirtualRegister. Adding a range does a slow insertion sort.

A better data structure for this is a vector that we sort lazily for operations that need it to be sorted. This lets the list be unsorted during the core register allocation loop. A Vector also has better cache locality and allows binary search when sorted. These changes improve performance a lot: 276 seconds to 14 seconds locally for the Wasm module in that bug after some other changes.

Jan de Mooij [:jandem]

Assignee

Comment 1

•

2 months ago

Attached file Bug 1918970 part 1 - Store virtual registers in a Vector instead of FixedList. r?jseward! — Details

FixedList doesn't call destructors for the items. When VirtualRegister has a Vector in
later patches we need to call its destructor to not leak memory.

Jan de Mooij [:jandem]

Assignee

Comment 2

•

2 months ago

Attached file Bug 1918970 part 2 - Remove dead ranges first in createMoveGroupsFromLiveRangeTransitions. r?jseward! — Details

First remove all dead ranges before doing other operations on the ranges. It's easier
to reason about the other loops when they don't change the list of ranges at the same time.

Jan de Mooij [:jandem]

Assignee

Comment 3

•

2 months ago

Attached file Bug 1918970 part 3 - Add VirtualRegister::RangeIterator. r?jseward! — Details

Later patches will change the implementation of this iterator without having to
change the consumers.

Jan de Mooij [:jandem]

Assignee

Comment 4

•

2 months ago

Attached file Bug 1918970 part 4 - Add LiveBundle::removeAllRangesFromVirtualRegisters. r?jseward! — Details

This lets us share some code for the two places calling this.

Jan de Mooij [:jandem]

Assignee

Comment 5

•

2 months ago

Attached file Bug 1918970 part 5 - Add VirtualRegister::removeRangesForBundle. r?jseward! — Details

When we use a vector, this is a bit more efficient than removing a single range
at a time.

Jan de Mooij [:jandem]

Assignee

Comment 6

•

2 months ago

Attached file Bug 1918970 part 6 - Use a vector instead of linked list for VirtualRegister ranges. r?jseward! — Details

The vector is lazily sorted in order of descending start position for operations that
need it to be sorted.

This lets the list be unsorted during the core register allocation loop. A Vector also
has better cache locality and allows binary search when sorted. These changes improve
performance a lot: 276 seconds to 16 seconds locally for the Wasm module in bug 1916442
after some other changes.

A later patch will update the SMDOC comment.

Jan de Mooij [:jandem]

Assignee

Comment 7

•

2 months ago

Attached file Bug 1918970 part 7 - Use binary search to optimize VirtualRegister::rangeFor. r?jseward! — Details

All ranges that start after the position that's passed in definitely can't overlap.
Use binary search to skip these ranges.

This reduces the Ion compilation time for the Wasm module in bug 1916442 from 16 seconds to
14 seconds locally.

The time complexity for rangeFor and how it affects some of the callers still isn't great,
but optimizing that will require more work.

Jan de Mooij [:jandem]

Assignee

Comment 8

•

2 months ago

Attached file Bug 1918970 part 8 - Simplify LiveRange linked list code, update comments. r?jseward! — Details

Now that LiveRange is part of a single linked list instead of two lists, we can
remove its bundleLink and make it a standard linked-list node.

Will Medina [:willyelm]

Updated

•

2 months ago

Severity: -- → N/A

Priority: -- → P2

Jan de Mooij [:jandem]

Assignee

Comment 9

•

2 months ago

Attached file Bug 1918970 part 9 - Sort ranges explicitly instead of implicitly. r?jseward! — Details

The virtual register ranges are guaranteed to be sorted after the initial live range
building and grouping. After this phase, they can become unsorted when allocating
bundles to registers.

This patch adds an explicit sortVirtualRegisterRanges step to the allocator
to ensure the ranges are sorted after allocating all bundles. This is now the
only place where we can call sortRanges.

This makes it easier to reason about the sorting behavior and helps avoid pathological
behavior from future changes.

A VirtualRegister::replaceLastRangeLinear helper function is added to maintain
sort order so that we don't have to sort explicitly there.

Because sorting is now explicit, we can remove the activeRangeIterators_ bookkeeping
because we no longer have this possible footgun from sorting while iterating.

Jan de Mooij [:jandem]

Assignee

Comment 10

•

2 months ago

Attached file Bug 1918970 part 10 - Define sort order for ranges with same start position. r?jseward! — Details

std::sort is not a stable sorting function. To avoid relying on implementation-defined
ordering, extend the comparator function to compare the to position and the bundle id
when ranges start at the same position.

Bundles now store an 'id' in non-debug builds too. Each compilation has its own
counter that starts at 0 so that we don't need atomic operations to determine the next
id.

Pulsebot

Comment 11

•

2 months ago

Pushed by jdemooij@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/79c79897921e part 1 - Store virtual registers in a Vector instead of FixedList. r=jseward https://hg.mozilla.org/integration/autoland/rev/311340102af3 part 2 - Remove dead ranges first in createMoveGroupsFromLiveRangeTransitions. r=jseward https://hg.mozilla.org/integration/autoland/rev/eb5f3abe6d1f part 3 - Add VirtualRegister::RangeIterator. r=jseward https://hg.mozilla.org/integration/autoland/rev/af0ed3a86838 part 4 - Add LiveBundle::removeAllRangesFromVirtualRegisters. r=jseward https://hg.mozilla.org/integration/autoland/rev/ed873a1e9156 part 5 - Add VirtualRegister::removeRangesForBundle. r=jseward https://hg.mozilla.org/integration/autoland/rev/3ffe6980ef17 part 6 - Use a vector instead of linked list for VirtualRegister ranges. r=jseward https://hg.mozilla.org/integration/autoland/rev/5658a571443c part 7 - Use binary search to optimize VirtualRegister::rangeFor. r=jseward https://hg.mozilla.org/integration/autoland/rev/0f2df5df3636 part 8 - Simplify LiveRange linked list code, update comments. r=jseward https://hg.mozilla.org/integration/autoland/rev/56e5df0ee5bf part 9 - Sort ranges explicitly instead of implicitly. r=jseward https://hg.mozilla.org/integration/autoland/rev/4c0ff51c2ce6 part 10 - Define sort order for ranges with same start position. r=jseward

Cristian Tuns

Comment 12

•

2 months ago

bugherder

Status: ASSIGNED → RESOLVED

Closed: 2 months ago

status-firefox132: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 132 Branch

Mayank Bansal

Comment 13

•

2 months ago

•

Edited

Improvement:
30% on AWFY-'Jetstream2-HashSet-wasm-Runtime
24% on AWFY-Jetstream2-float-mm.c-Average on MacOS only
AWFY-webasswmbly-godot-* numbers by 3%-6%.

Regression:
10% on wasm-misc-optimizing-fannkuch
1.6% on wasm-misc-optimizing-compile . But this is effectively a "return to baseline" after the `1.6% improvement from bug 1917817

Jan de Mooij [:jandem]

Assignee

Comment 14

•

2 months ago

(In reply to Mayank Bansal from comment #13)

Improvement:
30% on AWFY-'Jetstream2-HashSet-wasm-Runtime

For posterity: I looked into this and it's because HashSet.wasm has a function with a ton of MIR/LIR call instructions. The splitting for this results in many tiny live ranges for a single virtual register that's live for the whole function. The slow part is connecting these ranges by inserting moves.

That loop is currently quadratic, was optimized some in this bug by using binary search to find the start position, but bug 1920951 will make it linear so will be another speedup.

Local Ion compile times for HashSet.wasm:

before this bug: 2800 ms
after this bug: 690 ms
after other changes like bug 1920951: 179 ms

Jan de Mooij [:jandem]

Assignee

Comment 15

•

2 months ago

NI myself to look at wasm-misc-optimizing-fannkuch to see if something changed there.

Flags: needinfo?(jdemooij)

Jan de Mooij [:jandem]

Assignee

Comment 16

•

2 months ago

(In reply to Mayank Bansal from comment #13)

Regression:
10% on wasm-misc-optimizing-fannkuch

I looked into this. The code we generate for this test is very similar before/after. After later changes (probably bug 1920951) it's clearly better now because we have fewer moves.

AFAICT this only affects MacOS and the test is bimodal if you zoom out, but we get the higher number more often now. The last two data points have been better (after bug 1920951 landed) but that could be a coincidence

Flags: needinfo?(jdemooij)

Mayank Bansal

Updated

•

2 months ago

Blocks: 1887312

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Use a vector instead of linked list for VirtualRegister ranges

Categories

(Core :: JavaScript Engine: JIT, task, P2)

Tracking

()

People

(Reporter: jandem, Assigned: jandem)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(10 files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Updated

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Updated

Attachment

General

Description

File Name

Content Type