Open Bug 1992888 Opened 1 month ago Updated 5 days ago

Remove the data header from wasm GC inline array allocations

Tracking

()

Status:

ASSIGNED

People

(Reporter: bvisness, Assigned: bvisness)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 3 obsolete files)

WIP: Bug 1992888: Remove wasm GC array data headers. 1 month ago Ben Visness [:bvisness] 48 bytes, text/x-phabricator-request		Details \| Review
image.png 1 month ago Ben Visness [:bvisness] 49.34 KB, image/png		Details
WIP: Bug 1992888: Use padding space for the data header. 1 month ago Ben Visness [:bvisness] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1992888: Remove the data header for inline array allocs. r=rhunt 1 month ago Ben Visness [:bvisness] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1992888 - part 1: Remove the data header from wasm GC inline array allocations. r=rhunt,bvisness. 11 days ago Julian Seward [:jseward] 48 bytes, text/x-phabricator-request		Details \| Review
Bug 1992888 - part 2: create Nursery::{set,get}DirectForwardedBufferWord and use for WasmArrayObject forwarding. r=jonco. 5 days ago Julian Seward [:jseward] 48 bytes, text/x-phabricator-request		Details \| Review

Ben Visness [:bvisness]

Assignee

Description

•

1 month ago

We allocate a one-word data header at the beginning of each chunk of wasm GC data. This potentially 8-byte value only ever stores a single meaningful bit, meaning we waste anywhere from 31 to 63 bits in each array data area that could be used for elements. In particular, this means that we could have more arrays that are allocated with inline storage, since we already know that GC programs are heavy on small arrays.

We can remove the data header by simply setting a low bit in the array data pointer when the data is out of line, then masking this bit off on access. The cost for this should be quite small, hopefully negligible, since it is a single-cycle op with a small encoding.

Ben Visness [:bvisness]

Assignee

Comment 1

•

1 month ago

Attached file WIP: Bug 1992888: Remove wasm GC array data headers. (obsolete) — Details

Julien Pages [:jpages]

Updated

•

1 month ago

Severity: -- → N/A

Priority: -- → P3

Ben Visness [:bvisness]

Assignee

Comment 2

•

1 month ago

Attached image image.png — Details

My initial attempt at this was a mixed bag - Kotlin and Dart workloads got a small win, but j2cl got a substantial loss. To me this suggests that Kotlin and Dart benefited from more arrays using inline storage, but that the cost of accessing arrays was made a fair amount worse, particularly hitting j2cl because that workload is mostly just looping over arrays.

I'm not entirely sure what makes the masking so costly, but one candidate is register allocation - many operations need additional temps. We could tighten this up somewhat, but it seems dubious that we could get much of a win from e.g. some careful use of useRegisterAtStart.

If we can come up with other approaches to save space that don't involve masking the pointer on load, that might be the better choice.

Ben Visness [:bvisness]

Assignee

Comment 3

•

1 month ago

Attached file WIP: Bug 1992888: Use padding space for the data header. (obsolete) — Details

Phabricator Automation

Updated

•

1 month ago

Attachment #9518529 - Attachment is obsolete: true

Ben Visness [:bvisness]

Assignee

Comment 4

•

1 month ago

Attached file Bug 1992888: Remove the data header for inline array allocs. r=rhunt (obsolete) — Details

Phabricator Automation

Updated

•

1 month ago

Attachment #9518529 - Attachment is obsolete: false

Phabricator Automation

Updated

•

1 month ago

Attachment #9518829 - Attachment is obsolete: true

Phabricator Automation

Updated

•

1 month ago

Attachment #9518529 - Attachment is obsolete: true

Ben Visness [:bvisness]

Assignee

Comment 5

•

1 month ago

After some experimentation, the masking approach didn't seem to pay off, but Ryan had an idea that allows us to remove the data header from inline arrays regardless. It is simply:

Don't allocate the data header for inline arrays.
Change how we interpret the data header: null -> out of line, non-null -> inline.
Make the out-of-line data header null.

This works because inline elements always follow the array's data pointer, and arrays always have a non-null data pointer. This is nice and simple, simplifies our bookkeeping, and doesn't affect codegen except by slightly simplifying allocation.

This is implemented in D267984.

Phabricator Automation

Updated

•

1 month ago

Attachment #9519015 - Attachment description: WIP: Bug 1992888: Remove the data header for inline array allocs. → Bug 1992888: Remove the data header for inline array allocs. r=rhunt

Julian Seward [:jseward]

Comment 6

•

1 month ago

I think unfortunately, the comment 5 scheme is going to conflict with the
change in bug 1995834, which is in turn a fix on top of bug 1995106, which is
needed to make the stackmap machinery work properly with BufferAlloc'd OOL
pointers. I tried but failed to think of a way to reconcile the two sets of
changes.

We should revisit this in a month or so and see if there's a way to reconcile
the two.

Ben Visness [:bvisness]

Assignee

Comment 7

•

15 days ago

We have yet another scheme for this, which may be addressed in bug 1995106: just do "direct forwarding" for both inline and out-of-line arrays.

In short: whenever we move array data, either inline or out of line, write the new data pointer at old_data[-1]. Then, whenever we find an array data pointer on the stack, simply update it to data[-1].

In full:

Allocate inline array data with no data header.
Allocate out-of-line array data with a one-word data header.
When moving an inline WasmArrayObject, write new_array.data into old_array.data. Because the inline array data immediately follows the WAO's data pointer, this places the new data address at old_data[-1].
When moving out-of-line array data (allocated by the buffer allocator), write the new data pointer into the old trailer's data header, again placing the new data address at old_data[-1].
Finally, when updating pointers on the stack, just update them like so: data = data[-1].

The one thing I'm not clear on is whether it's ok in the OOL case to write the actual data address instead of the actual allocation address. With the data header in there, alloc[0] is the same as data[-1]. It seems like this should be workable though.

Julian Seward [:jseward]

Comment 8

•

13 days ago

•

Edited

Following further discussion on Matrix, it seems this won't quite work because
of the unclarity mentioned in the previous comment. The GC/BufferAlloc
forwarding machinery really wants to deal with start-of-block pointers. Here's
a revised proposal from Jon/Ben which avoids that problem, and means we don't
need to change the GC/BufferAlloc forwarding machinery at all.

Proposed layout -- outside of GC times. "#e+p" is #elems + 4 bytes padding.

| Shape* | STV* | #e+p | data* | ...payload... |           (IL case)
                          | |    |
                          | \-->-/
                          |
                          |    | header | ...payload... |  (OOL case)
                          |               |
                          \------->-------/

After obj_move -- OOL case

                                                oldRoot
                                                |
| Shape* | STV* | #e+p | data* |     | header | ...payload... |   (old)
                            |          |        |
                            \--------->---------/
                                       |
                                       v        newRoot
                                       |        |
| Shape* | STV* | #e+p | data* |     | header | ...payload... |   (new)
                            |                   |
                            \--------->---------/

In this case, the GC (BufferAllocator) will write the new address of the
payload block in the first word of the old payload block, as illustrated (the
arrow from old header to new header). This write will be done by
calling Nursery::setForwardingPointer in the object moved hook.

Hence, old payload data pointers that we find on the stack can be updated as
newRoot = oldRoot[-1] + 1 where -1/+1 move backwards/forwards one word.

So now the game is to kludge up the IL case to make it look as much as possible
like the OOL case. In particular, the trick is to update the old data* to
make it be the address of (not the contents of) the new data* field.

After obj_move -- IL case

                                 oldRoot
                                 |
| Shape* | STV* | #e+p | data* | ...payload... |   (old)
                         |
                         v       newRoot
                         |       |
| Shape* | STV* | #e+p | data* | ...payload... |   (new)
                            |    |
                            \-->-/

So we still have newRoot = oldRoot[-1] + 1. And so it is possible to update
payload pointers without having to know whether they are to IL or OOL blocks.

Julian Seward [:jseward]

Comment 9

•

11 days ago

Attached file Bug 1992888 - part 1: Remove the data header from wasm GC inline array allocations. r=rhunt,bvisness. — Details

Commit message pending.

Phabricator Automation

Updated

•

11 days ago

Attachment #9519015 - Attachment is obsolete: true

Julian Seward [:jseward]

Updated

•

6 days ago

Summary: Remove the data header from wasm GC array allocations → Remove the data header from wasm GC inline array allocations

Phabricator Automation

Updated

•

5 days ago

Attachment #9528299 - Attachment description: WIP: Bug 1992888 - Remove the data header from wasm GC array allocations. → Bug 1992888 - part 1: Remove the data header from wasm GC inline array allocations. r=rhunt,bvisness.

Julian Seward [:jseward]

Comment 10

•

5 days ago

Attached file Bug 1992888 - part 2: create Nursery::{set,get}DirectForwardedBufferWord and use for WasmArrayObject forwarding. r=jonco. — Details

This patch allows the Nursery's forwarded-buffer machinery (the
direct-forwarding variant only) to forward a machine word that it does not
interpret.

In practice this is trivially done by having two new routines to set and get
the word. Existing forwarding code and uses thereof are unchanged. The new
routines do retain the assertion that the buffer that has been forwarded is
within the Nursery.

The use cases, WasmArrayObject::obj_moved and Instance::updateFrameForMovingGC,
which previously did such forwarding "manually", have been updated to use the
new routines.

Julian Seward [:jseward]

Updated

•

5 days ago

Duplicate of this bug: 1995834

You need to log in before you can comment on or make changes to this bug.