[meta] Minimize typed object memory usage

RESOLVED FIXED

Status

()

Core
JavaScript Engine
RESOLVED FIXED
4 years ago
2 years ago

People

(Reporter: bhackett, Unassigned)

Tracking

(Blocks: 2 bugs)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [MemShrink:P2])

(Reporter)

Description

4 years ago
Typed objects should use less memory than plain objects: the layout of typed values in memory is more efficient than an array of plain values, and there is no need to support extensibility.  Currently, however, due to overhead smaller typed objects will use more memory than a comparable plain object.  There are two sources of overhead: 1) typed objects always have an associated ArrayBuffer object with their backing data, and 2) typed objects have four fixed slots (byteOffset, length, owner, next view) plus a private pointer for the data itself (which unfortunately gets padded so that we use eight slots in total).

In many common use cases all this overhead should be able to be eliminated.  To see what the target layout should be and how it will work, here is a basic example:

var T = TypedObject;
var Pair = new T.StructType({y:T.int32, z:T.int32});
var Triple = new T.StructType({x:T.int32, f:Pair});

#1 var v1 = new Triple();
#2 var v2 = v1.f;
#3 var buf = T.storage(v1).buffer;
#4 neuter(buf);

I'm working under the assumption that operations like #1 and #2 are common, but that operations like #3 and #4 are rare.

So, after #1 we've created a typed object with no explicit associated array buffer.  As we do for typed arrays, the array buffer should be implicit, and to use as little storage as feasible we'd end up with the following layout for the object in memory:

            p0              p1
Object OA: [shape, type, x, f.y, f.z]

Some notes:

- The object's byte offset and length are common to all Triples and can be stored in its type information or descriptor (not sure yet how that stuff fits together).

- The object's owner is implicitly null, since there's no array buffer.  I think the view list should be removed entirely, which will have some nice complexity and memory benefits.

- The object's data follows the shape/type inline so the data pointer is stored implicitly.

- There are no slots/elements pointers, which we currently create for all objects.  Typed objects are not extensible so these pointers will never be used.  The same holds true for other non-native objects (i.e. proxies; after bug 966518 typed objects should be able to become proxies fwiw), so I think it would be a good idea to remove all notion of slots and elements in non-native objects entirely.

- The shape and type (i.e. the TI type) should stay unchanged; these are accessed all the time in hot jitcode.

After #2 we've created another typed object for the inner Pair structure.  This one can't use the same inline layout since it needs to alias the outer object.  The best layout for this object in memory is:

Object OB: [shape, type, owner:p0, data:p1]

Note that now there are two different layouts which can be used for a typed object, depending on how it is created or whether it aliases the contents of another typed object (or array buffer).  We'd need to distinguish which representation is in use, by varying either the shape or the type of the object.

After #3 we need to lazily create the array buffer for object OA.  There are two typed objects now with pointers to the data, and no way to obtain a pointer to OB from OA (since there's no view list).  So the array buffer needs to point to the data in OA, and OA needs to point to that array buffer.  The latter is somewhat tricky since there's no space in OA for a new pointer, but this could be done by reshaping OA to hold a pointer to the array buffer.  This would be expensive but not horribly so.

After #4 the array buffer has been neutered so no accesses can be performed on it or OA or OB.  Not having a view list means that instances are not explicitly neutered so accesses on them need to always check for neutered buffers, which is some additional VM cost but will normally be eliminated or minimized in JIT code.
Excellent! This could help pdf.js quite a bit. I've encountered exactly this issue a few times, where I would like to convert a vanilla array to a Uint8Array but it's not a memory win unless the array gets to a certain length.
Whiteboard: [MemShrink]
(Reporter)

Comment 2

4 years ago
(In reply to Nicholas Nethercote [:njn] from comment #1)
> Excellent! This could help pdf.js quite a bit. I've encountered exactly this
> issue a few times, where I would like to convert a vanilla array to a
> Uint8Array but it's not a memory win unless the array gets to a certain
> length.

This bug is a bit different, and is about improving memory usage for typed objects rather than typed arrays (which are different things).  As part of this though we'll be able to improve typed array memory usage.  Right now typed arrays have 8 fixed slots but only use 6 of them.  By removing the view list we can cut that down to 5, and there is another slot that looks trivial to remove and bring things down to 4, allowing us to allocate in a smaller size class and avoid wasting any space.  So that should reduce the memory usage of a typed array by 32 bytes.

When this bug is fixed and typed objects are more mature (enabled in some way on all channels, not just nightly) we should be able to do a better job allocating fixed length arrays as typed objects rather than typed arrays (typed objects can themselves be arrays, which are still a different thing than typed arrays).  In that case we would have two words of header for the shape and type, which would be followed immediately by the typed object's data (assuming the whole thing fits in one of our object's size classes).  On 32 bit platforms that would save 40 bytes over the typed array layout described above, and 72 bytes over what we currently do.
CC-ing Niko to make sure he's aware of this.
(Reporter)

Updated

4 years ago
Depends on: 1058761
(Reporter)

Updated

4 years ago
Depends on: 1060053
(Reporter)

Updated

4 years ago
Depends on: 1061288
(Reporter)

Updated

4 years ago
Depends on: 1061404
(Reporter)

Comment 4

4 years ago
(In reply to Brian Hackett (:bhackett) from comment #0)
> After #2 we've created another typed object for the inner Pair structure. 
> This one can't use the same inline layout since it needs to alias the outer
> object.  The best layout for this object in memory is:
> 
> Object OB: [shape, type, owner:p0, data:p1]
> 
> Note that now there are two different layouts which can be used for a typed
> object, depending on how it is created or whether it aliases the contents of
> another typed object (or array buffer).  We'd need to distinguish which
> representation is in use, by varying either the shape or the type of the
> object.
> 
> After #3 we need to lazily create the array buffer for object OA.  There are
> two typed objects now with pointers to the data, and no way to obtain a
> pointer to OB from OA (since there's no view list).  So the array buffer
> needs to point to the data in OA, and OA needs to point to that array
> buffer.  The latter is somewhat tricky since there's no space in OA for a
> new pointer, but this could be done by reshaping OA to hold a pointer to the
> array buffer.  This would be expensive but not horribly so.
> 
> After #4 the array buffer has been neutered so no accesses can be performed
> on it or OA or OB.  Not having a view list means that instances are not
> explicitly neutered so accesses on them need to always check for neutered
> buffers, which is some additional VM cost but will normally be eliminated or
> minimized in JIT code.

The approach here has changed somewhat.  Instead of not keeping track of the views for a given buffer, which has a lot of ramifications for complexity and performance in the engine, I'd like to track subobjects of array buffers and typed objects using per-compartment tables.  The patch for this is in bug 1061404.  Now, after #2 we create an entry in this table when making the subobject, which the array buffer neutering can then find and fix up.  If that subobject is a temporary the table entry will disappear after the next minor GC, and hopefully with Ion compilation construction of the temporary can be avoided entirely.  The main weakness of this strategy is when there are lots of long lived subobjects created, many entries will be added to these tables which hold weak references and need to be non-incrementally swept on GC.  That non-incrementalness should be fixable (though not easily) if this ends up being a problem.
(Reporter)

Updated

4 years ago
Depends on: 1061741
Whiteboard: [MemShrink] → [MemShrink:P2]
(Reporter)

Updated

4 years ago
No longer depends on: 1061288
(Reporter)

Updated

4 years ago
Depends on: 1069680
(Reporter)

Updated

4 years ago
Depends on: 1069688
(Reporter)

Updated

4 years ago
Depends on: 1073836
(Reporter)

Updated

4 years ago
Depends on: 1073842
(Reporter)

Updated

4 years ago
Depends on: 1083600
(Reporter)

Updated

4 years ago
Depends on: 1085029
(Reporter)

Updated

4 years ago
Depends on: 1091010
(Reporter)

Updated

4 years ago
Depends on: 1091015
(Reporter)

Updated

4 years ago
Depends on: 1091329
(Reporter)

Updated

4 years ago
Depends on: 1091725
(Reporter)

Updated

4 years ago
Depends on: 1092238
(Reporter)

Updated

4 years ago
Depends on: 1092318
(Reporter)

Updated

4 years ago
Depends on: 1095952
(Reporter)

Updated

4 years ago
Depends on: 1096539
(Reporter)

Updated

4 years ago
Depends on: 1100168
(Reporter)

Updated

4 years ago
Depends on: 1100169
(Reporter)

Updated

4 years ago
Depends on: 1100170
(Reporter)

Updated

4 years ago
Depends on: 1100173
(Reporter)

Updated

4 years ago
Depends on: 1102510
(Reporter)

Comment 5

4 years ago
Typed object performance is now in a very good place.  The memory improvements from comment 0 are done, with inline typed objects having two words of header plus their data, and outline typed objects only needing four words.  This is a pretty big savings over normal objects and typed arrays.  Speed is also good.  On the awfy-assorted simple-struct test (testing simple reference and scalar accesses by swapping data between objects in an array) the typed object version takes 50ms for me (10.9 x86) vs. 70ms with normal objects; both versions generate excellent code, with the difference presumably due to better data cache performance with typed objects.  On a version of octane-splay modified to use typed objects (I'll add this to awfy-assorted pretty soon) the typed object version takes 680ms vs. 700ms with normal objects; this benchmark is mainly about GC performance but shows that using inline typed objects won't be a drag on GC tracing.

There's more that will need to be done in the future of course, but this bug is good to mark as fixed.
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Great work, bhackett! Thank you.
(Reporter)

Updated

4 years ago
Blocks: 1106828
You need to log in before you can comment on or make changes to this bug.