Closed Bug 1853467 Opened 5 months ago Closed 5 months ago

Investigate if Object.assign can be optimized more on Charts-chartjs


(Core :: JavaScript Engine, enhancement)




120 Branch
Tracking Status
firefox120 --- fixed


(Reporter: mstange, Assigned: jandem)


(Blocks 1 open bug)


(Whiteboard: [sp3])


(4 files, 2 obsolete files)

Attached file reduced testcase for profiling (obsolete) —

6.6% of our time on the Charts-chartjs subtest is spent inside obj_assign.
(Once bug 1850775 is fixed, this percentage will rise to 7.6%.)

const cloneIfNotShared = (cached, shared) => shared ? cached : Object.assign({}, cached);

It would be nice if we could find ways to speed this Object.assign call some more.

Firefox: 3551 samples
Chrome: 3059 samples

We're hitting the medium-fast but not the super-fast path; in other words, TryAssignPlain succeeds but we don't hit the toWasEmpty path.

The call is of the form Object.assign({}, cached). The cached object was created here with Object.freeze.

We're not going down the toWasEmpty super-fast path because the properties on the frozen object don't have the default property flags.

The first thing that stands out from the profile is that the gathered array of properties might be resized multiple times.

Attached file reduced testcase (obsolete) —
Attachment #9353501 - Attachment is obsolete: true

Scores on the testcase on my Mac:
Firefox: 540ms
Chrome: 407ms
Safari: 885ms

Here are a few more variations: Different numbers of properties, and frozen / not frozen.

It looks like the super-fast path isn't hit with not-frozen objects, either.

Attachment #9353502 - Attachment is obsolete: true

Here are the numbers on my machine for Firefox / Chrome / Safari:

frozen-10: 541ms / 408ms / 893ms
frozen-8: 359ms / 325ms / 738ms
frozen-200: 990ms / 12.5s / 2820ms
regular-10: 539ms / 404ms / 506ms
regular-8: 359ms / 328ms / 312ms
regular-200: 980ms / 12.5s / 2226ms

(In reply to Markus Stange [:mstange] from comment #3)

It looks like the super-fast path isn't hit with not-frozen objects, either.

Probably because the objects have different sizes and this (the number of fixed slots) determines the shape.

If we can't use the fastest path, we could still build up the property map first and then get the final shape, instead of incrementally assigning intermediate shapes. I think I tried this a while ago but it was slower because shapes do have pretty fast caching of "next shapes".

Flags: needinfo?(jdemooij)

We already had an optimization to reuse the shape when assigning to an empty object.
This often doesn't work because the number of fixed slots is different, but in
this case we can still reuse the property map.

Assignee: nobody → jdemooij

We don't need to use the vector in this case because both objects
are plain objects with only data properties.

Depends on D188966

Speedometer 3's Charts-chartjs calls this with an object with 10 properties,
so this avoids malloc overhead.

Depends on D188967

I posted some patches that especially help the non-frozen part of the micro-benchmark. Part 3 also improves the frozen-10 and frozen-200 tests.

For the frozen variant, I tried a fast path based on building the property map and then looking up the shape, but this was actually slower. This is because shapes have a fast path for the next property's shape(s). As I mentioned in comment 5, I tried this before and got similar results.

We could make this faster if we added a 1-slot cache for toShape (empty) + fromShape => toShape (new). I added some logging and I think the hit rate would be very high on both Speedometer 2 and 3. If we get a cache hit, we could potentially avoid the props Vector and the first loop in TryAssignPlain too.

Pushed by
part 1 - Add fast path to Object.assign for when we can reuse the shape's PropMap. r=jonco
part 2 - Simplify loop for Object.assign fast path. r=jonco
part 3 - Bump PropertyInfoWithKeyVector inline storage from 8 to 16. r=jonco
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 120 Branch

Is this better now? We could optimize the empty-object case more with some caching, but I don't know if that's worth doing at this point.

Flags: needinfo?(jdemooij) → needinfo?(mstange.moz)

On the Windows machine we use for profiling, these patches have improved the score on Charts-chartjs by 2.6%. Nice!

Before: 3643 samples
After: 2252 samples
Chrome: 3141 samples

The function is now 4.6% of the overall time spent on this test. So improving it further can still yield valuable percents.

Flags: needinfo?(mstange.moz)

(In reply to Markus Stange [:mstange] from comment #13)

The function is now 4.6% of the overall time spent on this test. So improving it further can still yield valuable percents.

A one-entry cache for the empty-plain-object case has a hit rate > 90% on Speedometer 3 (and also Speedometer 2). For a shell version of your benchmark I get:

frozen-10 Took 702.12ms. => 291.76ms.
frozen-8 Took 573.66ms. => 243.06ms.
frozen-200 Took 1565.07ms. => 510.86ms.

regular-10 Took 480.85ms. => 275.58ms.
regular-8 Took 428.57ms. => 247.62ms.
regular-200 Took 1009.22ms. => 476.14ms.

Oooh, I like the sound of that!

Blocks: 1855705
Blocks: 1814711
You need to log in before you can comment on or make changes to this bug.