Closed Bug 1900466 Opened 5 months ago Closed 5 months ago

Vue's createReactiveObject is 2.5x slower in SM than V8

Categories

(Core :: JavaScript Engine: JIT, defect, P2)

defect

Tracking

()

RESOLVED FIXED
129 Branch
Tracking Status
firefox129 --- fixed

People

(Reporter: jrmuizel, Assigned: jonco)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [sp3])

Attachments

(1 file)

This shows up on Sp3-Vue-TodoMVC and Sp3-NewsSite-Nuxt. On both tests making Firefox as fast as Chrome on this function would reduce its overal time by 0.6%.

Chrome: https://share.firefox.dev/4ei4gFA
FF: https://share.firefox.dev/4e5Lpxi

One big part of the difference is creating new Proxy objects. In FF we spend 293 samples vs Chrome's 25, so more than 10x faster.

function createReactiveObject(target, isReadonly, baseHandlers, collectionHandlers, proxyMap) {
            if (!(0, _vue_shared__WEBPACK_IMPORTED_MODULE_1__.Kn)(target)) {
                if (false) {}
                return target;
            }
            if (target["__v_raw"] && !(isReadonly && target["__v_isReactive"])) {
                return target;
            }
            const existingProxy = proxyMap.get(target);
            if (existingProxy) {
                return existingProxy;
            }
            const targetType = getTargetType(target);
            if (targetType === 0) {
                return target;
            }
            const proxy = new Proxy(target, targetType === 2 ? collectionHandlers : baseHandlers);
            proxyMap.set(target, proxy);
            return proxy;
        }

This is roughly split between setting an item in the WeakMap and allocating the proxy object. Both of these operations are spending a lot of time doing post barrier stuff.

The proxy uses GCPtr<Value> slots. If we allocated these scripted proxies in the nursery, we could probably avoid these barriers with some additional work.

Severity: -- → S3
Priority: -- → P2

We should be able to do something about both of those issues.

Assignee: nobody → jcoppeard

Proxy handlers have a canNurseryAllocate() method which is used to determine
whether to allocate them in the nursery. Whether this is safe or not depends on
whether the finalize() method tolerates not being called for objects that die
in the nursery. The default finalize method for proxies is fine with this.

It turns out that most proxy handlers don't override finalize and so we should
be able to allocate them in the nursery. It looks like the reason we don't is
due to oversight rather than anything fundamental.

The patch adds a new class in the proxy handler heirarchy,
NurseryAllocableProxyHandler and uses it where possible. This helps make it
clearer that other proxies aren't allocated in the nursery by default. In
addition, cross compartment wrappers are manually opted in to nursery
allocation. DOM proxies do add a finalize method, so Wrapper cannot be nursery
allocated by default.

Doing this shows some significant improvements on some of the NewsSite-Nuxt
subtests of speedometer3. Annoyingly there is also a large regression on one
Ares6 subtest but looking at the profile this seems to be due to GC happening
in a different place in the test rather than anything getting slower.

Whiteboard: [sp3]

Looking at benchmark results, this patch produces significant improvements on some sp3 subtests but also regresses ares6.

The sp3 subtests high confidence changes are (Windows, 20 runs):

  • 30% improvement on NewsSite-Nuxt/NavigateToPolitics/Async
  • 20% improvement on NewsSite-Nuxt/NavigateToPolitics/total
  • 8% improvment on NewsSite-Nuxt/total
  • 3% regression on Editor-CodeMirror/Long/Sync

The ares6 changes are:

  • 3% overall regression, due to:
  • 40% regression on Basic_averageWorstCase
  • 3% improvement on Basic_firstIteration
  • 3% improvement on Babylon_averageWorstCase

I spent some time investigating the ares6 changes and determined that these were actually due to GC timing.

The way ares6 works is that it has four subtests which are run many times, and for each three measures are taken: first iteration (not important here), average time and average of worst 4 iterations. The average times are not affected by this patch (which is a good sign). The regression comes from the change to the average worst case score for one subtest.

Looking at profiles I discovered that the effect of this patch was to delay a GC that ran during the benchmark. This GC is triggered by allocation during the benchmark and previously always occurred during the first subtest (Air). With this patch (which reduces tenured allocations) it now sometimes occurs during the second subtest (Basic). The GC itself was a little over 10ms total. The effect on the benchmark results was to increase the worst case measurement of the Basic subtest by around 10ms. However the change didn't occur often enough to reduce the worst case measurement of the Air subtest, which remains ~10ms greater the the average for this subtest.

Based on the above I'm going to land this patch and intend to accept the regression to the ares6 results.

Pushed by jcoppeard@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/10768998f471 Allocate proxy objects in the nursery where possible r=jandem
Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 129 Branch
Regressions: 1904545

May also have improved AWFY-Jetstream2-chai-wtb-* subtests :
6%-13% on chai-wtb-first

Blocks: 1923996
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: