Open Bug 1952626 Opened 1 year ago Updated 11 days ago

Testcase generating N nested Weakmap/Weakset is 7x (Was:25x) slower in Nightly (N=50000000)

Categories

(Core :: JavaScript: GC, task, P3)

task

Tracking

()

People

(Reporter: mayankleoboy1, Assigned: jonco)

References

(Depends on 1 open bug, Blocks 3 open bugs)

Details

Attachments

(2 files)

Attached file Nested Weakmaps.HTML β€”
Summary: Testcase generating N nested Weakmaps is 30x slower in Nightly (N=50000000) → Testcase generating N nested Weakmap/Weakset is 30x slower in Nightly (N=50000000)
Attached file Nested Weaksets.HTML β€”

The WeakSet profile roughly breaks down as follows:

  • 28% waiting for a helper thread under js::GCParallelTask::joinNonIdleTask.
  • 24% under AddWeakSetEntryImpl, mostly jemalloc and barrier overhead.
  • 19% under GCRuntime::findSweepGroupEdges.
  • 17% under StoreBuffer::ValueEdge::trace as part of a minor GC.

The WeakMap one is similar.

(In reply to Jan de Mooij [:jandem] from comment #2)

  • 28% waiting for a helper thread under js::GCParallelTask::joinNonIdleTask.

Of the off-thread sweepWeakMaps time, 52% is for freeing the malloc buffer. Maybe we could move this out of sweepWeakMaps and free the memory using BackgroundFreeTask.

(In reply to Jan de Mooij [:jandem] from comment #2)

  • 19% under GCRuntime::findSweepGroupEdges.

To optimize this we could have a flag for "I'm a plain WeakMap without any CCW keys (or cross-zone CCWs)". That'd let us short-circuit under WeakMapBase::findSweepGroupEdgesForZone for the (hopefully) common case.

Thanks Jan, lots of good ideas.

Component: JavaScript Engine → JavaScript: GC
Assignee: nobody → jcoppeard

(In reply to Jan de Mooij [:jandem] from comment #4)

To optimize this we could have a flag for "I'm a plain WeakMap without any CCW keys (or cross-zone CCWs)".

One issue I found with this is that a key can start out as a plain object and then get swapped with a CCW.

Severity: -- → N/A
Priority: -- → P3

(In reply to Jon Coppeard (:jonco) from comment #6)

One issue I found with this is that a key can start out as a plain object and then get swapped with a CCW.

Maybe we could use ObjectMayBeSwapped? That returns true for proxies and DOM objects...

Depends on: 1953446

Results after bug 1953446 landed.

N=50000000
Weakmaps: https://share.firefox.dev/41DITJe (37s)
Weaksets: https://share.firefox.dev/4bOSQbs (37s)

So about 20% improvement from comment 0, which is inline with joncos comment at https://bugzilla.mozilla.org/show_bug.cgi?id=1953446#c0

Summary: Testcase generating N nested Weakmap/Weakset is 30x slower in Nightly (N=50000000) → Testcase generating N nested Weakmap/Weakset is 25x slower in Nightly (N=50000000)

Latest profile: https://share.firefox.dev/4oeCn6j (37s)

Blocks: 1980560
See Also: 1943230
Depends on: 1993183
Depends on: 1993212

Profile with latest Nightly: https://share.firefox.dev/46Wxk3N (44s)

We are doing more GC work since bug 1993183, but that is expected. Further improvements should result in a net improvement.

Depends on: 1995021
See Also: → 1994407

profile with latest nightly: https://share.firefox.dev/3X7hMEz (17s!)
So we are 2.5x faster than before!

Samply: https://share.firefox.dev/49tw0qG (17s)
Same timings with semispace

Summary: Testcase generating N nested Weakmap/Weakset is 25x slower in Nightly (N=50000000) → Testcase generating N nested Weakmap/Weakset is 10x (Was:25x) slower in Nightly (N=50000000)

Latest Nightly: https://share.firefox.dev/3ZQwwZJ (12.3s)

So we are now 30% faster compared to comment #12. Bug 1993212 is probably the biggest contributor here.

Depends on: 2015075

results with nightly containing the fix of bug 2015075

Weakmap: https://share.firefox.dev/4rhAPJM (10s)
Weakset: https://share.firefox.dev/4rfRVaZ (10s)

So we are 16% faster compared to comment 13, making us 7x slower than Chrome.

Summary: Testcase generating N nested Weakmap/Weakset is 10x (Was:25x) slower in Nightly (N=50000000) → Testcase generating N nested Weakmap/Weakset is 7x (Was:25x) slower in Nightly (N=50000000)
Depends on: 2016906
See Also: → 2017506

With the latest Nightly which contains the fix from bug 2016906 : https://share.firefox.dev/4carm2l (9.7s)
Not a big difference from the profile in comment #14

(In reply to Mayank Bansal from comment #15)
I'm surprised that didn't make much difference. According to the profiles it removed time spent in GCRuntime::findSweepGroupEdges which was 36% of major GC time. Now it just spends more time waiting on the helper thread though.

Depends on: 2020925

Latest Nightly: https://share.firefox.dev/4dfRE3x (8.5s)

(In reply to Mayank Bansal from comment #17)

Latest Nightly: https://share.firefox.dev/4dfRE3x (8.5s)

This is a 1.2 second or 12% reduction in total time but a reduction in major GC time from 1.865 seconds to 0.368 seconds or an 80% reduction. In addition there's 1.2 seconds of helper thread work that is removed by this change.

I actually get much better overall results testing on macOS. For me this patch halved the total time taken. I don't know why it's so different on Windows.

With a fresh firefox profile and rebooting the computer:
Weakmap: https://share.firefox.dev/3NiACqQ (9s)
Weakset: https://share.firefox.dev/47EUeMK (8s)

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: