Testcase generating N nested Weakmap/Weakset is 7x (Was:25x) slower in Nightly (N=50000000)
Categories
(Core :: JavaScript: GC, task, P3)
Tracking
()
People
(Reporter: mayankleoboy1, Assigned: jonco)
References
(Depends on 1 open bug, Blocks 3 open bugs)
Details
Attachments
(2 files)
N=50000000
- Nightly: https://share.firefox.dev/3FdPOBq (46s)
- Chrome: https://share.firefox.dev/4kyLE77 (1.5s)
| Reporter | ||
Updated•1 year ago
|
| Reporter | ||
Comment 1•1 year ago
|
||
N=50000000
Nightly: https://share.firefox.dev/3QPy6qb (44s)
Chrome: https://share.firefox.dev/3Dq9YaD (1.5s)
Comment 2•1 year ago
•
|
||
The WeakSet profile roughly breaks down as follows:
- 28% waiting for a helper thread under
js::GCParallelTask::joinNonIdleTask. - 24% under
AddWeakSetEntryImpl, mostly jemalloc and barrier overhead. - 19% under
GCRuntime::findSweepGroupEdges. - 17% under
StoreBuffer::ValueEdge::traceas part of a minor GC.
The WeakMap one is similar.
Comment 3•1 year ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #2)
- 28% waiting for a helper thread under
js::GCParallelTask::joinNonIdleTask.
Of the off-thread sweepWeakMaps time, 52% is for freeing the malloc buffer. Maybe we could move this out of sweepWeakMaps and free the memory using BackgroundFreeTask.
Comment 4•1 year ago
|
||
(In reply to Jan de Mooij [:jandem] from comment #2)
- 19% under
GCRuntime::findSweepGroupEdges.
To optimize this we could have a flag for "I'm a plain WeakMap without any CCW keys (or cross-zone CCWs)". That'd let us short-circuit under WeakMapBase::findSweepGroupEdgesForZone for the (hopefully) common case.
| Assignee | ||
Comment 5•1 year ago
|
||
Thanks Jan, lots of good ideas.
| Reporter | ||
Updated•1 year ago
|
| Assignee | ||
Updated•1 year ago
|
| Assignee | ||
Comment 6•1 year ago
•
|
||
(In reply to Jan de Mooij [:jandem] from comment #4)
To optimize this we could have a flag for "I'm a plain WeakMap without any CCW keys (or cross-zone CCWs)".
One issue I found with this is that a key can start out as a plain object and then get swapped with a CCW.
Updated•1 year ago
|
Comment 7•1 year ago
|
||
(In reply to Jon Coppeard (:jonco) from comment #6)
One issue I found with this is that a key can start out as a plain object and then get swapped with a CCW.
Maybe we could use ObjectMayBeSwapped? That returns true for proxies and DOM objects...
| Reporter | ||
Comment 8•1 year ago
•
|
||
Results after bug 1953446 landed.
N=50000000
Weakmaps: https://share.firefox.dev/41DITJe (37s)
Weaksets: https://share.firefox.dev/4bOSQbs (37s)
So about 20% improvement from comment 0, which is inline with joncos comment at https://bugzilla.mozilla.org/show_bug.cgi?id=1953446#c0
| Reporter | ||
Updated•1 year ago
|
| Reporter | ||
Comment 9•8 months ago
|
||
Latest profile: https://share.firefox.dev/4oeCn6j (37s)
| Reporter | ||
Comment 10•5 months ago
|
||
Profile with latest Nightly: https://share.firefox.dev/46Wxk3N (44s)
| Assignee | ||
Comment 11•5 months ago
|
||
We are doing more GC work since bug 1993183, but that is expected. Further improvements should result in a net improvement.
| Reporter | ||
Comment 12•4 months ago
•
|
||
profile with latest nightly: https://share.firefox.dev/3X7hMEz (17s!)
So we are 2.5x faster than before!
Samply: https://share.firefox.dev/49tw0qG (17s)
Same timings with semispace
| Reporter | ||
Updated•4 months ago
|
| Reporter | ||
Comment 13•1 month ago
|
||
Latest Nightly: https://share.firefox.dev/3ZQwwZJ (12.3s)
So we are now 30% faster compared to comment #12. Bug 1993212 is probably the biggest contributor here.
| Reporter | ||
Comment 14•1 month ago
|
||
results with nightly containing the fix of bug 2015075
Weakmap: https://share.firefox.dev/4rhAPJM (10s)
Weakset: https://share.firefox.dev/4rfRVaZ (10s)
So we are 16% faster compared to comment 13, making us 7x slower than Chrome.
| Reporter | ||
Comment 15•1 month ago
|
||
With the latest Nightly which contains the fix from bug 2016906 : https://share.firefox.dev/4carm2l (9.7s)
Not a big difference from the profile in comment #14
| Assignee | ||
Comment 16•1 month ago
|
||
(In reply to Mayank Bansal from comment #15)
I'm surprised that didn't make much difference. According to the profiles it removed time spent in GCRuntime::findSweepGroupEdges which was 36% of major GC time. Now it just spends more time waiting on the helper thread though.
| Reporter | ||
Comment 17•12 days ago
|
||
Latest Nightly: https://share.firefox.dev/4dfRE3x (8.5s)
| Assignee | ||
Comment 18•12 days ago
|
||
(In reply to Mayank Bansal from comment #17)
Latest Nightly: https://share.firefox.dev/4dfRE3x (8.5s)
This is a 1.2 second or 12% reduction in total time but a reduction in major GC time from 1.865 seconds to 0.368 seconds or an 80% reduction. In addition there's 1.2 seconds of helper thread work that is removed by this change.
I actually get much better overall results testing on macOS. For me this patch halved the total time taken. I don't know why it's so different on Windows.
| Reporter | ||
Comment 19•11 days ago
|
||
With a fresh firefox profile and rebooting the computer:
Weakmap: https://share.firefox.dev/3NiACqQ (9s)
Weakset: https://share.firefox.dev/47EUeMK (8s)
Description
•