Closed Bug 1911537 Opened 1 year ago Closed 1 year ago

Write a memory allocator for JSObject slots/elements buffers

Categories

(Core :: JavaScript: GC, enhancement, P3)

enhancement

Tracking

()

RESOLVED FIXED
136 Branch
Tracking Status
firefox136 --- fixed

People

(Reporter: jonco, Assigned: jonco)

References

(Blocks 5 open bugs)

Details

(Keywords: perf-alert)

Attachments

(12 files, 1 obsolete file)

48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review

Currently these are allocated with jemalloc. Using our own allocator would have a number of advantages. It would:

  • eliminate lock contention between foreground allocation and background sweeping
  • reduce the work necessary in the JSObject finalizer
  • optimise the nursery promotion check for whether something is in the tenured heap

This is also a step towards the future goal of allocating more internal data in the GC heap.

Assignee: nobody → jcoppeard
Severity: -- → N/A
Priority: -- → P3
Depends on: 1912068
Depends on: 1916758
Depends on: 1786451
Depends on: 1927417
Depends on: 1927795
Depends on: 1933205
See Also: → 1933642
Depends on: 1934856

The allocator uses a lot of lists because it has arrays of free lists per
supported allocation size. mozilla::LinkedList uses three words for the list
itself which would make this an unnecessarily large memory overhead. This
implementation only uses a single word for the list by storing a pointer to the
first list element rather than having the list be a special node in the list
itself.

Like mozilla::LinkedList it is a circular doubly linked list. Another
difference however is that it uses pointer tagging to remove the need for a
sentinel flag in each node, which ends up taking up a whole word due to
alignement constraints.

The eventual plan is to replace these with equivalent calls that use the new allcoator.

The allocator supports three size classes: small, medium and large. This adds
support for the small class for allocations of up to 128 bytes. This is backed
by our existing cell allocator.

This adds a new trace kind 'SmallBuffer' and new alloc kinds in powers of two.

This memory is accounted to the malloc heap size so as to disturb GC scheduling
as little as possible. Eventually the distiction between GC heap and malloc
heap will be removed.

This adds large buffers using the OS allocator. Buffers of this size are
relatively rare in the JS engine and so are not particularly optimised.

See Also: 1933642

(In reply to Jon Coppeard (:jonco) from comment #0)

  • reduce the work necessary in the JSObject finalizer
  • optimise the nursery promotion check for whether something is in the tenured heap

How does this help with those things? Does this come from knowing something about the addresses?

Flags: needinfo?(jcoppeard)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #13)
With the new allocator we can sweep to discard unmarked allocations in 1MB chunks, rather than having to run a finalizer to call free for each one individually. This is more efficient because each free operations updates allocator state and with the new allocator we can do these updates in bulk instead.

Managing the memory ourselves means we can use tricks to work out what kind of memory it is (e.g. for the nursery vs tenured check) such looking at the chunk kind stored at the start of each 1MB chunk. We can't do this for malloced memory because we don't want to make assumptions about the layout of their data structures.

Flags: needinfo?(jcoppeard)
Blocks: 1939445
Blocks: 1939949
Depends on: 1940065
Depends on: 1940066

Comment on attachment 9445569 [details]
Bug 1911537 - Part 13: Add magic check values for various data structures r?sfink

Revision D233106 was moved to bug 1940065. Setting attachment 9445569 [details] to obsolete.

Attachment #9445569 - Attachment is obsolete: true
Pushed by jcoppeard@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/20791092320d Part 1: Tidy some inline nursery methods into Nursery-inl.h r=sfink https://hg.mozilla.org/integration/autoland/rev/ba658f0d1824 Part 2: Make mark bitmap methods take void* so they don't have to be used with Cells r=sfink https://hg.mozilla.org/integration/autoland/rev/a15f54d57882 Part 3: Add a check for data protected by a specific mutex r=sfink https://hg.mozilla.org/integration/autoland/rev/6c3ba1086317 Part 4: Add a dobuly linked list implementation that only uses a single word for the list r=sfink https://hg.mozilla.org/integration/autoland/rev/543c9847cc28 Part 5: Rename nursery methods that can allocate malloc buffers to make that clear r=sfink https://hg.mozilla.org/integration/autoland/rev/ae0acd4fa39c Part 6: Add buffer allocator class and support for small allocations r=sfink https://hg.mozilla.org/integration/autoland/rev/af074fe74bd9 Part 7: Support allocating medium size allocations r=sfink https://hg.mozilla.org/integration/autoland/rev/a1bd5de92497 Part 8: Support freeing medium size allocations r=sfink https://hg.mozilla.org/integration/autoland/rev/88009dbfaf57 Part 9: Support resizing medium allocations in place where possible r=sfink https://hg.mozilla.org/integration/autoland/rev/f63df5aa379b Part 10: Add a way to allocate buffers of 1MB or greater r=sfink https://hg.mozilla.org/integration/autoland/rev/45ae405157e2 Part 11: Add memory reporting for the buffer allocator and a way to dump statistics r=sfink https://hg.mozilla.org/integration/autoland/rev/ab05c4f378df Part 12: Add jsapi tests for the buffer allocator r=sfink
Regressions: 1940167
Regressions: 1940176
Regressions: 1940438
Blocks: 1909812
Blocks: 1809387

(In reply to Cristina Horotan [:chorotan] from comment #18)

https://hg.mozilla.org/mozilla-central/rev/20791092320d
https://hg.mozilla.org/mozilla-central/rev/ba658f0d1824
https://hg.mozilla.org/mozilla-central/rev/a15f54d57882
https://hg.mozilla.org/mozilla-central/rev/6c3ba1086317
https://hg.mozilla.org/mozilla-central/rev/543c9847cc28
https://hg.mozilla.org/mozilla-central/rev/ae0acd4fa39c
https://hg.mozilla.org/mozilla-central/rev/af074fe74bd9
https://hg.mozilla.org/mozilla-central/rev/a1bd5de92497
https://hg.mozilla.org/mozilla-central/rev/88009dbfaf57
https://hg.mozilla.org/mozilla-central/rev/f63df5aa379b
https://hg.mozilla.org/mozilla-central/rev/45ae405157e2
https://hg.mozilla.org/mozilla-central/rev/ab05c4f378df

Perfherder has detected a awsy performance change from push ab05c4f378df2225e26149a3fd8fe1e08886482e.

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
4% JS windows11-64-2009-shippable-qr fission tp6 170,187,023.94 -> 162,785,355.84
2% JS windows11-64-2009-shippable-qr fission tp6 166,389,041.85 -> 162,804,663.42

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask a sheriff to do that for you.

You can run these tests on try with ./mach try perf --alert 43303

For more information on performance sheriffing please see our FAQ.

Keywords: perf-alert
Depends on: 1941602
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: