Open Bug 1903758 Opened 4 months ago Updated 12 days ago

Experiment with deferring and/or avoiding arena_t::Purge where possible

Categories

(Core :: Memory Allocator, enhancement)

enhancement

Tracking

()

ASSIGNED

People

(Reporter: jstutte, Assigned: jstutte)

References

(Depends on 2 open bugs, Blocks 2 open bugs)

Details

(Whiteboard: [sp3])

Attachments

(2 files, 3 obsolete files)

arena_t::Purge checks if if it reached our maximum of non-used but allocated memory and in case is responsible to update our internal book-keeping and give back memory chunks to the OS. It does so synchronously whenever a free or realloc happens. Both our internal book-keeping and giving back memory to the OS can be quite time consuming. We already have ways to influence how much dirty memory we allow before checking, but this comes with the price of memory overhead if taken too far.

There are situations where we know at higher levels of abstraction, that during certain operations a purge is not needed at all or can be done later in order to reduce or avoid this overhead, like:

  • process shutdown. If a process knows that it is on the straight path to shutdown, any purge operation can be entirely avoided.
  • memory drop and raise. When we destroy a bigger tree of objects, just to re-create another, new tree of other objects, any purge that happens in between could be somehow deferred to happen after the end of that cycle.

This bug wants to explore the potential of making moz_jemalloc aware of such conditions or find other ways to avoid their impact, if possible. A big part of this is probably to set up good ways of measuring the impact of what we are trying to do here.

See Also: → 1902728, 1809115

:smaug said, we already looked at this from various angles in the past, so we know there might be potential but we do not really know, how much and if it is worth it. Having some PoC patches might help to find out.

When a process is about to shutdown, there is no point in spending effort to purge our memory, once it exits it will be returned to the system, anyways. Rather than having explicit different code paths everywhere, we can signal jemalloc via a global flag, that it can stop to bother. Once flipped, jemalloc will not execute any purge, and also realloc will only have effect if the new size is bigger than the old.

For content processes, we flip the flag when shutdown is first notified. This notification also interrupts content JS execution, such that there should be no more significant allocations until shutdown.

For the parent process, this patch chooses the highest risk by associating the flip with entering shutdown phase AppShutdownConfirmed, just to verify the potential impact. FWIW, given we only block purge, we do not expect to see unnecessary allocations of system memory.

There might be room for doing more on huge object free, which currently is unaltered.

Note that in leak checking builds we do not skip anything, of course.

This is a PoC patch and might want some adjustment to integrate better with jemalloc. There might be relevant telemetry already in place to monitor the effect.
Note that the profiler currently gets fooled about memory consumption as we bail out at lower levels and would probably need some adjustment, if we want to go this route.

See Also: → 1903764

memory drop and raise. When we destroy a bigger tree of objects, just to re-create another, new tree of other objects, any purge that happens in between could be somehow deferred to happen after the end of that cycle.

As far as this case is concerned, there could be some mechanism that allows to give moz_jemalloc a callback that should be called on purge, something like:

  • The consuming process sets a callback std::function<...> cbRequestPurge for moz_jemalloc purge
    ...
  • Some free happens
  • moz_jemalloc notices it should purge, does setPurgePending() and calls cbRequestPurge([arena = this]() {arena->doPendingPurge(); }
  • the cbRequestPurge does an idle dispatch of the passed callback to the current serial event target
  • when that event is executed, the real purge happens.

The basic idea is to hook into jemalloc on purge and defer it to when we are less busy.

This patch is meant to just show that this may move some needles. It also shows that it is a a bit hairy to inject something like this into jemalloc and might not be the best way how to do this at all.

Attachment #9411253 - Attachment description: WIP: Bug 1903758 - PoC: Do a lazy purge for arenas on nsThread threads. → WIP: Bug 1903758 - PoC: Do a lazy purge for all arenas via idle processing on the main thread.

The lazy purge seems to have some positive effect on (not only) sp3 numbers, thus smaug suggested to link this bug also there to keep it on the radar for further investigations.

Blocks: speedometer3
Whiteboard: [sp3]
Attachment #9411253 - Attachment description: WIP: Bug 1903758 - PoC: Do a lazy purge for all arenas via idle processing on the main thread. → WIP: Bug 1903758 - PoC: Do a lazy purge for arenas on nsThread threads.
Attachment #9411253 - Attachment description: WIP: Bug 1903758 - PoC: Do a lazy purge for arenas on nsThread threads. → WIP: Bug 1903758 - PoC: Do a lazy purge for all arenas via idle processing on the main thread.

Bug 1488780 will be an interesting, complementary improvement: Once we can purge without holding an arena's lock, the impact of purging on the main thread while other threads do some work that uses such an arena should be minimal.

See Also: → 1488780

there could be some mechanism that allows to give moz_jemalloc a callback that should be called on purge

We do not need a callback, the updated patch avoids that.

Assignee: nobody → jstutte
Attachment #9421656 - Attachment description: WIP: Bug 1903758 - Make mozjemalloc have a lazy purge for all arenas via idle processing on the main thread. → Bug 1903758 - Make mozjemalloc have a lazy purge for all arenas via idle processing on the main thread. r?glandium,smaug
Status: NEW → ASSIGNED
Attachment #9422232 - Attachment description: WIP: Bug 1903758 - Disable dom.memory.foreground_content_processes_have_larger_page_cache but introduce enabled dom.memory.background_content_processes_have_smaller_page_cache. → Bug 1903758 - Disable dom.memory.foreground_content_processes_have_larger_page_cache but introduce enabled dom.memory.background_content_processes_have_smaller_page_cache. r?smaug
Attachment #9422232 - Attachment is obsolete: true
Attachment #9424062 - Attachment description: WIP: Bug 1903758 - Let chunk_dealloc do a lazy pages_unmap. → Bug 1903758 - Let chunk_dealloc do a lazy pages_unmap. r?glandium,pbone,smaug
Attachment #9421656 - Attachment description: Bug 1903758 - Make mozjemalloc have a lazy purge for all arenas via idle processing on the main thread. r?glandium,smaug → WIP: Bug 1903758 - Make mozjemalloc have a lazy purge for all arenas via idle processing on the main thread. r?glandium,smaug
Attachment #9421656 - Attachment description: WIP: Bug 1903758 - Make mozjemalloc have a lazy purge for all arenas via idle processing on the main thread. r?glandium,smaug → Bug 1903758 - Make mozjemalloc have a lazy purge for all arenas via idle processing on the main thread. r?glandium,smaug

Does this bug depend on Bug 1913753?

(In reply to Paul Bone [:pbone] from comment #11)

Does this bug depend on Bug 1913753?

I assume only for the order of landing the patches to avoid conflicts, but it's probably good to make it explicit. It definitely depends on bug 1488780 and that is already blocking Bug 1913753.

Depends on: 1913753
Attachment #9411253 - Attachment is obsolete: true
Blocks: 1920450

Comment on attachment 9408621 [details]
WIP: Bug 1903758 - PoC: Avoid purge in our memory allocator after shutdown started.

Revision D214381 was moved to bug 1920450. Setting attachment 9408621 [details] to obsolete.

Attachment #9408621 - Attachment is obsolete: true
Summary: Experiment with avoiding and/or deferring arena_t::Purge where possible → Experiment with deferring and/or avoiding arena_t::Purge where possible
Depends on: 1920451
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: