MallocCounter GC requests can get ignored, not releasing memory in a timely fashion

RESOLVED FIXED in Firefox 66

Status

()

defect
P1
normal
RESOLVED FIXED
6 months ago
6 months ago

People

(Reporter: anba, Assigned: jonco)

Tracking

unspecified
mozilla66
Points:
---

Firefox Tracking Flags

(firefox66 fixed)

Details

(Whiteboard: [MemShrink])

Attachments

(1 attachment)

When running the following test case, SpiderMonkey holds 8GB at the end, even though it could have released the memory much earlier.

Here's what happens:

  • Inlined TypedArray construction creates nursery objects which hold a pointer to malloc allocated memory.
  • This malloc allocated memory is managed by the nursery and accounted against the current zone's MallocCounter.
  • During the test more and more memory is allocated, which leads to Zone::gcMallocCounter requesting a GC via Zone::maybeTriggerGCForTooMuchMalloc. First a IncrementalTrigger and later a NonIncrementalTrigger.
  • But despite its name, NonIncrementalTrigger doesn't lead to a non-incremental GC, but instead only changes the GC budget time to unlimited, cf. GCRuntime::budgetIncrementalGC.
  • So when the unlimited, but still incremental, GC happens, the currently active GC state, State::Decommit in the test case, simply finishes to the end of the GC cycle.
  • That means there won't be any Mark/Sweep phase, which is needed to reset the MallocCounter, cf. GCRuntime::performSweepActions and GCRuntime::endSweepingSweepGroup. Which also means Zone::updateAllGCMallocCountersOnGCEnd is never called to reset the zone's MallocCounter.
  • And without resetting the malloc counter, MemoryCounter::triggered_ will get stuck at its current state preventing any further calls to Zone::maybeTriggerGCForTooMuchMalloc, cf. Zone::updateMemoryCounter.

Test case:

function f() {
    var zone = performance.mozMemory.gc.zone;
    print(`Start: mallocBytesRemaining=${String(zone.mallocBytesRemaining).padStart(10)}, maxMalloc=${zone.maxMalloc}`);
    var ta = new Int8Array(8*1024*1024);
    var r = 0
    var t = dateNow()
    for (var i = 0; i < 200; ++i) {
        r += ta.slice(0).length;
    }
    print(`End:   mallocBytesRemaining=${String(zone.mallocBytesRemaining).padStart(10)}, maxMalloc=${zone.maxMalloc}`);
    print("");
}

for (var i = 0; i < 5; ++i) f();

Output:

Start: mallocBytesRemaining=     64620, maxMalloc=134217728
End:   mallocBytesRemaining=1560281088, maxMalloc=201326592

Start: mallocBytesRemaining=1560285333, maxMalloc=201326592
End:   mallocBytesRemaining=3246395977, maxMalloc=201326592

Start: mallocBytesRemaining=3246397382, maxMalloc=201326592
End:   mallocBytesRemaining=4932508330, maxMalloc=201326592

Start: mallocBytesRemaining=4932508455, maxMalloc=201326592
End:   mallocBytesRemaining=6618618787, maxMalloc=201326592

Start: mallocBytesRemaining=6618618912, maxMalloc=201326592
End:   mallocBytesRemaining=8304732796, maxMalloc=201326592

Debug output of the GCRuntime::budgetIncrementalGC return value and the current state in GCRuntime::incrementalSlice:

Gc-Cycle: incremental=1, reason=5, budget=time
incrementalSlice: not-active
incrementalSlice: mark-roots
incrementalSlice: mark
incrementalSlice: sweep
incrementalSlice: finalize
incrementalSlice: --done--
Gc-Cycle: incremental=1, reason=5, budget=time
incrementalSlice: finalize
incrementalSlice: compact
incrementalSlice: decommit
incrementalSlice: --done--
Gc-Cycle: incremental=1, reason=5, budget=unlimited
incrementalSlice: decommit
incrementalSlice: finish
incrementalSlice: --done--
Gc-Cycle: incremental=1, reason=2, budget=unlimited
incrementalSlice: not-active
incrementalSlice: mark-roots
incrementalSlice: mark
incrementalSlice: sweep
incrementalSlice: finalize
incrementalSlice: compact
incrementalSlice: decommit
incrementalSlice: finish
incrementalSlice: --done--

This is a good catch. Thanks also for the detailed report.

Assignee: nobody → jcoppeard
Priority: -- → P1

This is a problem with the way we handle non-incremental triggers that happen during an ongoing incremental GC. For example it's possible that we get a trigger for a zone that is part of the current collection after we've finished sweeping that zone. At the moment we will go non-incremental but won't collect the zone again.

The patch makes us reset in this situation so we finish the current GC and then start a new GC to address the trigger. This fixes problem with the testcase. I also added assertions that we don't finish a GC with any of the malloc triggers outstanding.

Attachment #9036370 - Flags: review?(sphink)
Whiteboard: [MemShrink]
Comment on attachment 9036370 [details] [diff] [review]
bug1519794-malloc-trigger

Review of attachment 9036370 [details] [diff] [review]:
-----------------------------------------------------------------

Wow, subtle.
Attachment #9036370 - Flags: review?(sphink) → review+
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/43022e5e3757
Reset incremental GC on allocation triggers that happen late in an incremental collection r=sfink
Status: NEW → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
You need to log in before you can comment on or make changes to this bug.