Convert the GCHelperThread to a GCParallelTask so we can fix the reserve chunk heuristics




4 years ago
27 days ago


(Reporter: terrence, Unassigned)


(Blocks: 2 bugs)

Dependency tree / graph

Firefox Tracking Flags

(Not tracked)




4 years ago
There is no straightforward way to do either of these things right now because of the way we handle ShrinkBuffers: it sets a (non-atomic) bool, shrinkFlag, in the class as a side channel. This gets read both before and after the expire call and it will re-run expire with shrinking if shrinkFlag got set. Moreover, we also call expire from the main thread, also with both shrinking and non-shrinking modes. Telling what's going on here at all, much less telling a sensible story about the intended behavior is basically impossible.

What I intend to do is first figure out what a sensible behavior would be, then make the semantics do that. Then we'll burn the existing code to the ground, salt it, hold purifying rites, etc.

Here are my current thoughts, fairly stream-of-conscience:
  * Malloc oom should flush out all reserve chunks and decommit everything immediately (duh!). As a followup, I'd also like it to flush out the compartment and runtime caches. We can't kill the cached GC things obviously, but we can free all of the table memory, which might help.

  * GC_SHRINKING GC's happen on gecko's memory pressure events -- they also flush empty chunks and decommit in the foreground. We need to verify that these GC's actually flush the runtime and compartment caches as well, instead of treating them normally.

  * Decommitting can happens unobtrusively in the background, so normal GC should do this after sweeping, but this should not block us from GCing again.

  * Our current expiration policy is: no more than 30MiB, keep chunks for 3 GC's. During startup we easily cap this. Left alone we'd keep these chunks for ~3 minutes. But the browser wants nice low memory numbers on startup (and in general), so 4 seconds after every GC it calls ShrinkBuffers, throwing away /all/ of our old chunks down to our minimum (1 chunk). Chunks basically never reach expiration because of this. Moreover, the 30MiB cap causes massive churn when using a site like the real-time raytracer that allocates hundreds of MiB/sec.
  Our options here are:
    (1) Set a max of 1MiB and kill off ShrinkBuffers. We will still churn under heavy allocation, but not significantly more than we do right now. Capping at 1 means we'll just do what ShrinkBuffers is going to do, but sooner, so we can kill off ShrinkBuffers. This would be simple, but is not really satisfying.
    (2) Alternatively, at the end of GC, during ShrinkBuffers, and in allocTask, make use of our current allocation rate when deciding how many chunks to keep or expand. Additionally, set the max empty chunks value to something actually reflecting the physical RAM available in the current environment so that machines that are capable will not choke on heavy allocation, but will still eagerly clip themselves down to 1MiB when idle.

  * The allocTask only allocates up to the minimum (1 chunk), but it gets fired off unconditionally every time we have to allocate a new chunk. I think in order for this to actually do anything, we have to be allocating so fast that we fill a full chunk before it gets a chance to run, and we have to interlock such that we're just between the last allocation of the prior chunk and the first allocation of the next chunk. Even so I've actually seen this happen a couple times during startup. Still, a ton of overhead for something that's useless 99% of the time.

In terms of actual code, this is:
  * GC_SHRINKING onEndSweep does flushEmptyChunks and decommitWithLockHeld.
  * Normal GC onEndSweep starts the background sweeping task.
  * sweepTask does sweeping without observing cancel_, but MaybeGC will avoid GCing while sweeping; after sweeping it will unconditionally start decommitArena, but observes cancel_ and does not block MaybeGC.
  * onOutOfMallocMemory waits on background sweeping without cancelling decommit, then calls flushEmptyChunks and decommitWithLockHeld to be sure.
  * sweepTask does not expire empty chunks directly, rather it spawns the expire task, concurrent with decommitArenas.
  * ShrinkGCBuffers spawns expireTask if neither the sweepTask nor expireTask is currently running. This will give us the opportunity to clobber the empty chunks pool down to nothing if the system has quiesced since the last GC (e.g. after startup).
  * expireTask will take the prior allocation rate and figure out how many chunks it can kill from the background pool while obeying some reasonable bounds on sanity.
  * More speculatively, the allocTask should also observe the allocation rate and try to fill up to some percentage of the needed chunks. This is a bit more dangerous though, so should probably have some pretty tight controls.
The first part of this happened in bug 1469640.  I'm not clear on what the problem was with the reserve chunk heuristics that needed fixing.
You need to log in before you can comment on or make changes to this bug.