Closed Bug 1249367 Opened 4 years ago Closed 4 years ago
Racy GC fails to destroy compartments and zones
I have roughly 5000 zones floating around in explicit/js-non-window/zones. They all have one compartment, and no arenas, but they're still floating around. Their URIs are from pages I've visited over the last day or so. There's a race in the GC. <khuey> so <khuey> the sequence of events on one zone here <khuey> in GCRuntime::endSweepingZoneGroup we add this zone to a list <khuey> then we Zone::sweepCompartments on the main thread <khuey> which refuses to do anything because a background sweep is pending <khuey> and then at some point GCRuntime::sweepBackgroundThings removes us from its list <khuey> terrence: sfink: ^ <khuey> this is racy ...
5000 empty zones have to add up to some kind of memory usage.
I'm working on this. Thanks for giving me an excuse!
Assignee: nobody → terrence
It has always struck me as more than a little squick that the GC kicks off a background finalization thread and then "finishes" the GC, resetting the HeapState to IDLE even though this is demonstrably untrue. Besides, there are places where the GC needs to take back over after background finalization is done: buffer shrinking, compacting, and apparently zone sweeping. The buffer-shrinking case does this by having Gecko call us back after "awhile" and the zone case we handle by waiting for the next GC. Unless this is the last GC, where we happen to do the correct thing, probably by accident. And naturally it's totally undocumented. In any case, the preferable solution is the one that Jon implemented for COMPACTING: keep the GC active, but use 0-sized slices until until background finalization is done. I've got a patch that does this, but unfortunately, there are a ton of places in tests where we assume that gcslice(<large number>) will do a complete GC. This has not been true for awhile except that we still have to manually schedule compacting GCs. My feeling is that we should just fix the tests here, and I'm currently working on that. Other than tests against fiddly GC semantics, this approach appears to work fine in the shell.
I got the SM tests all green before I had to take off on Friday, but I really didn't expect that to be adequate to make the browser tests pass too.
Attachment #8722024 - Flags: review?(jcoppeard)
Comment on attachment 8722024 [details] [diff] [review] make_bg_finalization_an_explicit_gc_phase-v0.diff Review of attachment 8722024 [details] [diff] [review]: ----------------------------------------------------------------- Looks good! ::: js/src/jsgc.cpp @@ +6192,5 @@ > + gcstats::AutoPhase ap1(stats, gcstats::PHASE_SWEEP); > + gcstats::AutoPhase ap2(stats, gcstats::PHASE_DESTROY); > + AutoSetThreadIsSweeping threadIsSweeping; > + FreeOp fop(rt); > + sweepZones(&fop, destroyingRuntime); Can we get rid of the call to sweepZones() in endSweepPhase() and just do it once here?
Attachment #8722024 - Flags: review?(jcoppeard) → review+
I /think/ so? I left it alone because the predicates of the two sweepZones calls there are not clearly symmetrical. However, given that this is green already and I expected it to take a week to stabilize, I'll go ahead and try it.
https://hg.mozilla.org/integration/mozilla-inbound/rev/d8da9bc56468f58be505e61e72e78bff1d2b4b61 Bug 1249367 - Make background finalization a GC phase (and clean up Zones properly); r=jonco
Some really nice wins appearing in Talos from this: https://treeherder.allizom.org/perf.html#/alerts?id=268
and one small regression with tps (talos page switch) of about 2.5% on linux64 e10s: https://treeherder.allizom.org/perf.html#/compare?originalProject=mozilla-inbound&originalRevision=5a1a5726b179&newProject=mozilla-inbound&newRevision=d8da9bc56468&framework=1
You need to log in before you can comment on or make changes to this bug.