Open Bug 1232802 (ConcurrentGC) Opened 8 years ago Updated 2 years ago

[meta] Concurrent GC

Categories

(Core :: JavaScript: GC, defect)

defect

Tracking

()

Tracking Status
firefox46 --- affected

People

(Reporter: terrence, Unassigned)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: meta)

Although we've had great success with incremental GC, we're rapidly approaching the point at which further incrementalization is very difficult. Moreover, as we push MMU up, we increase our per-slice fixed overhead, which limits how far we can push even the easily incrementalized parts of the GC. If we want to get down to guaranteed <1ms pauses, we need a new approach.

Moreover, processor performance growth is largely happening in the dimension of more cores, rather than single thread performance, and probably will continue this way for the foreseeable future. This means that incremental GC approaches have a fixed latency floor; meanwhile, the cost of concurrent GC is going to keep dropping as chip makers push hard on memory bandwidth to sate the inevitable throughput demands of massive core counts.
My ideal, I think, would be to have a thread per Zone, since that is how allocations are grouped. In my fairy tale world, this would extend to all of SpiderMonkey, making it concurrent across tabs - but I don't know how often Zones have to talk to each other in practice, or how often we write to structures shared by the whole Runtime. I know a lot of the more senior SM hackers remember the days that Runtimes were multi-threaded with dread, but I feel like Zones and Compartments split things rather naturally. I don't know what would be involved in such an undertaking though - I assume the DOM might not be split up quite as nicely, and we still have multiple Contexts (but I always forget what they're for).
Depends on: ConcurrentMarking
Depends on: 1232814
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #1)
> My ideal, I think, would be to have a thread per Zone,

This is similar in concept to bug 616927, I think?

> since that is how
> allocations are grouped. In my fairy tale world, this would extend to all of
> SpiderMonkey, making it concurrent across tabs

We already do this for parsing: see ExclusiveContext.

> - but I don't know how often
> Zones have to talk to each other in practice,

I don't think it's huge.

> or how often we write to
> structures shared by the whole Runtime.

All the time. Literally. All. The. Time. XPConnect, for one, has basically no idea what a compartment is and for many "reasons" adding such an understanding is very, very hard. Gecko knows enough to ask, at least, but generally it's just being polite. I wouldn't call what it does an invariant.

> I know a lot of the more senior SM
> hackers remember the days that Runtimes were multi-threaded with dread,

I'm not one of them, thankfully. However...

> but
> I feel like Zones and Compartments split things rather naturally.

They do at that! And we're already taking advantage of it to a small degree, as mentioned above. Unfortunately, Zones are extremely new and Compartments not much older. There's just a ton of stuff on JSRuntime that deserves to be on Zone, but isn't because Zone didn't exist when said stuff was added. 

> I don't
> know what would be involved in such an undertaking though - I assume the DOM
> might not be split up quite as nicely, and we still have multiple Contexts
> (but I always forget what they're for).

Contexts are very close to going away. We'll be able to replace most of them with JSRuntime directly, replace ExclusiveContext (really just a wrapper around a JSCompartment, but looking like a JSContext so we can pass it around) with JSCompartment. Then we'll want to work to convert the JSRuntime methods over to taking Zone*. Then mprotect(&rt) and hope for the best.

The problem is that there's just millions of lines of (useful and working) code that obliviously prods global structures and C++'s memory model makes updating such things... difficult. Better than C, but still, quite horrible. If only we had a better language for guaranteeing memory safety and the organizational will to commit to a piecemeal rewrite of... oh wait.

The challenge for the GC in this context is to (1) build a water-tight (maybe thread-tight?) interface that both rust and C++ can link against and use effortlessly and that (2) doesn't hog servo's main thread.

I may not have emphasized it enough in comment 0, but there are *serious* (research level) problems to solve here and we're going to have to step up big time in 2016 if we want to keep up with servo. We're going to though. It's on the schedule.
(In reply to Terrence Cole [:terrence] from comment #2)

Servo uses, roughly, a JSRuntime instance per zone right now. That gets around all these issues, but obviously is far from ideal from a memory footprint point of view.

> All the time. Literally. All. The. Time. XPConnect, for one, has basically
> no idea what a compartment is and for many "reasons" adding such an
> understanding is very, very hard. Gecko knows enough to ask, at least, but
> generally it's just being polite. I wouldn't call what it does an invariant.

I wonder if SM could at least share more stuff between runtimes if the embedding actually did uphold some strong invariants.

> > I don't
> > know what would be involved in such an undertaking though - I assume the DOM
> > might not be split up quite as nicely, and we still have multiple Contexts
> > (but I always forget what they're for).

I can't comment on the state of things in Gecko (where addons probably complicate things), but ISTM the DOM at least conceptually is split up in exactly the right way: there's no way for two documents from separate origins (or even the same origin but without a common creation history via iframe embedding or window.open) to synchronously communicate with each other. I don't see why making such communication impossible on the implementation level shouldn't work. In principle at least. And in Servo.

> The challenge for the GC in this context is to (1) build a water-tight
> (maybe thread-tight?) interface that both rust and C++ can link against and
> use effortlessly and that (2) doesn't hog servo's main thread.

Servo already doesn't really have a main thread as far as JS is concerned. It'd be very nice to pay less of a steep price for it, though.

> I may not have emphasized it enough in comment 0, but there are *serious*
> (research level) problems to solve here and we're going to have to step up
> big time in 2016 if we want to keep up with servo. We're going to though.
> It's on the schedule.

As said above, memory usage is a huge concern, and I think in a way it'll only become more so: we really should be doing more to reduce memory usage *inside* a JSRuntime, by being smarter about sharing stuff (such as source buffers and bytecodes, both of which we share in many, but not all cases). Improving there will only make the multiple runtimes case look worse in comparison.

However, it'd be even better to be able to solve parts of this in an even harder scenario: multi-process. Both Gecko and Servo are aggressively moving to multi- or even many-process setups. Right now, that imposes pretty huge overheads, like not being able to share the self-hosting compartment or any bytecode whatsoever.

What does all of this have to do with concurrent GC? Not too much, except for making it less important, because multiple JSRuntimes would get it for free, basically. Well, at least for scenarios other than "the current thread constantly creates substantial amounts of garbage, but also wants very short pause times".
(In reply to Till Schneidereit [:till] from comment #3)
> (In reply to Terrence Cole [:terrence] from comment #2)
> 
> there's no way for two documents from separate
> origins (or even the same origin but without a common creation history via
> iframe embedding or window.open) to synchronously communicate with each
> other. I don't see why making such communication impossible on the
> implementation level shouldn't work.

/me muses about SharedWorker or ServiceWorker coupled with either SharedArrayBuffer (which can be used to implement synchronous messaging) or some new synchronous messaging mechanism, which always seems to be on the brink of being proposed because async messaging is so slow.

(Not meaning to scare anyone; the plan is to disallow shared memory with SharedWorker and ServiceWorker for the time being, and of course sync messaging is going to meet with a wall of resistance.)
(In reply to Till Schneidereit [:till] from comment #3)
> (In reply to Terrence Cole [:terrence] from comment #2)
> 
> Servo uses, roughly, a JSRuntime instance per zone right now. That gets
> around all these issues, but obviously is far from ideal from a memory
> footprint point of view.
> 
> > All the time. Literally. All. The. Time. XPConnect, for one, has basically
> > no idea what a compartment is and for many "reasons" adding such an
> > understanding is very, very hard. Gecko knows enough to ask, at least, but
> > generally it's just being polite. I wouldn't call what it does an invariant.
> 
> I wonder if SM could at least share more stuff between runtimes if the
> embedding actually did uphold some strong invariants.

Speaking generally at an architectural level, I think we're all talking about a two-structure approach. One shared structure containing read-only (or at least read-mostly) common data and one per-thread structure containing caches, the execution state, and other high-traffic read-write data.

In servo we are using JSRuntime as the per-thread structure, because that is the best fit that we currently have. We also have a [parent] JSRuntime that can common up some of the shared read-only data, but I guess servo is not using that yet? In this world, Zones are basically useless.

In gecko we have one JSRuntime (ignoring web workers). We've jammed some concurrency in at the JSCompartment level in an ad-hoc way. Really we'd like to move all the high traffic read-write state to Zone and have only read-only data on the JSRuntime.

So I think we're all pointed in the same general direction, but we need to pick one of these paths to pursue. I'd lean towards the first, with the caveat that we'd rework the interface a bit before servo plays with it. Having optional fields in a structure that are dependent on the value of |parent| has always seemed un-ideal to me. That's not really this bug though.

> > > I don't
> > > know what would be involved in such an undertaking though - I assume the DOM
> > > might not be split up quite as nicely, and we still have multiple Contexts
> > > (but I always forget what they're for).
> 
> I can't comment on the state of things in Gecko (where addons probably
> complicate things), but ISTM the DOM at least conceptually is split up in
> exactly the right way: there's no way for two documents from separate
> origins (or even the same origin but without a common creation history via
> iframe embedding or window.open) to synchronously communicate with each
> other. I don't see why making such communication impossible on the
> implementation level shouldn't work. In principle at least. And in Servo.

Yeah, I meant under the covers in places like XPConnect, the Cycle Collector, the Wrapper Map, etc. that just see a big pile of objects and are not tied to the top-level document in any way.

> > The challenge for the GC in this context is to (1) build a water-tight
> > (maybe thread-tight?) interface that both rust and C++ can link against and
> > use effortlessly and that (2) doesn't hog servo's main thread.
> 
> Servo already doesn't really have a main thread as far as JS is concerned.
> It'd be very nice to pay less of a steep price for it, though.

Okay, not the "main" thread, the "JS" thread. Sorry for my vulgarity. ;-)

> > I may not have emphasized it enough in comment 0, but there are *serious*
> > (research level) problems to solve here and we're going to have to step up
> > big time in 2016 if we want to keep up with servo. We're going to though.
> > It's on the schedule.
> 
> As said above, memory usage is a huge concern, and I think in a way it'll
> only become more so: we really should be doing more to reduce memory usage
> *inside* a JSRuntime, by being smarter about sharing stuff (such as source
> buffers and bytecodes, both of which we share in many, but not all cases).
> Improving there will only make the multiple runtimes case look worse in
> comparison.
>
> However, it'd be even better to be able to solve parts of this in an even
> harder scenario: multi-process. Both Gecko and Servo are aggressively moving
> to multi- or even many-process setups. Right now, that imposes pretty huge
> overheads, like not being able to share the self-hosting compartment or any
> bytecode whatsoever.

In theory we should be able to allocate the shared bits out of shmem, but doing so elegantly and reliably may be extremely difficult.

> What does all of this have to do with concurrent GC? Not too much, except
> for making it less important, because multiple JSRuntimes would get it for
> free, basically. Well, at least for scenarios other than "the current thread
> constantly creates substantial amounts of garbage, but also wants very short
> pause times".

Ignoring games even, gmail frequently eats 100% CPU and lags horribly as I use it. I do not see the use case of "large application on the web" as unimportant or niche, or becoming less important in the future.
I think one thing I was forgetting when I commented was the meaning of 'concurrent'. This doesn't mean the GC would do its marking on multiple threads at the same time (although it could) - it means it would run on a separate thread from the mutator, so they could both run at the same time. Even if you had a separate process for each zone, with the way things work right now the GC would still interrupt processing to do its thing.
Yup, this entire conversation is pretty much orthogonal to this bug. :-) Interesting though, at least.
Keywords: meta
Blocks: 1507448
Assignee: terrence.d.cole → nobody
Status: ASSIGNED → NEW
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.