Open Bug 721236 (ConcurrentMarking) Opened 13 years ago Updated 2 years ago

Concurrent GC marking

Categories

(Core :: JavaScript: GC, defect)

defect

Tracking

()

People

(Reporter: igor, Assigned: jonco)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Whiteboard: [Snappy:P1])

With the incremental GC infrastructure in place doing marking from another thread becomes rather feasible. We should investigate this.
Whiteboard: [Snappy]
Whiteboard: [Snappy] → [Snappy:P1]
Could you explain what you mean a little more, Igor?

I'm guessing this would mean that a GC thread would do marking at the same time JS code is running. This would be really cool to have, but extremely difficult, I think. One problem is that, on 32-bit architectures, atomically writing to a 64-bit Value is expensive, and we would need all writes to be atomic.

Another problem is that we have more sophisticated data structure invariants than a typical Java VM (which is where this sort of GC is usually done). I think there are probably a fair number of places where we would have to atomically update multiple words, which is something that JVMs usually don't have to do.

It's possible that there are some ways that we could overcome these obstacles, but it would be pretty researchy. I'm not aware of any concurrent GCs for languages as dynamic as JavaScript (although I'm sure some Lisp or Smalltalk person could correct me).
(In reply to Bill McCloskey (:billm) from comment #1)
> I'm guessing this would mean that a GC thread would do marking at the same
> time JS code is running. This would be really cool to have, but extremely
> difficult, I think. One problem is that, on 32-bit architectures, atomically
> writing to a 64-bit Value is expensive, and we would need all writes to be
> atomic.

One possible solution to deal with this: a write barrier can copy the value to a thread-local buffer. When the buffer becomes full, it is sent under the lock to the GC thread. Also, if the GC thread consumes all its marking stack, it will ask the main thread to send it the incomplete buffer. Clearly, there are a lot details that has to be taken care of, but it does not look like it is unsolvable problem.

IMO more difficult problem is what to do about realloc of the object slot array or similar data that also can be accessed by the the marking thread. Again, that local buffer can be helpful here again. Under the concurrent GC the realloc is replaces with malloc and the old memory is sent via the local buffer to the GC thread so it can do the free when it knows it would not access the array again.

> It's possible that there are some ways that we could overcome these
> obstacles, but it would be pretty researchy.

Surely it is reasearchy and it may well turned out that, for example, due to bad CPU cache interactions a concurrent GC would not be able to bring any improvement compared with tuned incremental GC. Still it is worth to look into it as it may help non-concurrent GC by exploring ideas that would not otherwise surface.


 I'm not aware of any concurrent
> GCs for languages as dynamic as JavaScript (although I'm sure some Lisp or
> Smalltalk person could correct me).
Regarding Value updates on 32 bit CPU: if we ensure that Value never cross the cache boundary (true with jemalloc and our GC allocator) and that the concurrent marker always runs from a different CPU core using OS-specific machinery, then the value updates will be atomic.
Summary: GC marking from another thread → Concurrent GC marking
(In reply to Igor Bukanov from comment #3)
> Regarding Value updates on 32 bit CPU: if we ensure that Value never cross
> the cache boundary (true with jemalloc and our GC allocator) and that the
> concurrent marker always runs from a different CPU core using OS-specific
> machinery, then the value updates will be atomic.

That sounds just crazy enough to work :-).
Terrence: is concurrent marking still relevant for GGC?
Component: JavaScript Engine → JavaScript: GC
Flags: needinfo?(terrence)
Yes, although I'm not sure yet what the priority should be.
Flags: needinfo?(terrence)
This might make more sense once we get 64-bit builds on Windows. I'm told that's in the works. Even still, though, it would be pretty difficult and maybe wouldn't help too much.
(In reply to Bill McCloskey (:billm) from comment #7)
> This might make more sense once we get 64-bit builds on Windows. I'm told
> that's in the works. Even still, though, it would be pretty difficult and
> maybe wouldn't help too much.

Does it make a difference on all the other platforms that already are 64bit? Then we shouldn't need to wait for Windows 64bit to fix this …
Assignee: general → nobody
(In reply to Bill McCloskey (:billm) from comment #1)
> It's possible that there are some ways that we could overcome these
> obstacles, but it would be pretty researchy. I'm not aware of any concurrent
> GCs for languages as dynamic as JavaScript (although I'm sure some Lisp or
> Smalltalk person could correct me).

It seems IE has been doing this for a few versions now.

http://blogs.msdn.com/b/ie/archive/2014/10/09/announcing-key-advances-to-javascript-performance-in-windows-10-technical-preview.aspx

It seems that we spend a pretty large chunk of time marking in Splay: ~7-10%, on my machine. Also, it's worth considering for latency reasons, even if the throughput is lower.
Parallel marking (bug 638660) would have a bigger impact on throughput (i.e., splay) than concurrent marking. It's also much easier to implement. I could even imagine a concurrent GC having lower throughput than a non-concurrent one since there will be a ton of write barrier traffic during the benchmark. Parallel marking can also be split across many threads; concurrent only gives us one extra thread (unless you do parallel as well).

Latency also seems like a red herring to me. Long pause times pretty much never happen in the part of the mark phase that could happen concurrently. The pauses are always during root marking or sweeping, which concurrent marking doesn't help with.
No longer blocks: GC.60fps
Blocks: ConcurrentGC
No longer blocks: GC.performance
Blocks: 1507448
Depends on: 1424934
Assignee: nobody → jcoppeard
Alias: ConcurrentMarking
Depends on: 1693542
Severity: normal → S3

The severity field for this bug is relatively low, S3. However, the bug has 61 CCs.
:jonco, could you consider increasing the bug severity?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jcoppeard)

The last needinfo from me was triggered in error by recent activity on the bug. I'm clearing the needinfo since this is a very old bug and I don't know if it's still relevant.

Flags: needinfo?(jcoppeard)
You need to log in before you can comment on or make changes to this bug.