Open Bug 1693542 Opened 3 years ago Updated 2 years ago

Prototype concurrent marking implementation

Categories

(Core :: JavaScript: GC, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: jonco, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(1 obsolete file)

This bug aims to deliver a prototype implementation of concurrent GC marking on Intel 64 bit platforms. This will be used as a base for further testing and performance evaluation.

High level design:

  • For the sake of simplicity only one thread will mark at a time (either the main thread or a helper thread)
  • Write barriers on the main thread will add unmarked cells to a buffer to mark later in case concurrent marking is happening
  • Not everything can be traced concurrently; when such things are encountered off the main thread they added to a buffer to mark later on the main thread
  • There is occasional synchronisation between the main thread and the marking thread to exchange these buffers
  • Concurrent marking may observe stale data but the snapshot-at-the-beginning invariant saves us by ensuring that any missed marking happens eventually
  • Every pointer the marker follows must be atomic to avoid undefined behaviour and prevent problematic compiler optimizations
  • Ditto for all metadata about cell layout necessary for tracing

Limitations:

  • Some things will still get marked on main thread
  • We need occasional ‘micro-slices’ to synchronize between main/marker threads
  • The array shift optimization will be disabled during concurrent marking
  • We must pre-initialize unused slot/element storage, which causes a small perf regression
  • This requires 64 bit platforms because we need atomic JS::Value access
Depends on: 1693590
Depends on: 1694209
Depends on: 1749329
Depends on: 1750792
Depends on: 1750794

The shell allocation metadata builder creates a bunch of objects for every
allocation, including a dummy representation of the stack at which the allocation
happened. This puts the stack frames into a JS array object. This would usually
use dense elements to store the indexed properties but the code uses property
attribute flags of zero which creates non-enumerable properties. This leads to
the object going in to dictionary mode instead which is much slower.

This causes a problem with the test
jit-test/tests/errors/overrecursed-double-fault-1.js which enables allocation
metadabta and then enters an infinite loop. The test can recurse ~40000 times
before throwing an overrecursion error. The error also contains a
representation of the stack, including a SavedFrame JS object for every frame.
The allocation metadata builder then tries to create metadata for each one of
these objects, and so this is quadratic in the number of frames. All of this
means that this test can easily time out.

The simplest solution for all this is to use the default attributes when
creating these properties which results in the faster dense element storage
being used.

Comment on attachment 9259553 [details]
Bug 1693542 - Part 1: Make the shell allocation metadata hook create properties with default attributes r?sfink

Revision D136248 was moved to bug 1750794. Setting attachment 9259553 [details] to obsolete.

Attachment #9259553 - Attachment is obsolete: true
Depends on: 1750964

This turned out to be difficult for two main reasons:

  1. It requires making everything traced by GC atomic which is a large and invasive change. This is much more than wrapped pointers like GCPtr - for example the JIT has a ton of malloc-allocated data structures that are traversed on GC. Also this results in a lot of API churn when we currently return a reference to something that can be traced by GC.

  2. It interferes with several optimisations which must now be made synchronised. For example rope flattening would potentially require synchronising on the main thread and every time we mark a string off thread.

I didn't get to the point of stabilising this in the browser and there are sure to be other issues beyond these.

Problem 1 could be addressed by making more things into GC things and this is desirable in the long term, but is a large change.

Problem 2 is solvable but would potentially affect performance and would have to be evaluated carefully.

I think that this viable eventually but that in the short to medium term we can get more return on our efforts by working on parallel marking instead.

Assignee: jcoppeard → nobody
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: