Incrementalize gray marking

RESOLVED FIXED in Firefox 66

Status

()

enhancement
P3
normal
RESOLVED FIXED
a year ago
5 months ago

People

(Reporter: jonco, Assigned: jonco)

Tracking

(Depends on 1 bug, Blocks 1 bug)

unspecified
mozilla66
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox66 fixed)

Details

(Whiteboard: [qf:p3])

Attachments

(10 attachments, 5 obsolete attachments)

13.25 KB, patch
sfink
: review+
Details | Diff | Splinter Review
14.50 KB, patch
sfink
: review+
Details | Diff | Splinter Review
10.45 KB, patch
sfink
: review+
Details | Diff | Splinter Review
16.65 KB, patch
sfink
: review+
Details | Diff | Splinter Review
1.39 KB, patch
sfink
: review+
Details | Diff | Splinter Review
11.99 KB, patch
sfink
: review+
Details | Diff | Splinter Review
34.19 KB, patch
sfink
: review+
Details | Diff | Splinter Review
7.83 KB, patch
sfink
: review+
Details | Diff | Splinter Review
4.16 KB, patch
sfink
: review+
Details | Diff | Splinter Review
1021 bytes, patch
sfink
: review+
Details | Diff | Splinter Review
Assignee

Description

a year ago
Splitting this off from bug 1323083.

Gray marking seems to be taking a lot of time according to telemetry.  We should incrementalize it to reduce jank from long GC pauses.
Assignee

Updated

a year ago
Priority: -- → P3
Whiteboard: [qf:p3]
Assignee

Updated

7 months ago
Assignee: nobody → jcoppeard
Assignee

Comment 1

7 months ago
Some minor refactoring patches first.

This moves the check for success of gray root buffering to directly after the call to beginMarkPhase() where it happens.
Attachment #9021818 - Flags: review?(sphink)
Assignee

Comment 2

7 months ago
ShouldTraceCrossCompartment always confuses me when I haven't looked at it in a while.  This main renames things associated with the edge's destination to have 'dst' in their name in contrast with the 'src' argument.
Attachment #9021819 - Flags: review?(sphink)
Assignee

Comment 3

7 months ago
Add an RAII class AutoSetMarkColor to handle changing into and out of gray marking mode.  This sets the mark color and also updates the state of zones in the current sweep group, if any.
Attachment #9021864 - Flags: review?(sphink)
Attachment #9021818 - Flags: review?(sphink) → review+
Comment on attachment 9021819 [details] [diff] [review]
bug1463462-1-refactor-should-trace

Review of attachment 9021819 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah, that's easier to follow.
Attachment #9021819 - Flags: review?(sphink) → review+
Comment on attachment 9021864 [details] [diff] [review]
bug1463462-2-auto-set-mark-color

Review of attachment 9021864 [details] [diff] [review]:
-----------------------------------------------------------------

The only weird thing about this is that it's sometimes used to change the zone gcState_ Zone::Mark <--> Zone::MarkGray and sometimes not. And in fact, I'm not entirely sure what it does in the nonincremental case -- does it redundantly set some of the zones' state and not others? What will be in the current sweep group in that case? I mean, it doesn't really matter, since we've already changed the gcState_ on every zone in the runtime, but it's a little confusing.

Would it be better to take a parameter (regular or template) to say whether to toggle gcState_ for all zones or just the current sweep group? If that complicates what you're planning on using it for, then never mind.

(Alternatively, why does something called "nonIncrementalMark" necessarily do all-zones stuff? I guess it's just for the validator, so maybe the fact that you're validating implies all zones?)

::: js/src/gc/GC.cpp
@@ +5039,5 @@
>          for (GCZonesIter zone(runtime); !zone.done(); zone.next()) {
>              zone->changeGCState(Zone::Mark, Zone::MarkGray);
>          }
> +
> +        {

This separate scope doesn't seem necessary, unless there's some assertion that I'm missing. But maybe it's  easier to understand this way?
Attachment #9021864 - Flags: review?(sphink) → review+
Assignee

Comment 6

7 months ago
That's a good point about setting the zone state.  I figured out that we don't actually need to change this when we change the mark color so I took those parts out.
Attachment #9021864 - Attachment is obsolete: true
Attachment #9022160 - Flags: review?(sphink)
Assignee

Comment 7

7 months ago
Currently we have zone states Mark (for black marking) and MarkGray (for gray marking).

This patch changes these to MarkBlackOnly and MarkBlackAndGray, which allows both black and gray marking.  The idea is that we will do as much black marking as possible first, transition to the second state and then do gray marking incrementally (and subsequent black and gray weak marking).  Black marking will also happen in the latter state because of barriers.

Mostly this patch it just renames the states because black marking already works in what was previously the MarkGray state.

I also leave the Zones in the MarkBlackAndGray state at the end of GCRuntime::endMarkingSweepGroup() and expect them to be in this state in GCRuntime::beginSweepingSweepGroup().  Currently we can't yield in between these two points, but eventually we will.
Attachment #9022170 - Flags: review?(sphink)
Assignee

Comment 8

7 months ago
We maintain a markOverflow flag on arenas, but it's only used for assertions that anything in the delayed marking list has had this set.  There used to be an actual use but this got removed.  I don't think this is really adding much so this patch removes it.
Attachment #9022172 - Flags: review?(sphink)
Assignee

Comment 9

7 months ago
This partitions the mark stack between entries to be marked black and entries to be marked gray.  This adds a field which is the index of the first black entry, or SIZE_MAX if we are in gray marking mode.

GCMarker::markUntilBudgetExhaused() now has to deal with transitioning between black and gray marking as appropriate.

Delayed marking was fiddly to get right.  We have a list of arenas for which we didn't have stack space to mark the children of some cell we marked.  We don't know whether this happened during black or gray marking so we need to mark all children of gray cells gray and all children of black cells black.  But, all gray marking must be done first so that gray entries end up on the bottom of the mark stack.

This requires doing the delayed marking in two passes over the list.  But, marking can cause arenas to be pushed back onto this list and we need to catch this and mark such arenas again.  So we first process the list for gray marking and transfer its contents to a second list, and then we restore the original list and process it again for black marking.
Attachment #9022175 - Flags: review?(sphink)
Assignee

Comment 10

7 months ago
Add a gray marking sweep phase.  This marks incoming gray CCWs and gray roots.  It's non-incremental for now.
Attachment #9022177 - Flags: review?(sphink)
Assignee

Comment 11

7 months ago
When marking weakmaps we now have the situation where the map and its keys may be both gray or black (previously, we marked all black things first and then all gray things).

In general the way I handled this was to make gray things count as unmarked during black marking.  I added IsMarkdBlack() functions and a GCMarker methods to test mark state using either IsMarked or IsMarkedBlack depending on the current mark color.

I had to add a couple of explicit tests for the weakmap itself as well as recording its color, but the logic is the same.

Hopefully this doesn't conflict with your changes too much, or is at least reconcilable.
Attachment #9022179 - Flags: review?(sphink)
Attachment #9022160 - Flags: review?(sphink) → review+
Comment on attachment 9022170 [details] [diff] [review]
bug1463462-3-black-and-gray-mark-state

Review of attachment 9022170 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, that clears up a likely source of confusion.
Attachment #9022170 - Flags: review?(sphink) → review+
Attachment #9022172 - Flags: review?(sphink) → review+
Comment on attachment 9022175 [details] [diff] [review]
bug1463462-5-partition-mark-stack

Review of attachment 9022175 [details] [diff] [review]:
-----------------------------------------------------------------

This makes sense, but it makes me wonder if having separate gray and black mark stacks would be easier -- and possibly even faster, both by keeping things single-pass and by removing the ordering constraint so that you could do black marking first and avoid cases where you first mark a bunch of stuff gray and then later re-mark it black. But it would mean that you would have to check whether the black stack is empty before every pop (assuming you wanted to be heavily black-biased; I suppose you could only check every so often to have a more predictable branch, if you were ok allowing a limited amount of gray->black remarking. Or drain black, then drain gray, then repeat? But gray marking should rarely cause black to be pushed, I guess, so it probably doesn't matter what you do.)

Anyway, I'll review this patch as-is now.

::: js/src/gc/Marking.cpp
@@ +1645,5 @@
> +    AutoSetMarkColor autoSetBlack(*this, MarkColor::Black);
> +
> +    // Change representation of value arrays on the stack while the mutator
> +    // runs.
> +    auto svr = mozilla::MakeScopeExit([&] {saveValueRanges();});

Ooh, I like this. Depending on all exits to call this seems error-prone.

@@ +2737,5 @@
> +            markLaterArenas--;
> +        }
> +#endif
> +
> +        if (color == MarkColor::Black || ArenaCanHaveGrayThings(arena)) {

This should have a comment, at least mentioning the two-pass thing.

This code also makes me wonder if rather than doing this outputList thing and saving stuff away, if the arenas should have separate delayedGray and delayedBlack bits. When exceeding the native stack, you'd set the black bit, and additionally the gray bit if ArenaCanHaveGrayThings. Hmm... but that requires more bookkeeping, since you'd need a queue instead of a stack... yeah, never mind.

If anything, it would be easier to maintain a gray stack top and a black stack top and toss popped things from the gray stack to the black stack... but it all comes out to about the same thing, I guess. I just found this conditional kind of hard to follow and understand until I translated it mentally to `if (color == Black || (color == Gray && ArenaCanHaveGrayThings))`.

@@ +2768,5 @@
> +    // onto the list so we mark children of marked things both colors, if the
> +    // arena is of the appropriate kind. Gray marking must be done first as
> +    // gray entries always sit before black entries on the mark stack.
> +    //
> +    // In order to guarantee progress, gray marking is done non-incrementally.

Can you expand this comment? I don't understand. You must be talking about the case where we do some delayed gray marking, then return to the mutator and it unmarks gray?

Bleck. IIUC, this wouldn't be a problem with separate mark stacks. You'd be marking black first, and you would only be unmarking reachable stuff, so I think it would end up blackened at some point?
Attachment #9022175 - Flags: review?(sphink) → review+
Comment on attachment 9022177 [details] [diff] [review]
bug1463462-6-add-gray-marking-phase

Review of attachment 9022177 [details] [diff] [review]:
-----------------------------------------------------------------

Whoo, this is a scary one but it looks good.
Attachment #9022177 - Flags: review?(sphink) → review+
Comment on attachment 9022179 [details] [diff] [review]
bug1463462-7-fix-weakmap-marking

Review of attachment 9022179 [details] [diff] [review]:
-----------------------------------------------------------------

Sorry Jon, I've talked myself in circles and am now where I think there's a problem with this patch, but it's a pre-existing problem. Maybe I'm wrong somewhere here and you can talk me out of it?

By itself, this patch makes sense to me and I feel like it's doing the right thing. (It also seems pretty harmless with respect to the other weak marking work I've been doing, thankfully.)

You know, I'm going to mark it r+ because I don't think it makes the current situation any worse, and you can decide whether it makes sense to land.
Attachment #9022179 - Flags: review?(sphink) → review+
Assignee

Comment 17

7 months ago
(In reply to Steve Fink [:sfink] [:s:] (PTO Nov 5-16) from comment #13)
> This makes sense, but it makes me wonder if having separate gray and black
> mark stacks would be easier -- and possibly even faster, both by keeping
> things single-pass and by removing the ordering constraint so that you could
> do black marking first and avoid cases where you first mark a bunch of stuff
> gray and then later re-mark it black.

I probably didn't explain it too well, but that's what I'm aiming for here.  Gray entries go at the bottom of the mark stack and black ones at the top, and we mark from the top towards the bottom.  So black things get marked first to minimise the amount of re-marking.  Also, barriers need to be able to push black entries onto the stack when the mutator is running without violating the ordering constraint.

> This code also makes me wonder if rather than doing this outputList thing
> and saving stuff away, if the arenas should have separate delayedGray and
> delayedBlack bits. When exceeding the native stack, you'd set the black bit,
> and additionally the gray bit if ArenaCanHaveGrayThings. Hmm... but that
> requires more bookkeeping, since you'd need a queue instead of a stack...
> yeah, never mind.

Yeah, I experimented with doing this but it makes it tricky to keep track of the case when the arena is re-pushed on the delayed list while doing delayed marking.

> I just found this
> conditional kind of hard to follow and understand until I translated it
> mentally to `if (color == Black || (color == Gray &&
> ArenaCanHaveGrayThings))`.

I'll just write that and then it will be clearer.

> @@ +2768,5 @@
> > +    // In order to guarantee progress, gray marking is done non-incrementally.
> 
> Can you expand this comment? I don't understand. You must be talking about
> the case where we do some delayed gray marking, then return to the mutator
> and it unmarks gray?

OK that comment is really terse.  I'm talking about progress with delayed marking.  Since we need to scan the list twice (for gray and then black marking) we can't remove anything from it until the second pass.  So if we yield during the first pass we will have to start again and process all the arenas.  If there are enough arenas we may never finish during our timeslice.  Disallowing yield during the first pass ensures that the list will at least shrink by one arena every time.  I'll add a comment about this.

This delayed marking is not great in terms of efficiency, but it shouldn't happen much during normal operation because it's only used if the mark stack grows too large.
Assignee

Updated

6 months ago
Blocks: 1507440
Assignee

Updated

6 months ago
Depends on: 1509824
Assignee

Comment 18

6 months ago
Comment on attachment 9021818 [details] [diff] [review]
bug1463462-0-check-gray-root-buffer

Moving preparatory patches to bug 1509824 for landing.
Attachment #9021818 - Attachment is obsolete: true
Assignee

Updated

6 months ago
Attachment #9021819 - Attachment is obsolete: true
Assignee

Updated

6 months ago
Attachment #9022160 - Attachment is obsolete: true
Assignee

Updated

6 months ago
Attachment #9022172 - Attachment is obsolete: true
Assignee

Updated

6 months ago
Depends on: 1509923
Assignee

Comment 19

6 months ago
Patch to improve weak map marking checks using the WeakMap's state to say whether the map is marked black or gray in case we don't have an object.  This makes the checks tighter for internal weak maps.
Attachment #9030805 - Flags: review?(sphink)
Assignee

Comment 20

6 months ago
Patch to make gray marking incremental.  Mainly this removes the final drainMarkStack() from markGrayReferencesInCurrentGroup() and replaces it with marker.markUntilBudgetExhaused(). 

This also deals with gray unmarking: when we try to unmark gray in a zone where gray marking can happen, we may hit something that has not yet been marked gray.  To handle this we push cells in zones that are marking onto the mark stack and let the GC deal with marking them black.  Conversely, when the GC marker hits a black to gray edge where the target is in an zone that is not marking it calls UnmarkGrayGCThing().
Attachment #9030814 - Flags: review?(sphink)
Assignee

Comment 21

6 months ago
The next part is a bit annoying: because gray marking happens incrementally, our assertions that things are not marked gray can fail if that thing is in a zone that is currently marking gray (it could have been marked gray already but also be on the black mark stack).

To deal with this this patch and the next buffer the things we want to assert are not gray and check them after we finish gray marking.

The first part (this patch) is to convert code like MOZ_ASSERT(ObjectIsNotGray()) into AssertObjectIsNotGray().
Attachment #9030820 - Flags: review?(sphink)
Assignee

Comment 22

6 months ago
Patch to buffer grayness checks when necessary and perform them when appropriate.
Attachment #9030821 - Flags: review?(sphink)
Assignee

Comment 23

6 months ago
Add a zeal mode to yield during gray marking.
Attachment #9030822 - Flags: review?(sphink)
Assignee

Comment 24

6 months ago
Patch to dump more info if gray marking checks fail.
Attachment #9030824 - Flags: review?(sphink)
Attachment #9030805 - Flags: review?(sphink) → review+
Comment on attachment 9030814 [details] [diff] [review]
bug1463462-8-incremental-gray-marking

Review of attachment 9030814 [details] [diff] [review]:
-----------------------------------------------------------------

Wait... so previously if we unmarked gray, we would immediately recurse. Now we just push onto the mark stack and potentially yield back to the mutator? Does that mean that we now trust our read barriers to catch everything that might be inserted into the graph, and our post-barriers to enforce snapshot at the beginning? I'm thinking of the scenario:

  obj = grabSomeRandomObject(); // read barrier does ExposeToActiveJS(obj);
  prop = JS_GetProperty(obj, "grayprop");
  JS_SetProperty(someRoot, "someprop", prop);

As far as I can tell, JS_GetProperty does not read-barrier. Currently this is fine because obj.grayprop gets un-grayed before being stored in 'prop'. After this change, this works because of normal snapshot-at-beginning marking rules. Is that correct? If so, why wasn't this change valid before? (Or was it?)

I didn't find any problems with the code, so r=me. I'm just unsure that I'm truly understanding what's going on.

::: js/src/gc/GC.cpp
@@ +5282,5 @@
>  
>    markGrayRoots<SweepGroupZonesIter>(gcstats::PhaseKind::SWEEP_MARK_GRAY);
>  
> +  hasMarkedGrayRoots = true;
> +  return marker.markUntilBudgetExhaused(budget) ? Finished : NotFinished;

Somday, we shood probaly fiks the speling of markUntilBudgetExhaused. But I've avoided doing that in my patches so far too.

::: js/src/gc/Marking.cpp
@@ +3504,5 @@
>  
>  void UnmarkGrayTracer::onChild(const JS::GCCellPtr& thing) {
>    Cell* cell = thing.asCell();
>  
> +  // Cells in the nursery cannot be gray, and not can certain kinds of tenured

s/not/neither/
Attachment #9030814 - Flags: review?(sphink) → review+
Comment on attachment 9030820 [details] [diff] [review]
bug1463462-9-make-gray-checks-into-assertions

Review of attachment 9030820 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/Barrier.h
@@ +342,5 @@
>  
>  template <typename T>
> +static inline void AssertTargetIsNotGray(const T& v) {
> +#ifdef DEBUG
> +  // TODO: move the check into the assert function?

Hm, I don't like leaving a TODO here. But I also don't know that it necessarily needs to be done.

Either replace with a bug number, or if you're not sure you want to do this, maybe

  // Consider moving the check into the assert function.
Attachment #9030820 - Flags: review?(sphink) → review+
Assignee

Comment 27

5 months ago
(In reply to Steve Fink [:sfink] [:s:] from comment #25)
> Comment on attachment 9030814 [details] [diff] [review]
> bug1463462-8-incremental-gray-marking
> 
> Review of attachment 9030814 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Wait... so previously if we unmarked gray, we would immediately recurse. Now
> we just push onto the mark stack and potentially yield back to the mutator?

If the gray thing is in a zone that we're marking, or we reach such a thing while unmarking, then this patch pushes the thing onto the mark stack and lets the normal incremental marking mark it black.

> Does that mean that we now trust our read barriers to catch everything that
> might be inserted into the graph, and our post-barriers to enforce snapshot
> at the beginning? I'm thinking of the scenario:
> 
>   obj = grabSomeRandomObject(); // read barrier does ExposeToActiveJS(obj);
>   prop = JS_GetProperty(obj, "grayprop");
>   JS_SetProperty(someRoot, "someprop", prop);
> 
> As far as I can tell, JS_GetProperty does not read-barrier. Currently this
> is fine because obj.grayprop gets un-grayed before being stored in 'prop'.
> After this change, this works because of normal snapshot-at-beginning
> marking rules. Is that correct? If so, why wasn't this change valid before?
> (Or was it?)

Yes, the unmarking may not happen immediately but it does eventually happen.  This change would have been valid before but it was only necessary because now gray marking may not have been finished by the time we try to unmark something.

Having said all this, I'm investigating whether we can do this a different way and recurse into unmarked children of gray things in zones that are currently being marked.  Maybe we could get away without the delayed gray marking checks in that case...

> Somday, we shood probaly fiks the speling of markUntilBudgetExhaused. But
> I've avoided doing that in my patches so far too.

Oh, duh.  Filed bug 1513842.
Assignee

Comment 28

5 months ago
(In reply to Jon Coppeard (:jonco) from comment #27)
> Having said all this, I'm investigating whether we can do this a different
> way and recurse into unmarked children of gray things in zones that are
> currently being marked.  Maybe we could get away without the delayed gray
> marking checks in that case...

OK, that didn't work.  The thing is that currently our read barrier doesn't call unmark gray if the zone is currently marking.  I can change this and see what happens.  However I kind of like pushing more work onto the mark stack as it a) happens incrementally b) uses less stack c) has fallback for stack exhaustion.  I don't like the delayed checks though.
Attachment #9030821 - Flags: review?(sphink) → review+
Comment on attachment 9030822 [details] [diff] [review]
bug1463462-11-yield-during-gray-marking-zeal

Review of attachment 9030822 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/GC.cpp
@@ +5777,5 @@
> +  }
> +#endif
> +
> +  auto r = marker.markUntilBudgetExhaused(budget) ? Finished : NotFinished;
> +  return r;

Was there some debugging here? The 'r' temporary seems unnecessary.
Attachment #9030822 - Flags: review?(sphink) → review+
Attachment #9030824 - Flags: review?(sphink) → review+

Comment 30

5 months ago
Pushed by jcoppeard@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/37a9521cad27
Change Zone marking states to MarkBlackOnly and MarkBlackAndGray r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/3e7a4e085ead
Parition the mark stack into black and gray entries r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/6ae14f44b4af
Add a non-incremental gray marking phase before weakmap marking that marks from gray roots in the current sweep group r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/1aefa97ae116
Make weak map marking take account of the fact that black and gray marking can now be interleaved r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/e3fc6ddd9a53
Make gray marking incremental r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/1544326ba29a
Make gray marking assertions call a JSAPI function r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/42f073dedf5f
Delay gray marking assertions when we are doing incremental gray marking r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/d5542a14b416
Add zeal mode to test incremental gray marking r=sfink
https://hg.mozilla.org/integration/mozilla-inbound/rev/fd4d12eb1b97
Dump more information when gray marking checks fail r=sfink
Assignee

Updated

5 months ago
Depends on: 1514704
Assignee

Updated

5 months ago
Depends on: 1514421
Assignee

Updated

5 months ago
Depends on: 1514520
Depends on: 1514927
Perf improvement noticed! \0/

== Change summary for alert #18264 (as of Fri, 14 Dec 2018 10:37:21 GMT) ==

Improvements:

  6%  raptor-tp6-youtube-firefox windows7-32 pgo      350.29 -> 327.62

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=18264
Depends on: 1515993
Depends on: 1514850
You need to log in before you can comment on or make changes to this bug.