Open Bug 741417 Opened 12 years ago Updated 1 year ago

[meta] Reduce CC pauses when there are cycles to collect

Categories

(Core :: Cycle Collector, defect)

defect

Tracking

()

blocking-kilimanjaro -

People

(Reporter: mccr8, Unassigned)

References

(Depends on 1 open bug, Blocks 1 open bug)

Details

(Keywords: meta)

Attachments

(2 files, 1 obsolete file)

In the wake of the ongoing work in bug 716598, CC times are getting closer to being proportional to the amount of garbage being collected (modulo some fairly constant overhead being looked at in bug 722715).  Therefore, it would be good to reduce CC pauses when there is actual garbage to collect.
As an example, I opened Gmail, a MDN wiki page, and 5 or so TechCrunch pages, then closed them all using "close other tabs".  This caused a 522ms CC pause.  416ms of this (79%) was MarkRoots(), 28ms (5%) for Unlink, 27ms for ScanRoots, 13ms for ClearGraph, 10ms for Root, 10ms for Unroot.  So it seems like we need to focus on MarkRoots.

The CC visited 45826 ref counted and 256681 GCed objects, and freed 45340 ref counted and 256174 GCed objects.  Kyle's patch to merge together DOM nodes in the CC graph will help with reducing the number of ref counted nodes, but as you can see JS can take up a large chunk of the graph sometimes, so I think we need a strategy for speeding up JS, too.

I'll try sharking CCs to see where we are spending time.
Attached patch run Shark during CCs (obsolete) — Splinter Review
Attachment #611508 - Attachment is obsolete: true
Attached file perf report
This is what perf tells me is hot during cc (using the patch in bug 741652).  I browsed to cnn, slashdot, nytimes, then cc'ed from about:memory.
Thanks.  For the purposes of this bug, I'm interested in what happens when the CC does a lot of work, for instance, after you close a tab.
This may not be particularly meaningful, but perf annotate says that the hottest part of the PL_DHashTableOperate(PL_DHASH_ADD) is this if statement in SearchTable:

    /* Miss: return space for a new entry. */
    if (PL_DHASH_ENTRY_IS_FREE(entry)) {

Presumably this is a difficult-to-predict branch.
(In reply to Andrew McCreight [:mccr8] from comment #5)
> Thanks.  For the purposes of this bug, I'm interested in what happens when
> the CC does a lot of work, for instance, after you close a tab.

I closed these tabs as soon as they loaded, so I think I was exercising the CC.  Hard to say for sure, of course...

> Presumably this is a difficult-to-predict branch.

One thing you might be able to do is re-order your calls into the hashtable so that the result of that branch is more predictable.  That is, if you have a bunch of nodes and you know some likely aren't in the hashtable, you could add them first (or last).

cachegrind might be able to give you more insight into the branch mispredicts, and there's code to make it work with JS_EnableProfiling (I dunno if that code actually works... :).
(In reply to Justin Lebar [:jlebar] from comment #7)
> I closed these tabs as soon as they loaded, so I think I was exercising the
> CC.  Hard to say for sure, of course...

Oh, okay, I didn't realize that.

> > Presumably this is a difficult-to-predict branch.
> 
> One thing you might be able to do is re-order your calls into the hashtable
> so that the result of that branch is more predictable.  That is, if you have
> a bunch of nodes and you know some likely aren't in the hashtable, you could
> add them first (or last).

That's not really possible.  We are traversing the graph in a breadth-first fashion, and looking up any children we find.

> cachegrind might be able to give you more insight into the branch
> mispredicts, and there's code to make it work with JS_EnableProfiling (I
> dunno if that code actually works... :).

Okay, thanks.
> That's not really possible.  We are traversing the graph in a breadth-first fashion, and looking 
> up any children we find.

Could you go depth-first?  I wouldn't bet that this would help, because I expect you'd have worse memory locality.

Hm...  Could you store metadata (e.g. the visited bit) in the objects themselves, rather than in a hashtable?
> Hm...  Could you store metadata (e.g. the visited bit) in the objects themselves, rather than in a 
> hashtable?

So...what if we moved struct PtrInfo into CC participants?  That would bloat every object by three words?

Sorry, I'm probably not helping anymore.  :)
You only need PRUint32 mColor : 2; PRUint32 mInternalRefs : 30; EdgePool::Iterator mFirstChild.  The rest is already present (ignoring the DEBUG_CC gunk).  Then you'd have to add virtual accessors for those fields.  We could also skip the iterator if we just use Traverse multiple times.  The whole copy of the graph was initially done to avoid traversal overhead.  It has succeeded in its goal, in that building the graph is 80% of the CC, but maybe it would be worth looking into the tradeoff again.
(In reply to Andrew McCreight [:mccr8] from comment #11)
> Then you'd have to add virtual accessors for those fields.

That would be lame.  We can't do this with inheritance?
Cycle collected classes would have to inherit from some new nsICCSupports class, which I guess wouldn't be impossible.
(In reply to Andrew McCreight [:mccr8] from comment #13)
> Cycle collected classes would have to inherit from some new nsICCSupports
> class, which I guess wouldn't be impossible.

Yes.  And two extra words of storage per object doesn't seem bad at all, in my completely uninformed opinion.  :)
noming for Kilimanjaro as per conversation with Olli.
Blocks: k9o-perf
blocking-kilimanjaro: --- → ?
Depends on: 749784
leaving this meta bug a nom for k9o in case other high-value dependencies get attached in the next week or so. marked the current depends as k9o blocker.
Depends on: 754495
Depends on: 756675
No longer depends on: 756675
Depends on: 756675
No longer depends on: 754495
Depends on: 754495
No longer blocks: 698919
Blocks: 698919
All depends marked as k9o+. Please nom any new dependent bugs that are applicable for k9o.
blocking-kilimanjaro: ? → -
Depends on: IncrementalCC
Depends on: 653191
Depends on: 1005500
Severity: normal → S3
Component: XPCOM → Cycle Collector
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: