Last Comment Bug 641910 - non-grey nodes are added to the cycle collector model graph
: non-grey nodes are added to the cycle collector model graph
Status: RESOLVED FIXED
fixed-in-tracemonkey
: footprint, perf
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla5
Assigned To: Andrew McCreight (PTO-ish through 6-29) [:mccr8]
:
Mentors:
Depends on: 650519
Blocks: 641243
  Show dependency treegraph
 
Reported: 2011-03-15 11:27 PDT by Andrew McCreight (PTO-ish through 6-29) [:mccr8]
Modified: 2011-06-13 09:27 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
don't add JS children who haven't been marked grey by JS GC (1.34 KB, patch)
2011-03-15 11:31 PDT, Andrew McCreight (PTO-ish through 6-29) [:mccr8]
gal: review+
Details | Diff | Review
don't add non-grey JS children, don't check for non-grey in traverse (2.25 KB, patch)
2011-03-21 13:17 PDT, Andrew McCreight (PTO-ish through 6-29) [:mccr8]
gal: review+
Details | Diff | Review

Description Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-15 11:27:00 PDT
User-Agent:       Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0) Gecko/20100101 Firefox/4.0
Build Identifier: 

The cycle collector stops tracing at nodes that aren't marked grey by the GC, because these nodes (and any nodes reachable from them) will never be a part of non-garbage cycles, but it still adds them to the graph.  These nodes seem to make up around 2-10% of nodes allocated.  Not adding these nodes should improve the speed and space usage of the collector, and provide further opportunities for reducing the size of the CC graph.

Reproducible: Always
Comment 1 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-15 11:31:23 PDT
Created attachment 519467 [details] [diff] [review]
don't add JS children who haven't been marked grey by JS GC
Comment 2 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-15 14:53:16 PDT
Andreas wants me to investigate the marked JS nodes that remain after this patch.  If we can eliminate them, then we can remove a "&& xpc_IsGrayGCThing(p)" from nsXPConnect::Traverse, because marked JS nodes will only be encountered as children (and we can add xpc_IsGracyGCThing(p) as an assertion).

I looked at 3 of them, and they were all roots pushed by the JS runtime.  Out of the 7000-ish roots that the JS runtime pushes, 17 of them are JS objects.  Probably-not-coincidentally, 17 is also the number of marked JS nodes that are in the graph.  In other words, it seems like any time the JS runtime pushes a JS root at BeginCollection() (which is rare), it is a GC-marked root.  I'll have to investigate the JS runtime's BeginCollection to see why that is.
Comment 3 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-15 16:40:18 PDT
Poking around the code a bit, it looks like every time a JS node is added to the graph, it is checked first to ensure it was marked Gray by the JS GC.  Some incomplete logging I did suggests this is in fact true.

Why do we have some marked nodes then?  There's actually another condition in nsXPConnect::Traverse that can hold that will cause a node to be set as marked (something about a wrapper class that isn't main thread only and has external references?), and it looks like that's just always set when we have a JS root object.

In summary, it looks like it should be safe to change the line
    type = !markJSObject && xpc_IsGrayGCThing(p) ? GCUnmarked : GCMarked;
to
    type = markJSObject ? GCMarked : GCUnmarked;
as xpc_IsGrayGCThing(p) is always true here with my patch.
Comment 4 Andreas Gal :gal 2011-03-15 16:47:24 PDT
Andrew, add an assert for that condition and send your change to the try server. It will run 100 machine hours of tests against it. You need your level 1 access for that.
Comment 5 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-18 11:33:03 PDT
I've pushed my change to the try servers.  I wasn't sure if I should include the Talos stuff or not so I went ahead and just included it (it doesn't seem to be included by default).

http://tbpl.mozilla.org/?tree=MozillaTry&rev=d8bb6ef35ee6
Comment 6 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-20 11:49:03 PDT
8 tests failed.  No real rhyme or reason to them as far as I can see.  No test failed on more than 2 configurations, and only one test failed on both the debug and optimized version of the test.  Though looking over the suggested bugs, these are all errors that have happened recently, so maybe these aren't (all?) the results of my change.
Comment 7 Josh Matthews [:jdm] 2011-03-20 13:29:07 PDT
Not worth investigating, IMO:
- the Moth failures
- the jsreftest failures
- the reftest failure
- the xpcshell failure

Most of the failure in M5 is known orange, dunno about the two xul test timeouts.  The M1 failures all look like known media oranges, so it's probably safe to ignore them as well.
Comment 8 Andreas Gal :gal 2011-03-20 13:52:32 PDT
Thanks jdm.
Comment 9 Andreas Gal :gal 2011-03-20 13:53:59 PDT
With your level1 access you should be able to push this to the tracemonkey tree (or ask Patrick for help). Nice job. Thanks!
Comment 10 Peter Van der Beken [:peterv] 2011-03-21 09:38:51 PDT
Is there a reason to not do the check in nsXPConnect::ToParticipant instead?
Comment 11 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-21 11:56:49 PDT
Yeah, that looks like it could also work.  My only concern is that it looks like that change has the potential to affect behavior in more places, and I don't entirely understand ToParticipant.  Doing an MXR search, it seems like there are two other places it could be potentially called.  It is called in UnmarkRemainingPurple, but non-gray nodes should never be purpled, so that should be okay.  Though if a non-gray JS node gets purpled, it will cause a segfault.  The result of nsCycleCollector_isScanSafe() will also change, but that should only affect Suspect, and (I think) JS objects are never Suspected?
Comment 12 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-21 13:17:59 PDT
Created attachment 520729 [details] [diff] [review]
don't add non-grey JS children, don't check for non-grey in traverse

For completeness, this is the actual full patch I would add.  It includes the check in NoteScriptChild in the previous patch, and removes the check of xpc_IsGrayGCThing (adding an assertion to check it) from nsXPConnect::Traverse.
Comment 13 Andreas Gal :gal 2011-03-21 13:21:56 PDT
Comment on attachment 520729 [details] [diff] [review]
don't add non-grey JS children, don't check for non-grey in traverse

Cool. Thanks.
Comment 14 Peter Van der Beken [:peterv] 2011-03-21 13:30:43 PDT
(In reply to comment #11)
> It is called in
> UnmarkRemainingPurple, but non-gray nodes should never be purpled, so that
> should be okay.  Though if a non-gray JS node gets purpled, it will cause a
> segfault.  The result of nsCycleCollector_isScanSafe() will also change, but
> that should only affect Suspect, and (I think) JS objects are never Suspected?

That's correct, JS objects are never suspected (== purple), so this shouldn't be an issue. It has the benefit of keeping the grey/black logic inside XPConnect, though I guess with a slight performance penalty (ToParticipant is a virtual call).
Comment 15 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-21 13:49:41 PDT
Yes, I suppose that does manifest as the need to #include "xpcpublic.h", which would be avoided if the check was put inside ToParticipant.  Either way is fine with me, though I'd want to test some more if I move it into ToParticipant.  I was going to wait a little bit until the Tracemonkey Tinderbox isn't quite so orange to push my patch anyways.
Comment 16 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-23 15:00:57 PDT
I went to look at moving the non-grey blocking into nsXPConnect::ToParticipant, as peterv suggested, as it would be nice to keep xpcpublic out of the cycle collector.  Unfortunately, there is a problem with doing that: we only want to skip non-grey nodes if !WantAllTraces(), which is from the CC's graph builder.  nsXPConnect::ToParticipant doesn't have any access to this method as far as I can see.
Comment 17 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-24 18:38:47 PDT
Andreas agrees that there's probably no avoiding putting this in nsCycleCollector, so I'll push this to TM first thing tomorrow.
Comment 18 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-03-28 15:03:16 PDT
http://hg.mozilla.org/tracemonkey/rev/e73e3a34c194
Comment 19 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-04-14 18:39:24 PDT
The assertion I added is not correct when WANT_ALL_TRACES is on.  I need to add some kind of call to WantAllTraces in there.
Comment 20 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-04-17 11:41:49 PDT
This seems to be causing a fair number of crashes on Aurora.  I think there's an general problem with js_GCThingIsMarked being invoked incorrectly or something, because I've found some crashes basically any place js_GCThingIsMarked is used to prune nodes.  See bug 650519.
Comment 21 Andrew McCreight (PTO-ish through 6-29) [:mccr8] 2011-06-13 09:27:32 PDT
A patch for this landed in moz-central a while ago, so I'm marking this as fixed.  Further patches to address other non-grey nodes being added to the CC graph can be addressed in a separate bug.

Note You need to log in before you can comment on or make changes to this bug.