Closed Bug 641910 Opened 13 years ago Closed 13 years ago

non-grey nodes are added to the cycle collector model graph

Categories

(Core :: XPCOM, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla5

People

(Reporter: mccr8, Assigned: mccr8)

References

Details

(Keywords: memory-footprint, perf, Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:2.0) Gecko/20100101 Firefox/4.0
Build Identifier: 

The cycle collector stops tracing at nodes that aren't marked grey by the GC, because these nodes (and any nodes reachable from them) will never be a part of non-garbage cycles, but it still adds them to the graph.  These nodes seem to make up around 2-10% of nodes allocated.  Not adding these nodes should improve the speed and space usage of the collector, and provide further opportunities for reducing the size of the CC graph.

Reproducible: Always
Blocks: 641243
Andreas wants me to investigate the marked JS nodes that remain after this patch.  If we can eliminate them, then we can remove a "&& xpc_IsGrayGCThing(p)" from nsXPConnect::Traverse, because marked JS nodes will only be encountered as children (and we can add xpc_IsGracyGCThing(p) as an assertion).

I looked at 3 of them, and they were all roots pushed by the JS runtime.  Out of the 7000-ish roots that the JS runtime pushes, 17 of them are JS objects.  Probably-not-coincidentally, 17 is also the number of marked JS nodes that are in the graph.  In other words, it seems like any time the JS runtime pushes a JS root at BeginCollection() (which is rare), it is a GC-marked root.  I'll have to investigate the JS runtime's BeginCollection to see why that is.
Poking around the code a bit, it looks like every time a JS node is added to the graph, it is checked first to ensure it was marked Gray by the JS GC.  Some incomplete logging I did suggests this is in fact true.

Why do we have some marked nodes then?  There's actually another condition in nsXPConnect::Traverse that can hold that will cause a node to be set as marked (something about a wrapper class that isn't main thread only and has external references?), and it looks like that's just always set when we have a JS root object.

In summary, it looks like it should be safe to change the line
    type = !markJSObject && xpc_IsGrayGCThing(p) ? GCUnmarked : GCMarked;
to
    type = markJSObject ? GCMarked : GCUnmarked;
as xpc_IsGrayGCThing(p) is always true here with my patch.
Andrew, add an assert for that condition and send your change to the try server. It will run 100 machine hours of tests against it. You need your level 1 access for that.
I've pushed my change to the try servers.  I wasn't sure if I should include the Talos stuff or not so I went ahead and just included it (it doesn't seem to be included by default).

http://tbpl.mozilla.org/?tree=MozillaTry&rev=d8bb6ef35ee6
8 tests failed.  No real rhyme or reason to them as far as I can see.  No test failed on more than 2 configurations, and only one test failed on both the debug and optimized version of the test.  Though looking over the suggested bugs, these are all errors that have happened recently, so maybe these aren't (all?) the results of my change.
Not worth investigating, IMO:
- the Moth failures
- the jsreftest failures
- the reftest failure
- the xpcshell failure

Most of the failure in M5 is known orange, dunno about the two xul test timeouts.  The M1 failures all look like known media oranges, so it's probably safe to ignore them as well.
Thanks jdm.
Attachment #519467 - Flags: review?(gal) → review+
With your level1 access you should be able to push this to the tracemonkey tree (or ask Patrick for help). Nice job. Thanks!
Is there a reason to not do the check in nsXPConnect::ToParticipant instead?
Yeah, that looks like it could also work.  My only concern is that it looks like that change has the potential to affect behavior in more places, and I don't entirely understand ToParticipant.  Doing an MXR search, it seems like there are two other places it could be potentially called.  It is called in UnmarkRemainingPurple, but non-gray nodes should never be purpled, so that should be okay.  Though if a non-gray JS node gets purpled, it will cause a segfault.  The result of nsCycleCollector_isScanSafe() will also change, but that should only affect Suspect, and (I think) JS objects are never Suspected?
For completeness, this is the actual full patch I would add.  It includes the check in NoteScriptChild in the previous patch, and removes the check of xpc_IsGrayGCThing (adding an assertion to check it) from nsXPConnect::Traverse.
Attachment #519467 - Attachment is obsolete: true
Attachment #520729 - Flags: review?(gal)
Comment on attachment 520729 [details] [diff] [review]
don't add non-grey JS children, don't check for non-grey in traverse

Cool. Thanks.
Attachment #520729 - Flags: review?(gal) → review+
(In reply to comment #11)
> It is called in
> UnmarkRemainingPurple, but non-gray nodes should never be purpled, so that
> should be okay.  Though if a non-gray JS node gets purpled, it will cause a
> segfault.  The result of nsCycleCollector_isScanSafe() will also change, but
> that should only affect Suspect, and (I think) JS objects are never Suspected?

That's correct, JS objects are never suspected (== purple), so this shouldn't be an issue. It has the benefit of keeping the grey/black logic inside XPConnect, though I guess with a slight performance penalty (ToParticipant is a virtual call).
Yes, I suppose that does manifest as the need to #include "xpcpublic.h", which would be avoided if the check was put inside ToParticipant.  Either way is fine with me, though I'd want to test some more if I move it into ToParticipant.  I was going to wait a little bit until the Tracemonkey Tinderbox isn't quite so orange to push my patch anyways.
I went to look at moving the non-grey blocking into nsXPConnect::ToParticipant, as peterv suggested, as it would be nice to keep xpcpublic out of the cycle collector.  Unfortunately, there is a problem with doing that: we only want to skip non-grey nodes if !WantAllTraces(), which is from the CC's graph builder.  nsXPConnect::ToParticipant doesn't have any access to this method as far as I can see.
Andreas agrees that there's probably no avoiding putting this in nsCycleCollector, so I'll push this to TM first thing tomorrow.
Whiteboard: fixed-in-tracemonkey
Assignee: nobody → continuation
Version: unspecified → Trunk
Keywords: footprint, perf
The assertion I added is not correct when WANT_ALL_TRACES is on.  I need to add some kind of call to WantAllTraces in there.
This seems to be causing a fair number of crashes on Aurora.  I think there's an general problem with js_GCThingIsMarked being invoked incorrectly or something, because I've found some crashes basically any place js_GCThingIsMarked is used to prune nodes.  See bug 650519.
Depends on: 650519
A patch for this landed in moz-central a while ago, so I'm marking this as fixed.  Further patches to address other non-grey nodes being added to the CC graph can be addressed in a separate bug.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla5
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: