Closed Bug 333078 (cycle-collector) Opened 18 years ago Closed 16 years ago

XPCOM Cycle Collector

Categories

(Core :: XPCOM, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
Future

People

(Reporter: graydon, Assigned: graydon)

References

Details

(Keywords: dev-doc-complete)

Attachments

(5 files, 26 obsolete files)

238.13 KB, patch
Details | Diff | Splinter Review
3.07 KB, text/plain
Details
3.43 KB, patch
graydon
: review+
sicking
: superreview+
Details | Diff | Splinter Review
5.56 KB, patch
graydon
: review+
Details | Diff | Splinter Review
1.15 KB, patch
dbaron
: review+
Details | Diff | Splinter Review
Experiment with implementing a general-purpose garbage-cycle collector for XPCOM.
Attached patch first cut at working code (obsolete) — — Splinter Review
Very interesting work. Graydon, how to let an existing class participate cycle collection?
(In reply to comment #2)
> Very interesting work. Graydon, how to let an existing class participate cycle
> collection?

I included an example class, nsDOMAttribute, which I've modified to participate. Briefly: 

  - Declare the existence of a cycle collection helper class within your class. 
    This is the NS_DECL_CYCLE_COLLECTION_CLASS(nsDOMAttribute) call in 
    nsDOMAttribute.h

  - Implement the singleton helper class. This is the 

    NS_IMPL_CYCLE_COLLECTION_1_AMBIGUOUS(nsDOMAttribute, 
                                         nsIDOMAttr, 
                                         mChild)

    call in nsDOMAttribute.cpp, though you might have to implement it 
    manually if your class doesn't follow simple structure amenable to
    the macro. I'll document this step further.

  - Implement QI for the interface, returning the singleton. This is the 
    NS_INTERFACE_MAP_ENTRY_CYCLE_COLLECTION(nsDOMAttribute)
    call in nsDOMAttribute.cpp.

  - Implement the modified addref / release methods which interface with 
    the cycle collector. These are the 

    NS_IMPL_CYCLE_COLLECTING_ADDREF_WITH_BASECAST(nsDOMAttribute, nsIDOMAttr)
    NS_IMPL_CYCLE_COLLECTING_RELEASE_WITH_BASECAST(nsDOMAttribute, nsIDOMAttr)

    calls in nsDOMAttribute.cpp

That is about as minimal as I could make it. Feel free to suggest improvements, to make it more automatic or whatever. I'll be playing around with modifying a few classes to participate in it, in the next few days.
Status: NEW → ASSIGNED
(In reply to comment #3)
> (In reply to comment #2)
> > Very interesting work. Graydon, how to let an existing class participate cycle
> > collection?
> 
> I included an example class, nsDOMAttribute, which I've modified to
> participate. Briefly: 
> 
>   - Declare the existence of a cycle collection helper class within your class. 
>     This is the NS_DECL_CYCLE_COLLECTION_CLASS(nsDOMAttribute) call in 
>     nsDOMAttribute.h
> 
>   - Implement the singleton helper class. This is the 
> 
>     NS_IMPL_CYCLE_COLLECTION_1_AMBIGUOUS(nsDOMAttribute, 
>                                          nsIDOMAttr, 
>                                          mChild)
> 

So it is programmer's resposibility to add fields to be traced. I think, ideally, we want to use the declaration of mChild so that s/he is less likely to make mistake. It seems very hard to do it by using only C++ language constructors :(

>     call in nsDOMAttribute.cpp, though you might have to implement it 
>     manually if your class doesn't follow simple structure amenable to
>     the macro. I'll document this step further.
> 
>   - Implement QI for the interface, returning the singleton. This is the 
>     NS_INTERFACE_MAP_ENTRY_CYCLE_COLLECTION(nsDOMAttribute)
>     call in nsDOMAttribute.cpp.
> 
>   - Implement the modified addref / release methods which interface with 
>     the cycle collector. These are the 
> 
>     NS_IMPL_CYCLE_COLLECTING_ADDREF_WITH_BASECAST(nsDOMAttribute, nsIDOMAttr)
>     NS_IMPL_CYCLE_COLLECTING_RELEASE_WITH_BASECAST(nsDOMAttribute, nsIDOMAttr)
> 
>     calls in nsDOMAttribute.cpp
> 
> That is about as minimal as I could make it. Feel free to suggest improvements,
> to make it more automatic or whatever. I'll be playing around with modifying a
> few classes to participate in it, in the next few days.
> 
Graydon: cool, I may blog about this!  Global nit: typenames are generally StudlyCaps style in Mozilla (I'm an old _t kernel hacker, so don't mind local table_t typenames; anyway, my 2 cents).

Feng: right, we talked about automating via some tracing infrastructure (trace operator new, trace nested nsCOMPtr constructions and the like).  This seems likely to be too costly at runtime.

Long ago at Kaleida (IIRC), Wade Hennessy instrumented lcc to compile metadata for GC scanning.  That's not so tempting for our cross-platform builds, although if we could pull it off with GCC, perhaps the metadata could be shared.

At this point, proceeding by hand and learning as we go seems best, but for sure, we should be on the lookout for a less intrusive and by-hand method.

/be
Attached patch updated cycle collector patch (obsolete) — — Splinter Review
This version says that it's actually finding and collecting some cycles on normal workloads, and doesn't seem to crash immediately. It's wired into everything (or most things) which inherit from nsGenericElement, which seems to be a fair amount of the HTML and XUL content graph. Not touching the JS graph yet.
Attachment #217515 - Attachment is obsolete: true
Some common cycles that we've had in the past involve nsScriptLoader, nsCSSLoader, htmlparser, nsHTMLContentSink and ns*Document. It'd be great to get these hooked into this too.

Could we start relying on the cycle collector for things that we have previously added manual code for? For examples in the nsHTMLContentSink we have a specialized class to make the nsScriptLoader not hold a strong reference to the nsHTMLContentSink.

Or is there an advantage to collecting cycles by hand still?
(In reply to comment #7)
> Some common cycles that we've had in the past involve nsScriptLoader,
> nsCSSLoader, htmlparser, nsHTMLContentSink and ns*Document. It'd be great to
> get these hooked into this too.

Ok. It's still not completely clear to me that it's working properly; it might have a bug which mis-colors nodes in its graph, thus mis-identifying cycles, or it might be violating someone else's assumptions about liveness. Note that I had to modify |nsXBLPrototypeBinding::LocateInstance| to accept that |templParent| might be null right off the bat. I'm not sure if by doing that I was fixing a buggy lifecycle assumption, or papering over one. 

> Could we start relying on the cycle collector for things that we have
> previously added manual code for? For examples in the nsHTMLContentSink we have
> a specialized class to make the nsScriptLoader not hold a strong reference to
> the nsHTMLContentSink.
> 
> Or is there an advantage to collecting cycles by hand still?

In theory I think we ought to be able to rely on this algorithm for finding all types of XPCOM-visible cycles, if it turns out to be functioning properly. That's the goal anyways. Whether the goal is achievable depends on a lot of things: for example whether the components involved in a cycle demand that they are disconnected from one another in a particular order, or at a particular time.

At some point after *I* think it's well and truly working, I will want several sets of extra eyes giving it all a very careful line-by-line examination. It has the potential to upset a lot of code.
Comment on attachment 218234 [details] [diff] [review]
updated cycle collector patch

A few things I saw at the top of the patch:

>+NS_CYCLE_COLLECTION_CLASSNAME(nsGenericElement)::Unlink(nsISupports *s)
>+{
>+  // Standard prelude.
>+  nsGenericElement *tmp = NS_STATIC_CAST(nsGenericElement*, NS_STATIC_CAST(nsIContent*, s));
>+
>+  // Custom unlink for this class.
>+  {
>+    tmp->UnbindFromTree(PR_TRUE, PR_TRUE);
>+    PRUint32 i;
>+    PRUint32 kids = tmp->mAttrsAndChildren.ChildCount();
>+    for (i = kids; i > 0; i--)
>+      tmp->mAttrsAndChildren.RemoveChildAt(i-1);
>+  }
>+  return NS_OK;
>+}


This should probably be something like this:

  // Custom unlink for this class.
  {
    PRUint32 i;
    PRUint32 kids = tmp->mAttrsAndChildren.ChildCount();
    for (i = kids; i > 0; i--) {
      tmp->mAttrsAndChildren.GetSafeChildAt(i-1)->UnbindFromTree(PR_FALSE);
      tmp->mAttrsAndChildren.RemoveChildAt(i-1);
    }
  }

Or possibly even

  // Custom unlink for this class.
  {
    PRUint32 i;
    PRUint32 kids = tmp->GetChildCount();
    for (i = kids; i > 0; i--) {
      tmp->RemoveChildAt(i-1, PR_FALSE);
    }
  }

Actually, i think the last one is the safest for XUL since it doesn't always put all its children in its mAttrsAndChildArray
Come to think of it. It is the first of those two that you want, even for XUL, since all you're interested in is releasing the references of the nodes you're holding on to.
Alias: XPCOMGC
Attached patch updated cycle collector patch (obsolete) — — Splinter Review
Some further work on the patch: 

 - Updated to recent trunk
 - Rewote the auto refcount type to be djsoint from nsAutoRefCnt 
   (revealing bug in nsHTMLIFrameElement), and fixed some bugs in it.
 - Tuned the scanner implementation a bit.
 - Sketched out a preliminary scanner for the JS heap, to find linked 
   XPCWrappedNatives from JSObjects. 

Overall now seems stable enough to surf the normal web with collection occurring periodically in the background. Collects some objects, though not a huge number: something like ~100 nodes out of 10k scanned?

Summary of the interdiff:

$ interdiff.exe -w cycle-collection-2006-04-12.patch cycle-collection-2006-04-27.patch  | diffstat
 content/base/src/nsGenericElement.cpp            |    4
 content/base/src/nsGenericElement.h              |    4
 content/html/content/src/nsHTMLIFrameElement.cpp |    4
 js/src/xpconnect/src/xpcprivate.h                |    4
 js/src/xpconnect/src/xpcwrappedjs.cpp            |  100 +++++++++++++++++++++++
 xpcom/base/nsCycleCollector.cpp                  |   55 ++++++++++--
 xpcom/glue/nsISupportsImpl.cpp                   |   33 -------
 xpcom/glue/nsISupportsImpl.h                     |   80 +++++++++++++++++-
 xpcom/glue/objs.mk                               |    1
 xpcom/threads/nsEventQueue.cpp                   |    1
 10 files changed, 227 insertions(+), 59 deletions(-)
Attachment #218234 - Attachment is obsolete: true
Attached patch updated cycle collector patch (obsolete) — — Splinter Review
- Make nsEventListenerManager provide an nsCycleCollectionParticipant
 - Extend graph from nsGenericElement to nsEventListenerManager
 - Fix re-entrancy problems in collector
 
$ interdiff.exe cycle-collection-2006-04-27.patch cycle-collection-2006-05-01.patch  | diffstat
 content/base/src/nsGenericElement.cpp         |   32 ++++++-
 content/events/src/nsEventListenerManager.cpp |   73 +++++++++++++++-
 content/events/src/nsEventListenerManager.h   |    5 -
 xpcom/base/nsCycleCollector.cpp               |  117 ++++++++++++++------------
 xpcom/glue/nsISupportsImpl.h                  |    6 -
 5 files changed, 170 insertions(+), 63 deletions(-)
Attachment #220085 - Attachment is obsolete: true
Attached patch update cycle collection patch (obsolete) — — Splinter Review
Update to cycle collection patch.

  - Update patch to today's trunk.
  - Roll back all DOM GC and wrapper preservation infrastructure 
    created with bugs 283129 and 241518.
  - Reinstitute old wrapper preservation system on nsIDocument.
  - Extend cycle collection to walk through preserved wrappers.
  - Disable cycle collection on nsJSEventListeners.
  - Make a number of conditions fatal in cycle collector for purposes of 
    debugging.
  - Fix more cases of re-entering cycle collector during scanning.
  - Fix logic errors in autorefcnt, especially wrt. stabilization.
  - Fix write-back errors in cycle collector.

I'm keeping this work in monotone databases now; it's easier to track trunk, work offline, and mirror between machines this way. I'll continue to post patches, of course.
Attachment #220474 - Attachment is obsolete: true
Attached patch minor update (obsolete) — — Splinter Review
Small update to limit the set of js objects crawled; it improves the stability of the patch significantly.
Attachment #224976 - Attachment is obsolete: true
Attached patch update cycle collection patch (obsolete) — — Splinter Review
This is mostly a stability and correctness update: many crashes and misidentified cycles are fixed. It seems "harmless" now, insofar as it doesn't interfere with normal browser operation. Unfortunately by correcting the misbehavior, it has ceased "finding" any cycles on a normal workload (those it was "finding" before were misidentified). Clearly, I need to add more participant classes.

Changes:

  - Update to today's trunk.
  - Add some optional debug printing machinery.
  - Remove incorrect traversal of two classes of weak reference in nsGenericElement.
  - Fix a couple more bugs in JS heap crawling.
  - Add new invariants to collector.
  - Flesh out nsDocument traverse and unlink.
  - Promote nsDocument's preserved wrapper table to use nsInterfaceHashtable.
  - Traverse assocoiated preserved wrappers directly from nsGenericElement.
Attachment #224979 - Attachment is obsolete: true
Is it worth getting review and committing this "harmless" milestone, so that it's easier to analyze cases in which it becomes unsafe later?  (And to front-load the review/integration process to earlier in the 1.9 release cycle, no pun intended.)
(In reply to comment #16)
> Is it worth getting review and committing this "harmless" milestone, so that
> it's easier to analyze cases in which it becomes unsafe later?  (And to
> front-load the review/integration process to earlier in the 1.9 release cycle,
> no pun intended.)

I wouldn't mind starting up on review. There are at least three individually reviewable aspects to the patch now: backing out dbaron's wrapper preservation and DOMGC changes, adding the primary cycle collection mechanism, and hooking individual classes up to the collector. 

I would not worry particularly about committing: the patch is generated from my safely committed-and-multiply-mirrored private mozilla monotone repository, and I can do safe incremental work and inter-version debugging, per-file reverting, etc. if it starts misbehaving again. It's not like this is one fragile CVS working copy in a precious state.
Would landing this regress the cycles that dbarons wrapper preservation fixed? Or does your code collect the same cycles?

If nothing is regressed I think we should get this landed. That way you could get help with adding more classes to the collector.
(In reply to comment #18)
> Would landing this regress the cycles that dbarons wrapper preservation fixed?
> Or does your code collect the same cycles?
> 
> If nothing is regressed I think we should get this landed. That way you could
> get help with adding more classes to the collector.

At the moment, it would regress. I'm hoping to get to the state where it covers the same cycles, at which point of course I would not object to landing it and working directly on the trunk. 
(In reply to comment #19)
> At the moment, it would regress. I'm hoping to get to the state where it covers
> the same cycles, at which point of course I would not object to landing it and
> working directly on the trunk. 

That's one of the fun parts:  making the JS engine's reachability participate in the algorithm.  Figuring out how to incorporate that doesn't seem trivial.
(In reply to comment #20)
> (In reply to comment #19)
> > At the moment, it would regress. I'm hoping to get to the state where it covers
> > the same cycles, at which point of course I would not object to landing it and
> > working directly on the trunk. 
> 
> That's one of the fun parts:  making the JS engine's reachability participate
> in the algorithm.  Figuring out how to incorporate that doesn't seem trivial.

No, I have JS participating: when we enter a wrapped native or wrapped JS, we trace through the JS heap to the native wrappers reachable and continue cycle collecting from the other side of the heap. This strategy will consume time proportional to the product of the number of wrappers and the size of the JS heap reachable from them; if that turns out to be bad I'll buffer the entering-JS-heap frontier until the native side is exhausted and then trace them all in a single pass, then it'll be proportional to the reachable JS heap alone. Either way, I think it should work.

I'm not hitting the same cycles your stuff did yet because I don't have enough natives hooked into the scheme. Unfortunately this entire operation requires manually identifying classes that participate in cycles and augmenting them with a singleton cycle collection helper. I'm working on adding nsGlobalWindow at the moment.

(It also doesn't help that the ownership model is far from obvious; lots of pointers don't make it clear whether they're strong or weak, so I just have to try both and see which crashes...)
How do you tell if something else that's not part of the cycle is referencing the wrapped native?  Doesn't the cycle collection algorithm rely on reference counts for that?
(In reply to comment #22)
> How do you tell if something else that's not part of the cycle is referencing
> the wrapped native?  Doesn't the cycle collection algorithm rely on reference
> counts for that?

I'm confused. My understanding is that wrapped natives are XPCOM objects, so they have refcounts, and those refcounts accurately reflect how many things reference them (including references living in the JS heap). Is this wrong?
XPCWrappedNative objects are XPCOM objects, but they have a single reference (you might consider it two, but what they treat as a reference count of 1 is really a weird state that I like to think of as a reference count of one half, so everything else is off by one) for their reachability from the JS GC.  The number of wrapped JS objects that a wrapped native is reachable from can vary due to changes in the JS object graph that don't go anywhere near that wrapped native.

(I think nsXPCWrappedJS objects also play reference count games to work around churn, but I don't think that's a theoretically difficult issue; they are really reference-counted objects.)

(At least I hope that's right; I haven't looked at this for a while.)
(In reply to comment #24)
> XPCWrappedNative objects are XPCOM objects, but they have a single reference
> (you might consider it two, but what they treat as a reference count of 1 is
> really a weird state that I like to think of as a reference count of one half,
> so everything else is off by one) for their reachability from the JS GC.  The
> number of wrapped JS objects that a wrapped native is reachable from can vary
> due to changes in the JS object graph that don't go anywhere near that wrapped
> native.
> 
> (I think nsXPCWrappedJS objects also play reference count games to work around
> churn, but I don't think that's a theoretically difficult issue; they are
> really reference-counted objects.)
> 
> (At least I hope that's right; I haven't looked at this for a while.)

Huh. Well, if this is the case then the strategy I am using will be defeated. Would it be possible to derive the actual number of references to a wrapped native?
(In reply to comment #25)
> Huh. Well, if this is the case then the strategy I am using will be defeated.
> Would it be possible to derive the actual number of references to a wrapped
> native?

References from JS properties and roots?  That would require a JS GC execution, but after it you would know the answer.  Is that helpful?

/be
(In reply to comment #26)

> References from JS properties and roots?  That would require a JS GC execution,
> but after it you would know the answer.  Is that helpful?

Eh, only sorta. David is right on this one; we've discussed it extensively in IRC and come to the conclusion that you and I misunderstood the interaction between the GCs when first formulating this system. Essentially we thought all we needed was to project entering-JS pointers to exiting-JS pointers; what we actually need is to calculate the refcount (or equivalently, the color) of the exiting pointers too. Otherwise the following case:

      XPCOM_a -> JS_a -> JS_b -> XPCOM_b

            JS_root_c -> JS_b -> XPCOM_b

Will cause XPCOM_b to be colored white if XPCOM_a is white; in fact XPCOM_b should be black, due to an external reference. But that information is lost if we consider a single virtual edge from "input pointers" to "output pointers".

To acquire the information we need will take more work. We can only see one decent solution, and it's not super fun:

  - The first entry to the JS heap triggers calculation of *all* the true 
    refcounts of all JS objects. Refcounts get stored in a hashtable. Maybe
    we can reuse the JS GC for calculating this. I'd love help with that, 
    every time I touch sub-jsapi spidermonkey code I gain several grey hairs.

  - The cycle collector cranks through the JS graph doing its external color 
    book-keeping on the reachable subgraph. To differentiate JS and XPCOM 
    pointers, it keeps a "language flavour" for each pointer it's working with, 
    or alternatively we (lazily) manufacture pseudo-XPCOM objects for each JS 
    object, for the purpose of traversing. 

It's a bit gross, but I think it's necessary.
(In reply to comment #27)
> To acquire the information we need will take more work. We can only see one
> decent solution, and it's not super fun:
> 
>   - The first entry to the JS heap triggers calculation of *all* the true 
>     refcounts of all JS objects. Refcounts get stored in a hashtable. Maybe
>     we can reuse the JS GC for calculating this. I'd love help with that, 
>     every time I touch sub-jsapi spidermonkey code I gain several grey hairs.

Cc'ing Igor for help with JS GC hacking.  Igor, the paper to read is http://citeseer.ist.psu.edu/paz03fly.html.  Hacking  JS to use ref-counting with background cycle collection is a non-starter (for 1.9 at any rate), but if the JS GC can compute a few bits that help, great.

Do you really need the full count, or something more like the Bacon/Paz color?

>   - The cycle collector cranks through the JS graph doing its external color 
>     book-keeping on the reachable subgraph. To differentiate JS and XPCOM 
>     pointers, it keeps a "language flavour" for each pointer it's working with,

Ok.

>     or alternatively we (lazily) manufacture pseudo-XPCOM objects for each JS 
>     object, for the purpose of traversing. 

Let's not do that.

/be
i think now's a good time to remind people that the number of languages particupating in cycles has just increased :).
(In reply to comment #27)
> (In reply to comment #26)
> 
> > References from JS properties and roots?  That would require a JS GC execution,
> > but after it you would know the answer.  Is that helpful?
> 
> Eh, only sorta. David is right on this one; we've discussed it extensively in
> IRC and come to the conclusion that you and I misunderstood the interaction
> between the GCs when first formulating this system. Essentially we thought all
> we needed was to project entering-JS pointers to exiting-JS pointers; what we
> actually need is to calculate the refcount (or equivalently, the color) of the
> exiting pointers too. Otherwise the following case:
> 
>       XPCOM_a -> JS_a -> JS_b -> XPCOM_b
> 
>             JS_root_c -> JS_b -> XPCOM_b
> 
> Will cause XPCOM_b to be colored white if XPCOM_a is white; in fact XPCOM_b
> should be black, due to an external reference. But that information is lost if
> we consider a single virtual edge from "input pointers" to "output pointers".
> 
> To acquire the information we need will take more work. We can only see one
> decent solution, and it's not super fun:
> 
>   - The first entry to the JS heap triggers calculation of *all* the true 
>     refcounts of all JS objects. Refcounts get stored in a hashtable. Maybe
>     we can reuse the JS GC for calculating this. I'd love help with that, 
>     every time I touch sub-jsapi spidermonkey code I gain several grey hairs.
> 
>   - The cycle collector cranks through the JS graph doing its external color 
>     book-keeping on the reachable subgraph. To differentiate JS and XPCOM 
>     pointers, it keeps a "language flavour" for each pointer it's working with, 
>     or alternatively we (lazily) manufacture pseudo-XPCOM objects for each JS 
>     object, for the purpose of traversing. 
> 
> It's a bit gross, but I think it's necessary.

Graydon, a simpler scheme might work in this case. 

Assume that we know all pairs of (WrappedJS wjs, WrappedNative wj) in which wj is reachable from wjs, and whether a wjs is reachable from a normal JS root object (excluding all WrappedJS), the cycle collector will work properly. Here is the stretch of my thoughts:

A WrappedNative object maintains a refcnt of the number of WrappedJS objects that can reach it, plus 1 if it is reachable from a normal JS root object.

When scanning and marking an WrappedJS object, just treat reachable WrappedNative objects as its direct children.

So the key part is to to get all pairs of reachable (WrappedJS, WrappedNative), and determine whether a WrappedNative reachable from a normal JS root. The mark phase code of jsgc.c can re-used. I am not familar how SpiderMonkey handles WrappedJS objects during GC. Igor and Brenden for sure know it.

> 

(In reply to comment #27)
> 
>       XPCOM_a -> JS_a -> JS_b -> XPCOM_b
> 
>             JS_root_c -> JS_b -> XPCOM_b

(In reply to comment #30)
> Assume that we know all pairs of (WrappedJS wjs, WrappedNative wj) in which wj
> is reachable from wjs, and whether a wjs is reachable from a normal JS root
> object (excluding all WrappedJS), the cycle collector will work properly. 

Would it work in the following case:

>       XPCOM_a->JS_root ->  JS_obj1 -> ... -> JS_objN -> XPCOM_b -> XPCOM_a

Here JS engine was told that one of the fields of XPCOM_a is a part of the root set. In this case GC marking only tells that XPCOM_b is reachable from a root but it does not tell that it was reachable from the single root which is a part of the circle.
(In reply to comment #30)

> So the key part is to to get all pairs of reachable (WrappedJS, WrappedNative),
> and determine whether a WrappedNative reachable from a normal JS root. The mark
> phase code of jsgc.c can re-used. I am not familar how SpiderMonkey handles
> WrappedJS objects during GC. Igor and Brenden for sure know it.

I'll admit there are a few alternative schemes that look like they might work, but I am hesitant to go down this road. I proposed some to David as well, but ultimately talked myself out of the idea. In particular:

  - XPCOM objects keep JS roots that are not nsXPCWrappedJS, so the 
    traversal interface has to be somewhat modified to understand JS pointers
    anyways. Still, your proposal could work if we maintain some way to 
    differentiate roots held by objects in the current cycle from roots held
    outside the current cycle. However...

  - Calculating the set-of-pairs will be expensive. You can trade time for 
    space, but one way or another you have an MxN cost during calculation.
    Extending the unseen frontier of a graph is comparatively cheaper. This
    cost is currently borne by the (bad) overlapping-search algorithm present
    in the existing patch (nsXPConnect::FindReachableWrappedNatives) but 
    would be eliminated in the scheme I outlined above.

  - Co-ordinating phases is slightly trickier: we have to find all the edges
    we're going to see entering the JS graph first, then subtract that full 
    set from the JS root set and run a JS GC, then proceed with coloring. 
    That extra phase -- collecting all the entering-JS pointers -- will be 
    hard to do right, and probably add expense since the full XPCOM graph is
    allowed to re-enter JS multiple times. 

So overall, I think traversing the JS graph will be slightly cheaper and easier than the alternatives. Plus, in the odd case where we hit a script runtime that does refcounting itself -- python say -- the refcount calculation for nodes within their graph is already done for us.
(In reply to comment #31)
> (In reply to comment #27)
> > 
> >       XPCOM_a -> JS_a -> JS_b -> XPCOM_b
> > 
> >             JS_root_c -> JS_b -> XPCOM_b
> 
> (In reply to comment #30)
> > Assume that we know all pairs of (WrappedJS wjs, WrappedNative wj) in which wj
> > is reachable from wjs, and whether a wjs is reachable from a normal JS root
> > object (excluding all WrappedJS), the cycle collector will work properly. 
> 
> Would it work in the following case:
> 
> >       XPCOM_a->JS_root ->  JS_obj1 -> ... -> JS_objN -> XPCOM_b -> XPCOM_a
> 
> Here JS engine was told that one of the fields of XPCOM_a is a part of the root
> set. In this case GC marking only tells that XPCOM_b is reachable from a root
> but it does not tell that it was reachable from the single root which is a part
> of the circle.

This is the assumption I made, I assume JS_root is a WrappedJS object, JS GC needs to tell that JS_objN is reachable from a WrappedJS, and no other roots. So JS_objN has a ref count of 1. If it is reachable another WrappedJS, it's ref count will be 2, and if it is reachanble from any other JS roots, its ref count will be 3. Cycle collector will break it by first decreasing refcount of JS_root to 0, then decrease JS_objN's ref count.

> 
I wanted to drop this old chestnut from roc, when he was still at CMU: http://www.cs.cmu.edu/~roc/HetGC.html -- we aspire to have something like this to collect cycles between GC'd spaces.  IBM did this for .NET and a JVM (Parley, no link handy).  When dbaron and I last discussed roc's writeup, we thought that it should be possible to handle cycles including XPCOM ref-counted objects.  Were we too optimistic?

/be
We never got around to implementing that at IBM, well, not before I left.

That is basically a global mark and sweep protocol. I haven't thought through how it might work with this cycle-collecting stuff.
Attached patch update cycle collection patch (obsolete) — — Splinter Review
This extends the cycle collection interface to walk through multiple object graph types, differentiated by language code. It walks the JS graph now using the same mechanism as it walks the XPCOM graph, and should be easily extended to walk the python graph too, when the time comes.

Since I do not have a function finished yet to calculate the true refcounts of JS objects, the interface is currently hardcoded to return maxint for the refcount, to prevent accidentally culling live edges. This should be changed once the appropriate helper function is written.
Attachment #225665 - Attachment is obsolete: true
Attached patch update cycle collection patch (obsolete) — — Splinter Review
This update moves the patch forward to today's trunk, and hooks into the JS GC, to properly calculate refcounts on JS objects. It includes some other minor cleanups in the collector, and a change to the event loop to eliminate a deadlock that arose during GC callbacks.

The patch still runs without disturbing the browser, but is not presently complete or correct. No cycles are presently collected on my test workloads, and some JS objects appear to have a zero refcount calculated for them.
Attachment #226030 - Attachment is obsolete: true
Comment on attachment 226254 [details] [diff] [review]
update cycle collection patch

Just wondering why the JS_FRIEND_API decorations weren't sufficient to link, and you needed to change at least js_GetSlotThreadSafe's to JS_PUBLIC_API?  Also why drop the extern?  In the ancient days, it was necessary in front of JS_*_API(returnType) on some platforms.

Haven't had time to study the whole patch, but it looks good.  The gc-thing callback looks good, and I bet it wasn't that hard to add (how many gray hairs did you get? :-P).

/be
Comment on attachment 226254 [details] [diff] [review]
update cycle collection patch


>--- js/src/jsapi.c	8f9b7ce27e588a933b7e64877b8710b9ae9ff537
>+++ js/src/jsapi.c	9d00b7a490553b76ce00f7122fe5952b8e9772fe
>@@ -1990,6 +1990,13 @@
>     return oldcb;
> }
> 
>+JS_PUBLIC_API(void)
>+JS_SetGCThingCallback(JSContext *cx, JSGCThingCallback cb, void* closure)

Nit: void *closure is prevailing style.

>+{
>+    cx->runtime->gcThingCallback = cb;
>+    cx->runtime->gcThingCallbackClosure = closure;

Someone please tell me that GCC is smart enough to know that cx->runtime->gcThingCallback does not alias cx->runtime, so it doesn't reload cx->runtime but commons it.

>+extern JS_PUBLIC_API(void)
>+JS_SetGCThingCallback(JSContext *cx, JSGCThingCallback cb, void* closure);

Same style nit above.

>+++ js/src/jscntxt.h	bbd487705ebdd9a7f148582ac0e6ab23e53f7dd1
>@@ -136,6 +136,8 @@
>     uint8               gcPadding;
> 
>     JSGCCallback        gcCallback;
>+    JSGCThingCallback   gcThingCallback;
>+    void*               gcThingCallbackClosure;

Prevailing style puts declarator modes with name, not type -- see two lines below.

>     uint32              gcMallocBytes;
>     JSGCArena           *gcUnscannedArenaStackTop;
> /* Exported to js.c, which calls it via OBJ_GET_* and JSVAL_IS_* macros. */
>-JS_FRIEND_API(jsval)
>+JS_PUBLIC_API(jsval)
> js_GetSlotThreadSafe(JSContext *cx, JSObject *obj, uint32 slot)

This is exported already (still wondering why FRIEND wasn't enough) but is it necessary?  I'd rather make a new JS API that XPConnect could use to map a function over an object's slots.

>+nsXPConnect::BeginCycleCollection()
>+{
>+    if (!mObjRefcounts)
>+        mObjRefcounts = new JSObjectRefcounts;
>+    mObjRefcounts->Clear();
>+    XPCCallContext cx(NATIVE_CALLER);
>+    JS_SetGCThingCallback(cx, XPCMarkNotification, mObjRefcounts);
>+    js_ForceGC(cx, 0);

JS_GC forces a GC, could you use that public API instead of exported js_ForceGC?

>+    for (int i = JSSLOT_START(OBJ_GET_CLASS(cx, obj)); 
>+         i < JSVAL_TO_INT(obj->slots[-1]); ++i) {
>+        jsval val = OBJ_GET_SLOT(cx, obj, i);
>+        if (!JSVAL_IS_NULL(val) 
>+            && JSVAL_IS_OBJECT(val)) {
>+            JSObject *child = JSVAL_TO_OBJECT(val);
>+            if (child) {
>+                PRUint32 refcount = 0;
>+                mObjRefcounts->Get(child, refcount);
>+                // FIXME: there are zero-refcount children showing up here.
>+                // Figure out why. It might indicate racing with another GC
>+                // thread, or it might indicate something darker.
>+                if (refcount)
>+                    cb.NoteScriptChild(nsIProgrammingLanguage::JAVASCRIPT, child);
>+            }
>+        }
>+    }

This is the only js_GetSlotThreadSafe call outside of js/src/*, right?

>@@ -504,6 +514,7 @@
>     nsIXPCSecurityManager*   mDefaultSecurityManager;
>     PRUint16                 mDefaultSecurityManagerFlags;
>     JSBool                   mShuttingDown;
>+    JSObjectRefcounts        *mObjRefcounts;

Prevailing style is subsidiary in Mozilla code.  Here, it looks like declarator mode goes with type, at least for simple modes such as "pointer" (*) -- see three lines up.

Matching prevailing style ("When in Rome") is still a good rule of thumb to keep the code more readable by everyone accessing it.

/be
(In reply to comment #37)
> The patch still runs without disturbing the browser, but is not presently
> complete or correct. No cycles are presently collected on my test workloads,
> and some JS objects appear to have a zero refcount calculated for them.

This is at least due to MarkGCThingChildren bypassing calls to MarkGCThing. You will need to call the callback there in 2 cases:

1. After the lines:
         for (; vp != end; ++vp) {
            v = *vp;
            if (!JSVAL_IS_GCTHING(v) || v == JSVAL_NULL)
                continue;
            next_thing = JSVAL_TO_GCTHING(v);

you need to call the callback for next_thing.

2. After the lines:
      case GCX_MUTABLE_STRING:
        str = (JSString *)thing;
        if (!JSSTRING_IS_DEPENDENT(str))
            break;
        thing = JSSTRDEP_BASE(str);

you need to call the callback for thing.

BTW, I think there is a good way to optimize the refcounts calculations for GC things. JSGCArena can be extended with a pointer like uint32*. Then when refcounts are required GC would allocate a chunk of corresponding memory in JSGCArena and put the counters there using already known thing offset. In fact this would allow to calculate refcounts without running GC since in refcount mode GC can use the counters as a test when the thing is already marked and not the flags. But this is for the future. For now it would be interesting if with the above fixes you would get accurate counters.
Attached patch update cycle collection patch (obsolete) — — Splinter Review
- Fixed stylistic slipups Brendan noted.
 - Reverted visibility changes. Artifact of chasing C++ extern "C" name mangling 
   issue.
 - Added extra callbacks as Igor suggested, fixing zero refcounts. Thanks!
 - Not sure whether Brendan's comment about gcc coalescing two cx->runtime
   references was important; it's certainly not in performance-critical code,
   and I don't think there's any harm in referencing it twice, is there?
Attachment #226254 - Attachment is obsolete: true
(In reply to comment #41)
>  - Not sure whether Brendan's comment about gcc coalescing two cx->runtime
>    references was important; it's certainly not in performance-critical code,
>    and I don't think there's any harm in referencing it twice, is there?

It's not a big deal, but the SpiderMonkey way would be to explicitly common into a JSRuntime *rt local.  Portable assembler, you know.

/be
Attached patch update cycle collection patch (obsolete) — — Splinter Review
This version finally "works": it collects some cycles. They're not bogus cycles, and it doesn't crash. Hooray!

The majority -- perhaps all? -- of the cycles it's collecting now were previously dealt with by weak pointers between nsJSEventListener and its mTarget. I converted them to use nsCOMPtr, both because it's less accident-prone and because it was an easy target. 

I also added a walker to the cycle collector that prints graphs to dotty on popen(). If you install dotty and run firefox with XPCOM_DRAW_CYCLE_GRAPHS=1 in your environment, it'll pause and show you graphs every time it finds some stuff to collect. <a href="http://venge.net/graydon/moz-cycle-collection-jun-22-2006.png">Here's an example graph</a>. The white parts are the fully accounted cyclic subgraphs that it's about to release. The grey parts are those it can't find all the references for, so is leaving alone.
Attachment #226379 - Attachment is obsolete: true
Blocks: 324145
Attached patch update cycle collection patch (obsolete) — — Splinter Review
Updates to work on win32 and osx.
Attachment #226618 - Attachment is obsolete: true
Attached patch update cycle collection patch (obsolete) — — Splinter Review
This update carries a number of corrections to the logic involving xpconnect and JS objects. The result is now able to find and collect C++ <-> JS cycles. It can collect, for example, a synthetic testcase David constructed involving a cycle formed between DOM nodes by a preserved native wrapper: http://people.mozilla.com/~graydon/cross-language-cycles.png
Attachment #228099 - Attachment is obsolete: true
Attached patch update cycle collection patch (obsolete) — — Splinter Review
Some bug fixes including a fatal race on nsGlobalWindow, extension to cover XMLHttpRequest, an update to today's trunk, and many cosmetic cleanups. I'd like people to begin reviewing this now (or as soon as the ffx2 stuff dies down) as I think it's approaching "done".
Attachment #228546 - Attachment is obsolete: true
Comment on attachment 229875 [details] [diff] [review]
update cycle collection patch

>+NS_IMETHODIMP nsCycleCollectionParticipant::Traverse(nsISupports *n, 
>+						     nsCycleCollectionTraversalCallback &cb)

Nit: underhanging second param is underindented by two spaces.  To avoid really long lines, some places break after NS_IMETHODIMP.  Style, yay.

>+#define NS_CYCLECOLLECTIONPARTICIPANT_IID          \
>+{ /* 407bbd61-13b2-4f85-bdc6-23a59d881f80 */            \

Nit: first line's backslash is underindented.

>+#define NS_IMPL_QUERY_CYCLE_COLLECTION(_class)                          \
>+  if ( aIID.Equals(NS_GET_IID(nsCycleCollectionParticipant)) ) {        \
>+    foundInterface = & NS_CYCLE_COLLECTION_NAME(_class);                \
>+  } else
>+
>+#define NS_INTERFACE_MAP_ENTRY_CYCLE_COLLECTION(_class)            \
>+  NS_IMPL_QUERY_CYCLE_COLLECTION(_class)
>+
>+
>+///////////////////////////////////////////////////////////////////////////////
>+// Helpers for implementing nsCycleCollectionParticipant::Unlink
>+///////////////////////////////////////////////////////////////////////////////
>+
>+#define NS_IMPL_CYCLE_COLLECTION_UNLINK_BEGIN(_class, _base)           \
>+  NS_IMETHODIMP                                                        \
>+  NS_CYCLE_COLLECTION_CLASSNAME(_class)::Unlink(nsISupports *s)        \
>+  {                                                                    \
>+    _class *tmp = NS_STATIC_CAST(_class*, NS_STATIC_CAST(_base*, s));
>+
>+#define NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMPTR(_field)          \
>+    tmp->_field = NULL;    
>+
>+#define NS_IMPL_CYCLE_COLLECTION_UNLINK_NSCOMARRAY(_field)        \
>+    tmp->_field.Clear();
>+
>+#define NS_IMPL_CYCLE_COLLECTION_UNLINK_END                       \
>+    return NS_OK;                                                 \
>+  }

Uber-nit: the different macros have nicely aligned backslashes, but they don't agree on what column to put those backslashes in -- avoiding ragged right is worth something in readability, but maybe not your fingers.  I have a script to add backslashes in column 79, or had (on my old thinkpad), which works well with vim commands to shell over a region.

>+// This file implements a garbage-cycle collector based on the paper
>+// 
>+//   Concurrent Cycle Collection in Reference Counted Systems
>+//   Bacon & Rajan (2001), ECOOP 2001 / Springer LNCS vol 2072
>+//
>+// We are not using the concurrent or acyclic cases of that paper; so
>+// the green, red and orange colors are not used.
>+//
>+// The collector is based on tracking pointers of four types:

Nit: "objects of four colors" seems more accurate.

>+// Grey nodes are being scanned. Nodes which turn grey will turn

Nit: "that" for defining clauses, not "which".

>+// We do not |AddRef| or |Release| any objects during scanning. We
>+// rely on purple-safety of the roots which call |suspect| and
>+// |forget| to hold, such that we will forget about a purple pointer
>+// before it is destroyed.  The pointers which are merely scan-safe,

Ibid.

>+static void
>+Fault(const char *msg, const void *ptr=nsnull)
>+{
>+    // This should be nearly impossible, but just in case.
>+    if (!sCollectorConstructed)
>+        return;
>+
>+    if (sCollector.mParams.mFaultIsFatal) {
>+
>+        if (ptr)
>+            printf("Fatal fault in cycle collector: %s (ptr: %p)\n", msg, ptr);
>+        else
>+            printf("Fatal fault in cycle collector: %s\n", msg);
>+
>+        exit(1);
>+
>+    } else {

Else after exit or other path killers is a non-sequitur -- lose the else { and closing }, clap your hands three times, and save indentation fairies ;-).

>+        if (mCurrPi.mLang == nsIProgrammingLanguage::CPLUSPLUS) {
>+
>+            // For "CPLUSPLUS", we use XPCOM to find a per-object helper.

Nit, or de-snark :-) -- use 'C++' instead of '"CPLUSPLUS"'.

>+static PR_CALLBACK PLDHashOperator
>+markCallback(const void*  ptr,
>+	     PRUint32&    generation,
>+	     void*        userArg)

Underhang underindent nit again.

>+static PR_CALLBACK PLDHashOperator
>+NoGreyCallback(const void*  ptr,
>+	       PtrInfo&     pinfo,
>+	       void*        userArg)

Ibid.

>+    // NB: this routine does not call the script runtime unlink
>+    // methods at the moment, as it is not clear whether the script
>+    // pointers can be stabilized during unlinking. Possibly we'll
>+    // extend the interface to include such functionality.

File a separate bug cited by a FIXME comment here?

>+    // Step 2: Unlink all the whites.
>+
>+    for (i = 0; i < mBuf.GetSize(); ++i) {

Nit: lose the blank line between comment and for-head to match steps 1 and 3.

Question: is GetSize() inline and loop-invariant, or do we need to call it each time the loop condition is evaluated?

>+static PR_CALLBACK PLDHashOperator
>+ForgetAllCallback(const void*  ptr,
>+		  PtrInfo&     pinfo,
>+		  void*        userArg)

Underhang-indent nit.

>+static PR_CALLBACK PLDHashOperator
>+zeroGenerationCallback(const void*  ptr,
>+		       PRUint32&    generation,
>+		       void*        userArg)

Ibid.

>+void 
>+nsCycleCollector::MaybeDrawGraphs()
>+{
>+    if (mParams.mDrawGraphs) {
>+        nsDeque roots(nsnull);
>+        while (mBuf.GetSize() > 0)
>+            roots.Push(mBuf.Pop());
>+
>+        mBuf.Empty();
>+        mGraph.Enumerate(FindWhiteCallback, this);
>+
>+        // We only draw graphs if there were any white nodes.

Nit: s/only draw graphs/draw graphs only/

>+void 
>+nsCycleCollector::Freed(void *n)
>+{
>+    PRUint32 gen;
>+    PRBool hit = PR_FALSE;
>+    mStats.mFreeCalls++;
>+
>+    if (mPurpleBuf.Get(n, &gen))
>+        {
>+            hit = PR_TRUE;

Argh, 4-space-indented stallmacs style invasion! ;-)

>+++ content/xbl/src/nsXBLPrototypeBinding.cpp	b1dc8118bebdcb5eca83b279cd1ef2c061155c17
>@@ -753,6 +753,9 @@
> 
>   nsCOMPtr<nsIContent> templParent = aTemplChild->GetParent();
>   nsCOMPtr<nsIContent> childPoint;
>+
>+  if (!templParent)
>+    return nsnull;

Sorry, I missed why this was necessary. Comment?

>+++ js/src/jsgc.c	2b2ad8602852a0f3af2a3e8370f399e6e38a7c8d
>@@ -1788,9 +1788,12 @@
>             if (!JSVAL_IS_GCTHING(v) || v == JSVAL_NULL)
>                 continue;
>             next_thing = JSVAL_TO_GCTHING(v);
>+            next_flagp = js_GetGCThingFlags(next_thing);
>+            if (rt->gcThingCallback)
>+                rt->gcThingCallback(next_thing, *next_flagp, 
>+                                    rt->gcThingCallbackClosure);

Nit: house style in SpiderMonkey braces then and else clauses if any of the if condition, the then statement, or the else statement takes more than one line.

>+        if (rt->gcThingCallback)
>+            rt->gcThingCallback(thing, *flagp, 
>+                                rt->gcThingCallbackClosure);

Ditto.

>@@ -2153,6 +2159,10 @@
> 
>     flagp = js_GetGCThingFlags(thing);
>     JS_ASSERT(*flagp != GCF_FINAL);
>+
>+    if (cx->runtime->gcThingCallback)
>+        cx->runtime->gcThingCallback(thing, *flagp, cx->runtime->gcThingCallbackClosure);

12345678901234567890123456789012345678901234567890123456789012345678901234567890

Woogah! 80th column violation alert! Please wrap and brace the callback invocation.

>+++ js/src/xpconnect/src/xpcwrappedjs.cpp	ca9082c1c9f4d86378ca73297dd789fbf61792d6
>@@ -45,7 +45,54 @@
> 
> // NOTE: much of the fancy footwork is done in xpcstubs.cpp
> 
>+NS_IMPL_CYCLE_COLLECTION_CLASS(nsXPCWrappedJS)
>+
> NS_IMETHODIMP
>+NS_CYCLE_COLLECTION_CLASSNAME(nsXPCWrappedJS)::Traverse(nsISupports *s,
>+                                                        nsCycleCollectionTraversalCallback &cb)
>+{
>+    nsXPCWrappedJS *tmp = NS_STATIC_CAST(nsXPCWrappedJS*, 
>+                                         NS_STATIC_CAST(nsXPTCStubBase*, s));
>+
>+    // REVIEW ME PLEASE:
>+    // 
>+    // I am not sure when this represents the true refcount.
>+
>+    cb.DescribeNode(tmp->mRefCnt.get(), sizeof(nsXPCWrappedJS), "nsXPCWrappedJS");
>+
>+    // REVIEW ME PLEASE: 
>+    //
>+    // If we do not move to the Root Wrapper, but traverse the tmp
>+    // wrapper we have, we get a segfault shortly as we try to fetch
>+    // the class of its associated JSObject.
>+    //
>+    // Weirder still, if we move to the Root Wrapper *before*
>+    // extracting the refcount, we will be returning a *zero refcount*
>+    // to our caller. Why does a non-root wrapper have a zero
>+    // refcount? No idea. This is set as it is because it's the magic
>+    // combination that works. At the moment.
>+
>+    tmp = tmp->GetRootWrapper();
>+
>+    if (tmp->IsValid()) {
>+        cb.NoteScriptChild(nsIProgrammingLanguage::JAVASCRIPT, tmp->GetJSObject());
>+    }

I'm going to duck the REVIEW ME PLEASE comments and let jst and bz have fun (dbradley too if he's around and has time and remembers; heck, try jband_mozilla@rattlebrain.com too).

Nit: we uphold local style, and in XPConnect that means

    if(tmp->IsValid())
        one_liner();

or

    if(foo)
    {
        bar;
        baz;
    }

Same goes for xpcwrappednative* changes.

Rome vs. Carthage, I know ;-).

Oops, forgot this nic-picking comment and excerpt (recovered from another edit-attachment view):

>+void
>+nsCycleCollector::Collect()
>+{
>+
>+    if (mParams.mDoNothing)
>+        return;
>+
>+    if (mParams.mHookMalloc)
>+        InitMemHook();
>+    
>+    if (mTick++ < mParams.mEventDivisor) 
>+        return;
>+    else
>+        mTick = 0;
>+
>+    for (PRUint32 i = 0; i <= nsIProgrammingLanguage::MAX; ++i) {
>+        if (mRuntimes[i])
>+            mRuntimes[i]->BeginCycleCollection();
>+    }

Lose the else after return non-sequitur and unindent the arbitrarily indented mTick = 0.

Looks great, nice to see the ad-hoc code yield to the general solution!

/be
Attached patch update cycle collection patch (obsolete) — — Splinter Review
- Address Brendan's comments.
- Expand the "language runtime" interface to enable sensible unlinking in JS.
- Implement unlinking in JS.
- Factor C++/XPCOM support out into a separate language runtime class.
Attachment #229875 - Attachment is obsolete: true
Comment on attachment 230069 [details] [diff] [review]
update cycle collection patch

Please diff with -P since that makes it easier to see what function you're in.

Comments on the content/ part of this patch:

We should hook up DOM userdata to the cyclecollector. I.e. when traversing a node we should traverse any userdata and userdatahandlers as well. However feel free to leave that to a separate patch since we don't handle that yet anyway.

You have to add code to nsDOMAttributeMap that traverses all the attributes that live in its hash.

>+++ content/base/src/nsDOMAttribute.cpp	adc2211a267b817c195b4bb524db67e8deea52cd

You should traverse mChild from this class. Also, attributes can have a listenermanager too so you need to traverse that as well. And I think you need to deal with preserved wrappers here too.

>+++ content/base/src/nsDocument.cpp	8c718008cc264caee5241c58781476107a1ec9ae

>+NS_CYCLE_COLLECTION_CLASSNAME(nsDocument)::Unlink(nsISupports *s)
>+  // nsDocument has a pretty complex destructor, so we're going to
>+  // assume that *most* cycles you actually want to break somewhere
>+  // else, and not unlink an awful lot here.
>+  //
>+  // In rare cases where you think an unlink will help here, add one
>+  // manually.

I wonder if it might be worth unlinking all the stuff in the mContentWrapperHash as well? Or will that be taken care of otherwise?


>+NS_CYCLE_COLLECTION_CLASSNAME(nsDocument)::Traverse(nsISupports *s,
>+  // Traverse all nsINode nsCOMPtrs.
>+  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mNodeInfo)

Technically you don't need to traverse mNodeInfo since there is no way that that can take part in any cycles. It's just a refcounted struct holding a few strings. I don't feel strongly either way.

>+  // Traverse all nsIDocument nsCOMPtrs.
>+  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mDocumentURI)
>+  NS_IMPL_CYCLE_COLLECTION_TRAVERSE_NSCOMPTR(mDocumentBaseURI)

Same thing with the URIs.

>+void
>+nsDocument::GetReference(void *aKey, nsISupports **aReference)
>+{
>+  // NB: This method is part of content cycle collection,
>+  // and must *not* Addref its return value.

Please make this function return an |nsISupports*|. Convention for all our functions (XPCOM or not) that return values using out-parameters is that they are refcounted.

>+++ content/base/src/nsDocumentFragment.cpp	80dbe8a7b98eba971b8cfde6577719482d63dd7d
>@@ -209,11 +209,11 @@
>   NS_INTERFACE_MAP_ENTRY(nsINode)
>   NS_INTERFACE_MAP_ENTRY_AMBIGUOUS(nsISupports, nsIContent)
>   NS_INTERFACE_MAP_ENTRY_CONTENT_CLASSINFO(DocumentFragment)
>-NS_INTERFACE_MAP_END
>+NS_INTERFACE_MAP_END_INHERITING(nsGenericElement)

Remove the entries for nsIContent, nsINode and nsISupports here since that will be taken care of by nsGenericElements QI implementation. Also, could you file a bug on making nsDocumentFragment use the nsNode3Tearoff that lives in nsGenericElement (to avoid checking for nsIDOM3Node twice during QI and to save some code).

>+++ content/base/src/nsGenericDOMDataNode.cpp	0563d62688d6117be8ab85f54a98bc804870ca91

These guys can have listenermanagers and preserved wrappers too.


>+NS_CYCLE_COLLECTION_CLASSNAME(nsGenericElement)::Unlink(nsISupports *s)
...
>+  // Unlink child content (and unbind our subtree).
>+  {
>+    tmp->UnbindFromTree(PR_TRUE, PR_TRUE);

This line is wrong, it unbinds the element from its parent, whereas you want to unbind all the children from this node. Remove it and...

>+    PRUint32 i;
>+    PRUint32 kids = tmp->mAttrsAndChildren.ChildCount();
>+    for (i = kids; i > 0; i--)
>+      tmp->mAttrsAndChildren.RemoveChildAt(i-1);    

... add a call to |tmp->mAttrsAndChildren.ChildAt(i-1)->UnbindFromTree()| here instead.

>+  // ADDME: Unlink DOMSlots.

See http://lxr.mozilla.org/mozilla/source/content/base/src/nsNodeUtils.cpp#217 for an easy way to do this.

>+  // ADDME: Unlink RangeList.

That shouldn't be needed as nodes hold weak references to ranges.

>+NS_CYCLE_COLLECTION_CLASSNAME(nsGenericElement)::Traverse(nsISupports *s,
...
>+  // ADDME: Traverse DOMSlots.

At least traverse mAttributeMap if it exists since that's the one i can think of can cause cycles.

>+  // ADDME: Traverse RangeList.

Shouldn't be needed since the ref is weak.


>+++ content/base/src/nsImageLoadingContent.cpp	6afb5fd3d9e7b5e5b644e417c0f9a636bd62b636

What code is keeping these guys alive now?

>+++ content/base/src/nsTreeWalker.cpp	d431ce27b3dfa68fc0242ea91aaeaa744a31bccc

Don't you need to add code to break this cycle?


Note to self, I'm on content/base/src/nsXMLHttpRequest.cpp
(In reply to comment #49)
> Please diff with -P since that makes it easier to see what function you're in.

That's -p.
(In reply to comment #49)

> >+void
> >+nsDocument::GetReference(void *aKey, nsISupports **aReference)
> >+{
> >+  // NB: This method is part of content cycle collection,
> >+  // and must *not* Addref its return value.
> 
> Please make this function return an |nsISupports*|. Convention for all our
> functions (XPCOM or not) that return values using out-parameters is that they
> are refcounted.

We have these conventions, AFAIK, possibly even documented by roc somewhere in the Gecko part of wiki.mozilla.org:

- XPCOM ref-counting Foo getter: GetFoo(nsIFoo **aResult)

- De-COMtaminated non-ref-counting fallible (nullable) getter: nsIFoo *GetFoo()

- De-COMtaminated non-ref-counting infallible getter: nsIFoo *Foo()

I hope we're using these conventions consistently.

/be
(In reply to comment #49)
> >+++ content/base/src/nsImageLoadingContent.cpp	6afb5fd3d9e7b5e5b644e417c0f9a636bd62b636
> 
> What code is keeping these guys alive now?

It's not needed, since the cycle collection code is conservative:  it only collects if it can find all parts of the cycle, whereas the GC participant code would collect the preserved wrappers if it didn't know about their reachability.

(In reply to comment #51)
> We have these conventions, AFAIK, possibly even documented by roc somewhere in
> the Gecko part of wiki.mozilla.org:
> 
> - XPCOM ref-counting Foo getter: GetFoo(nsIFoo **aResult)
> 
> - De-COMtaminated non-ref-counting fallible (nullable) getter: nsIFoo *GetFoo()
> 
> - De-COMtaminated non-ref-counting infallible getter: nsIFoo *Foo()

- De-COMtaminated ref-counting getter: already_AddRefed<nsIFoo> GetFoo()
   (or Foo(), if infallible)
Comment on attachment 230069 [details] [diff] [review]
update cycle collection patch

I wonder if it would be worth finding a smaller name for the NS_IMPL_CYCLE_COLLECTION_UNLINK_* macros. Something like CC_UNLINK_* or CYCLE_UNLINK_* maybe. Up to you, I just don't want people to feel it's too much work to add cycle collection for their classes, but the macro name is probably not going to make a big difference either way.

I'm not doing the classinfo files, i'll leave that to dbaron or jst that knows that code better.

You should change the iid of all interfaces that you make not inherit nsIDOMGCParticipant (i.e. the interfaces where you change the vtable)

>+++ content/base/src/nsXMLHttpRequest.cpp	3c42781ed007684fbc2a36fa217064db05084425
>@@ -397,7 +463,8 @@
> {
>   NS_ENSURE_ARG_POINTER(aOnreadystatechange);
> 
>-  mOnReadystatechangeListener.Get(aOnreadystatechange);
>+  *aOnreadystatechange = mOnReadystatechangeListener;
>+  NS_IF_ADDREF(*aOnreadystatechange);

You can do
NS_IF_ADDREF(*aOnreadystatechange = mOnReadystatechangeListener);

This applies in a few other places in the patch (in this file in particular).

That's it for mozilla/content. Let me know if you want me to review something more.

General question:

>+++ dom/src/base/nsGlobalWindow.cpp	fa355c71c39bab170d444c1e047c3a8f53364228
>+  // Do *not* unlink mOpener; it seems you can race against other windows.
>+  // Do *not* traverse mOpener; it seems you can race against other windows.

"race" how?
Attached patch update cycle collection patch (obsolete) — — Splinter Review
Just an update against trunk. Still awaiting further review and testing, plus some empirical valgrind analysis on my end (which I've now mostly completed the infrastructure for).
Attachment #230069 - Attachment is obsolete: true
Attached patch update cycle collection patch (obsolete) — — Splinter Review
This update should address (most of?) sicking's review comments. It also updates the patch to a trunk snapshot from a couple days ago, fixes a variety of bugs discovered over the past month of examining the patch performing under valgrind, extends cycle collection to several new classes, and factors some of the annotations a bit.

It's still not completely clear to me whether the patch is neutral wrt. memory performance. It appears to be slightly better than trunk on a test page set of 200 or so pages. I have a plausible explanation for why I was seeing increased retained-set sizes -- someone leaks a reference to XPConnect, even on trunk, so leaked wrappers appear as "reachable" to a pointer-scanner -- but I can't confirm this hypothesis yet.

As always, review and testing would be appreciated.
Attachment #237535 - Attachment is obsolete: true
Graydon, are you measuring based on malloc/free pairs or are you measuring based on total size of the process?  Just asking what you meant by "retained-set size."
(In reply to comment #56)
> Graydon, are you measuring based on malloc/free pairs or are you measuring
> based on total size of the process?  Just asking what you meant by
> "retained-set size."

Oh sorry, I meant to say "reachable". Worse yet, it seems I forgot to chronicle in this bug my evaluation of the patch under valgrind, which I've been engaged in this month. 

Valgrind includes a conservative pointer-scanning leak checker: it examines the process' memory maps and, connects together all the allocations that point to one another. It tracks allocations (using both function hooking and annotations), so it knows which bits are allocated. When I run this patch under valgrind, it tells me that there is a significant increase in the reachable set when the process exits. Reachable, not leaked.

That would usually indicate an unexpected bug. I would expect my collector to leak nodes, at worst, but not keep a bunch of new nodes alive. I was perplexed by this for quite a while and spent some time trying to work out what might be keeping nodes alive. I finally found that the XPConnect singleton itself was leaking -- not being fully destroyed -- and since it is a static object, valgrind considers it globally reachable. This means that anything leaked that passes through XPConnect -- anything with a wrapper -- will be misclassified as retained, by valgrind. So there's a reasonable chance that what I was seeing was actually an increase in leaked nodes, not retained.

To confirm this would require tracking down the XPConnect leak on trunk. I do not know where it is, but it's easy to observe: #define XPC_DUMP_AT_SHUTDOWN in js/src/xpconnect/src/nsXPConnect.cpp.

Note that increased leaks under this patch are not necessarily indicative of a problem: it's a conservative collector with a defined delay period while it waits for garbage cycles to stabilize. During the stabilization period, the garbage cycles buffered and reachable by the cycle collector, but it clears its delay buffer during shutdown. This means that valgrind will see buffered garbage cycles as leaks. 

One way to compensate for this affect is to run the collector with a flag controlling its delay-buffer clearing. If you get L leaks and R reachable with buffer clearing turned on, and L-K leaks and R+K reachable for some K with buffer clearing turned off, you can be pretty sure that K leaks were just buffered cycles, not bugs introduced by the collector. I haven't run this experiment yet.
Attached patch update cycle collection patch (obsolete) — — Splinter Review
Fixes to two new bugs (and one regression due to code that I thought I observed magically fixing itself, but which is in fact still buggy so should remain commented-out).
Attachment #242373 - Attachment is obsolete: true
Blocks: 335896
Comment on attachment 244177 [details] [diff] [review]
update cycle collection patch

In particular, if you could take a look at the stuff in xpcom/base and how it interfaces with the rest of xpcom / gecko.
Attachment #244177 - Flags: review?(benjamin)
Comment on attachment 244177 [details] [diff] [review]
update cycle collection patch

The XPCOM bits for a nonfrozen API look fine.
Attachment #244177 - Flags: review?(benjamin) → review+
Attached patch update to cycle collector patch (obsolete) — — Splinter Review
barring last minute glitches and embarrassing portability typos, this is the final "signing off" version I'd like someone to land for the 1.9 alpha (vlad has suggested he'll take it). Brendan, sicking and bsmedberg have reviewed various parts, I believe I've addressed most/all comments.
Attachment #244177 - Attachment is obsolete: true
Doesn't that change to nsDocumentFragment's QI make it QI to nsIDOMElement?  Is that desirable?
Er, nevermind on nsIDOMElement.  So this might be fine if we're ok with document fragments implementing all the event stuff.  That's the main difference between the QI methods.
It still seems like a bad idea, mostly for future-proofing. I filed bug 361460
This fix disables a small and optional bit of diagnostic machinery on OSX, as well as providing a wall-clock throttle for the frequency of aging pointers in the collector (defaults to 1 second).
Depends on: 361590
Depends on: 361585
Depends on: 361596
No longer depends on: 361596
Depends on: 361596
Attached patch a further fix to NODE_SCRIPTABLE_FLAGS (obsolete) — — Splinter Review
This doesn't solve all the speed concerns, but it puts a device in place to help ameliorate the largest observed new hotspot the cycle collector introduces. Observed improvement is unfortunately only a few percent; but then the observed overhead my machine shows for the cycle collector is only about 20% in the first place, not the 100% overhead seen in the tinderboxes when we landed. So I have no idea what will represent "fast enough", or whether this patch does any good. Ultimately we *do* have to keep a transient set of suspicious pointers, and we can only squeeze so much performance out of a hashtable.
I have no idea why the profilers on linux didn't find this. After enough wrestling, vtune found it on win32 (where the cost is particularly bad). The penalty now looks like about 1.1x - 1.2x rather than 2.0x - 3.0x.
Attachment #247803 - Attachment description: embarassing simple fix responsible for almost all of the performance penalty → embarrassingly simple fix responsible for almost all of the performance penalty
The committed patch for bug 362668 added STOBJ_ macros that should be used to access slots is JSObject instead of referencing JSObject.slot directly. Here STOBJ means Single-Threaded-Object. This affected the cycle collector patch at least in one place and here is a fix for that.

The patch also removes unnecessary != JSVAL_NULL check in the slot traversal loop body as this redundant due to child!=NULL check.

Which raises the question: the cycle GC suspends all other threads that can mutate the object graph, right? If so, OBJ_GET_SLOT macros in the loop this patch affects can be replaced by STOBJ_GET_SLOT. Otherwise the code is not thread safe.
The previous patch includes all the cycle collector changes just to show how to change single line to STOBJ macros.
Attachment #248088 - Attachment is obsolete: true
(In reply to comment #69)
> Which raises the question: the cycle GC suspends all other threads that can
> mutate the object graph, right?

The cycle collector operates only on main-thread-only objects.

/be
This should fold in all the changes made to the patch since the last attempted landing, as well as update it against recent activity on trunk.
Attachment #246238 - Attachment is obsolete: true
Attachment #246332 - Attachment is obsolete: true
Attachment #246880 - Attachment is obsolete: true
Attachment #246881 - Attachment is obsolete: true
Attachment #247803 - Attachment is obsolete: true
Attachment #248093 - Attachment is obsolete: true
Depends on: 365983
graydon, could this checkin be the cause of the crasher in bug #366063?
No longer blocks: 366063
Attached patch 64-bit fixes (diff -w) — — Splinter Review
This makes the cycle collector code compile and work on 64-bit systems. And this also cleans up indentation issues in nsISupportsImpl.h (mixed space/tab indentation).
Attachment #250648 - Flags: superreview?(bugmail)
Attachment #250648 - Flags: review?
Attachment #250648 - Flags: review? → review?(graydon)
Attachment #250648 - Flags: superreview?(bugmail)
Attachment #250648 - Flags: superreview+
Attachment #250648 - Flags: review?(graydon)
Attachment #250648 - Flags: review?
Attachment #250648 - Flags: review? → review?(graydon)
64-bit fix looks correct to me. Other reviewers required?
Attachment #250650 - Flags: review+
Attachment #250648 - Flags: review?(graydon)
Attachment #250648 - Flags: review+
All good. I landed the 64-bit fixes on the trunk.
Blocks: 366119
No longer blocks: 366119
Depends on: 366119
Depends on: 366127
Depends on: 366063
No longer depends on: 366127
This patch might address bugs 366063 and 366148. The crash in question is triggered by the cycle collector traversing an edge from an nsEventListenerManager to an nsImageDocument, that then QIs to the nsDocument::cycleCollection helper, that then performs an incorrect static ambiguous base cast (the nsIDocument subobject, rather than the nsIDOMEventListener one), and crashes.
Attachment #250709 - Flags: review?(dbaron)
Depends on: 366166
Comment on attachment 250709 [details] [diff] [review]
correction for MI cycle collection crasher in nsImageDocument

I think this is the wrong approach.  It sounds like the cycle collector currently only supports a cycle collection participant being referenced through a single interface, which seems like a pretty major bug.  We need to have a QueryInterface somewhere before doing a downcast so that it can work from multiple interfaces.
Attachment #250709 - Flags: review?(dbaron) → review-
Comment on attachment 250709 [details] [diff] [review]
correction for MI cycle collection crasher in nsImageDocument

OK, so r=dbaron for **temporarily** doing this to disable cycle collection through nsImageDocument, but we should get the real fix sooner rather than later.
Attachment #250709 - Flags: review- → review+
So this still had somewhat of a hit on perf.  Order of 3% on Tdhtml, 4% on Tp, 5% on Txul, 5-6% on Ts.

We might be willing to pay that, I guess...  and perhaps with more deCOM (reducing number of addrefs/releases) it'll get better...
Probably, bug 366148 is a regression of this.
Depends on: 366148
(In reply to comment #82)
> So this still had somewhat of a hit on perf.  Order of 3% on Tdhtml, 4% on Tp,
> 5% on Txul, 5-6% on Ts.
> 
> We might be willing to pay that, I guess...  and perhaps with more deCOM
> (reducing number of addrefs/releases) it'll get better...

First, jst has a patch in progress to get GC (JS and now CC) off the Tp critical path. We always had too much non-deterministic GC noise in Tp, it seems, and CC amplified it.

(In reply to comment #83)
> Probably, bug 366148 is a regression of this.

See comment 79 -- does the followup patch not address that bug?

/be
(In reply to comment #84)
> (In reply to comment #82)
> > So this still had somewhat of a hit on perf.  Order of 3% on Tdhtml, 4% on Tp,
> > 5% on Txul, 5-6% on Ts.
> > 
> > We might be willing to pay that, I guess...  and perhaps with more deCOM
> > (reducing number of addrefs/releases) it'll get better...
> 
> First, jst has a patch in progress to get GC (JS and now CC) off the Tp
> critical path. We always had too much non-deterministic GC noise in Tp, it
> seems, and CC amplified it.

I meant to say:

Second, after jst's patch, we should see what's left to fix in T* regressions, and also get rid of ad-hoc cycle-breaking or -avoiding code where it is not needed any longer, and where it costs too much.  This will win codesize more than runtime, but the runtime effects may be minimal for all Tx once Tp is no longer perturbed by collection.

DeCOMtamination could win back time on all benchmarks, but I don't want to count on it.  I know Boris has done lots of useful profiling; we're gearing up to gather fresh profile data and institute ongoing perf work for 1.9 and Mozilla 2.  We have more folks to work on performance; with the right balance of work for 1.9 and 2 I believe we can go beyond holding the line, playing defense for the current figures recorded by tinderboxen.  More in a bit.

/be
Depends on: 366440
Depends on: 366578
So can the XML_HTTP_REQUEST_ROOTED stuff in nsXMLHttpRequest.cpp be removed now?
Per comment #73
BeOS defines from malloc.h all lack last void ponter argument:
/* Hooks for debugging versions.  */
extern void (*__malloc_initialize_hook)(void);
extern void (*__free_hook)(void *ptr);
extern void *(*__malloc_hook)(size_t size);
extern void *(*__realloc_hook)(void *ptr, size_t size);
extern void *(*__memalign_hook)(size_t size, size_t alignment);

thus bustage at line 1160 and further
per comment 87:
Inspite malloc hooks are declared in malloc.h system header for BeOS, I couldn't find it in any lib.
Attached patch patch (obsolete) — — Splinter Review
allows XPCOM to be built under BeOS - adjust some defines
Per comment 89:
Still cannot build under BeOS:
/mozbuild/home/mozbone/2007-01-11/mozilla/js/src/xpconnect/src/xpcprivate.h: In method `nsresult nsXPCWrappedJS::cycleCollection::Traverse(nsISupports *, nsCycleCollectionTraversalCallback &)':
/mozbuild/home/mozbone/2007-01-11/mozilla/js/src/xpconnect/src/xpcprivate.h:2249: `class nsAutoRefCnt nsXPCWrappedJS::mRefCnt' is protected
/mozbuild/home/mozbone/2007-01-11/mozilla/js/src/xpconnect/src/xpcwrappedjs.cpp:77: within this context 
The cycleCollection class probably needs to be declared as friend class.

(Similar bug for BeOS in bug 365118.) 
(In reply to comment #91)
Along with the BeOS-patch,
http://lxr.mozilla.org/seamonkey/source/xpcom/base/nsCycleCollectionParticipant.h#163
needs to have 'friend class' instead of class to build Firefox.
 
One question, should this 'friend' be ifdef'ed for older gcc's or is it ok to have it always?

We should open a bug for BeOS.
Blocks: 367678
Depends on: 368159
Depends on: 366241
Depends on: 368523
Depends on: 368528
Depends on: 367779
Depends on: 368530
Depends on: 368549
Depends on: 368869
Depends on: 369625
Depends on: 371321
Depends on: 372713
Depends on: 375225
Depends on: 376530
Depends on: 373693
Depends on: 373694
Depends on: 377606
Depends on: 377730
Depends on: 377787
Comment on attachment 251372 [details] [diff] [review]
patch

this patch is included in overall bustage fix in bug 377730
Attachment #251372 - Attachment is obsolete: true
Blocks: 339504
Depends on: 387725
Depends on: 397435
Depends on: 407034
No longer depends on: 397435
I'm going to mark this fixed, cycle collector landed a long time ago and should be mostly working fine now.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Alias: XPCOMGC → cycle-collector
Depends on: 447764
You need to log in before you can comment on or make changes to this bug.