Closed Bug 335998 (strongparent) Opened 18 years ago Closed 13 years ago

parentNode of element gets lost

Categories

(Core :: DOM: Core & HTML, defect, P3)

x86
Windows XP
defect

Tracking

()

VERIFIED FIXED
mozilla9
Tracking Status
status2.0 --- wanted

People

(Reporter: martin.honnen, Assigned: smaug)

References

(Depends on 1 open bug, )

Details

(Keywords: testcase)

Attachments

(3 files, 15 obsolete files)

55.13 KB, patch
Details | Diff | Splinter Review
14.60 KB, patch
peterv
: review+
Details | Diff | Splinter Review
14.57 KB, patch
Details | Diff | Splinter Review
The test case at
<http://home.arcor.de/martin.honnen/mozillaBugs/domLevel2/parentNodeProblem1.html>
has three div elements, |div id="div1"| contains |div id="div2"| as a child node which itself contains |div id="div3"| as a child node.
The script code in the document in the onload handler first stores a reference to |div id="div3"| in a global variable, then removes all child nodes of |div id="div1"| without storing a reference to them.
Then the parentNode of the stored |div id="div3"| is checked and initially is the |div id="div2"| element in Mozilla and other implementations (IE, Opera).
However with Mozilla if you then use the button to check the parentNode again and again Mozilla loses the parentNode and reports it as null. Other browsers don't do that, the parentNode of the stored |div id="div3"| element remains the |div id="div2"| element.

I see the problem with a recent trunk nightly Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060429 Minefield/3.0a1 as well as with Firefox 1.5 release as well as with Mozilla 1.7 builds.
So nodes only hold weak refs to parent nodes....  

Could we maybe have nsGenericElement::GetSCCIndex ensure the parent is wrapped?  Not sure whether creating JSObjects at that point is ok or whether the effect on performance would be ok.
Flags: blocking1.9a2?
I think this used to work because we used to make the scope chain follow the DOM parent chain, but we changed that for IE compat (see bug 147058).
You don't want to hack this by creating wrappers that could just go away -- you'll just fix it most of the time, except when GC runs.  Unless you want to use the wrapper preservation mechanism on all wrappers that are the root of a fragment.

I think what you really want is a mechanism where the document manages all of the fragments that it owns; enforcing WRONG_DOCUMENT_ERR and implementing importNode and adoptNode would be a nice first step there.  (This might even be required if you want to do it using wrapper preservation.)
having every document manage all its disconnected trees sound expensive since during normal DOM manipulation a lot of small trees are created and destroyed (think single-element trees).
A hashtable insertion or removal for operations like that shouldn't be too bad.  How many of those do JS and XPConnect do in the normal execution of such an operation?
We would have to sprinkle hash-adding functions all over the place, at least in all the document.create* implementations and all the UnbindFromTree implementations.
So here's another fundamental question.  This bug is about keeping any node in a subtree alive keeping the whole subtree alive, right?  For JS I think we clearly want that.  Do we want it for C++ or python or whatever as well?  If so, we should consider non-JS-bound solutions (eg actually hold strong refs both ways and use graydon's cycle collector thing or something).
I don't necessarily care about the C++ case. Python would be nice though the same way python would be nice to have similar behaviour where nodes keep their document around.
I just feel that if we're going to try to solve this for multiple languages it may possibly be better to change the actual DOM tree impl as needed instead of doing per-language things.
Yeah, i agree. What i'm worried about though is small leaks blowing up into huge ones. We do occationally leak elements, but we don't want that to cause entire documents to leak.
Actually, generally speaking leaking an element already leaks the document.  Certainly if we leak the element's JSObject too.
I'd think it's more rare that we leak the JSObject since we have GC goodness there.
Keywords: testcase
> enforcing WRONG_DOCUMENT_ERR and implementing importNode and adoptNode

Yikes, that would break a lot of my code, and probably other people's code too.
Please see the relevant bug.  Don't start that discussion in this bug.
I guess that's bug 47903.  Does this bug depend on it?
Flags: blocking1.9a2? → blocking1.9-
Whiteboard: [wanted-1.9]
Blocks: acid3
Now that cycle collector works pretty well and document::destroy is cleaned up 
etc. making child nodes to own their parent might work quite well.
Assignee: general → nobody
QA Contact: ian → general
Flags: wanted1.9+
Whiteboard: [wanted-1.9]
Yeah, so now we actually *do* enforce importNode and adoptNode, and at least for ChatZilla this is a major issue (bug 413125) on trunk. Ironically it's an issue because we can't find the ownerDocument of a node anymore so as to be able to adoptNode new child nodes... :-(
Blocks: 413125
I've never seen us leak just elemenets anyway so in reality we're likely leaking big on shutdown if we leak an element.
I would be very worried about making such a change at this point in the release cycle. But for mozilla2 we should definitely do that. Though things will be very different then since we'll be using mmGC.
(In reply to comment #19)
> I've never seen us leak just elemenets anyway so in reality we're likely
> leaking big on shutdown if we leak an element.

Upps, sorry, this was meant for another bug.
(In reply to comment #21)
> (In reply to comment #19)
> > I've never seen us leak just elemenets anyway so in reality we're likely
> > leaking big on shutdown if we leak an element.
> 
> Upps, sorry, this was meant for another bug.
> 

But comment #20 *was* meant for this bug? :-(
Attached patch Strong parent pointer (obsolete) — Splinter Review
This fixes the testcase.
Running chrome test doesn't show leaks. browser test leaks even without the patch. Running mochitest and waiting results from TryServer/Talos.
Comment on attachment 321973 [details] [diff] [review]
Strong parent pointer

This seems to cause one nsDocument leak with browser test and 2 nsDocument leaks with mochitest. Not too bad, I'd say. Going to look at what causes those leaks.
Wouldn't you also want to make the ownerDocument pointer (i.e. nsNodeInfoManager::mDocument) strong to ensure that things like comment 18 really work.

Though I still think we should wait until mmgc for this...
marking wanted1.9.1? to get this in the triage queue.  
Flags: wanted1.9.1?
Priority: -- → P3
Ok, after discussing this with Jonas, we don't think this has to be in the 1.9.1 release, but Smaug, you are welcome to take this.
Flags: wanted1.9.1? → wanted1.9.1-
Component: DOM: Core → DOM: Core & HTML
Blocks: 448993
Attachment #321973 - Attachment is obsolete: true
Depends on: 454570
So given that this is part of Acid3, I guess we shouldn't wait for mmgc (or bome or whatever it'll be).

We definitely need to track down the leaks that appear if you enable owning parent/ownerDocument references though.
> (or bome or whatever it'll be).

Do you know what the time line is for experimenting with and possibly integrating Boehm GC?
(In reply to comment #32)
> > (or bome or whatever it'll be).
> 
> Do you know what the time line is for experimenting with and possibly
> integrating Boehm GC?

Benjamin started, and I am presently continuing conducting the experiment. The browser executes for quite a while with Boehm managing the heap and treating malloc/free memory as uncollectable (poorly but functionally). In this mode, however, it buys us nothing over jemalloc, and in fact performs worse. In the "real" mode -- treating such memory as collectable -- it executes for only a brief period before crashing. 

I am investigating this, but I would not hold your breath at the moment. Any estimate I made about how long it will take to fix "all" the things that might be causing it to crash would be complete fiction. Adjustments to global memory behavior are very tricky to get right, as you know.
Making nsRange Ccollectable drops mochitest leaks to 1 document (+other stuff).
...now trying to find which test causes that leak.
test_bug382990.xul causes leaks
Attached patch WIP, doesn't leak (obsolete) — Splinter Review
This doesn't leak any refcounted objects in mochitest or chrome, and nothing new
in browser-chrome (and those very few leaks are image cache related).
I'm not happy with NODE_OWNS_OWNER_DOCUMENT. Traverse/unlink should go via
nodeinfo/nodeinfomanager, but more important thing was to track down leaks.
Next to figure out how to modify nodeinfo handling (and even without crashes 
which I got when trying the most obvious changes).
Peter, want to look at this patch and give comments?
I was trying to reduce traversing: only root of a content subtree owns
document and each content object owns parent content. So traversing doesn't go
via nodeinfo. (Attr nodes aren't handled)
Passes tests without leaks. Running now in try server - an older version of the
patch didn't seem to cause bad perf regressions, though try server isn't
too reliable.
Attachment #341824 - Flags: review?(peterv)
Don't we already skip traversing DOM nodes whose GetCurrentDoc is currently being rendered?  If so, and if doing more traversing would make the patch cleaner, we should perhaps do that.
(In reply to comment #39)
> Don't we already skip traversing DOM nodes whose GetCurrentDoc is currently
> being rendered?
That is sort of different thing, and yes, we don't traverse in that case.
But there are documents which are never 'rendered'.
If we can easily skip traversing element->ownerDocument almost every case, 
it might be worth to try.

>  If so, and if doing more traversing would make the patch
> cleaner, we should perhaps do that.
Most of the patch is just adding new stuff to CC. AddRefing/Releasing
parent/ownerDoc is pretty simple, IMO.
Perhaps I should try to add some code to see how much we eventually traverse, when document is being CC'd.

Anyway, all comments welcome.
(In reply to comment #38)
> Created an attachment (id=341824) [details]
> root of the content subtree owns document

This patch no longer applies cleanly.
Depends on: 463410
If and when the patch for bug 457022 lands and actually sticks, that severely bitrots portions of this patch to the point where I am not certain how to correctly modify and apply it.
There's nothing incompatible between the two patches, it's just a case of merging. This patch will probably get updated once it has been reviewed. I'm currently reviewing (also by splitting off some chunks in separate bug). There's no need to keep commenting about the fact that the patch doesn't apply, it happens.
Depends on: 463424
Blocks: 465767
Attachment #341824 - Flags: review?(peterv)
Comment on attachment 341824 [details] [diff] [review]
root of the content subtree owns document

This needs a new patch.
Part of the patch has already landed (nsRange at least)
Flags: blocking1.9.2?
Attached patch up-to-date (obsolete) — Splinter Review
...but something is now leaking. Try server shows that one document and its elements are leaking
Attachment #368031 - Attachment is obsolete: true
Attachment #368031 - Flags: review?(peterv)
Not blocking 1.9.2 on this.
Flags: blocking1.9.2? → blocking1.9.2-
(In reply to comment #47)
> Not blocking 1.9.2 on this.

I thought 1.9.2 was supposed to pass ACID3.
(In reply to comment #48)
> (In reply to comment #47)
> > Not blocking 1.9.2 on this.
> 
> I thought 1.9.2 was supposed to pass ACID3.

That's what I thought, too (and I read somewhere, I think?). Re-nom'ing.
Flags: blocking1.9.2- → blocking1.9.2?
If the only reason to fix this is that someone said somewhere that 1.9.2 would pass ACID3, I almost hope we _don't_ fix it.  We should fix bugs that affect users and web developers before we start implementing things just to get another point on ACID3.

I certainly wouldn't hold the release of 1.9.2 if this were the last bug remaining, speaking only for myself.
(In reply to comment #48)
> (In reply to comment #47)
> > Not blocking 1.9.2 on this.
> 
> I thought 1.9.2 was supposed to pass ACID3.

I haven't heard anyone make that decision, and I certainly don't think it'd be made now given the short schedule. I'd still gladly consider a patch for this, but I'm with shaver here, I wouldn't hold the release for a point on ACID3.
Flags: wanted1.9.2+
Flags: blocking1.9.2?
Flags: blocking1.9.2-
(In reply to comment #51)
> > I thought 1.9.2 was supposed to pass ACID3.
> 
> I haven't heard anyone make that decision, and I certainly don't think it'd be
> made now given the short schedule. I'd still gladly consider a patch for this,
> but I'm with shaver here, I wouldn't hold the release for a point on ACID3.

Well that is kind of what was said to answer why Firefox 3.5 was going to be released not passing ACID3.  I agree that the pass level we have now is about all that is needed, the test that we fail are for features that probably never belonged being part of ht e test in the first place.

However, besides technical issues there are also political issues.

Given the close schedule after 3.5 for this next branching I can see that there is now way that it will include the SVG font support, blocking for ACID3 issues probably makes no sense.  However, having another release with out announcing a commitment to make the next version pass ACID3 with a sort of slushy date associated with that would probably be a political mistake.
I would rather make a political mistake than have someone work on this instead of a bug or feature that will actually affect people.  In fact, I would describe *that* as a political mistake, since it would be making a poor choice due to (theoretical) political pressure.

FF3.6 will be released not passing ACID3 for the same reason that FF3.5 was released without passing ACID3, and FF3, and FF2 for that matter -- because there were other things that we felt were more important than the items required to pass ACID3.  (Since all the things in ACID3 were standards before FF2 was released, IIRC, we could have chosen to implement all of those features, on their own merits.  We didn't, though dbaron's post on the topic points to some interesting analysis on such stuff.)

I would consider it a gift if people didn't use ACID3 to justify a blocking nomination again!
I would also strongly oppose a "commitment to make the next version pass ACID3", if that matters.  Or any version.  If we pass it, it'll be because the things it tests stand on their merits (or someone whose goal is actually passing ACID3 decides to do something instead of just complaining), not because someone chose to test them.
Attached patch v14 (obsolete) — Splinter Review
Still one leak when tryserver runs mochitest.
Can't reproduce on my linux system, testing elsewhere....
Attached patch v16 (obsolete) — Splinter Review
Passes tryserver tests at least on Windows.

This includes two hacks.
- unbound native anon doesn't own the document. This is because of the bug in
  generated content handling. There is occasionally a loop
  content->a_property_which_owns_the_content->content
  Fixing this could be a followup, IMO, because it needs some changes to
  generated content handling.
- Running mochitest (or even a small part of it) using --close-when-done and
  --autorun leaks without the change to quit.js.
  Note, there is no leak without --close-when-done.
  So far I haven't found what causes the extra reference to an element.
  DEBUG_CC isn't too helpful here.
  I think something somewhere assumes normal shutdown, not a special case like
  quit.js where a content page calls Quit().
Attachment #390890 - Attachment is obsolete: true
Depends on: 508373
Attached patch v18 (obsolete) — Splinter Review
The iframe related leak was the nsContentList cache after all.

Passed tryserver tests without leaks.

Unbind has a small hack for native anonymous content (followup bug 508373).

If we want to try to make nodeinfomanager to own doc etc., that could be 
hopefully done in a followup bug. The node-owns-doc part of this patch is pretty small to backout. (Or if really wanted, I could do that in this bug.)

This is something which should land to 1.9.3 pretty early, IMO.
Attachment #332444 - Attachment is obsolete: true
Attachment #340645 - Attachment is obsolete: true
Attachment #341824 - Attachment is obsolete: true
Attachment #389534 - Attachment is obsolete: true
Attachment #392569 - Attachment is obsolete: true
Attachment #392950 - Flags: review?(peterv)
Is there a reason to addref/release our GetParent() instead of our GetNodeParent()?
Why all the magic with NS_MAYBEADDREFOWNERDOC? Why not simply make nsNodeInfoManager hold a strong reference to its document?
(In reply to comment #59)
> Why all the magic with NS_MAYBEADDREFOWNERDOC? Why not simply make
> nsNodeInfoManager hold a strong reference to its document?
Because at least originally that just didn't work as easily as
the ownerdoc approach. Way more leaks.
But the nodeinfomanager approach is perhaps something to do.

(In reply to comment #58)
> Is there a reason to addref/release our GetParent() instead of our
> GetNodeParent()?
That doesn't really matter here.
GetParent() is one extra branch, no?  I agree it's probably not a big deal if we're getting of the MAYBEADDREF stuff.  Otherwise, it can avoid that stuff for kids of the document...
(In reply to comment #60)
> (In reply to comment #59)
> > Why all the magic with NS_MAYBEADDREFOWNERDOC? Why not simply make
> > nsNodeInfoManager hold a strong reference to its document?
> Because at least originally that just didn't work as easily as
> the ownerdoc approach. Way more leaks.

Any idea why? It seems like a much cleaner solution.
It isn't really that clean solution. Currently there is the assumption that
nsINode has always nodeinfo and nodeinfo belongs to nodeinfomanager.
So unlinking should unlink only mDocument in nodeinfomanager.
I could try that once bug 508373 is fixed (hoping someone hacking layout would do it :p), because the hack for that bug
isn't possible with nodeinfo-approach.
Attached patch v19 (obsolete) — Splinter Review
Without the hack for bug 508373.

I started the nodeinfo-approach, but it is a bit tricky because of all the
bindingmanager and nodeinfomanager ownership issues.
Attachment #392950 - Attachment is obsolete: true
Attachment #393225 - Flags: review?(peterv)
Attachment #392950 - Flags: review?(peterv)
(In reply to comment #64)
> I started the nodeinfo-approach
But IMHO, that could be a followup bug.

Note, the changes to cached content list should be enough.
No need to change nsContentList::NodeWillBeDestroyed in any way.
Depends on: 461640
I split off the boxobject CC stuff to bug 461640 (it has a reviewed piece of attachment 393225 [details] [diff] [review]). I'll also split off the cached content list stuff to a separate bug.

I have to say I also really dislike the NS_MAYBEADDREFOWNERDOC business.
(In reply to comment #66)
> I have to say I also really dislike the NS_MAYBEADDREFOWNERDOC business.

Is it the whole idea or the name you are objecting to?  Changing MAYBE to CONDITIONALLY would certainly describe it better, although adding 8 characters to the function name.
I discussed this with Jonas last week, here's what we would like to see: nsNode holds strong reference to its nsNodeInfo, nsNodeInfo holds strong reference to its nsNodeInfoManager, nsNodeInfoManager holds strong reference to the document, nsDocument holds strong reference to its nsBindingManager, nsNodeInfoManager doesn't hold the nsBindingManager anymore (partly backing out the patch from bug 415192). We unlink from CC if possible. Does that setup work? If it doesn't because it leaks, we should find out what those leaks are and fix them (I can help). Do we still need NS_MAYBEADDREFOWNERDOC if we do that?
(In reply to comment #68)
> We unlink from CC if possible.

Specifically, unlinking unlinks the strong reference from the nodeinfomanager to nsDocument. The strong links node->nodeinfo->nodeinfomanager aren't touched during unlinking.
So need to fix bug 415192 in some other way. Maybe the new setup just fixes it.

document->bindingmanager must never be unlinked, and document needs to always have
nodeinfomanager.
But if nodeinfomanager->document is unlinked and since bindingmanager doesn't
own document, I think that should work.
bug 415192 was caused by people holding on to an element and then having the elements owner-doc get garbage collected, which meant that an actively used element lost its bindingmanager. However with the proposed setup, an actively used element will hold a strong reference to its owner document (through element->nodeinfo->nodeinfomanager->document), which means that the document won't get garbage collected and thus we won't loose the binding manager.
So the idea is to have node->nodeinfo->nodeinfomanager strong, nodeinfomanager->document strong but breakable via cc?  And nodes always holding strong refs to parent nodes?  We're sure that nodes don't need the binding manager after the unlink phase of cc?  If so, that sounds doable...
(In reply to comment #72)
> We're sure that nodes don't need the
> binding manager after the unlink phase of cc?  If so, that sounds doable...

Technically speaking nothing should need anything after unlink since unlink should kill it. Of course that isn't always true.

All code we could find got the bindingmanager by doing:

node->GetOwnerDoc()->BindingManager()

(with appropriate nullchecks). If this is a situation where the document is getting unlinked along with the element in question, the above code will already stop producing a bindingmanager as soon as the document is destroyed.


We talked about making unlinking instead of clearing the nodeinfomanager->nsdocument link, change it to be a weak link (appropriately cleared when the document is destroyed as it is now). I don't remember if we decided that this was needed or not.
The thing is, I'd really like to not have to null-check GetOwnerDoc() or BindingManager()....
Absolutely. The way to do that is, IMHO, to make sure that node destructors never call out to code that attempts to get ownerdoc/bindingmanager. We should audit for that. Though probably after this bug is fixed. Then make it so that when the document goes away it clears out all the stuff that the bindingmanager holds on to.

Then the only place where we'd need to nullcheck is in the dtor for the element. If we there can't get the ownerdoc we know that the bindingmanager is empty. Otherwise we do what we currently do.
(In reply to comment #75)
> Absolutely. The way to do that is, IMHO, to make sure that node destructors
> never call out to code that attempts to get ownerdoc/bindingmanager.
Node destructor, or actually LastRelease, does always call GetOwnerDoc().
But AFAICT in every case where it does so, it does it in order to clear out things from hashes managed by the document. If the document, upon destruction, cleared out those hashes then it would be fine for LastRelease to do:

nsIDocument* owner = aNode->OwnerDoc();
if (owner) {
  owner->ClearXFor(aNode);
}

We'd have to audit all NodeWillBeDestroyed implementations to make sure they don't try to get the ownerdoc.

Anyhow, I think all this is a secondary project that should happen after this bug is fixed.
(In reply to comment #77)
> If the document, upon destruction,
> cleared out those hashes then it would be fine for LastRelease to do:
> 
> nsIDocument* owner = aNode->OwnerDoc();
> if (owner) {
>   owner->ClearXFor(aNode);
> }

And probably do the same when unlinking a node (can't rely on unlink order, so nodeinfomanager->document might already be broken at that point).
Actually, when unlinking we shouldn't need to do anything since either

1) The document is getting unlinked as well
2) The document is not getting unlinked as well

If 1 then the document getting destroyed will clear the hashes. If 2 then the destructor should be able to get to the ownerdocument and remove appropriate items from hashes.
(In reply to comment #79)
> 1) The document is getting unlinked as well
> 2) The document is not getting unlinked as well
> 
> If 1 then the document getting destroyed will clear the hashes. If 2 then the
> destructor should be able to get to the ownerdocument and remove appropriate
> items from hashes.

Given that unlinking is generally incomplete we should still try to unlink them in case we don't unlink enough to get the destructor to run.
Depends on: 523080
Depends on: 523081
Depends on: 523082
Depends on: 523083
Attachment #393225 - Attachment is obsolete: true
Attachment #393225 - Flags: review?(peterv)
FYI, Acid3 Test 27 (tied to this Bug) suddenly fails with 
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.3a5pre) Gecko/20100413 Minefield/3.7a5pre
All ok on nightly before that.
Hold on, something went (reproducably) wrong on my side, everything back to normal now (verified by others).
Mea culpa
Is this still relevant for the Acid3 test?
Ok. How does it affect the test?
It causes Acid3 numeric tests #26 and/or #27 to fail intermittently a small percentage of the time (depending on garbage collection timing).  See bug 410460 comment 51, etc.
blocking2.0: --- → ?
Smaug, I think we should get this finalized for 1.9.3, and I think you had planned to do so either way. If you don't agree, please let me know.
Assignee: nobody → Olli.Pettay
blocking2.0: ? → final+
status2.0: --- → wanted
Depends on: 588681
This feels risky enough that we want it in beta. Please let me know if you disagree.
blocking2.0: final+ → betaN+
I'm afraid this might be too risky for FF4.

(Sorry that I've been slow with this. The old patch did work without leaks,
but since people didn't quite like it.
*So far* I haven't found other way which doesn't leak.)
(In reply to comment #89)
> I'm afraid this might be too risky for FF4.

Is this now only about the parentNode getting lost per bug summary or does this mean that GetOwnerDoc() can return null in Firefox 4?
This is both.
And GetOwnerDoc() may return null even after this is fixed, at least after
unlink or during dtor.
Any news for this bug?
Yes, smaug is close to having a landable fix for this, but we've decided it's too risky to take this for Firefox 4, so this will need to wait until the following release.
blocking2.0: betaN+ → -
Blocks: 615034
We are at the beginning of a new cycle. Would that be the right time for smaug Patch ?
This is highly regression risky thing. I'd rather do this for Fx6.
Especially since there is only few weeks left Fx5 developing.
Matti, the right way to think about it is that we're halfway through the Fx5 cycle right now: week 3-4 of a 6-week cycle.
(In reply to comment #68)
> I discussed this with Jonas last week, here's what we would like to see: nsNode
> holds strong reference to its nsNodeInfo, nsNodeInfo holds strong reference to
> its nsNodeInfoManager, nsNodeInfoManager holds strong reference to the
> document, nsDocument holds strong reference to its nsBindingManager,
> nsNodeInfoManager doesn't hold the nsBindingManager anymore (partly backing out
> the patch from bug 415192).

I wonder if this setup causes too many cycles. As we saw during AllHands, 
we do create lots of cycles which could be break down without
cycle collector. NS_MAYBEADDREFOWNERDOC is one such way to break down cycles.
...but we may want to have something a bit different anyway.
Depends on: 652970
Attached patch WIP (obsolete) — Splinter Review
This doesn't leak all the time when browsing :)
The idea is to keep nodeinfomanager alive as long as document via
document's nodeinfo. But document's nodeinfo doesn't cause
nodeinfomanager's mDocument to be strong. Other kinds of
nodeinfos keep mDocument alive.
This way GetOwnerDoc() returns always valid value.
(I wish we could get rid of special casing DocumentType)

Uploaded to tryserver to find new problematic cases.
Attached patch WIP2 (obsolete) — Splinter Review
Attachment #528650 - Attachment is obsolete: true
Attached patch WIP3, yet more unlinking (obsolete) — Splinter Review
Attachment #528688 - Attachment is obsolete: true
Depends on: 653420
Depends on: 653497
Depends on: 653716
Depends on: 654182
Attached patch WIP4, for backup (obsolete) — Splinter Review
Attachment #528707 - Attachment is obsolete: true
Attached patch WIP5 (obsolete) — Splinter Review
Waiting for tryserver results. nsSessionStore or something close
to it seems to do something hacky with timers, so not sure if all the tests
pass yet.
Attachment #530134 - Attachment is obsolete: true
Depends on: 662186
Attached patch WIP6Splinter Review
This doesn't seem to leak when running tests.

I need to split the patch and file separate bugs for the certain parts for the patch.
Attachment #538035 - Attachment is obsolete: true
Depends on: 664430
Depends on: 664434
Depends on: 664438
Depends on: 664444
Depends on: 664447
Depends on: 664452
Depends on: 664455
Depends on: 664463
Depends on: 664464
Depends on: 664467
Depends on: 664470
Depends on: 664472
Attached patch patchSplinter Review
This is the patch for this bug. WIP6 contains patches also for bugs which
this bug depend on.
Comment on attachment 539573 [details] [diff] [review]
patch

This certainly can't land before all the other related patches have landed.

The idea is that all the other nodeinfos, except document nodeinfo keep
document alive. That way GetOwnerDoc() returns the right value even
in dtors, and yet there isn't cycle document->ni->nim->document.
Attachment #539573 - Flags: review?(peterv)
Depends on: 664506
Depends on: 665317
I fuzzed WIP6 (along with the followup in bug 664463), and all I found was bug 665317. So I'd say this looks good :)
I pushed the patch to try, and there are some new leaks.
Probably some regression.
Comment on attachment 539573 [details] [diff] [review]
patch

Review of attachment 539573 [details] [diff] [review]:
-----------------------------------------------------------------

I'll review this part already, since any leaks are probably due to problems outside of this patch. It's likely that this patch will lead to more leaks in the future though :-(.

::: content/base/src/nsNodeInfoManager.cpp
@@ +396,5 @@
>  nsNodeInfoManager::RemoveNodeInfo(nsNodeInfo *aNodeInfo)
>  {
>    NS_PRECONDITION(aNodeInfo, "Trying to remove null nodeinfo from manager!");
>  
> +  if (aNodeInfo == mDocumentNodeInfo) {

I'd make this null out mDocument, presumably this can only happen after we released mDocument? We should be able to assert mNonDocumentNodeInfos == 0?

@@ +406,5 @@
> +        // even if mDocument gets deleted.
> +        mDocument->Release();
> +        if (!mDocumentNodeInfo) {
> +          mDocument = nsnull;
> +        }

And then you can remove this.
Btw, I'm not really sure it's correct anyway, what if the document and the nodeinfo manager are also held by non-nodeinfo objects, it seems like we'd never null out mDocument potentially leading to a dangling pointer once the document is destroyed?
Attachment #539573 - Flags: review?(peterv) → review+
nsNodeInfoManager::mDocument is set to null when document is deleted.
And nsDocument::mNodeInfo keeps nodeinfomanager alive.
blocking2.0: - → ---
Depends on: 673806
Attached patch patchSplinter Review
Got all green on try with the previous patch + fix for animation controller unlink.
http://hg.mozilla.org/mozilla-central/rev/c53ee19aeffa

Marking this fixed now, but I may need to back this out if
bad leaks are found.
Also, atm I intend to back this out from the next Aurora, so the target
could be FF9.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Do we need a bug tracking that backout?
Blocks: 674276
Alias: strongparent
Target Milestone: --- → mozilla9
Olli, I assume things are looking good here and we don't need to do anything for aurora 9, right?
I'm not aware of any problems why we should back this out from A9.
Did this get backed out of 9?
Yes.
Depends on: 708572
Target Milestone: mozilla9 → mozilla10
And then the backout got backed out. See bug 711794.
Target Milestone: mozilla10 → mozilla9
Verified fixed in current builds.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.