Closed Bug 53901 Opened 24 years ago Closed 17 years ago

nsXULElement::CloneNode sets IS_IN_DOCUMENT flag

Categories

(Core :: XUL, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla1.9alpha5

People

(Reporter: dbaron, Assigned: sicking)

References

Details

(Keywords: embed, memory-leak, Whiteboard: [rtm-])

Attachments

(4 files, 3 obsolete files)

nsXULElement::CloneNode (unlike all the other CloneNode methods, I think), sets
the |mDocument| of the cloned XUL element to non-null in some cases.  This means
that if a JS object for the cloned node is created and it is never added to the
document tree, it leaks, possibly taking the whole document with it.  See bug
52726.

The possible solutions I can think of are:
 * not set mDocument to non-null (maybe override later).  Will this cause other
problems?
 * fix bug 52732 instead
Above stack trace was *with* the patch to fix bug 52726.
I confirmed that the call to nsXULElement::CloneNode immediately before the call
to JS_AddNamedRoot took the branch where mPrototype is nonzero (the one with the
problem).
This, along with bug 52726, have a huge potential for leakage. Adding
"embedding" keyword because I'm sure jud'll be interested. We should fix for RTM.
Status: NEW → ASSIGNED
Keywords: embed, rtm
Whiteboard: [rtm+]
I think the above patch is the fix: I'm testing it now.

nsXULElement::Create() should allow aDocument to be null; if it is null, then
the event handlers and popup stuff'll be set up upon addition to the document
(see nsXULDocument::AddSubtreeToDocument). With that done, we can now safely
pass nsnull as the aDocument argument when calling nsXULElement::CreateElement()
from nsXULElement::Clone().
Priority: P3 → P2
Hrmph. That patch ain't ready for primetime. We're now dropping JS objects.
I suspected as much.  Is it because of GetScriptObject calls and consequent data 
dependencies on the one true script object for a content node being made without 
a JS reference, and then JS_GC runs?  Or are we "dropping" script objects even 
with JS references?

/be
hyatt, I'm pretty sure my fix is right, but I'm having trouble getting XBL to
co-operate. After calling cloneNode(), nsXBLBinding::GenerateAnonymousContent()
calls SetDocument(), which it should. One problem that I've noticed is that
AllowScripts() is always returning false, even in chrome. But even if I
hard-code SetDocument(..., PR_TRUE) to force scripts to be allowed, I'm ending
up with properties being undefined sometimes.
Ok, tried making nsXBLElement::AllowScripts() return PR_TRUE *all* the time.
With this set, some of the auto-complete stuff works in the URL bar and the
addressing widget, but the binding for the return key is busted in the
addressing widget. (Without AllowScripts() hardwired to true, *none* of the
auto-complete stuff works.)
brendan: to answer your question, AFAICT, XBL is trying to make anonymous
content "be in the document". So I suspect that this is what is failing.
(from mail to reviewers@mozilla.org...)

Vidur Apparao wrote:

> I don't believe it's a serious issue for HTML and generic XML elements. Most
> calls to GetScriptObject from non-idlc-generated code are from call points
> that are themselves directly called from idlc-generated glue code
> (nsIJSScriptObject methods, for example).

Ugh, it is an issue with HTML, but I'm not sure how serious. For example, the
HTML below will lose the property set on the "br" element that's supposed to be
hanging off of Node.

XBL is potentially quite screwed. If we fix the leak, then anyone who holds on
to XBL-generated content risks losing properties when an element is removed from
the document (addressing widget breaks mightily, for example). If we leave the
leak in, then we've got a time bomb waiting to leak megabytes of memory.

Thoughts?

chris

<html>
<head>
<script language="JavaScript">
var Node;

function setProperty()
{
  // create a <span> element that we'll never put it in
  // the document.
  Node = document.createElement("span");

  // The only JS ref will be from this local var, which'll
  // die after the function leaves scope.
  var child = document.createElement("br");
  child.randomProperty = "random property!";

  Node.appendChild(child);

  // Kill JS's newborn slot
  document.createElement("br");

  // Now Node.firstChild's JS object is dangling, and
  // will get scooped up at the next GC point!
}

function getProperty()
{
  // try running this before, and then after, a GC has run!
  document.getElementById("propertyValue").value =
Node.firstChild.randomProperty;
}
</script>
</head>
<body onload="setProperty();">
<form>
<input id="propertyValue" type="text"></input>
</form>
<button onclick="getProperty();">Get the property!</button>
</body>
</html>
Marking "needinfo." Will reconsider for inclusion once there is a reviewed and
super reviewed patch.
Whiteboard: [rtm+] → [rtm+ needinfo]
Don't screw XBL!  What did it ever do to you? :)
Whiteboard: [rtm+ needinfo] → [rtm-]
Denominating this bug because of the mess it entails. We'll fix this later. In
the meantime, be careful with Clone'd elements!
I think we want to have this for mozilla1.0, though I'll believe that it's going
to miss the NS6.0 RTM train.
Keywords: mozilla1.0
I suspect that bug 71483 will make fixing this bug irrelevant. I'll re-evaluate 
once the DOM is XPConnected.
Depends on: 71483
Target Milestone: --- → mozilla0.9.1
mDocument must be non-null here.  In fact, I deliberately patched CloneNode so
that it is non-null along both the lightweight and heavyweight paths.  You will
bust mailcompose if you don't ensure a non-null mDocument immediately.
We need to fix it so that nodes never have a null mDocument when created using
createElement or cloneNode.
Allright, let me re-state that. Fixing bug 71483 will hopefully untangle the 
``should this script object be rooted or not'' issue from the mDocument pointer. 
I'll re-examine once jst lands.
Target Milestone: mozilla0.9.1 → mozilla0.9.2
Target Milestone: mozilla0.9.2 → mozilla1.0
Blocks: 92580
No longer blocks: 92580
Just emphasizing that many things will break if this bug summary is taken
literally (and fixed by making the new cloned subtree have a null mDocument ptr).
IMO mDocument should not be null for both created elements and cloned elements.
Target Milestone: mozilla1.0 → Future
OK, so my short take on this if we don't plan to change nsIContent is:

1)  We should absolutely fix this bug as described in the summary
2)  We should fix XBL to stop making some rather incorrect assumptions about how
    document pointers work in gecko.  That will prevent the mailnews bustage
    hyatt speaks of.

That will allow me to sorta work on fixing bug 211128 (creating BindToTree()).

See bug 52732 and its dependencies, however.  It may be worth it to consider
changing nsIContent to expose two methods:

1)  GetOwnerDocument()
2)  IsInDocument()

and convert callers to use these explicitly.  It's very rare that someone
actually wants to get the "document this node is attached to, but only if it's
in a document", I would think....
Blocks: 211128
Depends on: 52732
Just a small note: my patch in bug 27382 (attachment 146411 [details] [diff] [review]) adds InDocument()
to nsGenericElement.
Right.  That's what got me thinking about the api nsIContent should expose...
note that GetOwnerDocument (returning nsIDocument) is also already on
nsGenericElement.
> 1)  GetOwnerDocument()

We might want some other name since otherwise we'll get hide-warnings (and
possibly compile errors) from collisions with the nsIDOMNode::GetOnwerDocument
function.

Maybe GetOwnerDoc() like GetAttr vs. GetAttribute
Depends on: 265086
Blocks: 226620
No longer blocks: 226620
Depends on: 226620
Depends on: 265087
Blocks: 326881
Once this is fixed, we should back out the code in nsGenericElement::doReplaceOrInsertBefore that this bug necessitates.
Blocks: 357453
Blocks: 360078
Assignee: waterson → nobody
Status: ASSIGNED → NEW
QA Contact: jrgmorrison → xptoolkit.xul
Working on this
Assignee: nobody → jonas
Attached patch Patch to fixSplinter Review
This should do it. I'm a bit nervous that the change to make XUL elements not call GetChildCount() during BindToTree will cause badness somehow, but I assert without it, and it seems like the right thing to do.
Attachment #15452 - Attachment is obsolete: true
Attachment #162584 - Attachment is obsolete: true
Attached patch Patch to fix -wSplinter Review
Same as above, diffed with -w
Attachment #264558 - Flags: superreview?(bzbarsky)
Attachment #264558 - Flags: review?(bzbarsky)
It'll might take me a week or so to get to this; I'm pretty swamped right now, and have a lot of reviews on my plate.  :(
Attachment #264558 - Attachment description: Patch to fix -2 → Patch to fix -w
bz, any input would be great, i.e. things to look out for or things that failed when you last attempted this (i tried tabbed browsing and that worked fine).

In the meantime I can ask someone else to review it.
also, do you know the answer to the question in nsGenericElement::UnbindFromTree in the patch?
All the problems I ran into when I last touched this code stemmed from applying XBL bindings to nodes created via createElement() before they were inserted into the DOM.  Since this patch doesn't do it, we should be ok.  ;)

I was sure we had discussion somewhere about the GetChildCount() call in BindToTree, but I guess not...  Taking that out does make sense.

As for UnbindFromTree, I think we should try making that assert fire for XUL as well.  If we hit it, we should figure out why.  I do think it was mostly, if not always, firing because of this bug.
Oh, why check the namespace ID in UnbindFromTree instead of using the FromContent() returning nsXULElement or just IsContentOfType directly?
Comment on attachment 264558 [details] [diff] [review]
Patch to fix -w

Jst, would be great if I could get this reviewed for alpha5.

The answer to the question at the beginning of nsGenericElement::UnbindFromTree seems to be "no" so I'll change that before checkin
Attachment #264558 - Flags: superreview?(jst)
Attachment #264558 - Flags: superreview?(bzbarsky)
Attachment #264558 - Flags: review?(jst)
Attachment #264558 - Flags: review?(bzbarsky)
Oh, and I'll use the FromContent trick too.
Summary: nsXULElement::CloneNode sets non-null mDocument → nsXULElement::CloneNode sets IS_IN_DOCUMENT flag
Comment on attachment 264558 [details] [diff] [review]
Patch to fix -w

Looks good to me, r+sr=jst
Attachment #264558 - Flags: superreview?(jst)
Attachment #264558 - Flags: superreview+
Attachment #264558 - Flags: review?(jst)
Attachment #264558 - Flags: review+
Depends on: 380872
Depends on: 380990
(In reply to comment #35)
> All the problems I ran into when I last touched this code stemmed from applying
> XBL bindings to nodes created via createElement() before they were inserted
> into the DOM.  Since this patch doesn't do it, we should be ok.  ;)

See bug 380990.
Comment on attachment 264558 [details] [diff] [review]
Patch to fix -w

>Index: content/base/public/nsINode.h
>+// Forces the XBL code to treat this node as if it was

s/was/were/

>+// in the document and therefor should get bindings attached.

"therefore"

>Index: content/xul/content/src/nsXULElement.cpp
>-    // mControllers can own objects that are implemented
>-    // in JavaScript (such as some implementations of
>-    // nsIControllers.

I don't see where we're nulling out mControllers now.  That would introduce leaks through the controllers, per above.  Please restore this code, one way or another (e.g. we could make nsXULElement cycle-collect through mControllers and teach all nsIControllers implementations and everything they hold refs to about the cycle collector, or we could restore this nulling-out code in nsGenericElement::UnbindFromTree.

>Index: content/xul/content/src/nsXULElement.h
>+    // The function exists soly because XUL elements store the binding parent

s/soly/solely/

>+    // differently than nsGenericElement does.

"as a member instead of in the slots, as nsGenericElement does"

Rest looks good, modulo comments coming up in bug 380990.
We should write us some righteous regression tests (testing that the cloneNode thing attaches bindings and createElement does not, or something).
Flags: in-testsuite?
Depends on: 381078
Attached patch Fix bzs commentsSplinter Review
Attachment #265328 - Flags: superreview?(bzbarsky)
Attachment #265328 - Flags: review?(bzbarsky)
Comment on attachment 265328 [details] [diff] [review]
Fix bzs comments

Looks good.
Attachment #265328 - Flags: superreview?(bzbarsky)
Attachment #265328 - Flags: superreview+
Attachment #265328 - Flags: review?(bzbarsky)
Attachment #265328 - Flags: review+
Did this cause bug 380913? XBL anonymous nodes bound to cloned created elements don't seem to be getting rendered. Note that cloning an element that's in the document doesn't help and indeed additionally triggers an assertion.
Depends on: 380913
Depends on: 381553
Attached patch MOZ_XUL missed define (obsolete) — Splinter Review
Missed XUL ifdefs.
Attachment #265905 - Flags: superreview?(bzbarsky)
Attachment #265905 - Flags: review?(bzbarsky)
Attachment #265905 - Flags: superreview?(bzbarsky)
Attachment #265905 - Flags: superreview+
Attachment #265905 - Flags: review?(bzbarsky)
Attachment #265905 - Flags: review+
Target Milestone: Future → mozilla1.9alpha5
This was landed a while ago
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
Yep:

2007-05-15 18:13
2007-05-21 15:22
Depends on: 386854
Comment on attachment 265905 [details] [diff] [review]
MOZ_XUL missed define

Near as I can puzzle out from bug 386854, this patch is obsoleted by the one there.
Attachment #265905 - Attachment is obsolete: true
Component: XP Toolkit/Widgets: XUL → XUL
QA Contact: xptoolkit.xul → xptoolkit.widgets
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: