Closed Bug 258832 Opened 15 years ago Closed 15 years ago

Leaks introduced by bug 230816

Categories

(Core :: XBL, defect)

x86
Linux
defect
Not set

Tracking

()

VERIFIED FIXED

People

(Reporter: bzbarsky, Assigned: bzbarsky)

References

Details

(Keywords: memory-leak)

Attachments

(3 files, 2 obsolete files)

In short, we're leaking the world (docshell/window/document and all that entails).

We don't seem to be leaking the clone function objects, but we _are_ leaking the
JSObjects that wrap the native nsIContent when we're compiling/executing our
method (nsXBLProtoImplAnonymousMethod::Execute).

I'm really not quite sure what to look for here.  :(
Summary: Leaks introduced by bug 230816 → Leaks introduced by bug 230816
So http://www.mozilla.org/performance/leak-brownbag.html is still the definitive
guide.  I'm busy this weekend, in and out, but may be able to help if we can
hook up via IRC.  Dbaron's help would be swell.

/be
I've been poking at it a little, but I need more RAM to run leaksoup on the
shutdown leak file (which is 238 MiB).
I'm sort of looking at the comment at
http://lxr.mozilla.org/seamonkey/source/dom/src/base/nsJSEnvironment.cpp#1259
and wondering whether I need to be doing something similar here at some point....
bz: that's it!  I didn't realize that nsXBLProtoImplMethod::CompileMember calls
nsIScriptContext::CompileFunction with aShared == PR_FALSE.  You need PR_TRUE
for the constructor and destructor, at least, now that you are calling clones of
the "brutally shared" anonymous functions.

I don't have time to check whether you want aShared == PR_TRUE always.  What are
the other cases that use nsXBLProtoImplMethod::CompileMember?

/be
The other case is a <method> in XBL.  What that does is store the result of
CompileFunction in mJSMethodObject (like constructor/destructor) and then in
InstallMember() clones it and sets the clone as a property using
JS_DefineUCProperty.

Note that nsXBLProtoImplProperty::CompileMember also passes PR_FALSE for aShared
(and then clones things in InstallMember() in the same way).
Attached patch Patch to pass aShared == PR_TRUE (obsolete) — Splinter Review
This patch does _not_ fix the leak, unfortunately....
How is passing aShared == PR_TRUE supposed to help things like:

09123010 string searchEngUrl via
nsXBLProtoImplMethod::mJSMethodObject(Function).__proto__(Function).__proto__(Object).__parent__(nsXBLPrototypeScript
compilation
scope).__proto__(chrome://navigator/content/urlbarBindings.xml#urlbar
9122f31).__proto__(chrome://navigator/content/urlbarBindings.xml#autocomplete-search-box
9122c91)
Also, these GC traces seem a little odd to me:

...(ChromeWindow).chrome://global/content/binding(chrome://global/content/bindings/scrollbar.xml#scrollbar
8def6c1).initScrollbar(Function).__proto__(Function).__proto__(Function).__proto__(Object).__parent__(nsXBLPrototypeScript
compilation scope)...
dbaron's right, aShared as currently implemented by
dom/src/base/nsJSEnvironment.cpp can't work.  This patch might cause it to
work, but I have some doubts.  It does, at least, make aShared be an
independent parameter from the (scope object == null) parameter-condition.

It turns out the XUL content code calls CompileEventHandler with a null scope
object, as well as with aShared == PR_TRUE.  But XBL was passing PR_FALSE, and
even with bz's patch in this bug to pass PR_TRUE, XBL passes a non-null
aClassObject as the scope object.  That entrains the class prototype.

With this patch, only the brutally-shared function compiled for the XBL method
or ctor/dtor; that function object's prototype, Function.prototype from the
shared compilation scope; and the Object.prototype from the same scope, will be
entrained by clones created at InstallMember time when calling methods, and at
Execute time when calling ctors/dtors.

Whether that's enough to fix the leak, bz will have to say.  At least it makes
things more consistent, both between compiling event handlers and functions,
and within the nsJSEnvironment::Compile{EventHandler,Function} methods
regarding use of independent scope-object and aShared parameters.

/be
Comment 7 cites this reference path:

09123010 string searchEngUrl via
nsXBLProtoImplMethod::mJSMethodObject(Function).__proto__(Function).__proto__(Object).__parent__(nsXBLPrototypeScript
compilation
scope).__proto__(chrome://navigator/content/urlbarBindings.xml#urlbar
9122f31).__proto__(chrome://navigator/content/urlbarBindings.xml#autocomplete-search-box
9122c91)

which looks very wrong to me.  Why is the XBLPrototypeScript compilation scope
prototyped by chrome://navigator/content/urlbarBindings.xml#urlbar (the
XBL-created class prototype for that URI's binding), which is prototyped by
chrome://navigator/content/urlbarBindings.xml#autocomplete-search-box (a totally
different binding!)?

It looks like the prototype chain of the compile-time scope object is getting
messed up.  Is that really happening, or did this trace get garbled (via the JS
GC, or via bugzilla copy/paste, or some such)?

/be
Duh, urlbar does inherit from autocomplete search box.  But the
nsXBLPrototypeScript compilation scope should not be prototyped by the urlbar
binding's prototype.
So nsXBLProtoImpl::CompilePrototypeMembers(nsXBLPrototypeBinding* aBinding) does
this:

  JSObject* scopeObject = globalObject->GetGlobalJSObject();
  nsresult rv = aBinding->InitClass(mClassName, context, scopeObject, &classObject);

And nsXBLPrototypeBinding::InitClass just calls like so:

  return nsXBLBinding::DoInitJSClass(cx, ::JS_GetGlobalObject(cx),
                                     scriptObject, aClassName, aClassObject);

nsXBLBinding::DoInitJSClass(JSContext *cx, JSObject *global, JSObject *obj, ...)
in turn does this just before returning:
                                                                                
  // Set the prototype of our object to be the new class.
  if (!::JS_SetPrototype(cx, obj, proto)) {
    return NS_ERROR_FAILURE;
  }

This looks to me like a sure-fire way to connect an nsXBLPrototypeScript
compilation scope object to a particular binding's class prototype.  That must
be a bug, but isn't it old as the hills?  I can't believe it's a feature.

Why does nsXBLProtoImpl::CompilePrototypeMembers pass a global object indirectly
as the |obj| parameter of DoInitJSClass?  The |obj| parameter is supposed to be
a bound element's wrapper JSObject.

/be
Applying Brendan's patch for JSEnvironment and the aShared patch still leaks. 
It's pretty easy to notice the leak in any current debug build, btw -- the
webshell count and DOMWindow count don't end up at 0.

As for comment 12, there are two callers of InitClass() --
nsXBLProtoImpl::InitTargetObjects (which passes the wrapper for the bound
element) and nsXBLProtoImpl::CompilePrototypeMembers (which as you noticed is
doing something weird).  For example, why is it calling InitClass() at all? 
IT's doing this to get a "obj" param for JS_CompileUCFunctionForPrincipals (in
the end), but it's not clear to me what that "obj" should be -- it's not really
documented at
http://www.mozilla.org/js/spidermonkey/apidoc/gen/api-JS_CompileUCFunctionForPrincipals.html
Blocks: 258921
As implied in my comments here, the |obj| param to JS_Compile*Function* is the
scope chain head for the function object being compiled, and the object in which
a property names that function, if the |name| param is non-null.  If |obj| is
null, |name| may still be non-null to give the function an intrinsic name
(useful for diagnostics, and for later query to give clones the same name, e.g.).

Brutally shared function objects are often named, so as to be shared via
property references in multiple global objects, or otherwise (method names in
bound element prototypes created via XBL).  But brutal sharing should pass null
for |obj|, to avoid creating prototype compilation scope objects (we still make
'em, possibly we could eliminate them now that JS_Compile*Function* allows null
|obj|), or at least to avoid naming proto-methods in compilation scopes.

So clearly, passing the global object from
nsXBLProtoImpl::CompilePrototypeMembers to InitClass is wrong.  I think the
calling code here should pass null, and DoInitJSClass be tweaked to cope with a
null |obj| param.  Comments?

/be
I'm advocating fixing nsXBLProtoImpl::CompilePrototypeMembers now, even if it
doesn't help fix the leak, just to clear the muddy waters a bit.  With the
pointer stew created via XBL and JS, it's going to be tough to find the cycle
created by bz's patch (I still don't see it, but I'll keep staring at that
patch).  Leaksoup would be good, served nice and hot on the MF's 1GB machine....

/be
Attached patch Fix CompilePrototypeMembers (obsolete) — Splinter Review
I agree that it doesn't make much sense to pass in a non-null scope object to
functions we just plan to clone into a different scope.

Which makes me wonder why we bother passing a class (return value of
JS_InitClass) to all the CompileMember functions in the first place... I
suspect it's not really needed.

In any case, this patch handles the null-obj case by just skipping the
proto-mangling for null obj (while still calling JS_InitClass for that case so
the rest of XBL doesn't freak.. but perhaps the right fix is to unfreak all of
XBL?  Or do we really need the JS_InitClass here?).

Applying this together with the other two patches in this bug still doesn't fix
the leak.  :(
Comment on attachment 158840 [details] [diff] [review]
Fix CompilePrototypeMembers

Chatted with bz, wanted to record the answer to this question here:

> Which makes me wonder why we bother passing a class (return value of
> JS_InitClass) to all the CompileMember functions in the first place... I
> suspect it's not really needed.

JS_InitClass returns the class prototype object.  It's needed when applying a
binding to give each bound doc (actually, each disconnected prototype chain in
a bound doc that runs through the binding being applied) its own prototype
holding the binding members.  With the advent of brutal sharing, mscott reused
InitClass to make a compile-time home for the precompiled binding methods. 
That was cool, but passing the global object in was not.

Pre-emptive r+sr=me.

/be
Attachment #158840 - Flags: superreview+
Attachment #158840 - Flags: review+
*** Bug 259276 has been marked as a duplicate of this bug. ***
Comment on attachment 158840 [details] [diff] [review]
Fix CompilePrototypeMembers

Checked in  this part.
Attachment #158840 - Attachment is obsolete: true
So I ran leaksoup.  The leak roots contained a lot of things that I'd expect to
be leaked from global variables or from things that shut down without releasing
them (JS GC roots holding XPCWrappedNative, nsGenericElement's hash holding
nsEventListenerManager objects, nsXBLPrototypeBinding's pools).  I didn't see
anything particularly unusual.  So I'd say it seems likely that the problem is
related to a cycle rooted by an nsXPCWrappedNative.  I'm not sure I'd learn very
much useful if I tried not destroying the JS runtime, but maybe I'll try it...
This is just the leak roots section of the leaksoup output.
08ad7ef8 object 0x89df478 XULDocument via
nsXPCWrappedJS::mJSObj[nsIDOMEventListener,0x8f37ac8,0x8ecb6b0](Object).tabbox(XULElement).__parent__(XULElement).__parent__(XULElement).__parent__(XULElement).__parent__(XULElement).__parent__(XULElement).__parent__(XULDocument).

Probably has something to do with:
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/xpfe/global/resources/content/bindings/tabbox.xml&rev=1.33&mark=113-150#113

Could this be new, somehow?
Commenting out the constructor and destructor in tabbox.xml fixes almost all of
the leaks.
(And once I do that the few that are left are related to

XPCWrappedNativeProto::mJSProtoObject(XPC_WN_ModsAllowed_Proto_JSClass).controllers
getter(Function).__proto__(Function).__proto__(Function).__proto__(Object).__parent__(ChromeWindow)


.)
The weird thing about tabbox.xml is that the destructor is never called (unless
|dump| doesn't work in it).
(The destructor isn't even called if I comment out the content of the constructor.)
And it was called in a nightly before your patch (and wasn't in a nightly after).
When we calll nsXBLProtoImplAnonymousMethod::Execute to run the destructor,
mJSMethodObject is null.
Hmm..  there is definitely a problem with XBL destructors.. (see 258921).  I'll
look into it tonight.
... which is because  mUncompiledMethod->mBodyText.GetText() returned null in
CompileMethod, so CompileMethod didn't do anything.
Attached patch patchSplinter Review
checked in with verbal r+sr=brendan
One tiny line lost, and we didn't see it reading and re-reading the diff.  - has
low visual weight, and the colored side-by-side diff tends to pull one's focus
to the right.  Sorry I missed this, major kudos to dbaron for finding it!

/be
*** Bug 259277 has been marked as a duplicate of this bug. ***
Comment on attachment 158594 [details] [diff] [review]
Patch to pass aShared == PR_TRUE

Well, dbaron's patch has fixed this bug (leaks are back to pre-bug 230816
levels).

I still think this is worth landing, though.
Attachment #158594 - Flags: superreview?(brendan)
Attachment #158594 - Flags: review?(brendan)
Comment on attachment 158806 [details] [diff] [review]
try this, it might fix the leak

And what about this one?
Comment on attachment 158594 [details] [diff] [review]
Patch to pass aShared == PR_TRUE

This is fine.

Let me whip up a better nsJSEnvironment.cpp patch, if I have time right now.

/be
Attachment #158594 - Flags: superreview?(brendan)
Attachment #158594 - Flags: superreview+
Attachment #158594 - Flags: review?(brendan)
Attachment #158594 - Flags: review+
Comment on attachment 158594 [details] [diff] [review]
Patch to pass aShared == PR_TRUE

Checked in.
Attachment #158594 - Attachment is obsolete: true
Resolving fixed (by dbaron's patch).  Brendan, it'd be good to see that
nsJSEnvironment patch updated; should I file a separate bug for that?
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
bz: please do, it's too much to fit into this bug, and into my brain right now.

VERIFYing.

/be
Status: RESOLVED → VERIFIED
Bug 262948 filed on that.
You need to log in before you can comment on or make changes to this bug.