Closed Bug 258832 Opened 20 years ago Closed 20 years ago

Leaks introduced by bug 230816

Categories

(Core :: XBL, defect)

x86
Linux
defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: bzbarsky, Assigned: bzbarsky)

References

Details

(Keywords: memory-leak)

Attachments

(3 files, 2 obsolete files)

In short, we're leaking the world (docshell/window/document and all that entails). We don't seem to be leaking the clone function objects, but we _are_ leaking the JSObjects that wrap the native nsIContent when we're compiling/executing our method (nsXBLProtoImplAnonymousMethod::Execute). I'm really not quite sure what to look for here. :(
Summary: Leaks introduced by bug 230816 → Leaks introduced by bug 230816
So http://www.mozilla.org/performance/leak-brownbag.html is still the definitive guide. I'm busy this weekend, in and out, but may be able to help if we can hook up via IRC. Dbaron's help would be swell. /be
I've been poking at it a little, but I need more RAM to run leaksoup on the shutdown leak file (which is 238 MiB).
I'm sort of looking at the comment at http://lxr.mozilla.org/seamonkey/source/dom/src/base/nsJSEnvironment.cpp#1259 and wondering whether I need to be doing something similar here at some point....
bz: that's it! I didn't realize that nsXBLProtoImplMethod::CompileMember calls nsIScriptContext::CompileFunction with aShared == PR_FALSE. You need PR_TRUE for the constructor and destructor, at least, now that you are calling clones of the "brutally shared" anonymous functions. I don't have time to check whether you want aShared == PR_TRUE always. What are the other cases that use nsXBLProtoImplMethod::CompileMember? /be
The other case is a <method> in XBL. What that does is store the result of CompileFunction in mJSMethodObject (like constructor/destructor) and then in InstallMember() clones it and sets the clone as a property using JS_DefineUCProperty. Note that nsXBLProtoImplProperty::CompileMember also passes PR_FALSE for aShared (and then clones things in InstallMember() in the same way).
Attached patch Patch to pass aShared == PR_TRUE (obsolete) — Splinter Review
This patch does _not_ fix the leak, unfortunately....
How is passing aShared == PR_TRUE supposed to help things like: 09123010 string searchEngUrl via nsXBLProtoImplMethod::mJSMethodObject(Function).__proto__(Function).__proto__(Object).__parent__(nsXBLPrototypeScript compilation scope).__proto__(chrome://navigator/content/urlbarBindings.xml#urlbar 9122f31).__proto__(chrome://navigator/content/urlbarBindings.xml#autocomplete-search-box 9122c91)
Also, these GC traces seem a little odd to me: ...(ChromeWindow).chrome://global/content/binding(chrome://global/content/bindings/scrollbar.xml#scrollbar 8def6c1).initScrollbar(Function).__proto__(Function).__proto__(Function).__proto__(Object).__parent__(nsXBLPrototypeScript compilation scope)...
dbaron's right, aShared as currently implemented by dom/src/base/nsJSEnvironment.cpp can't work. This patch might cause it to work, but I have some doubts. It does, at least, make aShared be an independent parameter from the (scope object == null) parameter-condition. It turns out the XUL content code calls CompileEventHandler with a null scope object, as well as with aShared == PR_TRUE. But XBL was passing PR_FALSE, and even with bz's patch in this bug to pass PR_TRUE, XBL passes a non-null aClassObject as the scope object. That entrains the class prototype. With this patch, only the brutally-shared function compiled for the XBL method or ctor/dtor; that function object's prototype, Function.prototype from the shared compilation scope; and the Object.prototype from the same scope, will be entrained by clones created at InstallMember time when calling methods, and at Execute time when calling ctors/dtors. Whether that's enough to fix the leak, bz will have to say. At least it makes things more consistent, both between compiling event handlers and functions, and within the nsJSEnvironment::Compile{EventHandler,Function} methods regarding use of independent scope-object and aShared parameters. /be
Comment 7 cites this reference path: 09123010 string searchEngUrl via nsXBLProtoImplMethod::mJSMethodObject(Function).__proto__(Function).__proto__(Object).__parent__(nsXBLPrototypeScript compilation scope).__proto__(chrome://navigator/content/urlbarBindings.xml#urlbar 9122f31).__proto__(chrome://navigator/content/urlbarBindings.xml#autocomplete-search-box 9122c91) which looks very wrong to me. Why is the XBLPrototypeScript compilation scope prototyped by chrome://navigator/content/urlbarBindings.xml#urlbar (the XBL-created class prototype for that URI's binding), which is prototyped by chrome://navigator/content/urlbarBindings.xml#autocomplete-search-box (a totally different binding!)? It looks like the prototype chain of the compile-time scope object is getting messed up. Is that really happening, or did this trace get garbled (via the JS GC, or via bugzilla copy/paste, or some such)? /be
Duh, urlbar does inherit from autocomplete search box. But the nsXBLPrototypeScript compilation scope should not be prototyped by the urlbar binding's prototype.
So nsXBLProtoImpl::CompilePrototypeMembers(nsXBLPrototypeBinding* aBinding) does this: JSObject* scopeObject = globalObject->GetGlobalJSObject(); nsresult rv = aBinding->InitClass(mClassName, context, scopeObject, &classObject); And nsXBLPrototypeBinding::InitClass just calls like so: return nsXBLBinding::DoInitJSClass(cx, ::JS_GetGlobalObject(cx), scriptObject, aClassName, aClassObject); nsXBLBinding::DoInitJSClass(JSContext *cx, JSObject *global, JSObject *obj, ...) in turn does this just before returning: // Set the prototype of our object to be the new class. if (!::JS_SetPrototype(cx, obj, proto)) { return NS_ERROR_FAILURE; } This looks to me like a sure-fire way to connect an nsXBLPrototypeScript compilation scope object to a particular binding's class prototype. That must be a bug, but isn't it old as the hills? I can't believe it's a feature. Why does nsXBLProtoImpl::CompilePrototypeMembers pass a global object indirectly as the |obj| parameter of DoInitJSClass? The |obj| parameter is supposed to be a bound element's wrapper JSObject. /be
Applying Brendan's patch for JSEnvironment and the aShared patch still leaks. It's pretty easy to notice the leak in any current debug build, btw -- the webshell count and DOMWindow count don't end up at 0. As for comment 12, there are two callers of InitClass() -- nsXBLProtoImpl::InitTargetObjects (which passes the wrapper for the bound element) and nsXBLProtoImpl::CompilePrototypeMembers (which as you noticed is doing something weird). For example, why is it calling InitClass() at all? IT's doing this to get a "obj" param for JS_CompileUCFunctionForPrincipals (in the end), but it's not clear to me what that "obj" should be -- it's not really documented at http://www.mozilla.org/js/spidermonkey/apidoc/gen/api-JS_CompileUCFunctionForPrincipals.html
Blocks: 258921
As implied in my comments here, the |obj| param to JS_Compile*Function* is the scope chain head for the function object being compiled, and the object in which a property names that function, if the |name| param is non-null. If |obj| is null, |name| may still be non-null to give the function an intrinsic name (useful for diagnostics, and for later query to give clones the same name, e.g.). Brutally shared function objects are often named, so as to be shared via property references in multiple global objects, or otherwise (method names in bound element prototypes created via XBL). But brutal sharing should pass null for |obj|, to avoid creating prototype compilation scope objects (we still make 'em, possibly we could eliminate them now that JS_Compile*Function* allows null |obj|), or at least to avoid naming proto-methods in compilation scopes. So clearly, passing the global object from nsXBLProtoImpl::CompilePrototypeMembers to InitClass is wrong. I think the calling code here should pass null, and DoInitJSClass be tweaked to cope with a null |obj| param. Comments? /be
I'm advocating fixing nsXBLProtoImpl::CompilePrototypeMembers now, even if it doesn't help fix the leak, just to clear the muddy waters a bit. With the pointer stew created via XBL and JS, it's going to be tough to find the cycle created by bz's patch (I still don't see it, but I'll keep staring at that patch). Leaksoup would be good, served nice and hot on the MF's 1GB machine.... /be
Attached patch Fix CompilePrototypeMembers (obsolete) — Splinter Review
I agree that it doesn't make much sense to pass in a non-null scope object to functions we just plan to clone into a different scope. Which makes me wonder why we bother passing a class (return value of JS_InitClass) to all the CompileMember functions in the first place... I suspect it's not really needed. In any case, this patch handles the null-obj case by just skipping the proto-mangling for null obj (while still calling JS_InitClass for that case so the rest of XBL doesn't freak.. but perhaps the right fix is to unfreak all of XBL? Or do we really need the JS_InitClass here?). Applying this together with the other two patches in this bug still doesn't fix the leak. :(
Comment on attachment 158840 [details] [diff] [review] Fix CompilePrototypeMembers Chatted with bz, wanted to record the answer to this question here: > Which makes me wonder why we bother passing a class (return value of > JS_InitClass) to all the CompileMember functions in the first place... I > suspect it's not really needed. JS_InitClass returns the class prototype object. It's needed when applying a binding to give each bound doc (actually, each disconnected prototype chain in a bound doc that runs through the binding being applied) its own prototype holding the binding members. With the advent of brutal sharing, mscott reused InitClass to make a compile-time home for the precompiled binding methods. That was cool, but passing the global object in was not. Pre-emptive r+sr=me. /be
Attachment #158840 - Flags: superreview+
Attachment #158840 - Flags: review+
*** Bug 259276 has been marked as a duplicate of this bug. ***
Comment on attachment 158840 [details] [diff] [review] Fix CompilePrototypeMembers Checked in this part.
Attachment #158840 - Attachment is obsolete: true
So I ran leaksoup. The leak roots contained a lot of things that I'd expect to be leaked from global variables or from things that shut down without releasing them (JS GC roots holding XPCWrappedNative, nsGenericElement's hash holding nsEventListenerManager objects, nsXBLPrototypeBinding's pools). I didn't see anything particularly unusual. So I'd say it seems likely that the problem is related to a cycle rooted by an nsXPCWrappedNative. I'm not sure I'd learn very much useful if I tried not destroying the JS runtime, but maybe I'll try it...
This is just the leak roots section of the leaksoup output.
08ad7ef8 object 0x89df478 XULDocument via nsXPCWrappedJS::mJSObj[nsIDOMEventListener,0x8f37ac8,0x8ecb6b0](Object).tabbox(XULElement).__parent__(XULElement).__parent__(XULElement).__parent__(XULElement).__parent__(XULElement).__parent__(XULElement).__parent__(XULDocument). Probably has something to do with: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/xpfe/global/resources/content/bindings/tabbox.xml&rev=1.33&mark=113-150#113 Could this be new, somehow?
Commenting out the constructor and destructor in tabbox.xml fixes almost all of the leaks.
(And once I do that the few that are left are related to XPCWrappedNativeProto::mJSProtoObject(XPC_WN_ModsAllowed_Proto_JSClass).controllers getter(Function).__proto__(Function).__proto__(Function).__proto__(Object).__parent__(ChromeWindow) .)
The weird thing about tabbox.xml is that the destructor is never called (unless |dump| doesn't work in it).
(The destructor isn't even called if I comment out the content of the constructor.)
And it was called in a nightly before your patch (and wasn't in a nightly after).
When we calll nsXBLProtoImplAnonymousMethod::Execute to run the destructor, mJSMethodObject is null.
Hmm.. there is definitely a problem with XBL destructors.. (see 258921). I'll look into it tonight.
... which is because mUncompiledMethod->mBodyText.GetText() returned null in CompileMethod, so CompileMethod didn't do anything.
Attached patch patchSplinter Review
checked in with verbal r+sr=brendan
One tiny line lost, and we didn't see it reading and re-reading the diff. - has low visual weight, and the colored side-by-side diff tends to pull one's focus to the right. Sorry I missed this, major kudos to dbaron for finding it! /be
*** Bug 259277 has been marked as a duplicate of this bug. ***
Comment on attachment 158594 [details] [diff] [review] Patch to pass aShared == PR_TRUE Well, dbaron's patch has fixed this bug (leaks are back to pre-bug 230816 levels). I still think this is worth landing, though.
Attachment #158594 - Flags: superreview?(brendan)
Attachment #158594 - Flags: review?(brendan)
Comment on attachment 158806 [details] [diff] [review] try this, it might fix the leak And what about this one?
Comment on attachment 158594 [details] [diff] [review] Patch to pass aShared == PR_TRUE This is fine. Let me whip up a better nsJSEnvironment.cpp patch, if I have time right now. /be
Attachment #158594 - Flags: superreview?(brendan)
Attachment #158594 - Flags: superreview+
Attachment #158594 - Flags: review?(brendan)
Attachment #158594 - Flags: review+
Comment on attachment 158594 [details] [diff] [review] Patch to pass aShared == PR_TRUE Checked in.
Attachment #158594 - Attachment is obsolete: true
Resolving fixed (by dbaron's patch). Brendan, it'd be good to see that nsJSEnvironment patch updated; should I file a separate bug for that?
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
bz: please do, it's too much to fit into this bug, and into my brain right now. VERIFYing. /be
Status: RESOLVED → VERIFIED
Bug 262948 filed on that.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: