Closed Bug 231384 Opened 21 years ago Closed 21 years ago

global objects don't go away when closing window

Categories

(SeaMonkey :: General, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.7alpha

People

(Reporter: dbaron, Assigned: dbaron)

Details

(Keywords: memory-leak, Whiteboard: [patch])

Attachments

(7 files, 3 obsolete files)

4.77 KB, patch
brendan
: review+
bryner
: superreview+
Details | Diff | Splinter Review
1.86 KB, patch
bryner
: review+
Details | Diff | Splinter Review
1.30 KB, patch
bryner
: review+
Details | Diff | Splinter Review
4.18 KB, patch
mvl
: review+
bryner
: superreview+
Details | Diff | Splinter Review
4.06 KB, patch
brendan
: review+
bryner
: superreview+
Details | Diff | Splinter Review
2.84 KB, patch
glazou
: review+
bryner
: superreview+
Details | Diff | Splinter Review
3.05 KB, patch
bryner
: review+
brendan
: superreview+
Details | Diff | Splinter Review
We're leaking in cases where global objects (for chrome, not for the content window inside) don't go away because JS objects implementing observers weren't removed. This causes problems in two ways: * if the observers are held by something global (e.g., the pref service), they never go away * if the observers are held by something in the chrome's content, it takes an extra cycle of GC for them to go away, since they're still rooted until after the first cycle of GC, when the content is destroyed
This is useful, and it may as well be checked in. There are too many nsXPCWrappedJS objects for GC_MARK_DEBUG to be helpful without this. (The interface name is a big clue, and the pointer can be correlated with an XPCOM_MEM_ALLOC_LOG that shows where the wrapper was destroyed.)
This is far more important than "WEBSHELL ...". I think it's worth a printf for everybody.
Status: NEW → ASSIGNED
Keywords: mlk
Priority: -- → P1
Target Milestone: --- → mozilla1.7alpha
Oops, the + and - should be on the GLOBAL, not the = (like for WEBSHELL).
Attachment #139366 - Attachment is obsolete: true
OS: Linux → All
Hardware: PC → All
Whiteboard: [patch]
Comment on attachment 139368 [details] [diff] [review] patch for firebird At least for the simplest case of opening and closing a window, this patch makes Firebird destroy the chrome global object promptly after closing a window (the printf is right next to that for the content window's global object).
This is partial. I still need to figure out: 1. an appropriate place to put a removeObserver call for the addObserver call at: http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/extensions/cookie/resources/content/cookieTasksOverlay.xul&rev=1.22&mark=95-97#81 (and perhaps a better place for the CookieTasksStartup call). 2. What's causing: 0889cef0 object 0x8b5c710 ChromeWindow via nsXULPrototypeAttribute::mEventHandler(Function).__proto__(Function).__proto__(Object).__parent__(ChromeWindow). 3. Whatever's left after fixing (2).
Comment on attachment 139365 [details] [diff] [review] debugging code to deal with nsXPCWrappedJS objects better in GC_MARK_DEBUG Doesn't %p include the leading 0x by default on some platforms? I'm going on possibly-years-old porting paranoia. Looks good otherwise. /be
Attachment #139365 - Flags: review?(brendan) → review+
It gives the leading 0x on some platforms if you're using the standard library, but this is the NSPR/JS code. :-)
So I'm wondering whether (2) is the result of changes in revision 1.458 of nsXULElement.cpp -- specifically the very first bit of the changes in attachment 120151 [details] [diff] [review] (on bug 195010). The change to |scopeObject| in that patch seems intentional, but could removing the |context| variable and always using |aContext| be the cause of this leak? Restoring the old code to set |context|, however, leads to more leaks rather than fewer. However, it's possible that the new leaks are the result of a cycle that was previously just something living longer than it should have.
OK, this actually does fix comment 6's issue (2), and gets the suite's browser in good shape (global objects destroyed when they ought to be). Ignore the end of my previous comment -- the extra leaks I was seeing were because I mistakenly substituted |context| for |aContext| for the call to |BindCompiledEventHandler|. This essentially reverts a small part of the patch mentioned in my previous comment, but with assertions and the early returns converted to NS_ENSURE_TRUE / NS_ENSURE_SUCCESS so they don't take up too much room. (This makes some cases assert that didn't before, but any failure here seems quite serious.)
Attachment #139398 - Attachment description: patch to (again) use prototype's nsIScriptContext for event compilation → patch to (again) use prototype's nsIScriptContext for event handler compilation
Comment on attachment 139394 [details] [diff] [review] patch for cookie overlay (needed for suite browser) I don't think there should be any harm in running this code later (onload of the XUL document rather than when the script is first executed), should there?
Attachment #139394 - Flags: review?(dwitte)
Comment on attachment 139398 [details] [diff] [review] patch to (again) use prototype's nsIScriptContext for event handler compilation >+ NS_ENSURE_TRUE(xuldoc, NS_ERROR_UNEXPECTED); Waterson was pretty dead-set against NS_ENSURE* and its hidden returns, but maybe since he's passed the torch, XUL content code should follow other content code? I don't care enough to fight. >+ nsCOMPtr<nsIXULPrototypeDocument> protodoc; >+ rv = xuldoc->GetMasterPrototype(getter_AddRefs(protodoc)); >+ NS_ENSURE_SUCCESS(rv, rv); >+ NS_ENSURE_TRUE(protodoc, NS_ERROR_UNEXPECTED); >+ >+ nsCOMPtr<nsIScriptGlobalObjectOwner> globalOwner = >+ do_QueryInterface(protodoc); >+ nsCOMPtr<nsIScriptGlobalObject> global; >+ globalOwner->GetScriptGlobalObject(getter_AddRefs(global)); >+ NS_ENSURE_TRUE(global, NS_ERROR_UNEXPECTED); >+ >+ rv = global->GetContext(getter_AddRefs(context)); >+ NS_ENSURE_SUCCESS(rv, rv); So with aContext instead of this context, you get a leaked chrome window due to a __proto__.__proto__.__parent__ reference path? And with the special master prototype context, you get no such leak? What XPCOM reference completes the cycle? Sorry if I'm not keeping up as well as I should be. > // Compile the event handler >- rv = aContext->CompileEventHandler(scopeObject, aName, aBody, >+ rv = context->CompileEventHandler(scopeObject, aName, aBody, > aURL, aLineNo, !scopeObject, > aHandler); Indentation off-by-one on argument overflow lines, now. /be
I don't actually see a leak, strictly speaking. It's just the global object for the window doesn't get destroyed until shutdown (presumably the first GC after the prototype document is destroyed). Since the code at the end of this function is storing the compiled event handler in the prototype document and rooting it, we want that root not to entrain things in real (non-prototype) documents that we want to completely go away. I already fixed the indentation in my tree.
dbaron: thanks, this all makes sense now that I've slept a few hours. The null scopeObject optimization I did relieves the prototype document stuff from having to make a XULPDGlobalObject for every XULPrototypeDocument. But, if we use the wrong (delegating, XULDocument) context to compile with a null scope, the JS engine has no other source of Function.__proto__ than to look in cx->globalObjectfor Function.prototype. That ties the shared handler function to the specific chrome window, via the chrome window's cx. So my change was overaggressive without thinking through the null scope case in detail. Thanks again for fixing this, /be
Comment on attachment 139398 [details] [diff] [review] patch to (again) use prototype's nsIScriptContext for event handler compilation r=me, but maybe a comment along the lines of my last one in this bug would help future maintainers? /be
Attachment #139398 - Flags: review?(brendan) → review+
My mock JS expressions had an error, using Function for the new shared event handler function -- and for verbatim 'Function', the function object class constructor: "the JS engine has no other source of Function.__proto__ than to look in cx->globalObjectfor Function.prototype." should have read: "the JS engine has no other source of <the-new-shared-event-handler>.__proto__ than to look in cx->globalObjectfor Function.prototype." I hope this is clearer. /be
Comment on attachment 139426 [details] [diff] [review] patch to (again) use prototype's nsIScriptContext for event handler compilation >+ nsCOMPtr<nsIScriptContext> context; > if (mPrototype) { > // It'll be shared among the instances of the prototype. > // Use null for the scope object when precompiling shared > // prototype scripts. > scopeObject = nsnull; >+ >+ // Use the prototype document's special context. Add "Because scopeObject is null, " before "The JS engine has no" etc. here. r=me again. /be
Attachment #139426 - Flags: review+
Comment on attachment 139434 [details] [diff] [review] patch for suite editor It looks like the same patch will apply to standalone composer.
Attachment #139434 - Flags: review?(daniel)
FWIW, the message compose window seems to be OK, and the main mail window seems to be being held, at least for a bit (if not permanently) by an nsXPCWrappedJS implementing nsISupportsWeakReference, which makes it a bit harder to track down (since it's destroyed by the GC, not by an XPCOM release).
The mailnews main window issue is only that it takes one extra cycle of GC, and the variable causing the problem is http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/mailnews/extensions/smime/resources/content/msgHdrViewSMIMEOverlay.js&rev=1.12&mark=36-119,160#36 The strange thing is that something holds a strong reference to the wrapper (refcount of 2 or more, so it still holds a GC root) after all the mail window webshells have been destroyed. I suspect something like the second case in comment 0 -- a strong pointer held by something that's not destroyed until the first GC happens.
And never mind the stuff about nsISupportsWeakReference -- it's released from JS GC because the thing that owns it is an XPCWrappedNative -- it gets double-wrapped for going through an XPCOM setter/getter. But even that doesn't explain why I'm not seeing any reference counting of the object from native code (since the setter should AddRef) -- unless XPConnect is tricking the stack walking code into skipping a frame.
Attachment #139459 - Flags: superreview?(brendan)
Attachment #139459 - Flags: review?(bryner)
Attachment #139459 - Flags: superreview?(brendan) → superreview+
Attachment #139459 - Flags: review?(bryner) → review+
Attachment #139434 - Flags: superreview?(bryner) → superreview+
Attachment #139426 - Flags: superreview?(bryner) → superreview+
Attachment #139393 - Flags: review?(bryner) → review+
Attachment #139368 - Flags: review?(bryner) → review+
Attachment #139365 - Flags: superreview?(bryner) → superreview+
Attachment #139368 - Flags: superreview?(bugs) → superreview+
Comment on attachment 139393 [details] [diff] [review] partial patch for suite browser sr=ben@mozilla.org
Attachment #139393 - Flags: superreview?(bugs) → superreview+
I've checked in everything except for the cookie overlay fix (waiting for review).
Are these important enough to make it into FB 0.8 branch, too?
Attachment #139394 - Flags: review?(dwitte) → review?(mvl)
Attachment #139394 - Flags: review?(mvl) → review+
Attachment #139394 - Flags: superreview?(bryner) → superreview+
All patches now checked in.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: