Last Comment Bug 231384 - global objects don't go away when closing window
: global objects don't go away when closing window
Status: RESOLVED FIXED
[patch]
: mlk
Product: SeaMonkey
Classification: Client Software
Component: General (show other bugs)
: Trunk
: All All
: P1 normal (vote)
: mozilla1.7alpha
Assigned To: David Baron :dbaron: ⌚️UTC-10
:
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2004-01-18 13:01 PST by David Baron :dbaron: ⌚️UTC-10
Modified: 2005-02-12 21:34 PST (History)
8 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
debugging code to deal with nsXPCWrappedJS objects better in GC_MARK_DEBUG (4.77 KB, patch)
2004-01-18 13:05 PST, David Baron :dbaron: ⌚️UTC-10
brendan: review+
bryner: superreview+
Details | Diff | Splinter Review
print GLOBAL += N and GLOBAL -= N (1.51 KB, patch)
2004-01-18 13:05 PST, David Baron :dbaron: ⌚️UTC-10
no flags Details | Diff | Splinter Review
print GLOBAL+ = N and GLOBAL- = N (1.51 KB, patch)
2004-01-18 13:08 PST, David Baron :dbaron: ⌚️UTC-10
no flags Details | Diff | Splinter Review
patch for firebird (1.86 KB, patch)
2004-01-18 13:09 PST, David Baron :dbaron: ⌚️UTC-10
bryner: review+
bugs: superreview+
Details | Diff | Splinter Review
partial patch for suite browser (1.30 KB, patch)
2004-01-18 18:04 PST, David Baron :dbaron: ⌚️UTC-10
bryner: review+
bugs: superreview+
Details | Diff | Splinter Review
patch for cookie overlay (needed for suite browser) (4.18 KB, patch)
2004-01-18 18:21 PST, David Baron :dbaron: ⌚️UTC-10
mvl: review+
bryner: superreview+
Details | Diff | Splinter Review
patch to (again) use prototype's nsIScriptContext for event handler compilation (3.57 KB, patch)
2004-01-18 20:12 PST, David Baron :dbaron: ⌚️UTC-10
brendan: review+
Details | Diff | Splinter Review
patch to (again) use prototype's nsIScriptContext for event handler compilation (4.06 KB, patch)
2004-01-19 10:22 PST, David Baron :dbaron: ⌚️UTC-10
brendan: review+
bryner: superreview+
Details | Diff | Splinter Review
patch for suite editor (2.84 KB, patch)
2004-01-19 11:52 PST, David Baron :dbaron: ⌚️UTC-10
daniel: review+
bryner: superreview+
Details | Diff | Splinter Review
print ++GLOBAL, etc. (3.05 KB, patch)
2004-01-19 16:45 PST, David Baron :dbaron: ⌚️UTC-10
bryner: review+
brendan: superreview+
Details | Diff | Splinter Review

Description David Baron :dbaron: ⌚️UTC-10 2004-01-18 13:01:48 PST
We're leaking in cases where global objects (for chrome, not for the content
window inside) don't go away because JS objects implementing observers weren't
removed.  This causes problems in two ways:

 * if the observers are held by something global (e.g., the pref service), they
never go away

 * if the observers are held by something in the chrome's content, it takes an
extra cycle of GC for them to go away, since they're still rooted until after
the first cycle of GC, when the content is destroyed
Comment 1 David Baron :dbaron: ⌚️UTC-10 2004-01-18 13:05:12 PST
Created attachment 139365 [details] [diff] [review]
debugging code to deal with nsXPCWrappedJS objects better in GC_MARK_DEBUG

This is useful, and it may as well be checked in.  There are too many
nsXPCWrappedJS objects for GC_MARK_DEBUG to be helpful without this.  (The
interface name is a big clue, and the pointer can be correlated with an
XPCOM_MEM_ALLOC_LOG that shows where the wrapper was destroyed.)
Comment 2 David Baron :dbaron: ⌚️UTC-10 2004-01-18 13:05:48 PST
Created attachment 139366 [details] [diff] [review]
print GLOBAL += N and GLOBAL -= N

This is far more important than "WEBSHELL ...".  I think it's worth a printf
for everybody.
Comment 3 David Baron :dbaron: ⌚️UTC-10 2004-01-18 13:08:28 PST
Created attachment 139367 [details] [diff] [review]
print GLOBAL+ = N and GLOBAL- = N

Oops, the + and - should be on the GLOBAL, not the = (like for WEBSHELL).
Comment 4 David Baron :dbaron: ⌚️UTC-10 2004-01-18 13:09:03 PST
Created attachment 139368 [details] [diff] [review]
patch for firebird
Comment 5 David Baron :dbaron: ⌚️UTC-10 2004-01-18 13:44:52 PST
Comment on attachment 139368 [details] [diff] [review]
patch for firebird

At least for the simplest case of opening and closing a window, this patch
makes Firebird destroy the chrome global object promptly after closing a window
(the printf is right next to that for the content window's global object).
Comment 6 David Baron :dbaron: ⌚️UTC-10 2004-01-18 18:04:36 PST
Created attachment 139393 [details] [diff] [review]
partial patch for suite browser

This is partial.  I still need to figure out:

 1. an appropriate place to put a removeObserver call for the addObserver call
at:
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/extensions/cookie/resources/content/cookieTasksOverlay.xul&rev=1.22&mark=95-97#81

(and perhaps a better place for the CookieTasksStartup call).

 2. What's causing:
0889cef0 object 0x8b5c710 ChromeWindow via
nsXULPrototypeAttribute::mEventHandler(Function).__proto__(Function).__proto__(Object).__parent__(ChromeWindow).


 3. Whatever's left after fixing (2).
Comment 7 David Baron :dbaron: ⌚️UTC-10 2004-01-18 18:21:01 PST
Created attachment 139394 [details] [diff] [review]
patch for cookie overlay (needed for suite browser)

This solves comment 6's point (1).
Comment 8 Brendan Eich [:brendan] 2004-01-18 18:36:02 PST
Comment on attachment 139365 [details] [diff] [review]
debugging code to deal with nsXPCWrappedJS objects better in GC_MARK_DEBUG

Doesn't %p include the leading 0x by default on some platforms?  I'm going on
possibly-years-old porting paranoia.

Looks good otherwise.

/be
Comment 9 David Baron :dbaron: ⌚️UTC-10 2004-01-18 19:20:43 PST
It gives the leading 0x on some platforms if you're using the standard library,
but this is the NSPR/JS code. :-)
Comment 10 David Baron :dbaron: ⌚️UTC-10 2004-01-18 20:03:23 PST
So I'm wondering whether (2) is the result of changes in revision 1.458 of
nsXULElement.cpp -- specifically the very first bit of the changes in attachment
120151 [details] [diff] [review] (on bug 195010).  The change to |scopeObject| in that patch seems
intentional, but could removing the |context| variable and always using
|aContext| be the cause of this leak?

Restoring the old code to set |context|, however, leads to more leaks rather
than fewer.  However, it's possible that the new leaks are the result of a cycle
that was previously just something living longer than it should have.
Comment 11 David Baron :dbaron: ⌚️UTC-10 2004-01-18 20:12:59 PST
Created attachment 139398 [details] [diff] [review]
patch to (again) use prototype's nsIScriptContext for event handler compilation

OK, this actually does fix comment 6's issue (2), and gets the suite's browser
in good shape (global objects destroyed when they ought to be).

Ignore the end of my previous comment -- the extra leaks I was seeing were
because I mistakenly substituted |context| for |aContext| for the call to
|BindCompiledEventHandler|.

This essentially reverts a small part of the patch mentioned in my previous
comment, but with assertions and the early returns converted to NS_ENSURE_TRUE
/ NS_ENSURE_SUCCESS so they don't take up too much room.  (This makes some
cases assert that didn't before, but any failure here seems quite serious.)
Comment 12 David Baron :dbaron: ⌚️UTC-10 2004-01-18 20:24:54 PST
Comment on attachment 139394 [details] [diff] [review]
patch for cookie overlay (needed for suite browser)

I don't think there should be any harm in running this code later (onload of
the XUL document rather than when the script is first executed), should there?
Comment 13 Brendan Eich [:brendan] 2004-01-18 21:16:41 PST
Comment on attachment 139398 [details] [diff] [review]
patch to (again) use prototype's nsIScriptContext for event handler compilation


>+        NS_ENSURE_TRUE(xuldoc, NS_ERROR_UNEXPECTED);

Waterson was pretty dead-set against NS_ENSURE* and its hidden returns, but
maybe since he's passed the torch, XUL content code should follow other content
code?  I don't care enough to fight.

>+        nsCOMPtr<nsIXULPrototypeDocument> protodoc;
>+        rv = xuldoc->GetMasterPrototype(getter_AddRefs(protodoc));
>+        NS_ENSURE_SUCCESS(rv, rv);
>+        NS_ENSURE_TRUE(protodoc, NS_ERROR_UNEXPECTED);
>+
>+        nsCOMPtr<nsIScriptGlobalObjectOwner> globalOwner =
>+            do_QueryInterface(protodoc);
>+        nsCOMPtr<nsIScriptGlobalObject> global;
>+        globalOwner->GetScriptGlobalObject(getter_AddRefs(global));
>+        NS_ENSURE_TRUE(global, NS_ERROR_UNEXPECTED);
>+
>+        rv = global->GetContext(getter_AddRefs(context));
>+        NS_ENSURE_SUCCESS(rv, rv);

So with aContext instead of this context, you get a leaked chrome window due to
a __proto__.__proto__.__parent__ reference path?  And with the special master
prototype context, you get no such leak?  What XPCOM reference completes the
cycle?	Sorry if I'm not keeping up as well as I should be.

>     // Compile the event handler
>-    rv = aContext->CompileEventHandler(scopeObject, aName, aBody,
>+    rv = context->CompileEventHandler(scopeObject, aName, aBody,
>                                        aURL, aLineNo, !scopeObject,
>                                        aHandler);

Indentation off-by-one on argument overflow lines, now.

/be
Comment 14 David Baron :dbaron: ⌚️UTC-10 2004-01-18 21:32:15 PST
I don't actually see a leak, strictly speaking.  It's just the global object for
the window doesn't get destroyed until shutdown (presumably the first GC after
the prototype document is destroyed).  Since the code at the end of this
function is storing the compiled event handler in the prototype document and
rooting it, we want that root not to entrain things in real (non-prototype)
documents that we want to completely go away.

I already fixed the indentation in my tree.
Comment 15 Brendan Eich [:brendan] 2004-01-19 10:02:21 PST
dbaron: thanks, this all makes sense now that I've slept a few hours.

The null scopeObject optimization I did relieves the prototype document stuff
from having to make a XULPDGlobalObject for every XULPrototypeDocument.  But, if
we use the wrong (delegating, XULDocument) context to compile with a null scope,
the JS engine has no other source of Function.__proto__ than to look in
cx->globalObjectfor Function.prototype.

That ties the shared handler function to the specific chrome window, via the
chrome window's cx.  So my change was overaggressive without thinking through
the null scope case in detail.  Thanks again for fixing this,

/be
Comment 16 Brendan Eich [:brendan] 2004-01-19 10:03:08 PST
Comment on attachment 139398 [details] [diff] [review]
patch to (again) use prototype's nsIScriptContext for event handler compilation

r=me, but maybe a comment along the lines of my last one in this bug would help
future maintainers?

/be
Comment 17 Brendan Eich [:brendan] 2004-01-19 10:05:32 PST
My mock JS expressions had an error, using Function for the new shared event
handler function -- and for verbatim 'Function', the function object class
constructor:

"the JS engine has no other source of Function.__proto__ than to look in
cx->globalObjectfor Function.prototype."

should have read:

"the JS engine has no other source of <the-new-shared-event-handler>.__proto__
than to look in cx->globalObjectfor Function.prototype."

I hope this is clearer.

/be



Comment 18 David Baron :dbaron: ⌚️UTC-10 2004-01-19 10:22:50 PST
Created attachment 139426 [details] [diff] [review]
patch to (again) use prototype's nsIScriptContext for event handler compilation

With added comment and fixed indentation.
Comment 19 Brendan Eich [:brendan] 2004-01-19 11:48:08 PST
Comment on attachment 139426 [details] [diff] [review]
patch to (again) use prototype's nsIScriptContext for event handler compilation

>+    nsCOMPtr<nsIScriptContext> context;
>     if (mPrototype) {
>         // It'll be shared among the instances of the prototype.
>         // Use null for the scope object when precompiling shared
>         // prototype scripts.
>         scopeObject = nsnull;
>+
>+        // Use the prototype document's special context.  

Add "Because scopeObject is null, " before

"The JS engine has no" etc. here.

r=me again.

/be
Comment 20 David Baron :dbaron: ⌚️UTC-10 2004-01-19 11:52:38 PST
Created attachment 139434 [details] [diff] [review]
patch for suite editor
Comment 21 David Baron :dbaron: ⌚️UTC-10 2004-01-19 11:59:49 PST
Comment on attachment 139434 [details] [diff] [review]
patch for suite editor

It looks like the same patch will apply to standalone composer.
Comment 22 David Baron :dbaron: ⌚️UTC-10 2004-01-19 12:01:32 PST
FWIW, the message compose window seems to be OK, and the main mail window seems
to be being held, at least for a bit (if not permanently) by an nsXPCWrappedJS
implementing nsISupportsWeakReference, which makes it a bit harder to track down
(since it's destroyed by the GC, not by an XPCOM release).
Comment 23 Daniel Glazman (:glazou) 2004-01-19 12:19:52 PST
Comment on attachment 139434 [details] [diff] [review]
patch for suite editor

r=daniel@glazman.org
Comment 24 David Baron :dbaron: ⌚️UTC-10 2004-01-19 12:35:05 PST
The mailnews main window issue is only that it takes one extra cycle of GC, and
the variable causing the problem is
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/mailnews/extensions/smime/resources/content/msgHdrViewSMIMEOverlay.js&rev=1.12&mark=36-119,160#36

The strange thing is that something holds a strong reference to the wrapper
(refcount of 2 or more, so it still holds a GC root) after all the mail window
webshells have been destroyed.  I suspect something like the second case in
comment 0 -- a strong pointer held by something that's not destroyed until the
first GC happens.
Comment 25 David Baron :dbaron: ⌚️UTC-10 2004-01-19 12:52:14 PST
And never mind the stuff about nsISupportsWeakReference -- it's released from JS
GC because the thing that owns it is an XPCWrappedNative -- it gets
double-wrapped for going through an XPCOM setter/getter.  But even that doesn't
explain why I'm not seeing any reference counting of the object from native code
(since the setter should AddRef) -- unless XPConnect is tricking the stack
walking code into skipping a frame.
Comment 26 David Baron :dbaron: ⌚️UTC-10 2004-01-19 16:45:38 PST
Created attachment 139459 [details] [diff] [review]
print ++GLOBAL, etc.
Comment 27 Ben Goodger (use ben at mozilla dot org for email) 2004-01-21 17:34:44 PST
Comment on attachment 139368 [details] [diff] [review]
patch for firebird

sr=ben@mozilla.org
Comment 28 Ben Goodger (use ben at mozilla dot org for email) 2004-01-21 17:35:25 PST
Comment on attachment 139393 [details] [diff] [review]
partial patch for suite browser

sr=ben@mozilla.org
Comment 29 David Baron :dbaron: ⌚️UTC-10 2004-01-21 20:57:49 PST
I've checked in everything except for the cookie overlay fix (waiting for review).
Comment 30 alanjstr 2004-01-22 11:14:09 PST
Are these important enough to make it into FB 0.8 branch, too?
Comment 31 David Baron :dbaron: ⌚️UTC-10 2004-01-29 17:43:22 PST
All patches now checked in.

Note You need to log in before you can comment on or make changes to this bug.