Closed Bug 523885 Opened 15 years ago Closed 14 years ago

Tabbrowser handling of window.close possibly leak-prone

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED
mozilla1.9.3a3
Tracking Status
blocking2.0 --- beta1+
status1.9.2 --- ?
status1.9.1 --- ?

People

(Reporter: bzbarsky, Assigned: mozilla+ben)

References

(Blocks 1 open bug)

Details

(Keywords: memory-leak)

Attachments

(3 files, 12 obsolete files)

22.73 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
11.32 KB, patch
Details | Diff | Splinter Review
12.22 KB, patch
jst
: review+
Details | Diff | Splinter Review
dbaron and I were debugging some mochitest failures today, and found that they were due to us running up against our "max open spam popups" limit.  And the reason we were getting there is that while the test was calling window.close() on its popups, tabbrowser cancels the DOMWindowClose event and destroys the relevant tab.  But the result is that we don't call ReallyCloseWindow from nsGlobalWindow::Close, and in particular do not call CleanUp.  It ends up being called from the destructor, if we ever get there.  Whether and when we do depends on the cycle collector.

Ideally, in this case we'd detect that tabbrowser really is closing us and at least set a termination func or post a runnable to call CleanUp.  Not sure whether it's best to do this entirely in the nsGlobalWindow code or with some tabbrowser cooperation.  The former might be nicer in terms of making more consumers work....
Severity: normal → major
Keywords: mlk
OS: Mac OS X → All
Hardware: x86 → All
I think we should probably look into fixing this for 1.9.2.
Flags: blocking1.9.2?
bnewman's work in bug 260264 may be relevant here.
To be clear, the popup thing just pointed out the lack of CleanUp calls.  That lack is the real problem here.
Don't think this blocks release, would take a safe patch but doesn't feel (from my understanding of comment 0) like it's something we absolutely must do in order to release.
Flags: blocking1.9.2? → blocking1.9.2-
My patch for bug 260264 moves the reclamation of gOpenPopupSpamCount from nsGlobalWindow::CleanUp to nsGlobalWindow::SetDocShell. Specifically, it decrements the count when the docshell is set to null, which happens whenever a tab appears to close, whether the user clicked the x button or script called window.close.

This patch contains just that change, without the other popup blocker modifications.  This will safely mitigate the mochitest failures due to non-deterministic popup limiting.

Larger question: could we simply call CleanUp when SetDocShell is called with null?
Blocks: 260264
(In reply to comment #5)
> This patch contains just that change, without the other popup blocker
> modifications.  This will safely mitigate the mochitest failures due to
> non-deterministic popup limiting.

We've already mitigated that by setting the popup limiting pref in the test profile.

The bigger issue here is that failure to call CleanUp can lead to leaks.  We should call CleanUp at close time, not from GC.
I think this bug has one of the highest risks of user-perceivable leaks of any Gecko bugs I've seen for quite a while, and should be a high priority.
blocking2.0: --- → ?
Comment on attachment 409153 [details] [diff] [review]
Immediately reclaim popup spam count.

Obsolete due to
https://bugzilla.mozilla.org/show_bug.cgi?id=260264#c76
Attachment #409153 - Attachment is obsolete: true
Ben, wanna give that a shot (calling CleanUp() from SetDocShell())?
Assignee: nobody → bnewman
blocking2.0: ? → beta1
I went through each of nsGlobalWindow's data members (including those inherited), traced their lifetimes, paying close attention to how they needed to be reclaimed, and rewrote the CleanUp() method to meticulously handle each member properly.  I then eliminated code from various other CleanUp()ish methods by having them call CleanUp() instead.  CleanUp() is idempotent, so calling it more than once is not a problem.

Waiting for another round of tryserver results (http://hg.mozilla.org/try/rev/34177a9cf92d).  I'll request review if all the tests pass.
Attachment #425091 - Attachment is obsolete: true
Attached patch Right patch this time. (obsolete) — Splinter Review
Correct file path (~/dev/debug/.hg/patches/bug-523885.diff), wrong computer!
This patch reflects a different approach.  Instead of trying to determine the best way of handling each data member, I've merely consolidated the main cleanup routines, and the result appears more reliable.

I'm still failing the assertion on mCleanedUp just before the CleanUpInternal() call in the destructor, so I won't request review just yet.
Attachment #425092 - Attachment is obsolete: true
Blocks: 543528
Guaranteeing that the gOpenPopupSpamCount gets reclaimed as soon as a window is closed (browser windows and tabs alike) has the side-effect of fixing bug 543528.

Removed the offending assertion I mentioned in comment 12, since nsGlobalWindow can't really enforce the guarantee that SetDocShell(nsnull) will be called before ~nsGlobalWindow() or ReallyCloseWindow().

If tryserver results are good, I'll ask for review.
Attachment #425598 - Attachment is obsolete: true
Attachment #425610 - Flags: superreview?(jst)
Attachment #425610 - Flags: review?(bzbarsky)
Comment on attachment 425610 [details] [diff] [review]
Reclaim popup spam count synchronously when posting nsCloseEvents.

I'm satisfied that tryserver mochitest failures are not my fault.

bz, jst, feel free to bump review to someone else if you're swamped.
So... one of those failures was my fault, after all; specifically

  content/html/document/test/test_bug391777.html

which relies on a modal dialog setting its window.returnValue.  Killing inner windows too early interferes with access to this value just after the dialog has closed.
Attachment #425610 - Attachment is obsolete: true
Attachment #426105 - Flags: superreview?(jst)
Attachment #426105 - Flags: review?(bzbarsky)
Attachment #425610 - Flags: superreview?(jst)
Attachment #425610 - Flags: review?(bzbarsky)
Comment on attachment 426105 [details] [diff] [review]
Don't call CleanUp() on inner windows just because an outer window is dying.

1)  The new FreeInnerObjects callsite on the inners is after we've nulled out
    mContext.  That seems wrong; FreeInnerObjects uses the context.  r- due to
    this.
2)  Why do we no longer set mOuterWindow to null on the inner windows?
3)  Since we're not calling CleanUp on the inners anymore... are we sure the
    inners won't end up with dangling pointers?
4)  We have some redundant nulling out of mArguments*, I think.
Attachment #426105 - Flags: review?(bzbarsky) → review-
Releasing mContext later, nulling out w->mOuterWindow, and avoiding cleaning up outer windows whose inner window is a modal dialog.  Next thought: have inner modal dialogs call CleanUp() against their outer windows?
Attachment #426105 - Attachment is obsolete: true
Attachment #426105 - Flags: superreview?(jst)
Can one not navigate in a ShowModalDialog window?
CleanUp may be invoked against a modal dialog from SetDocShell or ReallyCloseWindow before the dialog's returnValue has been examined, because the event loop spins while the dialog is open.

With this patch, CleanUp checks whether it has been called on a modal dialog, and, if so, it sets mCallCleanUpAfterModalDialogCloses to PR_TRUE so that ShowModalDialog will know that it should call CleanUp after examining the dialog's returnValue.

Calling CleanUp on an outer window now calls CleanUp on the current inner window, as used to happen before any of my patches, and all mOuterWindow and mInnerWindow pointers are now nulled out as they used to be.  I've also moved the release of mContext to the point corresponding to where it used to be called in the old version of CleanUp, i.e. after FreeInnerObjects.

These reversions make the unification of CleanUp code more nearly a matter of copying and pasting, which makes it easier to argue that the only change in functionality is how early CleanUp is called.
Attachment #426780 - Attachment is obsolete: true
Attachment #427211 - Flags: review?(bzbarsky)
Forgot to initialize a member.  Doesn't seem worthy of a new patch.

diff --git a/dom/base/nsGlobalWindow.cpp b/dom/base/nsGlobalWindow.cpp
--- a/dom/base/nsGlobalWindow.cpp
+++ b/dom/base/nsGlobalWindow.cpp
@@ -666,8 +666,9 @@ nsGlobalWindow::nsGlobalWindow(nsGlobalW
 #ifdef DEBUG
     , mSetOpenerWindowCalled(PR_FALSE)
 #endif
     , mCleanedUp(PR_FALSE)
+    , mCallCleanUpAfterModalDialogCloses(PR_FALSE)
 {
   memset(mScriptGlobals, 0, sizeof(mScriptGlobals));
   nsLayoutStatics::AddRef();
Ben, do you think you can put up an interdiff from what I reviewed before to what the patch looks like now?
(In reply to comment #21)
> Ben, do you think you can put up an interdiff from what I reviewed before to
> what the patch looks like now?

Certainly.
Comment on attachment 427211 [details] [diff] [review]
Patch that defers CleanUp of modal dialogs.

>-      mOuterWindow = nsnull;

Might be a good idea to keep that, even if we didn't use to have it before.  Just to be safe.

r=bzbarsky with that.  Thank you for the interdiff!
Attachment #427211 - Flags: review?(bzbarsky) → review+
Comment on attachment 427211 [details] [diff] [review]
Patch that defers CleanUp of modal dialogs.

Thanks for your time and your advice, bz!
Attachment #427211 - Flags: superreview?(jst)
Somehow I get a crash when using this patch and have DEBUG_CC defined
(in nsCycleCollectionParticipant.h and nsCycleCollection.h)
s/nsCycleCollection/nsCycleCollector.h/
(In reply to comment #25)
> Somehow I get a crash when using this patch and have DEBUG_CC defined
Oh, and the page for this is gmail. (I was testing bug 526897)
(In reply to comment #25)
> Somehow I get a crash when using this patch and have DEBUG_CC defined
> (in nsCycleCollectionParticipant.h and nsCycleCollection.h)

So, I don't pretend to know enough about DEBUG_CC to know whether this fix is well-advised, but I can reliably prevent the crash by making the following changes:

diff --git a/dom/base/nsGlobalWindow.cpp b/dom/base/nsGlobalWindow.cpp
--- a/dom/base/nsGlobalWindow.cpp
+++ b/dom/base/nsGlobalWindow.cpp
@@ -1017,12 +1017,8 @@ nsGlobalWindow::CleanUp(PRBool aIgnoreMo
     mArgumentsLast = nsnull;
     mArgumentsOrigin = nsnull;
 
     CleanupCachedXBLHandlers(this);
-
-    #ifdef DEBUG
-      nsCycleCollector_DEBUG_shouldBeFreed(static_cast<nsIScriptGlobalObject*>(this));
-    #endif
   }
   // </CleanUp()>
 
   // <~nsGlobalWindow()>
@@ -1171,12 +1167,8 @@ nsGlobalWindow::FreeInnerObjects(PRBool 
     mDummyJavaPluginOwner = nsnull;
   }
 
   CleanupCachedXBLHandlers(this);
-
-#ifdef DEBUG
-  nsCycleCollector_DEBUG_shouldBeFreed(static_cast<nsIScriptGlobalObject*>(this));
-#endif
 }
 
 //*****************************************************************************
 // nsGlobalWindow::nsISupports
After a short and very convincing conversation with peterv.

smaug, r?ing you just to find out if this version of the patch fixes your crash (no need to review the patch itself).
Attachment #427670 - Attachment is obsolete: true
Attachment #428305 - Flags: superreview?(jst)
Attachment #428305 - Flags: review?(Olli.Pettay)
Attachment #428305 - Attachment description: Fixing smaug's DEBUG_CC the right way. → Fixing smaug's DEBUG_CC crash the right way.
Attachment #427211 - Flags: superreview?(jst)
Comment on attachment 428305 [details] [diff] [review]
Fixing smaug's DEBUG_CC crash the right way.

Looks good.
Attachment #428305 - Flags: superreview?(jst) → superreview+
Comment on attachment 428305 [details] [diff] [review]
Fixing smaug's DEBUG_CC crash the right way.

At least it doesn't crash
Attachment #428305 - Flags: review?(Olli.Pettay) → review+
Pushed to mozilla-central:
http://hg.mozilla.org/mozilla-central/rev/c3b327ad051a

Thanks again for the reviews, bz, jst, and smaug!
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Attachment #427211 - Attachment is obsolete: true
Target Milestone: --- → mozilla1.9.3a2
Had to back this patch out this morning because of a mochitest failure:
http://hg.mozilla.org/mozilla-central/rev/edf7a7be7ad8

Sending a more robust patch to the tryserver as we speak.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch Patch fixing failing mochitests. (obsolete) — Splinter Review
I'll request review once this patch clears the tryserver:
http://hg.mozilla.org/try/rev/f0885e2c31b7
Attachment #428305 - Attachment is obsolete: true
For ease of incremental reviewing.
Did it pass on Try?
Status: REOPENED → ASSIGNED
Can we get this landed? Bug 526897 is very critical.
Blocks: 526897
(In reply to comment #37)
> Did it pass on Try?

Apparently not; automation.py still seems to be crashing, which leaks the world, but no individual mochitests are failing any more.

(In reply to comment #38)
> Can we get this landed? Bug 526897 is very critical.

I'm going to request review anyway, and possibly land on mozilla-central in spite of the tryserver not being completely happy with it.  I'll back it out right away if it seems to be contributing to test failures.

Is that an even remotely sane plan?
You can't reproduce the automation.py crash, presumably? What information can we get about that crash?

Requesting review definitely makes sense. Landing on m-c, I'm not sure...
This patch represents my best efforts to make the tryserver happy. bz, feel free to reassign review.
Attachment #429187 - Attachment is obsolete: true
Attachment #430822 - Flags: review?(bzbarsky)
Changes since the last reviewed version of the patch.
Attachment #429189 - Attachment is obsolete: true
Ben, can you describe what that last patch is really changing?
Comment on attachment 430823 [details] [diff] [review]
Latest interdiff.

(In reply to comment #43)
> Ben, can you describe what that last patch is really changing?

Glad to.

In one mochitest a security check was failing because it tried to access an inner window of a window that had already been cleaned up, so I added this check earlier to make sure the window had not yet been cleaned up (or otherwise closed):

>@@ -4336,23 +4336,27 @@ nsDOMClassInfo::PostCreatePrototype(JSCo
...
>   nsGlobalWindow *win = nsGlobalWindow::FromSupports(globalNative);
>+  if (win->IsClosedOrClosing()) {
>+    return NS_OK;
>+  }
>   if (win->IsOuterWindow()) {
>     // XXXjst: Do security checks here when we remove the security
>     // checks on the inner window.
> 
>     win = win->GetCurrentInnerWindowInternal();
> 
>-    if (!win || !(global = win->GetGlobalJSObject())) {
>+    if (!win || !(global = win->GetGlobalJSObject()) ||
>+        win->IsClosedOrClosing()) {
>       return NS_OK;
>     }
>   }

I moved some code out of CleanUp that used to be in SetDocShell (and that executed whenever SetDocShell was called with aDocShell == null):

>-  // <SetDocShell(nsnull)>
>-  {
>-    NS_ASSERTION(!mDocShell, "Should already have nulled out mDocShell.");
>-
>-    MaybeForgiveSpamCount();
...
>-  // </SetDocShell(nsnull)>

and decomposed it into a method called CleanUpAfterDocShell:

>+  // Just in case we didn't do this before.
>+  CleanUpAfterDocShell();

I thought we might be calling nsCycleCollector_DEBUG_shouldBeFreed(mContext) when we weren't sure yet that the mContext needed to be released, so I took out that call.  This change kept some mochitests from crashing locally, but didn't seem to have much effect on the tryserver, unfortunately.
 
>-    if (mContext) {
>-#ifdef DEBUG
>-      nsCycleCollector_DEBUG_shouldBeFreed(mContext);
>-#endif
>-      mContext = nsnull;          // Forces Release
>-    }
>+    mContext = nsnull;

The decomposed method I mentioned earlier:

> void
>+nsGlobalWindow::CleanUpAfterDocShell()
>+{
>+  if (mCleanedUpAfterDocShell)
>+    return;
>+  mCleanedUpAfterDocShell = PR_TRUE;
...
>+}

In another attempt to be more conservative about when CleanUp gets called, I decided to call ReallyCloseWindow (which calls CleanUp) from SetDocShell only when !aDocShell and IsOuterWindow().  Since SetDocShell originally performed the CleanUpAfterDocShell operations whenever aDocShell was null, I call that method here as well, even if !IsOuterWindow().

>+void
> nsGlobalWindow::SetDocShell(nsIDocShell* aDocShell)
> {
...
>   if (!mDocShell) {
>+    CleanUpAfterDocShell();
>     NS_ASSERTION(!mCleanedUp, "Should be calling CleanUp for the first time.");
>-    CleanUp(PR_FALSE);
>+    if (IsOuterWindow())
>+      ReallyCloseWindow();

In order to make some failing postMessage mochitests pass, I now decline to post messages when the target has already been cleaned up:

> PostMessageEvent::Run()
> {
...
>-  nsRefPtr<nsGlobalWindow> targetWindow =
>-    mTargetWindow->GetCurrentInnerWindowInternal();
>-  if (!targetWindow)
>+  nsRefPtr<nsGlobalWindow> targetWindow;
>+  if (mTargetWindow->IsClosedOrClosing() ||
>+      !(targetWindow = mTargetWindow->GetCurrentInnerWindowInternal()) ||
>+      targetWindow->IsClosedOrClosing())
>     return NS_OK;

I'm now calling CleanUp unconditionally from ReallyCloseWindow, so that calling ReallyCloseWindow in SetDocShell guarantees that the window will be cleaned up:

>     }
>-
>-    CleanUp(PR_FALSE);
>-  }
>+  }
>+
>+  CleanUp(PR_FALSE);

This is the method that allows me to check whether a window has been cleaned up already:

>+  PRBool IsClosedOrClosing() {
>+    return (mIsClosed ||
>+            mInClose ||
>+            mHavePendingClose ||
>+            mCleanedUp);
>+  }
I'm out of town until Friday, and I won't have access to a development environment before then, but my next experiment will be to focus on the memory leaks and save the more ambitious code consolidation for later.  To that end, I will

1. Make CleanUp idempotent using the mCleanedUp mechanism.
2. Add the modal dialog check to CleanUp, so that we don't clean up until the modal dialog closes.
3. Call the orginal version of CleanUp from SetDocShell when aDocShell == null.
4. Keep the IsClosedOrClosing checks to avoid posting messages to cleaned up windows and failing the security check I mentioned in my previous comment.

This more conservative approach should strictly improve memory usage and avoid the mochitest failures that I've seen so far.  I think the tryserver failures I´m seeing have more to do with the code consolidation that I've been attempting than they do with calling CleanUp earlier.
Attachment #430822 - Flags: review?(bzbarsky) → review+
Comment on attachment 430822 [details] [diff] [review]
Current version of the patch.

r=me, fwiw
The changes in this bug are really a subset of the changes in previously reviewed patches, the difference being that I'm no longer trying to consolidate code from SetDocShell, ~nsGlobalWindow, and CleanUp into a single method.

As I hoped, the tryserver unit tests went all green for these changes (http://hg.mozilla.org/try/rev/65bb9c586882), so it looks like the problems were indeed introduced by the refactoring.

I'll file a followup bug to revisit the consolidation effort, but I'd like to get this safer patch landed asap, to see if it helps with bug 526897.
Attachment #432147 - Flags: review?(jst)
Comment on attachment 432147 [details] [diff] [review]
Patch implementing the simpler approach described in comment #45
[Checkin: Comment 49]

Looks good, one nit below. r=jst

+  static nsresult
+  PostCloseEvent(nsGlobalWindow* aWindow) {
+    nsCOMPtr<nsIRunnable> ev = new nsCloseEvent(aWindow);
+    nsresult rv = NS_DispatchToCurrentThread(ev);
+    if (NS_SUCCEEDED(rv))
+      aWindow->MaybeForgiveSpamCount();
+    return rv;
+  }
+
+   NS_IMETHOD Run() {
+     if (mWindow)
+       mWindow->ReallyCloseWindow();
+     return NS_OK;
+   }

Fix the 3-space indentation.
Attachment #432147 - Flags: review?(jst) → review+
Pushed to mozilla-central:
http://hg.mozilla.org/mozilla-central/rev/db88a703350f

At this point, all mozilla-central tinderbox tests have passed, so I'm going to mark this bug resolved.  Please file followup bugs if memory is still not being reclaimed eagerly enough.

Filed bug 552136 to revisit the CleanUp consolidation effort, per comment #47.
Blocks: 552136
Status: ASSIGNED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → FIXED
status1.9.1: --- → ?
status1.9.2: --- → ?
Attachment #432147 - Attachment description: Patch implementing the simpler approach described in comment #45. → Patch implementing the simpler approach described in comment #45 [Checkin: Comment 49]
Target Milestone: mozilla1.9.3a2 → mozilla1.9.3a3
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: