Closed Bug 353090 Opened 14 years ago Closed 14 years ago

Memory leak on google personalized home page

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: klassphere, Assigned: dbaron)

References

()

Details

(Keywords: fixed1.8.1.1, memory-leak, Whiteboard: [patch])

Attachments

(3 files, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060916 Minefield/3.0a1
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060916 Minefield/3.0a1

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a1) Gecko/20060916 Minefield/3.0a1 ID:2006091617 [cairo]

Reproducible: Always

Steps to Reproduce:
1. Go to http://www.google.com/ig?hl=en
2. Put "test" in the keyword box then hit the Google Search button
3. When the query finishes returning results, press the Back button or the backspace key to go back to the Google home
4. Exit the browser

Actual Results:  
Leaked inner window 28c3c40 (outer 28ce380) at address 28c3c40.
 ... with URI "about:blank".
Leaked outer window 25e0008 at address 25e0008.
Leaked inner window 285c178 (outer 25e0008) at address 285c178.
 ... with URI "http://www.google.com/ig?hl=en".
Leaked inner window 27f9fa8 (outer 25e0008) at address 27f9fa8.
 ... with URI "http://www.google.com/search?hl=en&q=test&btnG=Google+Search".
Leaked inner window 27510c8 (outer 25e0008) at address 27510c8.
 ... with URI "http://www.google.com/ig?hl=en".
Leaked outer window 28ce380 at address 28ce380.
Leaked document at address 2a51088.
 ... with URI "http://www.google.com/ig?hl=en".
Leaked document at address 12efaa0.
 ... with URI "http://www.google.com/ig?hl=en".
Leaked document at address 29084f8.
 ... with URI "http://www.google.com/search?hl=en&q=test&btnG=Google+Search".
Summary:
Leaked 6 out of 13 DOM Windows
Leaked 3 out of 44 documents
Leaked 0 out of 5 docshells
Keywords: mlk
Might be worth testing again with a build that has dbaron's patch for bug 352911 in it.
Just tested and this /isn't/ fixed by dbaron's patch in bug 352911. Also, this is NOT a regression since the 1.8 branch.
OS: Windows XP → All
Hardware: PC → All
Not fixed by the patch in bug 353022 either, although it might reduce the number of windows leaked (since I'm only seeing 3 windows leaked).
I'm seeing two documents leaked, via their wrappers:

0933a758 object 0x9791640 HTMLDocument via locked object(Window @ 0x092d79c8).XPCWrappedNativeProto::mJSProtoObject(XPC_WN_ModsAllowed_Proto_JSClass @ 0x092d79d0).XPCWrappedNativeScope::mPrototypeJSObject(Object @ 0x092d7968).constructor(Function @ 0x092d7958).__proto__(Function @ 0x092d79a0).bind(Function @ 0x09b082f0).__parent__(Call @ 0x09b07ad8).gd(Function @ 0x09b08328).__parent__(Call @ 0x09b08300).a(Object @ 0x09b08320).3_keydown_4_false(Object @ 0x098214b8).listener(Function @ 0x09821450).__parent__(Call @ 0x098213c0).b(HTMLDocument @ 0x0933a758).

09573340 object 0x959e4c0 HTMLDocument via locked object(Window @ 0x09572e20).XPCWrappedNativeProto::mJSProtoObject(XPC_WN_ModsAllowed_Proto_JSClass @ 0x09572e18).XPCWrappedNativeScope::mPrototypeJSObject(Object @ 0x09572e60).constructor(Function @ 0x09572e68).__proto__(Function @ 0x09572e30).bind(Function @ 0x09931360).__parent__(Call @ 0x09903b48).gd(Function @ 0x09931398).__parent__(Call @ 0x09931370).a(Object @ 0x09931390).3_keydown_4_false(Object @ 0x099fb150).listener(Function @ 0x099fb0e8).__parent__(Call @ 0x099fb058).b(HTMLDocument @ 0x09573340).

seems to indicate yet another missing ClearScope, perhaps?
Given that the two docs I'm seeing leaked are both the same page, perhaps it's a problem related to not being cacheable in the fastback cache...
I don't see anything that calls FreeInnerObjects for windows for documents that we can't put in the fastback cache (I'm assuming we never bother creating a WindowStateHolder, although maybe I'm wrong).  What should do this?
Blocks: 353114
Well, we do a ClearScope, but not a FreeInnerObjects -- and perhaps it's the disconnecting of mListenerManager in FreeInnerObjects that's crucial here, although adding just disconnecting mListenerManager doesn't fix the leak.
It's worth noting that all I need to do to see the leak is load the page http://www.google.com/ig?hl=en (from the command line) and close the browser window.  No back/forward needed.
Summary: Memory leak after submitting a Google query and going back → Memory leak on google personalized home page
(Note also that what I see *with* the two above-mentioned patches on the trunk matches what I see on the 1.8 branch -- just the www.google.com/ig pages leaked, and not the search results page.)
I'm thinking the problem here is just that JS_ClearScope on the global object doesn't clean up what's in comment 4.
Flags: blocking1.9?
Flags: blocking1.8.1.1?
Attached patch possible patch (obsolete) — Splinter Review
This fixes the leak.  Presumably only the stuff in the JS_GetPrototype chain is stuff that Web pages could actually attach things to.  (All of the JS_GetParent of these objects should either be null or point back to the global object, right?)

Would I also want to do JS_ClearWatchPointsForObject for all of these as well?  I'm guessing maybe we do, but I don't know why it is that we do that...
Attachment #239091 - Flags: superreview?(bzbarsky)
Attachment #239091 - Flags: review?(jst)
Assignee: general → dbaron
Whiteboard: [patch]
KL -- by the way, thanks for filing this bug.  Having more people looking at leak-gauge output, and having clear steps to reproduce for a leak like this are both really useful.
Comment on attachment 239091 [details] [diff] [review]
possible patch

This will unfortunately potentially break things.

The way out prototype story goes wrt to inner/outer global objects is that the outer window's prototype is also used by the inner as its prototype. That means if we use this code and call ClearScope() on the outer window we'll also remove properties from the inner window's prototype, and if this inner is going into the bfcache then any properties on it's prototype will be lost by the time it's restored from the bfcache.

This could easily be fixed by adding another argument to ClearScope() and conditionally clear the prototypes scopes only when we're really done with a window (i.e. it's falling out of bfcache or we're leaving a page that won't be cached.
Attachment #239091 - Flags: superreview?(bzbarsky)
Attachment #239091 - Flags: review?(jst)
Attachment #239091 - Flags: review-
I believe the JS_ClearWatchPointsForObject calls are there to fix bug 38959.  There _used_ to be an excellent comment at http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/dom/src/base/nsGlobalWindow.cpp&rev=1.321&mark=254-263#253  right before clearing watchpoints, but it looks like in bug 296639 the comment and code got separated (with an "XXXjst: Update above comment" tossed in) and at this point we're left with the comment at http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/dom/src/base/nsGlobalWindow.cpp&rev=1.882&mark=1108-1119#11 and the watchpoint removal happening in a totally different file.  We should probably move the comment over to the new JS_ClearWatchPointsForObject callsite.
(In reply to comment #13)
> This could easily be fixed by adding another argument to ClearScope() and
> conditionally clear the prototypes scopes only when we're really done with a
> window (i.e. it's falling out of bfcache or we're leaving a page that won't be
> cached.

Would that argument match the existing argument for whether to invalidate the global scope polluter?
Attached patch patchSplinter Review
So here's a rather more complicated patch that cleans some things up, although it's still pretty small given the length of this explanation.  Let me explain the logic here:

In comment 15, I asked whether the parameter to add would match the parameter for invalidating the global scope polluter.  It turns out that this is the case, *if* I clean up a bit of a historic oddity, which is that there's effectively one call to FreeInnerObjects (the case of clearing an inner window  not being put into fastback that puzzled me in comment 6 and comment 7) that, rather that being an explicit call, is split between two separate parts of SetNewDocument:
http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/dom/src/base/nsGlobalWindow.cpp&rev=1.884&mark=1218-1231,1374-1378,1385-1395#1056
This adds up to being the same as FreeInnerObjects, plus an extra InvalidateGlobalScopePolluter in the reUseInnerWindow && aDocument != oldDoc case, with the following caveats:
 * [bug] it skips the multi-language calls to ClearScope in the case where no termination function was set
 * [mental note] the fact that the boolean for invalidate the global scope polluter differs is compensated for by the separate call to InvalidateGlobalScopePolluter
Once this difference in the boolean is fixed by merging the above back into a call to FreeInnerObjects, then I think the booleans do match as I suggested in comment 15.

So, what this patch does is:
 * merge the spread-out code for the oldest case of clearing stuff (and one that's now pretty rarely used) into a call to FreeInnerObjects, by removing the first of the two parts, moving the little extra piece into the reUseInnerWindow case, and switching the boolean to PR_TRUE in ClearWindowScope, and replacing the second chunk with an actual call or calls to FreeInnerObjects (varying depending on whether termFuncSet is true).
 * changes the parameter name and definition to ClearScope to make it clear that it's about whether it's safe to clobber stuff on the prototype chain -- in other words, it's false when we're doing ClearScope for an outer window that's not being torn down
 * moves the watchpoints comment mentioned in comment 14
 * removes extraneous JS_BeginRequest/JS_EndRequest from ClearWindowScope, since the JS implementation of scx->ClearScope has a JSAutoRequest in it -- mhammond added comments that the BeginRequest/EndRequest weren't needed at all the other callers because it was inside; I don't see why it would still be needed here, and I'm assuming that's an omission
 * adds an assertion to nsEventListenerManager that I should have added when I added Disconnect
 * adds an assertion to SetTerminationFunction that I *think* documents when it's supposed to be used, although confirmation of that would be appreciated
 * adds the actual clearing of the prototype chain to fix this leak, in the cases where we were previously clearing the global scope polluter (except for the dangling one in the reUseInnerWindow case)
 * adds an XXXldb comment explaining what I would have done in this patch except it would have made the patch completely unreviewable.  I'll try to remember to do it as a followup -- that is, make FreeInnerObjects contain the loop over languages for ClearScope calls rather than make all its callers do the loop (the part other than the ClearScope is pretty much idempotent -- except for an assertion -- but there's no point doing it multiple times when once is enough).
Attachment #239091 - Attachment is obsolete: true
Attachment #239273 - Flags: superreview?(bzbarsky)
Attachment #239273 - Flags: review?(jst)
(In reply to comment #16)
>  * adds an assertion to SetTerminationFunction that I *think* documents when
> it's supposed to be used, although confirmation of that would be appreciated

Yes, that seems right to me.
Comment on attachment 239273 [details] [diff] [review]
patch

Thanks for moving the comment regarding watch point clearing!

r=jst
Attachment #239273 - Flags: review?(jst) → review+
Comment on attachment 239273 [details] [diff] [review]
patch

>Index: dom/src/base/nsGlobalWindow.cpp
>   if (mDocument) {
>+    // XXXldb We'll hit this when we loop over multiple language script
>+    // contexts.
...
>     NS_ASSERTION(mDoc, "Why is mDoc null?");

If we loop, both mDoc and mDocument will be null, so the assert won't be reached.

sr=bzbarsky in either case, since we're just going to fix the looping thing anyway.

Thanks for doing this!
Attachment #239273 - Flags: superreview?(bzbarsky) → superreview+
If you could answer the "XXXldb" comment at the end, that would be great, too. :-)
Attachment #239293 - Flags: superreview?(jst)
Attachment #239293 - Flags: review?(jst)
*** Bug 353646 has been marked as a duplicate of this bug. ***
Flags: blocking1.8.0.9?
*** Bug 323372 has been marked as a duplicate of this bug. ***
A little more data about that XXXldb comment.  There used to be a comment there saying:
          // An outer window shouldn't have a global scope polluter, but
          // in case code on a webpage took one and put it in an outer
          // object's prototype, we need to invalidate it nonetheless.
This is just the minimal leak fix for the 1.8 branch, without any of the cleanup (some of which was post-DOM-Agnostic cleanup that wouldn't apply anyway, and some of which was post-splitwindow cleanup that would).
Attachment #239587 - Flags: superreview?(bzbarsky)
Attachment #239587 - Flags: review?(jst)
Comment on attachment 239293 [details] [diff] [review]
followup patch to push loop into FreeInnerObjects

       // clear all scopes
+      // XXXldb Do we need to do this even if currentInner is null,
+      // because this is the only one passing PR_TRUE?  Or is the
+      // PR_TRUE unneeded?  Or does currentInner being null mean we've
+      // never had an inner?

It means we've either never had an inner, or SetNewDocument() was called but failed somewhere between setting mInnerWindow to null and setting it to the new inner (which I'm not sure our code actually can deal with). In either case it seems like we'd be better off doing this unconditionally here.

r+sr=jst
Attachment #239293 - Flags: superreview?(jst)
Attachment #239293 - Flags: superreview+
Attachment #239293 - Flags: review?(jst)
Attachment #239293 - Flags: review+
attachment 239273 [details] [diff] [review] checked in to trunk, 2006-09-19 18:13 -0700, with the incorrect comment about what I was going to do in attachment 239293 [details] [diff] [review] removed, per comment 19.

attachment 239293 [details] [diff] [review] checked in to trunk, 2006-09-21 18:01 -0700, with that last ClearScope loop moved out one level of bracing, per comment 25.

Fixed on trunk (really was two days ago, with the first checkin).
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Comment on attachment 239587 [details] [diff] [review]
patch for 1.8 branch

r=jst
Attachment #239587 - Flags: review?(jst) → review+
Attachment #239587 - Flags: superreview?(bzbarsky) → superreview+
No longer blocks: 353114
Attachment #239587 - Flags: approval1.8.0.9?
*** Bug 353842 has been marked as a duplicate of this bug. ***
Depends on: 353704
Comment on attachment 239587 [details] [diff] [review]
patch for 1.8 branch

This is a pretty major leak fix, and I think we want to take it.  The regression fix already got in the previous cycle.
Attachment #239587 - Flags: approval1.8.1.1?
Comment on attachment 239587 [details] [diff] [review]
patch for 1.8 branch

approved for 1.8/1.8.0 branches, a=dveditz for drivers
Attachment #239587 - Flags: approval1.8.1.1?
Attachment #239587 - Flags: approval1.8.1.1+
Attachment #239587 - Flags: approval1.8.0.9?
Attachment #239587 - Flags: approval1.8.0.9+
Flags: blocking1.8.1.1?
Flags: blocking1.8.1.1+
Flags: blocking1.8.0.9?
Flags: blocking1.8.0.9+
Checked in to MOZILLA_1_8_BRANCH.

(I'm a little less sure about the 1.8.0 approval request now -- I think I'd at least like to test it more there, although given all the other unfixed leaks on that branch I'm not even sure it's worth the bother.)
Keywords: fixed1.8.1.1
Comment on attachment 239587 [details] [diff] [review]
patch for 1.8 branch

I'm not sure why I requested 1.8.0.9 approval here, so clearing approval flag.
Attachment #239587 - Flags: approval1.8.0.9+
...and resetting previously-plussed blocking flag to ? with the suggestion that it be changed to minus.

(If somebody else wants this in on 1.8.0.9, they need to do appropriate testing -- I don't have the time.)
Flags: blocking1.8.0.9+ → blocking1.8.0.9?
Not going to take this for 1.8.0.9.  If someone feels strongly about taking this on that branch, please renominate for the next release (when flags are ready).
Flags: blocking1.8.0.9? → blocking1.8.0.9-
Depends on: 358661
I'm still seeing a leak with Firefox 2.0.0.1:

Leaked outer window 8be36a0 at address 8be36a0.
Leaked inner window 93251b8 (outer 8be36a0) at address 93251b8.
 ... with URI "http://www.google.com/ig".
Leaked document at address 9551a48.
 ... with URI "http://www.google.com/ig".
Summary:
Leaked 2 out of 20 DOM Windows
Leaked 1 out of 49 documents
Leaked 0 out of 8 docshells
I can reproduce the leak with Firefox 2.0.0.1 on Linux but not on Mac OS X.
Do you see it in a clean profile, or is it related to a specific customization that you made?  (After all, it's the personalized home page... it varies.)
Yes I see it with a clean profile without logging in
http://www.google.com/ig?hl=en
I see the same as Alexander.

Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.8.1.2pre) Gecko/20070119 BonEcho/2.0.0.2pre ID:2007011903

1. New profile, start firefox
2. Visit http://www.google.com/ig?hl=en (no need to logon or customize anything.)
3. Close firefox, analyse leak log:

Leaked 2 out of 10 DOM Windows
Leaked 1 out of 39 documents
Leaked 0 out of 4 docshells
Depends on: 466826
No longer depends on: 466826
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.