Closed Bug 324586 Opened 18 years ago Closed 18 years ago

browser.xul leaks (with FlashGot & Adblock Filterset.G Updater installed)

Categories

(Firefox :: General, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: RyanVM, Unassigned)

References

()

Details

(Keywords: memory-leak)

This took forever for me to narrow down, but I've finally managed to get this into a set of reproduceable steps.

1. Install the most recent version of the following three extensions:
  -Flashgot 0.5.9.99
  -Adblock Plus 0.6.0.4
  -Adblock Filterset.G Updater 0.2.9
2. Load Firefox, then immediately close it.
3. Run leak-gauge.pl and notice the following output:
Leaked inner window 1d7b1d0 (outer 1d9e4c8) at address 1d7b1d0.
 ... with URI "chrome://browser/content/browser.xul".
Leaked outer window 1d9e4c8 at address 1d9e4c8.
Summary:
Leaked 2 out of 8 DOM Windows
Leaked 0 out of 24 documents
Leaked 0 out of 4 docshells

I've confirmed the leaks with two different builds and two different profiles. I created a clean profile and followed these steps exactly and got the same leak.
Do you need all three extensions to reproduce the bug?
Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9a1) Gecko/20060124 Firefox/1.6a1 ID:2006012405
I can confirm.
New profile, about:blank for startup page.

Installed Flashgot 0.5.9.99 (no leakage detected)
Installed Adblock Plus 0.6.0.4 (no leakage detected)
Installed Adblock Filterset.G Updater 0.2.9 (leakage now detected)

Leaked inner window 1e824f8 (outer 1e4e870) at address 1e824f8.
 ... with URI "chrome://browser/content/browser.xul".
Leaked outer window 1e4e870 at address 1e4e870.
Summary:
Leaked 2 out of 6 DOM Windows
Leaked 0 out of 25 documents
Leaked 0 out of 3 docshells

Leakage tests done by starting browser, then closing it. ie: no surfing.
Actually, only Flashgot 0.5.9.99 & Adblock Filterset.G Updater 0.2.9 are needed to show the leakage. (Something to do with flashgot's download routines and the Adblock Filterset G's auto-updating routines mayhap?)
Summary: Memory leak in browser.xul → browser.xul leaks (with Flashgot & Adblock Filterset.G Updater installed)
What are the links to the extensions involved?  I'm willing to try debugging this, but not to waste my time searching for the "right" extensions....
Steve, I'm sorry.  I missed comment 4.  But yes, comment 6 is a lot more helpful.  ;)

So I tried debugging this.  In a debug build with the two extensions I can't even start the browser.

First I get an assert because Flashgot tries to register a content listener that's not an nsISupportsWeakReference.  Of course the URILoader API doesn't document this....  and should.

Then I get a warning because Flashgot accesses non-threadsafe objects (in this case the Components object, I think) on multiple threads.

Then I get a fatal assert in the JS debugger:

Program received signal SIGTRAP, Trace/breakpoint trap.
JS_Assert (
    s=0xb78bf600 "(! lock->count && ! lock->owner) || (lock->count && lock->owner)", 
    file=0xb78bf5bc "../../../mozilla/js/jsd/jsd_lock.c", ln=106)
    at ../../../mozilla/js/src/jsutil.c:62
62          abort();

Given the threadsafety stuff earlier, this is not exactly surprising.  Further, accessing non-threadsafe objects on multiple threads is very very likely to leak as refcounts can get screwed up.

So I think it's really useless to even try debugging here until the threadsafety stuff in Flashgot is fixed.  Can someone file a bug on them to do that, please?
CC'ng Giorgio Maone, author of FlashGot.
(In reply to comment #7)

> First I get an assert because Flashgot tries to register a content listener
> that's not an nsISupportsWeakReference.  Of course the URILoader API doesn't
> document this....  and should.

Thank you for this pointer, I'm fixing this right now.

> accessing non-threadsafe objects on multiple threads is very very likely to
> leak as refcounts can get screwed up.

Yet it doesn't explain why leaks does appear only when Filterset.G Updater is installed (even assuming it is a FlashGot leak)
 
> So I think it's really useless to even try debugging here until the
> threadsafety stuff in Flashgot is fixed.

I was already planning an huge refactoring to remove explicit multithreading before Firefox 2.0, but I guess I'll  accelerate it.
Anyway, right now FlashGot has an option for synchronous processing in the UI thread, even if asynchronous is set to default. 
So far the background processing has given no problem at all, but if you feel it 's a danger, I can swap the defaults until I apply the refactoring above.
> Yet it doesn't explain why leaks does appear only when Filterset.G Updater is
> installed (even assuming it is a FlashGot leak)

Sure.  All I'm saying is that as things stand I can't debug the leak because the multi-threading stuff triggers fatal assertions before I can even start the browser.

> Anyway, right now FlashGot has an option for synchronous processing in the UI
> thread, even if asynchronous is set to default. 

So I can set this option locally to test with?  How do I do that?

Thanks for your help with this!
Summary: browser.xul leaks (with Flashgot & Adblock Filterset.G Updater installed) → browser.xul leaks (with FlashGot & Adblock Filterset.G Updater installed)
(In reply to comment #10)

> 
> Thanks for your help with this!
>

Ooops, I completely forgot about updating this bug...
FlashGot 0.5.9.993 (current addons.mozilla.org shipped version) has no multithreaded code anymore, hence this should help with debugging.

On the other hand, I did not mind to wrap my URIContentListener in a weak reference, because its lifespan is the same as the browser: it's instantiated/registered on startup and disposed/unregistered on shutdown.
Again, shame on me, I forgot there you were complaining about an assertion, rather than a memory leak. Hope it's not too much annoying, being just one on startup.

Does the leak occur with only Adblock Filterset.G Updater installed, or must both it and Flashgot be present? Also note that we released a new version of AFGU, 0.3.0.1, does the leak still occur with it?
URI to updated adblock filterset G updater:

http://releases.mozilla.org/pub/mozilla.org/extensions/adblock_filterset.g_updater/adblock_filterset.g_updater-0.3.0.1-fx+fl+mz+ns.xpi

URI to updated flashgot:

http://releases.mozilla.org/pub/mozilla.org/extensions/flashgot/flashgot-0.5.9.993-fx+fl+mz+ns+tb.xpi

With those installed, I no longer crash on startup in a debug build.  I do still leak nsGlobalWindow objects.  Looking into that.
And I've confirmed that both extensions need to be enabled to leak; either one by itself doesn't do it.
I'm really not equipped to debug this, but I will (seeing as I wrote the AFGU extension) if someone can get me going. For example, what is leak-gauge.pl anyway? I assume it comes with a FF nightly or something like that, and/or there is a handy page on mozillazine all about what I'm looking for, but I really am new to hacking FF on such a basic level. I had to learn XUL, RDF, and how the extension manager system worked to write AFGU, and that's about the extenst of my FF knowledge, although I did get into some IDL working on Sunbird.

Anyway, could someone give me a shove in the right direction?

Thanks in advance,
Reid
Reid, debugging leaks with a nightly build is pretty hard...  To really make it work, you want a debug build, and even then to really debug you have to twiddle some C++ defines to enable logging that's usually off in debug builds.

I should have some more information here in a few minutes; I think I know what's going on and I'm just testing.
So I think the relevant ownership paths are (before XPCOM shutdown starts):

  FlashGotService has a reference to HttpInterceptor (via this.interceptor)
  HttpInterceptor has a reference to the observer service (via closure's osrv)
  Observer service has a reference to FlashGotService (via nsIObserver that was
     added).
  Observer service has references to various Filterset.G Updater stuff via
     nsIObservers that were added.

And the other relevant setup is:

  FlashGotService has an ownership cycle with the observer service; it breaks
  this cycle at shutdown by calling removeObserver().

  Updater doesn't have a cycle, and doesn't observe shutdown.  In fact, it never
  calls its deregister() method as far as I can see.  By itself, this is sorta
  ok.

Now we start XPCOM shutdown.  We send the XPCOM shutdown notification.  This cases the FlashGotService to remove itself as an observer from the observer service.  FlashGot still has a reference to the observer service, however, so the observer service is still alive and still holding a reference to the updater stuff.

We do our final GC.  This finalizes the JSObject for the FlashGotService (since that's no longer owned by anything), and hence finalizes the JSObject for the observer service.  This does NOT immediately release the observer service, since we have the deferred releases XPConnect option toggled (see http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/js/src/xpconnect/src/xpcwrappednative.cpp&rev=1.106&mark=919-931#918 and http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/dom/src/base/nsJSEnvironment.cpp&rev=1.275&mark=2329#2326).  Since the observer service holds a strong ref to the updater stuff, the updater stuff is NOT finalized at this point (the JSObject is locked during this GC).  Then we do our deferred release of the observer service, which releases the nsXPCWrappedJS for the updater stuff.  So now the updater stuff is no longer a GC root... but there are no more GCs coming along -- we're done shutting down.  So we leak the updater stuff and whatever it kept alive during the last GC; in this case that includes the window it ran in, of course.  Or at least something along these lines happens -- we end up running our last GCs after XPConnect shutdown, etc, etc.  So the exact details of what happens in XPConnect-land here are a little fuzzy.

This is actually very similar to the situation we saw in bug 296987.

To fix this, there are several possibilities:

1)  Flashgot could drop its reference to the HttpInterceptor during the shutdown
    notification (this should allow the observer service to die earlier).
2)  Updater could remove itself as an observer.
3)  Both (or either) could use weak references with the observer service.
4)  We could make shutdown cleaner (bug 316414).

I recommend doing 4 and also some combination of 1, 2, and 3.  ;)
Depends on: 316414
How many browser.xul documents leak?  One per window?  If it's anything other than just the hidden window, XPCOM shutdown seems way too late to be fixing document leaks.
mcm_ham, codeveloper of the AFGU extension has done some debugging, and sent me this message:

--snip--
I only just saw this bug and after looking at out code figured out immediately what fgupdater was doing wrong (not de-registering on shutdown you'll see a warning about it here: http://kb.mozillazine.org/Using_observers) and was just about to post the problem when Boris gave his big summary of the issue.

Anyway I have made that simple change and set it to use a weak reference instead. You can find the build attached if you want to comment in the bug. Might want to ask Boris whether in the example given on this page: http://kb.mozillazine.org/Using_observers whether it should use weak reference instead of strong.
--snip--

I trust he will not mind me quoting him here. I am testing the dev-build now, and will post it to Mozilla Update once I know it works properly.
If this is a shutdown leak I think it will be fixed by bug 326491, which will force the observer service to release its strong references even if it is being "leaked".
Depends on: 326491
No longer depends on: 316414
David, with the extension versions in comment 18, I see no document leaks at all.  I do see two ChromeWindow leaks (inner and outer), because the Window JSObject is entrained by the updater stuff (which runs in the window context).

Reid, it'd be great if the Mozillazine article gave examples of both strong and weak refs with an explanation of the difference.
> To fix this, there are several possibilities:
> 
> 1)  Flashgot could drop its reference to the HttpInterceptor during the
> shutdown
>     notification (this should allow the observer service to die earlier).
> 2)  Updater could remove itself as an observer.
> 3)  Both (or either) could use weak references with the observer service.
> 4)  We could make shutdown cleaner (bug 316414).
> 
> I recommend doing 4 and also some combination of 1, 2, and 3.  ;)
> 

FlashGot 0.5.9.995 (current AMO version) implements 1 and 3.
Can someone close and verify this bug?
Thanks!

Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.