Open Bug 412207 Opened 12 years ago Updated 4 years ago

Firefox leaks when Chatzilla Extension is connected to irc.mozilla.org

Categories

(Other Applications :: ChatZilla, defect, P2)

defect

Tracking

(Not tracked)

Future

People

(Reporter: cbook, Assigned: rginda)

References

(Depends on 1 open bug)

Details

(Keywords: memory-leak)

Attachments

(4 files, 1 obsolete file)

Attached file leak log
Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9b3pre) Gecko/2008011212 Firefox/3.0b3pre

Steps to reproduce:
Create a new Firefox Profile
Install Chatzilla as only Extension (beside the default DomI)
Connect with Chatzilla to moznet (irc.mozilla.org) (i clicked only on the moznet link, i did not any pref changes and used the default settings) 
Close Firefox with Chatzilla open
-> Leak

nsStringStats
 => mAllocCount:          23763
 => mReallocCount:         3932
 => mFreeCount:           23752  --  LEAKED 11 !!!
 => mShareCount:          19205
 => mAdoptCount:           1776
 => mAdoptFreeCount:       1776
Flags: blocking1.9?
Attached file new leak log
this still leaks with the steps to reproduce and Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9b3pre) Gecko/2008012105 Firefox/3.0b3pre

   0 TOTAL                                          24     2512   970906       12 ( 2779.95 +/-  2348.66)  2551340       15 ( 1944.56 +/-  3420.01)
Flags: blocking1.9? → blocking1.9+
Priority: -- → P2
This leak doesn't look too bad since it doesn't look like it's building up over time. Would still be really good to have it fixed of course. Especially since it's so easy to trigger, simply use chatzilla.

Silver, could you have a look at this. Let me know if you need any help.
Assignee: nobody → silver
Flags: wanted1.9+
Flags: blocking1.9-
Flags: blocking1.9+
Priority: P2 → --
note: does not leak when you close the chatzilla window first and then firefox.
Two questions to reporter/anyone able to reproduce the leak logs:

1. How exactly are you quitting Firefox? Does using different methods after how much or even if anything is leaked?
2. Must you be connected at the time to leak? What about if you disconnect from irc.mozilla.org before closing?

My bet is that either we (ChatZilla) are missing an/the opportunity to tell things to shut down, or that Firefox is forcing the issue on us by not giving us a chance. That's why question 1 is so important to us. :)

I've CCed Gijs as he's the only one of us who has access to a Mac, and this currently sounds like it is Mac-only. Anyone able to confirm/disprove that, please do.

We currently have two bugs on record ourselves for exit-related leaking:
- Bug 400156 - Windows/Linux (may well be Mac also) leak when exiting while
  connected.
- Bug 382522 - Mostly Mac, but seems to be have variable reproducibility
  between systems even of the same OS.

This bug is marked as blocking bug 402335, which suggests there is a litmus test involved; if so, could someone mention which test it is? Litmus is currently sucking like usual (500 Internal Server Error) so I can't find it.
I'm really sorry, but I'm incredibly busy with university stuff at the moment, so I don't have time to look into this deeply. If this needs to be fixed or looked into before Feb. 8 or so, you should probably ask someone else. Sorry!

(to answer some of comment #5, however, it sounds like fixing bug 406947 might fix this, too - keep in mind that we don't, for example, pop up a "we're connected, are you sure you want to quit" msg when you use File > Exit in Fx, which is what that bug is about)
just got a build set up on mac to help addon, leak testing.  hope things are set up correctly.

>
>Two questions to reporter/anyone able to reproduce the leak logs:
>
>1. How exactly are you quitting Firefox? Does using different methods after how
much or even if anything is leaked?

the leak log in comment 7 came after 

1- start firefox with mac trunk debug build
2- start chatzilla and had it autoconnect to #l10n-drivers #amo #sumo #"a secure passwd protected channel" #mozilla.in
3- hit [x] to exit chatzilla window, and confirm that I wanted to exit the session.
4- exit firefox and get the following in the terminal window...
 
nsStringStats
 => mAllocCount:          61323
 => mReallocCount:         4091
 => mFreeCount:           60685  --  LEAKED 638 !!!
 => mShareCount:          52996
 => mAdoptCount:           3505
 => mAdoptFreeCount:       3503  --  LEAKED 2 !!!

> 2. Must you be connected at the time to leak? What about if you disconnect from
irc.mozilla.org before closing?

are we leaking more if we actually connect to a few channels?
Giving this a P3 for now given this larger leak (leaks 4 nsGlobalWindows). Looks very similar to bug 415122 too, though there we do leak even if chatzilla is shut down first.
Blocks: 415122
Flags: wanted1.9+
Flags: blocking1.9-
Flags: blocking1.9+
Priority: -- → P3
don't know if this helps but...

on the start up terminal there are a couple of these errors when chatzilla starts, then seemingly one for each channel joined.  I see a total of eight, each time I start chatzilla.

************************************************************
* Call to xpconnect wrapped JSObject produced this error:  *
[Exception... "'Permission denied to get property XULElement.accessKey' when calling method: [nsIDOMXULLabelElement::accessKey]"  nsresult: "0x8057001e (NS_ERROR_XPC_JS_THREW_STRING)"  location: "JS frame :: XPCSafeJSObjectWrapper.cpp :: anonymous :: line 446"  data: no]
************************************************************
that code might have landed as part of 

https://bugzilla.mozilla.org/show_bug.cgi?id=355766

jst, does this help to answer the question in https://bugzilla.mozilla.org/show_bug.cgi?id=355766#43 ?
The 'Permission denied to get property XULElement.accessKey' error has been occurring for many months with ChatZilla, but we have been unable to pin down a cause or anything that doesn't work because of it. The oldest record I can find in my logs in Gijs mentioning it occurring on trunk builds on 2007-08-31, if that's any help.
This is getting way too messy. Can we keep one issue a bug, or clearly state why and how the issues are the same? I know these are all leaks, but right now this is just confusing. I've now heard three stories, for this bug, bug 400156 and bug 382522.

Bug 400156 is a problem on branch as far as we can tell (given there's no leak monitor on trunk).

Bug 412207 (this one), comment #0 describes a problem when opening ChatZilla, connecting *manually*, and then quitting, leaking a socket and related arch.

Bug 382522 describes an issue which so far has been mac-only, on trunk, and only an issue when you auto-connect to channels (which hides the *client* tab), leaking DOMWindows.

So, I'm not sure if any of these are related or the same, but right now the testing done by chofmann on this bug seems to me to in fact relate more to bug 382522 than to this one. Can someone clear this up?
So, first off, I am going to ignore the leak log in comment 7 because it is completely different from the original reported leak. I can reproduce the original leak locally reliably. All references below are to the leak logs in comment 0 and comment 1 (which are essentially the same).

Also, to answer my own question 2 from comment 5, the answer is yes and no, respectively. Disconnecting stops the leak.

It looks like things are just shutting down too fast, and bits of the connection to the server are still alive at the point everything else wants to go home. As bugs go, this is a non-event because it only happens when you close the entire of Firefox and not when you just close windows.

I think that bug 406947 (point 1 in particular) basically covers this; we need to intercept the appropriate Firefox notification and force-close our connections. Except this will be very bad, unless they get flushed first, etc..
OS: Mac OS X → All
Hardware: PC → All
Assignee: silver → rginda
Component: General → ChatZilla
Depends on: 406947
Priority: P3 → --
Product: Core → Other Applications
QA Contact: general → chatzilla
Assignee: rginda → silver
> I think that bug 406947 (point 1 in particular) basically covers this; we need
> to intercept the appropriate Firefox notification and force-close our
> connections. Except this will be very bad, unless they get flushed first, etc..

Won't they be force-closed anyway during the shut-down process?

Can this force-closing happen more centrally in necko? It seems bad if everyone that does network stuff has to monitor shutdown and force-close manually.
(In reply to comment #15)
> > I think that bug 406947 (point 1 in particular) basically covers this; we need
> > to intercept the appropriate Firefox notification and force-close our
> > connections. Except this will be very bad, unless they get flushed first, etc..
> 
> Won't they be force-closed anyway during the shut-down process?
> 
> Can this force-closing happen more centrally in necko? It seems bad if everyone
> that does network stuff has to monitor shutdown and force-close manually.
> 

That's a good question. Let's ask biesi :-)
Necko does force-close all the connections when shutting down (when the IO service goes offline).
http://lxr.mozilla.org/seamonkey/source/toolkit/xre/nsXREDirProvider.cpp#858

The net-teardown notifications causes Necko to go offline and to close sockets.

We call that before NS_ShutdownXPCOM.
Hmm, but nsSocketTransport::OnSocketDetached should maybe null out mCallbacks and mEventSink...
Attached patch patch (obsolete) — Splinter Review
does this patch help?
Tomcat, would you be able to test the attached patch?
(In reply to comment #21)
> Tomcat, would you be able to test the attached patch?
> 

yes will do the test. results soon :)
Sigh. FFS, sigh.
(In reply to comment #21)
> Tomcat, would you be able to test the attached patch?
> 

Unfortunately i can reproduce both leaks from comment #1 and #7 with a debug build where this patch is applied.
Comment on attachment 301575 [details] [diff] [review]
patch

James, any reason not to land this, just in case? Especially if there are multiple issues that are causing this leak.
Attachment #301575 - Flags: superreview+
Attachment #301575 - Flags: review?(silver)
Comment on attachment 301575 [details] [diff] [review]
patch

I don't see any reason not to land it - more deterministic reference-breaking is good.
Attachment #301575 - Flags: review?(silver) → review+
Assignee: silver → nobody
Component: ChatZilla → Networking
Product: Other Applications → Core
QA Contact: chatzilla → networking
Assignee: nobody → cbiesinger
So, I just noticed comment #24 after moving this. Why don't we resolve this bug with biesi's fix (once it's been committed), and then open a new bug for the remaining leaks? Sound good?
Status: NEW → ASSIGNED
No, nothing about this bug has changed with the patch going in. If anything the patch should have been moved elsewhere but I see no need for that.
Assignee: cbiesinger → rginda
Status: ASSIGNED → NEW
Component: Networking → ChatZilla
Product: Core → Other Applications
QA Contact: networking → chatzilla
Target Milestone: --- → mozilla1.9beta4
Assignee: rginda → silver
Checking in netwerk/base/src/nsSocketTransport2.cpp;
/cvsroot/mozilla/netwerk/base/src/nsSocketTransport2.cpp,v  <--  nsSocketTransport2.cpp
new revision: 1.52; previous revision: 1.51
done
Keywords: checkin-needed
Priority: -- → P2
Target Milestone: mozilla1.9beta4 → Future
Flags: tracking1.9+
Attachment #301575 - Attachment is obsolete: true
Assignee: silver → rginda
You need to log in before you can comment on or make changes to this bug.