Closed Bug 191835 Opened 22 years ago Closed 21 years ago

FTP leaks control socket

Categories

(Core Graveyard :: Networking: FTP, defect, P2)

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.4alpha

People

(Reporter: keeda, Assigned: darin.moz)

References

()

Details

(Keywords: memory-leak)

Attachments

(2 files, 2 obsolete files)

FTP leaks the nsSocketTransport for the control connection. This seems to have 
started happening since the async streams stuff landed.
*** Bug 191834 has been marked as a duplicate of this bug. ***
Attached patch fix (obsolete) — Splinter Review
This seems to fix it at least for the case when the control connection is
allowed to timeout. (timeout defaults to 5 mins.)
Attachment #113475 - Flags: review?(darin)
Comment on attachment 113475 [details] [diff] [review]
fix

hmm.. this should not be necessary.  something else must be holding a reference
to mCPipe or else the final release of the socket transport would close the
connection (or is that not working?)

does this patch not fix the leak if the browser is shutdown before the 5 minute
timeout?
-> me (this should be fixed for 1.3 final)
Assignee: dougt → darin
Severity: normal → major
Priority: -- → P2
Target Milestone: --- → mozilla1.3beta
Well, I was just copying what nsHttpConnectionMgr does to purge the connections
it doesn't need anymore.

Yes, the socket transport itself is leaking. You can see it on the brad
tinderbox. In my debug build tracerefcnt leak log says (amongst other things) 

 383 nsPipe                            132      132       83        1 
 445 nsSocketTransport                 180      180        3        1 
 446 nsSocketTransportService         1304     1304        1        1 
 453 nsStreamCopierIB                   32       32        1        1

... unbroken cycle between the pipe, nsStreamCopierIB and nsSocketTransport??

>does this patch not fix the leak if the browser is shutdown before the 5 minute
>timeout?

This patch doesn't help if the browser is shutdown early. What is happening in
that case is not clear to me, but the problem seems to be that the
sockettransport thread is already dead when we hit this dtor, and things cant
get cleaned up. So, this patch will probably not make the tbox leaks go away. 

Perhaps nsFtpProtocolHandler should be observing xpcom shutdown ....
Status: NEW → ASSIGNED
Target Milestone: mozilla1.3beta → mozilla1.3final
Flags: blocking1.3?
Flags: blocking1.3? → blocking1.3+
hmm.. i cannot reproduce this leak if i let the 5 minute FTP idle timer fire. 
if i just close the browser, after ftp://ftp.mozilla.org/ loads, then the leak
occurs.  the socket transport is not destroyed.  investigating further...

if the timeout function always breaks the cycle correctly, and if the only leak
is a shutdown leak, then i think this bug shouldn't need to be blocking 1.3 final.
ok, i did some investigation, and the problem is that my attempt to break the
cycle between the nsFtpControlConnection and the nsInputStreamPump it allocates
is thwarted by the fact the nsInputStreamPump is waiting for the
nsSocketInputStream to become readable or closed. 
nsFtpControlConnection::Disconnect simply releases its reference to mReadRequest
(which is the nsInputStreamPump), but that's not good enough to cause the
nsInputStreamPump to disconnect from the socket stream and hence release its
mListener, which is the nsFtpControlConnection.

I think the right solution is to keep mReadRequest around instead of trying to
release it.  Then, inside Connect, we would only need to call AsyncRead if
creating the socket transport for the first time.  To make this work properly,
the destructor for timerStruct will have to call nsFtpControlConnection::
Disconnect(some_failure_code).

Prior to my changes for bug 176919, calling Cancel(NS_OK) on the equivalent of
the input stream pump would have torn down the cycle, but I never liked that
convention of calling Cancel with a success code.  I'd like not to reintroduce
that if possible and assuming the alternative is not overly complicated or too
awkward, which i don't think it is.
You can't do that, I think See bug 129811, bug 99283 and other bugs which I
can't find ATM, mostly dupes of those.

Some servers reuse the PASV connection from the LIST for the RETR, so we can't
close it until after the following PORT command. When we get that, then we can
know if th port was the same, and if so we reuse that connection rather than
start up a new one.

Ain't ftp fun :)
Flags: blocking1.3+ → blocking1.3-
but bradley, i'm just talking about the control connection... you're talking
about the data connection right?  ftp state calls
nsFtpControlConnection::Disconnect, which only really disconnects the control
connection if an error code is passed.  otherwise, the control connection
remains idle.  i'm just talking about keeping the nsFtpControlConnection's
mReadRequest alive, instead of nulling it out.  maybe a patch would clarify what
i'm talking about ;-)

/me works on a patch...
Target Milestone: mozilla1.3final → mozilla1.4alpha
Attachment #113475 - Attachment is obsolete: true
Attachment #113475 - Flags: review?(darin)
Attached patch v1 patch (obsolete) — Splinter Review
this fixes the shutdown leak, but we still don't shutdown cleanly.  the FTP
handler's destructor tries to disconnect all the idle sockets, but that causes
attempts to access the socket transport service.  however, by this time the
socket thread no longer exists and attempts to run code on the socket thread
fails.	i don't think this is a big problem, because the socket transport
service closes all open sockets before exiting the thread.  however, it is
still a bit messy.
Attachment #116287 - Flags: review?(bbaetz)
Comment on attachment 116287 [details] [diff] [review]
v1 patch

Get tever to test this on his ftp regression set of servers, and r=bbaetz

This is only a shutdown leak, right?
Attachment #116287 - Flags: review?(bbaetz) → review+
bbaetz: right, only a shutdown leak.  if we reuse the control connection, the
cycle is broken .. or rather replaced by a new cycle ;)
Attachment #116287 - Flags: superreview?(bzbarsky)
Comment on attachment 116287 [details] [diff] [review]
v1 patch

>         nsCOMPtr<nsITimer> timer;
>-        nsCOMPtr<nsISupports> conn;
>+        nsFtpControlConnection *conn;

You could use |nsCOMPtr<nsFTPControlConnection> conn;|, since this has
unambiguous inheritance from nsISupports, no?  That would keep you from needing
those pesky manual addrefs and releases...
bz: sure, i could...  though, i'll still have to |if (conn)
conn->Disconnect(...)| in ~timerStruct, and this code is pretty simple as is. 
maybe less visual code is worth the extra branch in the dtor :-/
Attachment #116287 - Attachment is obsolete: true
Attachment #116402 - Flags: superreview?(bzbarsky)
Comment on attachment 116402 [details] [diff] [review]
v1.1 patch - with nsCOMPtr<nsFtpControlConnection>

> +        timerStruct() : conn(nsnull), key(nsnull) {}

This shouldn't be needed -- nsCOMPtr self-inits to null.

sr=bzbarsky with that
Attachment #116402 - Flags: superreview?(bzbarsky) → superreview+
bz: thx for the sharp eyes.  /me went too fast :-/
fixed-on-trunk
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
bz: for future reference... OS/2 wasn't happy with your nsCOMPtr suggestion :(

..\..\..\..\dist\include\xpcom\nsCOMPtr.h(664:7) : error EDC3222:
"nsDerivedSafe<nsFtpControlConnection>" needs a constructor because base class
"nsFtpControlConnection" needs a constructor initializer.

so, i reverted to the original patch.
post-this patch my build now stops on Linux with:

In file included from nsFtpProtocolHandler.cpp:36:
../../../../dist/include/xpcom/nsCOMPtr.h: In instantiation of
`nsDerivedSafe<nsFtpControlConnection>':
nsFtpProtocolHandler.h:94:   instantiated from here
../../../../dist/include/xpcom/nsCOMPtr.h:189: base `nsFtpControlConnection'
with only non-default constructor in class without a constructor

I am not a coder and I'm mentioning this here based on what I saw change in the
cvs log and that the previous comment looks similar- ignore me if I'm full of it.
Sun didn't like the nsCOMPtr either - don't blame us :)
Yech.  Looks like you can't use nsCOMPtr on a class that only has a constructor
taking arguments (that is, does not have a className::className() constructor).
Cc'ing scc, who has confirmed that
nsCOMPtr<concrete_class_unambiguously_derived_from_nsISupports> foo is legal
(I've used it in good health, but I must've had default constructors for all
concrete classes so used) -- do we need some docs updated?

/be
ok, so good news: nsFtpControlConnection is no longer leaking, but the bad news
is that the corresponding nsSocketTransport is still leaking.  i suspect this
has to do with what i mentioned in comment #10.  i suspect that it is not a good
idea to defer closing the idle control connections post xpcom-shutdown ;-)

REOPENING
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
OK, just tried again- works just great on Linux with patch reverted as in
Comment #20.  Thanks Darin!  Sorry 'bout the report- looks like when I did my
checkout I had just missed you reverting- and didn't see that you had already
fixed this till when I took another look at Bugzilla today.
Attachment #116287 - Flags: superreview?(bzbarsky)
This stack shows where the remaining reference to the nsStreamCopierIB was
added.  It's the nsStreamCopierIB and nsStreamCopierOB that are keeping the
nsSocketTransport alive.  (I'm assuming the nsStreamCopierOB case is similar but
I haven't checked)

This seems a bit odd.  Should ~nsPipeEvents really be doing what it does here
when it's used in OnPipeException?

nsStreamCopierIB::AddRef
nsSocketOutputStream::AsyncWait
nsStreamCopierIB::OnInputStreamReady
nsPipeEvents::~nsPipeEvents
nsPipe::OnPipeException
nsPipeOutputStream::CloseEx
nsPipeOutputStream::Close
nsPipeOutputStream::Release
nsCOMPtr<nsIOutputStream>::assign_assuming_AddRef
nsCOMPtr<nsIOutputStream>::assign_with_AddRef
nsCOMPtr<nsIOutputStream>::operator=
nsFtpControlConnection::Disconnect
nsFtpProtocolHandler::timerStruct::~timerStruct
nsFtpProtocolHandler::~nsFtpProtocolHandler
nsFtpProtocolHandler::Release
nsCOMPtr_base::assign_assuming_AddRef
nsCOMPtr_base::assign_with_AddRef
nsCOMPtr<nsISupports>::operator=
UNKNOWN
PL_DHashTableEnumerate
nsComponentManagerImpl::FreeServices
NS_ShutdownXPCOM
Attached patch possible patchSplinter Review
This makes the nsFtpProtocolHandler observe about-to-go-offline rather than
doing the shutdown in its destructor.  (It needs to observe about-to-go-offline
rather than XPCOM-shutdown because if it observes XPCOM shutdown it will end up
being notified after the IOService, which is what causes the
about-to-go-offline notification right before shutting down the socket
transport.)

This fixes the leak, and makes sense to me.  (Right now the only other listener
of that topic is in nsMsgAccountManager.cpp, but it seems like exactly what's
wanted.)

Thoughts on this patch?
Comment on attachment 118253 [details] [diff] [review]
possible patch

Of course, this fixed the leak a few minutes ago, but it doesn't seem to
anymore.  I don't know why.
Comment on attachment 118253 [details] [diff] [review]
possible patch

It seems to work again...
Attachment #118253 - Flags: superreview?(darin)
dbaron: yeah, i was going to try almost exactly the same patch... hmm... very
curious that it wouldn't do the job 100% of the time.
It always fixes the nsStreamCopierOB leak, but sometimes the nsStreamCopierIB,
nsSocketTransport, nsPipe, and nsSocketTransportService still leak.
Comment on attachment 118253 [details] [diff] [review]
possible patch

>Index: nsFtpProtocolHandler.cpp

>+            NS_DELETEXPCOM(mRootConnectionList);

while you're touching this file, could you use |delete| here instead.
mRootConnectionList is allocated using a naked call to |new|.

sr=darin
Attachment #118253 - Flags: superreview?(darin) → superreview+
do you see any interesting assertions when the nsStreamCopierIB leak occurs?
Comment on attachment 118253 [details] [diff] [review]
possible patch

I was actually thinking of attaching a whole new patch just to get rid of that
NS_DELETEXPCOM (and also to put the |i| in the for loop initializer and count
backwards to avoid too many calls to count), but decided it wasn't worth it.

I don't see any interesting assertions.  At least, I didn't notice any...
Attachment #118253 - Flags: review?(bbaetz) → review+
Fix checked in to trunk, 2003-03-25 06:48 PST.

'brad' tinderbox thinks this is fixed, and the case where I was still seeing the
leak may be something different from the original problem, and it's also
somewhat difficult to reproduce, so marking fixed.
Status: REOPENED → RESOLVED
Closed: 21 years ago21 years ago
Resolution: --- → FIXED
(Actually, 'brad' tinderbox seems to be oscillating.  Maybe this should be
reopened, or maybe there should be a new bug?)
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: