43567 - FTP - keeping connections alive forever

Reporter

Description

•

25 years ago

I may have prematurely closed one of the other socket transport bugs. Oh well, here's another one. Turns out we still leak - a socket transport (three refs, two from pipes, below) - the socket transport service (one ref) - two nsPipes (two refs each) I think that there's a circularity here. I'll report on what I find...

Chris Waterson

Reporter

Updated

•

25 years ago

Keywords: mlk

Chris Waterson

Reporter

Comment 1

•

25 years ago

Attached file refcount balancer tree for the first nsPipe object — Details

Chris Waterson

Reporter

Comment 2

•

25 years ago

From the refcount balancer tree of the first nsPipe object, it's pretty clear that the two references are being dropped by nsSocketTransport::OpenOutputStream(), presumably because the socket transport is taking ownership of the pipe.

Chris Waterson

Reporter

Comment 3

•

25 years ago

Attached file refcount balancer tree for nsSocketTransport object — Details

Chris Waterson

Reporter

Comment 4

•

25 years ago

...so where's that last reference being lost from? The nsSocketTransportService maybe?

Chris Waterson

Reporter

Comment 5

•

25 years ago

Attached file refcount tree for nsSocketTransportService — Details

Chris Waterson

Reporter

Comment 6

•

25 years ago

Aha. It looks like maybe the nsFtpConnectionThread is bound up in a cyclic reference with the nsSocketTransportService? Do we need some kind of out-of-band notification to break the cycle here?

Chris Waterson

Reporter

Comment 7

•

25 years ago

I added a second FTP URL to the "bloaturls.txt" file, and lo, a second socket transport was leaked, along with two more pipes.

Chris Waterson

Reporter

Comment 8

•

25 years ago

Jud: This fix seems to eliminate one more addref for the nsSocketTransport object that's leaking. We believe that it holds on to the 2 pipes and nsSocketTransportService. Canceling the "pipe" channels is necessary because they return to the work queue in the socket transport service because they just return WOULD_BLOCK. However, we don't think that the nsFTPReleaseEvents are actually releasing what they think they're releasing, particularly mConnCache. You should check into this. We added some MOZ_COUNT_CTOR/DTOR macros for the ftp events and nsConnectionCacheObj, and determined that mConn is also leaking, but that may be due to the mConnCache problem. Index: nsFtpConnectionThread.cpp =================================================================== RCS file: /cvsroot/mozilla/netwerk/protocol/ftp/src/nsFtpConnectionThread.cpp,v retrieving revision 1.121 diff -c -r1.121 nsFtpConnectionThread.cpp *** nsFtpConnectionThread.cpp 2000/06/17 01:38:36 1.121 --- nsFtpConnectionThread.cpp 2000/06/23 06:32:58 *************** *** 1809,1816 **** } // Release the transports ! mCPipe = 0; ! mDPipe = 0; mIPv6Checked = PR_FALSE; if (mIPv6ServerAddress) { nsMemory::Free(mIPv6ServerAddress); --- 1809,1822 ---- } // Release the transports ! if (mCPipe) { ! mCPipe->Cancel(NS_BINDING_ABORTED); ! mCPipe = 0; ! } ! if (mDPipe) { ! mDPipe->Cancel(NS_BINDING_ABORTED); ! mDPipe = 0; ! } mIPv6Checked = PR_FALSE; if (mIPv6ServerAddress) { nsMemory::Free(mIPv6ServerAddress);

Judson Valeski

Comment 9

•

25 years ago

potts and I went back and fourth on the pipe cancellation stuff (note the pipe cancels just above that which are commented out). I'm not sure why returning to the workQ is helping? Sounds like the socket transport has some cleanup problems. I'm worried about cancelling because a cancel will cause an OnStop() to fire w/ an abort code. This *might* cancel the data channel before all the data has been pumped. Yea, this is a bug in the socket transport.

Chris Waterson

Reporter

Comment 10

•

25 years ago

We were seeing 19 addref's but only 18 release's through nsSocketTransport::ProcessWorkQ(). http://lxr.mozilla.org/seamonkey/source/netwerk/base/src/nsSocketTransportServic e.cpp#206 This led us to believe that the last trip through was getting stuck here: http://lxr.mozilla.org/seamonkey/source/netwerk/base/src/nsSocketTransportServic e.cpp#250 In other words, the transport was returning EWOULDBLOCK rather than closing.

Warren Harris

Comment 11

•

25 years ago

Oops... we had another diff in the socket transport service for cleaning out the work queue on shutdown. Chris, can you post that one too so Rick can review it?

Chris Waterson

Reporter

Comment 12

•

25 years ago

rpotts: below is the diff. it's a bit sloppy (because we'll still leak the PR_CLIST stuff), but you get the idea.... Index: nsSocketTransportService.cpp =================================================================== RCS file: /cvsroot/mozilla/netwerk/base/src/nsSocketTransportService.cpp,v retrieving revision 1.42 diff -u -r1.42 nsSocketTransportService.cpp --- nsSocketTransportService.cpp 2000/06/23 02:02:03 1.42 +++ nsSocketTransportService.cpp 2000/06/23 21:29:09 @@ -619,7 +619,10 @@ } else { rv = NS_ERROR_FAILURE; } - + + for (int i = 0; i < mSelectFDSetCount; i++) { + NS_IF_RELEASE(mActiveTransportList[i]); + } return rv; }

rpotts (gone)

Comment 13

•

25 years ago

hey chris, the patch looks fine to me... Do you think that before we release each transport, we should cancel it? -- rick

Chris Waterson

Reporter

Comment 14

•

25 years ago

Well, I don't know...one thing that puzzled warren and I last night was "how come HTTP doesn't leak"? Is there some shutdown sequence that FTP is just not doing right? Does FTP keep a connection open to the server forever? When is that connection supposed to go away?

Warren Harris

Comment 15

•

25 years ago

Yes, maybe the destructors should always cancel, just to be safe (like closing a file object in its destructor).

Judson Valeski

Comment 16

•

25 years ago

ftp keeps a transport cached, per server, in the ftpprotocolhandler, indefinately.

Chris Waterson

Reporter

Comment 17

•

25 years ago

So the nsFTPProtocolHandler *is* going away: maybe it's not properly shutting down the socket transports?

rpotts (gone)

Comment 18

•

25 years ago

hey jud, if the ftpprotocolhandler keeps a transport cached per server, won't we eventually run out of socket transports if we visit ~8 different FTP sites? -- rick

Judson Valeski

Comment 19

•

25 years ago

Good question. I just visited a dozen different sites and everything was fine. There are no cache cleanup smarts in FTP, so I suspect we just kept building socket transports and caching them (maybe the limit is higher, or we grow it). How does HTTP do cleanup? ftp server timeouts are typically longer than HTTP (for the command channel).

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Updated

•

25 years ago

Whiteboard: [tind-mlk]

Warren Harris

Comment 20

•

25 years ago

We need to fix the thread-pool bug 36750 before we can up the limit on transport threads. Cc'ing dougt.

David Baron :dbaron: (⌚️UTC-4, no longer working on Mozilla)

Comment 21

•

25 years ago

The leaked nsSocketTransportService leaks an nsStringBundle too.

Warren Harris

Updated

•

25 years ago

Keywords: nsbeta3

Gagan

Comment 22

•

25 years ago

approving for nsbeta3

Assignee: gagan → ruslan

Gagan

Comment 23

•

25 years ago

approving this time (didn't make it last time!) sorry.

Whiteboard: [tind-mlk] → [tind-mlk][nsbeta3+]

ruslan

Comment 24

•

25 years ago

nsSocketTransport/nsResChannel-related leak is gone. As to FTP - it does seem to be keeping connections alive forever. May be we should set up a timeout like http does? I'm also not sure that recovery logic works correctly in case the remote server drops the connection.

Status: NEW → ASSIGNED

Gagan

Comment 25

•

25 years ago

robert-- welcome to necko!

Assignee: ruslan → rjc

Status: ASSIGNED → NEW

Robert John Churchill

Comment 26

•

25 years ago

Changing summary. As the leaks are detailed in bug # 51937, this bug can be strictly about FTP keeping connections alive. In terms of recovery logic, I used Mozilla to FTP into one of my Linux machines, then telnet'ed to the same machine (out-of-band) and killed off the FTP process. Finally, back in Mozilla I reloaded the FTP URL in question and Mozilla recovered fine.

Status: NEW → ASSIGNED

Summary: FTP causes nsSocketTransport and two nsPipe objects to leak → FTP - keeping connections alive forever

Daniel Veditz [:dveditz]

Comment 27

•

25 years ago

Per PDT rules P3 bugs are now nsbeta3-

Whiteboard: [tind-mlk][nsbeta3+] → [tind-mlk][nsbeta3-]

Robert John Churchill

Updated

•

25 years ago

Target Milestone: --- → Future

Gagan

Comment 28

•

25 years ago

Hi dougt, welcome to necko :)

Assignee: rjc → dougt

Status: ASSIGNED → NEW

Doug Turner (:dougt)

Updated

•

25 years ago

Blocks: 62356

benc

Comment 29

•

24 years ago

mass move, v2. qa to me.

QA Contact: tever → benc

benc

Updated

•

24 years ago

Keywords: mozilla1.0

Doug Turner (:dougt)

Comment 30

•

24 years ago

We need to fix this soon.

Target Milestone: Future → mozilla0.9.6

Patrick

Updated

•

24 years ago

Blocks: 92580

Doug Turner (:dougt)

Comment 31

•

24 years ago

to bradley

Assignee: dougt → bbaetz

Component: Networking → Networking: FTP

Cathleen

Updated

•

24 years ago

No longer blocks: 92580

refcount balancer tree for the first nsPipe object 25 years ago Chris Waterson 15.06 KB, text/plain		Details
refcount balancer tree for nsSocketTransport object 25 years ago Chris Waterson 19.92 KB, text/plain		Details
refcount tree for nsSocketTransportService 25 years ago Chris Waterson 24.66 KB, text/plain		Details
patch 24 years ago Bradley Baetz (:bbaetz) 10.52 KB, patch		Details \| Diff \| Splinter Review
new patch, using an array instead of a hashtable 24 years ago Bradley Baetz (:bbaetz) 10.95 KB, patch		Details \| Diff \| Splinter Review
new patch 24 years ago Bradley Baetz (:bbaetz) 11.53 KB, patch		Details \| Diff \| Splinter Review
updated patch 24 years ago Bradley Baetz (:bbaetz) 13.15 KB, patch	darin.moz : superreview+	Details \| Diff \| Splinter Review
works 24 years ago Bradley Baetz (:bbaetz) 12.93 KB, patch	dougt : review+ darin.moz : superreview+	Details \| Diff \| Splinter Review