Closed Bug 65220 Opened 24 years ago Closed 24 years ago

FTP clobbers CPU

Categories

(Core Graveyard :: Networking: FTP, defect)

x86
All
defect
Not set
critical

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9

People

(Reporter: kelson, Assigned: dougt)

References

()

Details

(Keywords: perf)

Accessing an FTP URL maxes out the CPU on my system (AMD K6-2 400 MHz).  I first
noticed this on an HTML page with FTP links using "Save Link As..."
(http://rpmfind.net/linux/rpm2html/search.php?query=amaya)
I tried several different links (the first one I tried had an erroneous "//" in
the path, but that turned out not to be the issue).  In each case, once it
started connecting, CPU usage jumped to 100% and did not drop to normal levels
until I closed Mozilla, even if I cancelled the download.

I then tried going to ftp://ftp.mozilla.org, and watched the CPU usage jump to
100% before it even started displaying anything.

This seems to be new, because the problem doesn't show up in Mozilla 0.7.
This is a dupe of bug 65177

*** This bug has been marked as a duplicate of 65177 ***
Status: UNCONFIRMED → RESOLVED
Closed: 24 years ago
Resolution: --- → DUPLICATE
Whoops, misread stuff. This isn't a dupe, reopening.
Status: RESOLVED → UNCONFIRMED
Resolution: DUPLICATE → ---
*** Bug 65340 has been marked as a duplicate of this bug. ***
Confirming bug, sev. major. linux 2001011306.
Severity: normal → major
Status: UNCONFIRMED → NEW
Ever confirmed: true
seen on Linux/NT/Windows98 - OS: All.
OS: Windows NT → All
I would like to fix this but I can not reproduce.  kelson (or anyone else who 
can reproduce this), please send me details on your net configuration, what 
else was you were doing in mozilla (eg. did you have mail/news open, did you 
have any web pages open).

FTP is working most of the time.  I believe this is true since the tinderboxes 
are running it and I don't see pagecycler timeouts (execpt on linux, but I 
understand that this is another problem which is 'known').

Keywords: qawanted
As described in bug 65340 , all I need to do to reproduce the problem is go to:
ftp://ftp.slackware.com/pub/slackware/slackware-current/README72.TXT
This was on a Windows NT system (service pack 6a, totally current) with a 10 MB
Ethernet connection to the office LAN.  The internal network is connected to a
Linux box running IP masquerading, which connects to our external network, which
is connected via T-something to Sprint.  Let's just say "bandwidth to spare.

I was using build 2001011106 (the one linked to by Mozillazine) with two or
three WWW windows open, one of which was a page of FTP links.  I tried to
download one of them, the download progress box popped up, usage skyrocketed and
my computer started acting like molasses.  I closed Mozilla, checked in the Task
Manager to make sure the process was gone, ran it again, going straight to that
page, and had the same thing happen.  I tried the same page in Mozilla 0.7 and
was able to download without it tying everything up.  Then I tried just bringing
up a window and going straight to ftp.mozilla.org.  The nightly did eventually
bring up the listing, but it was very slow and at tied up the CPU while it did.

I haven't had a chance to download any later builds at home, but I have no
problems with build 2001010512 on Linux.
kelson, could you try a newer build?  I just marked a similar bug as a dup
because of a bug in htmlparser which was fixed on the 12th.
i still see this on linux 2001011306

after having visited ftp.mozilla.org and then gone to a html URL: even after
having closed the browser window and only having mailnews open, CPU keeps
reporting from 75% CPU to 99% used by mozilla-bin, without anything seemingly
happening in the browser. (Meaning not loading pages, getting mail etc.)

I did a netstat for the fun of it, and found that TWO ftp sockets remained open
to ftp.mozilla.org (or rather h-207-200-81-212.netscape.com ?)

These sockets don't time out, even if other sockets to URL's i visited later
(and remained on for control) - timed out as they should.

So something is wrong, and ftp triggers it.
*** Bug 65490 has been marked as a duplicate of this bug. ***
Note: There was no network traffic on the open sockets.
Also: dup 65490 is reported on a build from the 14th.
Reporter there wonders if it is chrome related.
Still present in build 2001011420 on Windows NT.
does anyone have access to a debug build that can reproduce this problem?  :-(


was able to reproduce after I updated my tree!  Sounds like a regression.  I
will start digging.
Status: NEW → ASSIGNED
hmm.  well, not I can not reproduce it anymore.

I did however get a peek at what is going on.  The SocketTransport of the
control connection is being called over and over after the LIST command.  It is
being passed POLL_WRITE and its operation is eSocketState_WaitReadWrite.  It
tries to do a Read (?).  We fail with a NS_BASE_STREAM_WOULD_BLOCK error.  

More digging continues....
Okay, well, I do not have to peg the cpu(s) at 100% to see the problem.  Here is 
a tail of the socket transport log:


756[18ee350]: +++ Entering nsSocketTransport::Process() [ftp.netscape.com:21 
5078910].	aSelectFlags = 2.	CurrentState = 5
756[18ee350]: +++ Entering nsSocketTransport::doRead() [ftp.netscape.com:21 
5078910].	aSelectFlags = 2.	
756[18ee350]: nsReadFromSocket [fd=4766340].  rv = 80470007. Buffer space = 16.  
Bytes read =0
756[18ee350]: WriteSegments [fd=4766340].  rv = 80470007. Bytes read =0
756[18ee350]: --- Leaving nsSocketTransport::doRead() [ftp.netscape.com:21 
5078910]. rv = 80470007.	Total bytes read: 0

756[18ee350]: +++ Entering nsSocketTransport::doWrite() [ftp.netscape.com:21 
5078910].	aSelectFlags = 2.	mWriteCount = 0
756[18ee350]: --- Leaving nsSocketTransport::doWrite() [ftp.netscape.com:21 
5078910]. rv = 80470007.	Total bytes written: 0

756[18ee350]: --- Leaving nsSocketTransport::Process() [ftp.netscape.com:21 
5078910]. mStatus = 80470007.	CurrentState = 5


This loops repeatively unbounded.  

more digging...
failing with BASE_STREAM_WOULD_BLOCK should put the socket back on the select
list...

was it an AsyncRead or AsyncWrite that started this sequence?
the control socket transport (write) is never removed from the transport select 
list.  This cause the socket transport service to loop over all active 
transports even though there is no data for any of the transport.  A classic 
busy waiting bug.

Marking this bug dependent on 65272.  I will see if I can find a simple 
workaround for the time being, otherwise I will try fixing 65272 in the next 
week or so.
Depends on: 65272
Keywords: qawanted
No longer depends on: 65272
A quick hack around this problem might be to put your call to AsyncWrite in
your pipe's nsIOutputStream OnWrite observer method.  You could then set the
transferCount on the socket, corresponding to the amount passed into OnWrite,
before calling AsyncWrite.  Fortunately, the socket transport only used
transferCount for AsyncWrite/OpenOutputStream, so I think this should work.
Definitely smoking my CPU when visiting FTP urls.20010119
Using Build 20010119 on Linux kernel 2.2.16.
*** Bug 66395 has been marked as a duplicate of this bug. ***
My changes for bug 62566 mask this bug (in a hackish way), so please do not
mark this as invalid.  Thanks!
*** Bug 66214 has been marked as a duplicate of this bug. ***
This problem also shows up on SPARC/Solaris for build 2001012321.
Darin checked in a hack fix for this problem when he landed his AsyncWrite
changes.

Does anyone have access to that sunos box so that we can get a nspr log from it?
Blocks: 67358
If I use Mozilla for the windows installer download, it takes about 3 times
longer then when I use WSFTP. WSFTP downloads the file in about 25 seconds, and
Mozilla in 1 min 18 seconds.
H-J, on which build are you noticing this?  thx.
I have used build 2001012204 to 2001012920 for example. This because of the
changes that caused many problems with more recent builds (PSM no longer working
(after a fresh install), dialog windows without buttons, bookmark menu flips
under taskbar). 
With 2/5/2001 build, I am not able to use FTP protocol
at all. If I go anywhere with the ftp protocol, the CPU
goes up to 100%. This is true even if the ftp site I am
visiting has only one directory.

This needs to be fixed ASAP.
Severity: major → critical
Katsuhiko, It will be fixed when we land the necko design changes as discussed 
in the public netlib newsgroup.
*** Bug 67833 has been marked as a duplicate of this bug. ***
Ummm...what's the time frame on this?
I think that this is not going to happen until dougt lands his branch.  Do we
want to do that right before a milestone?  It sounds pretty big.  Doug, what's
the feedback like on your testbuild?  I dropped the ball on recruiting help
after the problem with the initial builds. I suck but I can probably get a few
folks on it today and tomorrow if you think it's worth it (safe enough) trying
to push it up to get included with 0.8.
hey asa, test build went well (for the one person that actually sent me
feedback).  I am going to land these changes when the tree opens for 0.9.
*** Bug 68272 has been marked as a duplicate of this bug. ***
cc: sairuh, shrir
adding perf kw, and nominating for beta1. am assuming that it's too late to fix
this for mozilla0.8 --how bout 0.9? [correct me if am wrong there, tho'.]
Keywords: nsbeta1, perf
This will be fixed once dougt's branch lands, which should be very soon.

Setting 0.9 as target milestone.
Target Milestone: --- → mozilla0.9
This bug should be fixed my recently checked in necko changes.  
Status: ASSIGNED → RESOLVED
Closed: 24 years ago24 years ago
Resolution: --- → FIXED
This problem is not fixed for the 2001-02-21-09-Mtrunk Win32 build.
Is your checkin later than this build?

Unless you've pulled a tree or updated, you won't see it until the respins 
tomorrow.  He just checked in, today.
verified
Status: RESOLVED → VERIFIED
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.