Closed Bug 110241 Opened 23 years ago Closed 23 years ago

After visiting the ftp site, and going on to other urls Mozilla will automatically refresh the ftp connection

Categories

(Core Graveyard :: Networking: FTP, defect, P1)

x86
Linux
defect

Tracking

(Not tracked)

VERIFIED FIXED
mozilla0.9.9

People

(Reporter: mozilla-bugzilla, Assigned: bbaetz)

References

()

Details

(Keywords: testcase)

Attachments

(6 files)

From Bugzilla Helper:
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:0.9.5+) Gecko/20011018
Netscape6/6.5
BuildID:    2001101419

After going to a ftp site, a set time after going to a web site Mozilla will
refresh the ftp site and take you back. 

Reproducible: Sometimes
Steps to Reproduce:
1. Go to ftp://ftp.rpmfind.net/linux/rawhide/1.0/i386/RedHat/RPMS/
2. Go to http://www.mozilla.org/
3. Wait a minute and Mozilla will automatically take you back to the ftp site

Actual Results:  Display resets back to ftp site

Expected Results:  Display stays to where you take it

Seen still in recent nightly builds
Linux problem only?

WFM (and I waited several minutes). Build ID: 2001 11 14 03. Windows 2000.

do you have any other network, ftp, proxy or other stuff configured in linux
that would be helpful to verify?

WFM on W2K also 2001-11-12-03
As I marked, the problem only happens sometimes. But for the last week I have
been seeing it at least 2-3 times a day and eariler I had a tab that would reset
to the ftp after about 60 seconds, not sure of the exact time.
I can't reproduce this, at all. Is it only this ftp site, or others too?
I have seen the bug since I reported it. I haven't found it's exact cause.
Things that will probably make it likely to show. Open 3-4 tabs, go to ftp
servers in each, go to 2-3 websites in each after the ftp site. Go back 1-2 in
each. Wait, and see if any reset themselves to the ftp site. You may also want
to go to a website before you go to the ftp site in half of them.
Hmm. I still can't reproduce this.

benc?
Keywords: qawanted
I am also seeing this, Mozilla 0.9.7, Alpha Linux (RedHat 7.1).
I have been seeing it for a while, but (like the reporter) my attempts to
reproduce it have not been very fruitful.
IS this only reproducable with tabbed browsing enabled?
*** Bug 119733 has been marked as a duplicate of this bug. ***
To tell you the truth, I didn't know tabbed browsing existed till you pointed
out this question. So no, I'm not using tabbed browsing. (I don't know if it is
"enabled", but I use seperate windows only)

A few other things: It is exactly 5 minutes until it does for me, by looking at
my ftp server logs after connecting with mozilla. It registers as going forward
a page, hitting "back" will get you to the page you were last at, not the one
before the initial connection to the ftp server.

Running linux 2.4.17, slightly hacked-with slackware 8(glibc223).
tcpdump log

tcpdump -i eth0 host susan.che.ncsu.edu and port ftp
I did a tcpdump and looked for ftp packets (see attachment).  It looks like
Mozilla waits to close the ftp connection for a few minutes.  Sometimes the
closing works ok, but sometimes it also redisplays the ftp in Mozilla.  From the
tcpdump log, it first closes the connection, and then makes a new connection. 
Very strange.  When the connection is closed properly (Mozilla does not go back
to the ftp page), the connection is just closed, with no reconnection attempt. 
This does seem to happen at exactly 5 minutes.  Note that many ftp servers use a
5 minute timeout on connections.  Perhaps Mozilla is confused that it can't
close a connection because it's already dead on the other end.

Most recently, I was on a page with a Javascript thing to open an ad window when
I left their page, and that Javascript was activated when Mozilla went back to
the ftp page.

Note to tcpdump newbies: the time is on the far left, my hostname is
susan.che.ncsu.edu (ftp site is jungle.metalab.unc.edu).  S is for making a new
connection and F is for closing one.
I still can't reproduce this. 5 minutes is the timeout to remove cached ftp
connections. Can you try shutting down mozilla, then adding to prefs.js in your
profile's folder:

pref("network.ftp.idleConnectionTimeout", 30);

then try this again? See if it happens after 30 seconds instead of 5 minutes. I
looked through that codce, and I can't see why this would be happening.

If you're on windows, can you try this by starting mozilla from the command
line, after setting teh environment variables

NSPR_LOG_MODULES=nsFTPProtocol:5
NSPR_LOG_FILE=ftp.110241.log

and then attach the file ftp.110241.log to this bug? That debugging code is only
enabled on windows in opt build - if you have a debug build on another platform
you could use that to create the log file.
*** Bug 119801 has been marked as a duplicate of this bug. ***
OK, I've managed to preproduce this, sort of.

Theone time I managed to get a log, we were getting a timeout message from the
server on a connection, and so we resumeed the load.

I haven't managed to get this to happen when I have both a packet dump and a
debug log, though, so I'm not sure which connection this is happeneing on.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: qawanted
Attached file NSPR_LOG
This is the NSPR_LOG.  I made two connections.	the first was ok, and the
second exhibitted the bug behavior.
Attached file tcp log
this tcp log corresponds to the NSPR_LOG
setting the idleConnectionTimeout changed the behavior, but I was unable to
reproduce the bug while it was on.  when set to 400, the remote site closed 1
(of 2) connection at 5 minutes and mozilla attempted to close the other at 400,
but got no response from the remote site.  However, Mozilla didn't have any
problem with that.  when set to 30, it closed the connection ok at 30 seconds.

Anyway, I disabled the pref and got it to go with a debug build.  the first one
in the logs (ftp.redhat.com) worked ok, but the second (jungle.metalab.unc.edu,
AGAIN) exhibitted the bug.
OK, got it:

1024[8099b90]: (89b7a98) reading 78 bytes: "150 Opening ASCII mode data
connection for file list
226 Transfer complete."

We're not handling reading more than one ftp response in the same packet - the
logic assumes that since we only send out one request at a time, we only receive
one response at a time, which isn't strictly true.
Status: NEW → ASSIGNED
Priority: -- → P1
Target Milestone: --- → mozilla0.9.9
Attached patch patchSplinter Review
OK, here we go. I've rewritten this code, basically from scratch. Its probably
easier to patch it, and then read through the routine, rather than look at the
diff directly. I think I've taken care of several case which were handled
incorrectly before (eg the partial buffer only having the first number from the
response code) as well as the case in this bug. I can't get this to reproduce,
though, on my machine. If someone who was seeing this before could try it and
test, that would be great.
NSPR_LOG showing the patch working properly
I can get this bug to occur every time by visiting an uncached URL on
jungle.metalab.unc.edu.

Works fine every time with the patch.
Comment on attachment 67236 [details] [diff] [review]
patch

>Index: nsFtpConnectionThread.cpp

>+        char* eol = PL_strstr(currLine, CRLF);

you should use strstr... it's available XP :-)


>+            // I can't use Substring + PromiseFlatCString - see bug 122727
>+            //mResponseCode = atoi(PromiseFlatCString(Substring(mResponseReadCarryOverBuf,0,3)).get());
>+            
>+            // This relies on the whitespace or '-' after the response code
>+            mResponseCode = atoi(mControlReadCarryOverBuf.get());

yuck!  how about this instead:

  char buf[4] = {0};
  memcpy(buf, mControlReadCarryOverBuf.get(), 3);
  mResponseCode = atoi(buf);

sr=darin with these changes.
Attachment #67236 - Flags: superreview+
Comment on attachment 67236 [details] [diff] [review]
patch

In general, I like the changes.  However, I want to see a complete regression
suite run when this lands.  tever can help here.


+    char* currLine = buffer;
+    while (currLine < (buffer+aCount)) {
+	 char* eol = PL_strstr(currLine, CRLF);
+	 if (!eol) {
+	     mControlReadBrokenLine = PR_TRUE;
+	     mControlReadCarryOverBuf += currLine;
+	     break;
	 }

Unless I am missing something, this code will loop forever if the input ends
with a CRLF.
dougt: You're missing the currline= eol + 2 at the end of the while loop

I'll make the other changes. (was that an r=dougt?)
Attachment #67236 - Flags: review+
Checked in.

tever: Can you please run that regression test suite?
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
FWIW: current cvs linux has a problem in that trying to reach ftp links now
freeze the app.

ftp://sunsite.dk/mirrors/lokigames/updates/
ftp://ftp.planetmirror.com/pub/lokigames/updates/

Both causes a hang. The latter also spawns a cryptic alert saying:
"n also access these via a web interface reachable from:"

Clicked OK and froze there too.
/me mutters under his breath.
I'll need to reread the ftp spec, I think. I think that what they are doing is
invalid. Anyway, I'll cooki up a patch later today.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attached patch fixSplinter Review
OK, I misparsed the ftp spec. A response ends with a respone code + space, but
it doesn't have to start with the response code.

I've also got rid of two write-only booleans. (Well, one write-only, and the
other was read only after it was written to in the same routine)
Built with the patch: No more hang, directories show as expected, all is well.
(The alert i mentioned earlyer doesn't attempt to appear anymore. Should it?)

Anyway: the patch is good.
Comment on attachment 67622 [details] [diff] [review]
fix

looks fine.  nice that you could remove the two bools.	again, lets make sure
that tom runs a full regression suite.
Attachment #67622 - Flags: review+
Comment on attachment 67622 [details] [diff] [review]
fix

sr=darin
Attachment #67622 - Flags: superreview+
The alert was a side effect of stuff left in the buffer after an alert.

Checked in.
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → FIXED
*** Bug 124040 has been marked as a duplicate of this bug. ***
VERIFIED. Please reopen if this recurs in 1.0.

I hadn't seen this in any recent Linux testing.
Status: RESOLVED → VERIFIED
+testcase, to catch regressions.
Keywords: testcase
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: