Installer Downloader stops frequently

VERIFIED FIXED in mozilla0.9.3

Status

VERIFIED FIXED
18 years ago
14 years ago

People

(Reporter: jlarsen, Assigned: ssu0262)

Tracking

Trunk
mozilla0.9.3
x86
Windows 98

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: PDT+ verified on branch)

Attachments

(5 attachments)

(Reporter)

Description

18 years ago
My network connection is sometimes buggy, and downloads will freeze for a few 
seconds while I'm downloading. Mozilla's network installer recently responded to 
this by making me press the resume button, even though it only looses the stream 
for a second or two. I've never seen another downloader do this, and intaller 
used to not do this either. I don't remember when this showed up unfortunatly. 
Is anyone else seeing this problem?
(Assignee)

Comment 1

18 years ago
I have seen this from home via a 56k line.  I redownloaded the stub installer 
and also hung up the line and reconnected.  This had fixed my problems with the 
installer throwing me into pause mode frequently.

One possibility is that the installer could have been corrupted during the 
download.

Comment 2

18 years ago
Using the network installer for commercial builds from 7/17 and 7/18 I receive
network error messages before each different component is downloaded. This
occurs on a 28.8 dialup connection through sera. 

The installer is forced into pause mode and can be resumed.

I have not yet tested with the mozilla installer (it takes me hours to
completely perform an install). I will test using the mozilla installer later today.
(Assignee)

Updated

18 years ago
Status: NEW → ASSIGNED
Keywords: nsbeta1+, nsBranch
Target Milestone: --- → mozilla0.9.3
(Assignee)

Comment 3

18 years ago
Created attachment 42978 [details] [diff] [review]
patch #1
(Assignee)

Comment 4

18 years ago
This is essentialy a race condition problem.  Normally at the end of the first 
file downloaded, we close the data connection (but leave the control connection 
open).  By closing the data connection, the server will respond (at least ours 
will) with:

  226 Transfer completed.

We do have a flush function to try to catch this response.  The problem occurs 
when the user's connection to the server is either really slow or the machine is 
performing some network operations that causes a bottle neck for incoming 
traffic.

When this happens, the flush function has nothing to flush in the receive buffer 
because the 226 response has not yet arrived due to the bottle neck.  The code 
simply continues and attempts to make another data connection to the server for 
the next file.

On every new data connection, we wait until a response is received because we 
are expecting one.  At this point, we pick up the 226 response from when the 
data connection was closed.  We are instead expecting a 227 response, and so we 
fail and put the user in Pause mode.

The fix is to have the flush function wait for the known 226 response to arrive 
before continuing with the next file.  Not all servers might send a 226 response 
when the data connection is closed, so a maximun # of loops is performed before 
it puts the user in the Pause state.

In my tests (56k while performing a 700k ftp upload via SERA to a nfs file 
system), I have not seen more than 15 iterations while waiting for the 226 
response.  I have preset the max # of loops to be 50.  Hopefully this should be 
long enough even on 14.4k connections.

I really can't imagine why anyone would be using anything slower than a 14.4 
modem, let alone a 14.4 modem (syd doesn't count :)
Whiteboard: seeking r=, sr=
(Assignee)

Comment 5

18 years ago
I forgot to elaborate on what happens if a server does not send the expected 226 
response.  In this case, the installer will simply time out and continue with 
the next file.  It is only when there's an expected 226 response which 
interferes with the next file's data connection that this problem occurs.

Comment 6

18 years ago
Is there any real reason to wait for the 226?  Do we take any action in response
to it?  (I'm wondering if we can just ignore the ones we get rather than waiting
just to throw it away.)
(Assignee)

Comment 7

18 years ago
we have to wait for the 226 response.  We can't try to ignore it later when 
we're expecting the 227.  The reason being is that it might not be a 226 that's 
in the buffer (when waiting for the 227 response).  It could be something else 
altogether.

It is safer to wait for a 226 when we know there might be one, than to try to 
guess later when we're not sure if there might be one.

The whole process of keeping the connection open is not really correct right 
now.  The patch is a good way to fix the current problem for now.

Comment 8

18 years ago
Created attachment 43004 [details] [diff] [review]
modification of patch that also builds on Mac

Comment 9

18 years ago
tested my patch on the mac, downloads all xpi files.
(Assignee)

Comment 10

18 years ago
patch #2 coming up.  fixed the logic of checking for E_TIMEOUT within 
FlushCntl().

This one also builds on the mac For Me(TM) without Syd's changes from his patch.  
I'm not sure why it didn't build for Syd.

Syd, can you try patch #2 to see if it builds?
(Assignee)

Comment 11

18 years ago
Created attachment 43016 [details] [diff] [review]
patch #2 (changed per talk with dveditz).
nearly r= for patch #2 with two concerns:

1. not sure what syd's changes were about and you skip them. I want his OK on
this, or otherwise see a working Mac installer with these changes (not just a
compile)

2. the inner if() you added has a redundant term making it harder to read.
instead of checking for the conditions on which you bail ( x || ( y && z )) it's
clearer if you reverse the sense to the condition on which you *don't* bail:

  if (err==nsSocket::E_TIMEOUT && bailOnTimeOutCount < bailOnTimeOut)
    err == nsSocket::E_READ_MORE; // wait a little longer for response
  else
    goto BAIL;

Address those and r=dveditz.

Comment 13

18 years ago
PDT+, please check in if you get fully comfortable and fully reviewed.
Whiteboard: seeking r=, sr= → PDT+ seeking r=, sr=
(Assignee)

Comment 14

18 years ago
Created attachment 43025 [details] [diff] [review]
patch #3 with dveditz's suggestions.  compiles on mac and linux.
(Assignee)

Comment 15

18 years ago
Created attachment 43039 [details] [diff] [review]
patch #4 contains dan's suggestion and compiles on mac and linux
r=dveditz for patch #4

Comment 17

18 years ago
sr=mscott
(Assignee)

Comment 18

18 years ago
patch checked into the trunk.  working on the branch now.
Whiteboard: PDT+ seeking r=, sr= → PDT+

Comment 19

18 years ago
branch works fine for me with patch 4 applied.
(Assignee)

Comment 20

18 years ago
patch checked into the branch.  marking this bug fixed.

how to verify:
on a slow machine via a slow connection (56k should suffice, LAN might even 
work), do the following:

1) copy about a 2mb file over the net (if on LAN try a 20mb file or larger).  
Either to or from somewhere is okay.  At the same time this copy is happening, 
do 2) and 3) below.

2) using the build _without_ the fix, run the win32 stub installer (any setup 
type).  It should toss you into Pause mode automatically after warning you.  
This should happen at the beginning of each file (after the first file).

3) using the build _with_ the fix, run the win32 stub installer the same way as 
2.  The installer should not put you in Pause mode.

If you can't reproduce 2), you will proabably have to keep trying the installer.  
Either that or bog down the network on the test machine some more :).
Status: ASSIGNED → RESOLVED
Last Resolved: 18 years ago
QA Contact: gemal → gbush
Resolution: --- → FIXED

Comment 21

18 years ago
Verified now works on a RedHat 6.2 JP system both using en LANG and LC_ALL and 
ja_JP LANG and LC_ALL environments. 

Comment 22

18 years ago
This works great on my slow machine(win95) at home dl rate  <56k. Not one 
interruption compared to many(!) on dl 7/17 ( and previous downloads) 
need to find 20mb file to download to go through test cases.

Comment 23

18 years ago
verified on branch build 2001072406 Win 95, steps 1(dl 2mb file),2,3 as noted in 
comments above.  Network error rec'd using build without fix, build with fix did 
not put me in Pause mode
Bravo!

adding vtrunk reminder
Keywords: vtrunk
Whiteboard: PDT+ → PDT+ verified on branch

Comment 24

17 years ago
verified on trunk build 2001081305
over 56k modem- no problems/errors
Status: RESOLVED → VERIFIED
Keywords: vtrunk
Product: Browser → Seamonkey
You need to log in before you can comment on or make changes to this bug.