Closed
Bug 85514
Opened 24 years ago
Closed 24 years ago
downloading files on Mac sometimes fail [hang] midway
Categories
(NSPR :: NSPR, defect, P2)
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: bugzilla, Assigned: sfraser_bugs)
References
()
Details
(Keywords: platform-parity)
Attachments
(2 files)
16.73 KB,
image/jpeg
|
Details | |
2.66 KB,
patch
|
Details | Diff | Splinter Review |
couldn't find an existing bug, but pls do dup as needed.
summary: when i do an ftp download of a file on the mac, the download sometimes
fails [download progress meter hangs] midway.
to repro:
1. go to an ftp site, such as the one at sweetlou above [go into a build folder,
then].
2. single-click a file --i have been choosing files that are at least 10 Mbytes
in size. this should bring up the downloading/helper app dialog.
3. in the downloading dialog, make sure the "save to disk" option is selected,
then click OK.
4. in the resulting file picker ["enter name of file to save to..."], choose a
location --fwiw, here i don't save on the desktop; rather, i've been selecting a
subfolder that's located on the non-startup disk.
5. click Save button, and the download progress ["saving file"] dialog will
appear.
6. wait for the download to complete.
result: about half-way to three-quarter's the way thru, the progress meter in
the progress dialog stops moving. after about 3-5min, i give up and cancel the
download, and retry. it seems that roughly 2 out of 3 attempts are resulting in
this transfer failure.
tested using 2001.06.07.11-branch comm bits on Mac OS 9.0x G3.
grace et al., have you seen or heard of this?
Assignee | ||
Comment 1•24 years ago
|
||
Reporter | ||
Comment 2•24 years ago
|
||
hmmm...i read bug 71204, so yeah it's possible this is a dup. fwiw [in contrast
to pchen's 2001-03-07 11:58 comments], i've been using installer bits when i
encounter this. so, i don't think that might matter.
also, i haven't really seen failures of this sort on win32 or linux, at least
not for some time --so adding 'pp'.
need to see if this is limited to ftp transfers...or if it also happens with
http downloads as well...
Comment 4•24 years ago
|
||
i have seen this recently (the 6/5/01 trunk bits), trying to d/l a 15+MB file.
Severity: major → critical
Assignee | ||
Comment 5•24 years ago
|
||
Snippet from a protocol log:
3[d852398]: nsSocketReadRequest: [this=dfd1d58] inside OnRead.
3[d852398]: nsSocketReadRequest: [this=dfd1d58] calling listener [offset=8528440,
count=8192]
3[d852398]: nsSocketIS: PR_Read(count=8192) returned 2920
3[d852398]: nsSocketIS: PR_Read(count=5272) returned -1
3[d852398]: nsSocketIS: PR_Read() failed with PR_WOULD_BLOCK_ERROR
3[d852398]: nsSocketReadRequest: listener returned [rv=0]
3[d852398]: nsSocketReadRequest: [this=dfd1d58] read 2920 bytes [offset=8531360]
3[d852398]: nsSocketTransport: doReadWrite [readstatus=80470007 writestatus=0
readsuspend=0 writesuspend=0 mSelectFlags=5]
3[d852398]: nsSocketTransport: Leaving Process() [host=208.12.36.227:23593 this=
e1e9f5c], mStatus = 80470007, CurrentState=5, mSelectFlags=5
3[d852398]: nsSocketTransport: Entering Process() [host=208.12.36.227:23593 this=
e1e9f5c], aSelectFlags=1, CurrentState=5.
3[d852398]: nsSocketTransport: Transport [host=208.12.36.227:23593 this=e1e9f5c]
is in WaitReadWrite state [readtype=1 writetype=0 status=80470007].
3[d852398]: nsSocketTransport: doReadWrite [this=e1e9f5c, aSelectFlags=1,
mReadRequest=dfd1d58, mWriteRequest=0
This sounds similar to bug 70408
Reporter | ||
Comment 6•24 years ago
|
||
Assignee | ||
Comment 7•24 years ago
|
||
This is a Mac NSPR bug.
To explain the protocol log stuff I pasted above (PR_Read returning
PR_WOULD_BLOCK_ERROR) -- darin says that PR_Read can be called repeatedly until
the necko buffer is full (without interventing PR_Poll calls), until the PR_Read
returns 0 (to indicate EOF), or PR_WOULD_BLOCK_ERROR. It happens that on Mac, we
can only detect an EOF by virtue of receiving an orderly release request from the
server, and this can happen some time after we've read all available data from
the stream. So there is a time window in which a second read (after the first has
read all available data) will return PR_WOULD_BLOCK_ERROR, because OTRcv gives us
a kOTNoDataErr, but we have not yet received the orderly release request.
This explains why protocol logs can look different between platforms, and why
seeing PR_WOULD_BLOCK_ERROR in Mac logs is benign.
So the real problem in this bug is a race condition in Mac NSPR, I think. Some
instrumentation shows that we stall when the OT notifier fires while we're inside
of SendReceiveStream(). I think we're clobbering the value of me->io_pending,
which needs to be protected by a lock.
Assignee: dougt → gordon
Assignee | ||
Comment 8•24 years ago
|
||
Throwing some _PR_INTSOFF/PR_Lock / PR_Unlock/_PR_FAST_INTSON around did not
help. Looking some more, what I believe is happening is this:
We're in SendReceiveStream(), calling OTRcv(). While we are in OTRcv(), the
notifier strikes. The OTRcv() returns kOTNoDataErr, so we set
fd->secret->md.readReady to FALSE (thus clobbering the value that the notifier
put in there).
Messing with interrupts and locks doesn't therefore help.
Assignee | ||
Comment 9•24 years ago
|
||
Assignee | ||
Comment 10•24 years ago
|
||
Using OTEnterNotifier/OTLeaveNotifier in SendReceiveStream() fixes this. These
calls prevent the notifier from firing while we are in the read/write loop, so
prevent clobbering of fd->secret->md.readReady/fd->secret->md.writeReady and me->
io_pending.
I have *not* tested the blocking version of the code. We may also need a similar
fix in SendReceiveDgram() (what uses this?).
Comment 11•24 years ago
|
||
r=wtc. By the way, please restore the original while (...) {
indentation style. I would not worry about the blocking version
of the code as long as your patch is a strict improvement over
the current code.
Assignee | ||
Updated•24 years ago
|
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.2
Assignee | ||
Comment 13•24 years ago
|
||
0.9.2
Assignee | ||
Comment 14•24 years ago
|
||
I tested https, FTP, IMAP, and the page load tests. Everything checks out. Since
I can check in to NSPR with just an r=, I'm ready to go.
Priority: -- → P2
Assignee | ||
Comment 15•24 years ago
|
||
*** Bug 84826 has been marked as a duplicate of this bug. ***
Comment 16•24 years ago
|
||
Wrong. You don't need an sr=, but you need an a=.
Remember to check in the same fix on the trunk of NSPR.
Assignee | ||
Comment 17•24 years ago
|
||
Who has to give a=? An NSPR module owner, or drivers?
Comment 18•24 years ago
|
||
drivers@mozilla.org. Treat the NSPRPUB_CLIENT_BRANCH as if
it were the trunk of Mozilla client.
As for the trunk of NSPR, there is no sr= or a= requirement.
Comment 19•24 years ago
|
||
a=blizzard on behalf of drivers for the trunk
Assignee | ||
Comment 20•24 years ago
|
||
Checked into NSPRPUB_CLIENT_BRANCH, and the NSPR tip.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 21•24 years ago
|
||
*** Bug 76899 has been marked as a duplicate of this bug. ***
Component: Networking: FTP → NSPR
Product: Browser → NSPR
Target Milestone: mozilla0.9.2 → ---
You need to log in
before you can comment on or make changes to this bug.
Description
•