Closed Bug 85514 Opened 24 years ago Closed 24 years ago

downloading files on Mac sometimes fail [hang] midway

Tracking

(Not tracked)

Status:

VERIFIED FIXED

People

(Reporter: bugzilla, Assigned: sfraser_bugs)

References

(
URL
)

Details

(Keywords: platform-parity)

Attachments

(2 files)

fwiw, a screenshot when the dialog hangs 24 years ago sairuh (rarely reading bugmail) 16.73 KB, image/jpeg		Details
Use OTEnter/LeaveNotifier in macsockotpt.c 24 years ago Simon Fraser [no longer active] 2.66 KB, patch		Details \| Diff \| Splinter Review

sairuh (rarely reading bugmail)

Reporter

Description

•

24 years ago

couldn't find an existing bug, but pls do dup as needed. summary: when i do an ftp download of a file on the mac, the download sometimes fails [download progress meter hangs] midway. to repro: 1. go to an ftp site, such as the one at sweetlou above [go into a build folder, then]. 2. single-click a file --i have been choosing files that are at least 10 Mbytes in size. this should bring up the downloading/helper app dialog. 3. in the downloading dialog, make sure the "save to disk" option is selected, then click OK. 4. in the resulting file picker ["enter name of file to save to..."], choose a location --fwiw, here i don't save on the desktop; rather, i've been selecting a subfolder that's located on the non-startup disk. 5. click Save button, and the download progress ["saving file"] dialog will appear. 6. wait for the download to complete. result: about half-way to three-quarter's the way thru, the progress meter in the progress dialog stops moving. after about 3-5min, i give up and cancel the download, and retry. it seems that roughly 2 out of 3 attempts are resulting in this transfer failure. tested using 2001.06.07.11-branch comm bits on Mac OS 9.0x G3. grace et al., have you seen or heard of this?

Simon Fraser [no longer active]

Assignee

Comment 1

•

24 years ago

Might be a dup of bug 71204, probably related to bug 53463

sairuh (rarely reading bugmail)

Reporter

Comment 2

•

24 years ago

hmmm...i read bug 71204, so yeah it's possible this is a dup. fwiw [in contrast to pchen's 2001-03-07 11:58 comments], i've been using installer bits when i encounter this. so, i don't think that might matter. also, i haven't really seen failures of this sort on win32 or linux, at least not for some time --so adding 'pp'. need to see if this is limited to ftp transfers...or if it also happens with http downloads as well...

sairuh (rarely reading bugmail)

Reporter

Comment 3

•

24 years ago

oops, really adding those kw's...

Keywords: mozilla0.9.2, nsbeta1, nsdogfood, pp

Mike Pinkerton (not reading bugmail)

Comment 4

•

24 years ago

i have seen this recently (the 6/5/01 trunk bits), trying to d/l a 15+MB file.

Severity: major → critical

Simon Fraser [no longer active]

Assignee

Comment 5

•

24 years ago

Snippet from a protocol log: 3[d852398]: nsSocketReadRequest: [this=dfd1d58] inside OnRead. 3[d852398]: nsSocketReadRequest: [this=dfd1d58] calling listener [offset=8528440, count=8192] 3[d852398]: nsSocketIS: PR_Read(count=8192) returned 2920 3[d852398]: nsSocketIS: PR_Read(count=5272) returned -1 3[d852398]: nsSocketIS: PR_Read() failed with PR_WOULD_BLOCK_ERROR 3[d852398]: nsSocketReadRequest: listener returned [rv=0] 3[d852398]: nsSocketReadRequest: [this=dfd1d58] read 2920 bytes [offset=8531360] 3[d852398]: nsSocketTransport: doReadWrite [readstatus=80470007 writestatus=0 readsuspend=0 writesuspend=0 mSelectFlags=5] 3[d852398]: nsSocketTransport: Leaving Process() [host=208.12.36.227:23593 this= e1e9f5c], mStatus = 80470007, CurrentState=5, mSelectFlags=5 3[d852398]: nsSocketTransport: Entering Process() [host=208.12.36.227:23593 this= e1e9f5c], aSelectFlags=1, CurrentState=5. 3[d852398]: nsSocketTransport: Transport [host=208.12.36.227:23593 this=e1e9f5c] is in WaitReadWrite state [readtype=1 writetype=0 status=80470007]. 3[d852398]: nsSocketTransport: doReadWrite [this=e1e9f5c, aSelectFlags=1, mReadRequest=dfd1d58, mWriteRequest=0 This sounds similar to bug 70408

sairuh (rarely reading bugmail)

Reporter

Comment 6

•

24 years ago

Attached image fwiw, a screenshot when the dialog hangs — Details

Simon Fraser [no longer active]

Assignee

Comment 7

•

24 years ago

This is a Mac NSPR bug. To explain the protocol log stuff I pasted above (PR_Read returning PR_WOULD_BLOCK_ERROR) -- darin says that PR_Read can be called repeatedly until the necko buffer is full (without interventing PR_Poll calls), until the PR_Read returns 0 (to indicate EOF), or PR_WOULD_BLOCK_ERROR. It happens that on Mac, we can only detect an EOF by virtue of receiving an orderly release request from the server, and this can happen some time after we've read all available data from the stream. So there is a time window in which a second read (after the first has read all available data) will return PR_WOULD_BLOCK_ERROR, because OTRcv gives us a kOTNoDataErr, but we have not yet received the orderly release request. This explains why protocol logs can look different between platforms, and why seeing PR_WOULD_BLOCK_ERROR in Mac logs is benign. So the real problem in this bug is a race condition in Mac NSPR, I think. Some instrumentation shows that we stall when the OT notifier fires while we're inside of SendReceiveStream(). I think we're clobbering the value of me->io_pending, which needs to be protected by a lock.

Assignee: dougt → gordon

Simon Fraser [no longer active]

Assignee

Comment 8

•

24 years ago

Throwing some _PR_INTSOFF/PR_Lock / PR_Unlock/_PR_FAST_INTSON around did not help. Looking some more, what I believe is happening is this: We're in SendReceiveStream(), calling OTRcv(). While we are in OTRcv(), the notifier strikes. The OTRcv() returns kOTNoDataErr, so we set fd->secret->md.readReady to FALSE (thus clobbering the value that the notifier put in there). Messing with interrupts and locks doesn't therefore help.

Simon Fraser [no longer active]

Assignee

Comment 9

•

24 years ago

Attached patch Use OTEnter/LeaveNotifier in macsockotpt.c — Details — Splinter Review

Simon Fraser [no longer active]

Assignee

Comment 10

•

24 years ago

Using OTEnterNotifier/OTLeaveNotifier in SendReceiveStream() fixes this. These calls prevent the notifier from firing while we are in the read/write loop, so prevent clobbering of fd->secret->md.readReady/fd->secret->md.writeReady and me-> io_pending. I have *not* tested the blocking version of the code. We may also need a similar fix in SendReceiveDgram() (what uses this?).

Wan-Teh Chang

Comment 11

•

24 years ago

r=wtc. By the way, please restore the original while (...) { indentation style. I would not worry about the blocking version of the code as long as your patch is a strict improvement over the current code.

gordon

Comment 12

•

24 years ago

over to Simon. Thanks.

Assignee: gordon → sfraser

Simon Fraser [no longer active]

Assignee

Updated

•

24 years ago

Status: NEW → ASSIGNED

Target Milestone: --- → mozilla0.9.2

Simon Fraser [no longer active]

Assignee

Comment 13

•

24 years ago

0.9.2

Simon Fraser [no longer active]

Assignee

Comment 14

•

24 years ago

I tested https, FTP, IMAP, and the page load tests. Everything checks out. Since I can check in to NSPR with just an r=, I'm ready to go.

Priority: -- → P2

Simon Fraser [no longer active]

Assignee

Comment 15

•

24 years ago

*** Bug 84826 has been marked as a duplicate of this bug. ***

Wan-Teh Chang

Comment 16

•

24 years ago

Wrong. You don't need an sr=, but you need an a=. Remember to check in the same fix on the trunk of NSPR.

Simon Fraser [no longer active]

Assignee

Comment 17

•

24 years ago

Who has to give a=? An NSPR module owner, or drivers?

Wan-Teh Chang

Comment 18

•

24 years ago

drivers@mozilla.org. Treat the NSPRPUB_CLIENT_BRANCH as if it were the trunk of Mozilla client. As for the trunk of NSPR, there is no sr= or a= requirement.

Christopher Blizzard (:blizzard)

Comment 19

•

24 years ago

a=blizzard on behalf of drivers for the trunk

Asa Dotzler [:asa]

Updated

•

24 years ago

Blocks: 83989

Simon Fraser [no longer active]

Assignee

Comment 20

•

24 years ago

Checked into NSPRPUB_CLIENT_BRANCH, and the NSPR tip.

Status: ASSIGNED → RESOLVED

Closed: 24 years ago

Resolution: --- → FIXED

Steve Dagley

Comment 21

•

24 years ago

*** Bug 76899 has been marked as a duplicate of this bug. ***

Tom Everingham

Comment 22

•

24 years ago

verified: mac os9 7/24/01 branch and trunk

Status: RESOLVED → VERIFIED

benc

Updated

•

23 years ago

Component: Networking: FTP → NSPR

Product: Browser → NSPR

Target Milestone: mozilla0.9.2 → ---

You need to log in before you can comment on or make changes to this bug.