Closed
Bug 32407
Opened 25 years ago
Closed 25 years ago
large files corrupted on download
Categories
(Core Graveyard :: Tracking, defect, P1)
Core Graveyard
Tracking
Tracking
(Not tracked)
VERIFIED
FIXED
M15
People
(Reporter: omerk, Assigned: law)
References
Details
(Whiteboard: [PDT+] fix in hand)
Any ZIP file saved w. Mozilla reports a bad CRC when tested with Winzip. Tried 2
skinz at Chromezone and a few other zip d/l's-- files were ok with Nav 4.7 and
IE, corrupt with Moz.
Comment 2•25 years ago
|
||
Bill, maybe we're missing the initial or final OnData*() data for writing? I
tested (mozilla nightly build) this with an HTTP download of a zip file and
couldn't download the zip.
Assignee: valeski → law
Keywords: beta1
I can't spot any pattern here. I've downloaded .exe files (via http) OK. I've
downloaded .zip files (via ftp) OK. I've downloaded .zip files via ftp: and
have had it fail.
The file lengths always look OK. I compared M14 downloads from mozilla.org, one
using mozilla (as of this morning), one using 4.x. The files differ at offset
0x10000 which looks like a good spot for somebody's code to break down.
I'll look further.
I applied the patch below to test whether the data is getting corrupted coming
in vs. going out. I dump the raw data to a file. When I compare this file to
the resulting .zip that the download produces, they do not match. When I
compare the test file to what communicator downloaded, they match.
So the data is getting corrupted on output. My guess is either a bug in the
file transport logic or in nsIFile. I'm cc:ing Doug Turner because he might
know more about the latter.
The code I resurrected last week to fix 31804 was code older than nsIFile. Is
that code not correct for use with nsIFile underlying nsIFileTransport? I'm
guessing here (I don't even know for sure that file transport uses nsIFile).
Anyway, here is the patch. You'll have to change the DEBUG_law string and the
hard-coded file name to make it work for you.
Index: nsStreamXferOp.cpp
===================================================================
RCS file: /cvsroot/mozilla/xpfe/components/xfer/src/nsStreamXferOp.cpp,v
retrieving revision 1.24
diff -r1.24 nsStreamXferOp.cpp
320a321,328
> #ifdef DEBUG_law
> // Write raw data to file for test.
> static FILE *test = 0;
> if ( !test ) {
> test = fopen( "c:\\WinNT\\Profiles\\law\\Desktop\\test", "wb" );
> }
> fwrite( buffer, 1, bytesRead, test );
> #endif
Status: UNCONFIRMED → NEW
Ever confirmed: true
Do we want to apply this patch for beta 1 at this point?
Priority: P3 → P1
Target Milestone: M15
Comment 8•25 years ago
|
||
Changing summary to state more explicitly that *all* large files (larger than
64K, apparently) are corrupted when downloaded.
Please re-evaluate PDT status. Doug and I (and Warren) are looking for a fix.
Summary: .zip files save with bad CRC → large files corrupted on download
Whiteboard: [PDT-]
Comment 10•25 years ago
|
||
*** Bug 32575 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 11•25 years ago
|
||
I built a file with known content so we could look for a pattern in how the data
was being scrambled. You can get that file at
ftp://law.mcom.com/d/download/bytes.zip.
I generated the file with this program:
#include <stdio.h>
void main() {
FILE *f = fopen( "bytes.zip", "wb" );
unsigned int p = 0;
// Fill 64K plus a few bytes.
while ( p < 0x11000 ) {
// Write the data 4 times.
fwrite( (char*)&p, 1, 4, f );
fwrite( (char*)&p, 1, 4, f );
fwrite( (char*)&p, 1, 4, f );
fwrite( (char*)&p, 1, 4, f );
// Advance pointer.
p += 0x10;
}
}
First funny thing is that I can download this file OK via ftp. Via http: it
gets scrambled, though. So this means the bug is likely not a straightforward
buffering algorithm thing.
Looking at the scrambled output, things are broken at offset 0x10000 which
happens to be the buffer (segment) size. Instead of 00000100, I see 0030FF00.
If we shift this left 1 byte (so it aligns), then we're off by 0xD0 bytes. This
is confirmed by examining the data at offset 0x100D0, which is 00000001 (if we
shift that left 1 byte, we get the 00000100 we expected at offset 0x10000).
I'm going to try the same thing with the beta1 branch version of Mozilla and
will report the results in a moment.
Assignee | ||
Comment 12•25 years ago
|
||
I get the same output downloading (via http:, ftp: works OK) using a Mozilla
trunk build as I do using a beta1 branch build.
You can access this test file via http: at http://law.mcom.com/bytes.zip.
Comment 13•25 years ago
|
||
I spoke with Phil, and he indicated that the PDT did not understand this bug
when they marked it as PDT- (I was out of the room). I'm upping it to PDT+, and
would REALLY like both info on the problem, as well as the potential for a solution.
We need a lot of focus on this RSN, as this seems like a VERY dangerous bug :-(
Thanks,
Jim
Whiteboard: [PDT+]
Assignee | ||
Comment 14•25 years ago
|
||
I added this code to nsBufferedStreams.cpp to catch when/how my
"bytes.zip" file was getting corrupted. When I hit this breakpoint, it
was called by way of
nsStreamXferOp::OnStopRequest->nsOutputFileStream::Close->nsBufferedStreams::Flu
sh
(I don't know exactly what was in the middle, though).
Basically, the http channel got to the end, called OnStopRequest. That
triggered the output stream close, which of course flushed the file.
The problem is that (apparently; I'm verifying) the OnStopRequest was
occurring on the wrong thread (*not* the UI thread). There are still
OnDataAvailable notifications queued up on the UI thread. The
close/flush that occurs on the second thread resets things like mCursor,
which swallows data that a pending OnDataAvailable has just written (or
is just about to write).
Anyway, that's the theory. I'm going to investigate this issue of what
thread OnStopRequest is called on more closely. If it isn't the UI
thread (regardless of whether it should be), I'm going to try proxying
it over to the UI thread inside my OnStopRequest code. That way, we
have a fix that doesn't affect any code aside from the file download
code.
Of course, any of you should feel free to jump in.
Bill
x2296
Index: nsBufferedStreams.cpp
===================================================================
RCS file: /cvsroot/mozilla/netwerk/base/src/nsBufferedStreams.cpp,v
retrieving revision 1.5
diff -r1.5 nsBufferedStreams.cpp
300a301,306
> #ifdef DEBUG_law
> // Ensure data at file offset matches the offset.
> // This means you can't download anything but bytes.zip!
> PRUint32 *p = (PRUint32*)mBuffer;
> NS_ASSERTION( p && *p == mBufferStartOffset, "Buffer corrupted!" );
> #endif
313c319
< nsCRT::memcpy(mBuffer, mBuffer + amt, rem);
---
> nsCRT::memmove(mBuffer, mBuffer + amt, rem);
314a321,324
> #ifdef DEBUG_law
> // force notification if we hit this code!
> NS_ENSURE_TRUE( rem == 0, NS_ERROR_FAILURE );
> #endif
Comment 15•25 years ago
|
||
Adding myself to cc list.
(sorry about spam)
Comment 16•25 years ago
|
||
Added some folks to the CC list, as this seems to be the most perilous bug
facing beta 1 at this point.
In fact... it is currently the *only* PDT+ bug that requires a checkin!!!
Assignee | ||
Comment 17•25 years ago
|
||
OK, it turns out I was mistaken about the OnStopRequest; that seems to be a red
herring.
If I increase my test file to 2*64K bytes, then the corrupted buffer appears on
an OnDataAvailable (*it* forces the flush at that point).
I'm going to add some code in nsBufferedStreams::Write() to try to catch where
things are going wrong.
Comment 18•25 years ago
|
||
also, you can reduce NS_OUTPUT_STREAM_BUFFER_SIZE to get this assert to hit more
regular.
Comment 19•25 years ago
|
||
It does not look like |buf| is being incremented after a flush in
nsBufferedOutputStream:Write()... Could this be it? still looking...
Comment 20•25 years ago
|
||
I think I found the fix. Please apply and verify:
Index: nsBufferedStreams.cpp
===================================================================
RCS file: /cvsroot/mozilla/netwerk/base/src/nsBufferedStreams.cpp,v
retrieving revision 1.5
diff -r1.5 nsBufferedStreams.cpp
282c282
< nsCRT::memcpy(mBuffer + mCursor, buf, amt);
---
> nsCRT::memcpy(mBuffer + mCursor, buf + written, amt);
Comment 21•25 years ago
|
||
You are going to have to remove law's debug patch 2000-03-20 21:01.
Assignee | ||
Comment 22•25 years ago
|
||
I think that's it, Doug. I independently arrived at the same fix. I can't
believe I looked at this for so long and didn't see it.
The nasty thing was that the error didn't occur when the bogus data was being
written, it occurred when the previous segment was flushed. When my assertion
was hit, the problem was way in the past.
I'm going to check in that fix on the trunk. We'll let the PDT decide how to
deal with the branch tomorrow.
Comment 23•25 years ago
|
||
Updated status whiteboard.
Whiteboard: [PDT+] → [PDT+] Investigation in progress ...
Assignee | ||
Comment 24•25 years ago
|
||
The investigation is done, I think. Doug and I found a fix that tests out. I
ran checkin tests with no problem and checked the code into the trunk last
night. I'm just not sure what's the deal with beta1.
Comment 25•25 years ago
|
||
updating whiteboard.
Whiteboard: [PDT+] Investigation in progress ... → [PDT+] fix in hand
Comment 26•25 years ago
|
||
Changing Platform/OS to 'All' --- this occurs with files downloaded on Mac OS &
Linux beta branch builds, as well.
OS: Windows NT → All
Hardware: PC → All
Assignee | ||
Comment 27•25 years ago
|
||
Fix checked in on beta1 branch. This better work now, or else.
Status: NEW → RESOLVED
Closed: 25 years ago
Resolution: --- → FIXED
Comment 29•25 years ago
|
||
verified fixed w/ today's beta1 branch bits on linux, winNT and mac (opt comm,
2000.03.22.06-nb1b).
Status: RESOLVED → VERIFIED
Updated•9 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•