Closed
Bug 281222
Opened 20 years ago
Closed 17 years ago
The file copy under unix can fail and produce zero sized file in some circumstances
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: wbonnet, Unassigned)
Details
(Whiteboard: thunderbird only)
Attachments
(4 obsolete files)
User-Agent: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.5) Gecko/20041108 Firefox/1.0
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; fr-FR; rv:1.7.5) Gecko/20041108 Firefox/1.0
Function CrudeFileCopy in nsFileSpecUnix.cpp show some bugs. In some cases it
can fail and generate a zero sized file.
Analysis will be provided with patch proposal coming with this bug report
Reproducible: Always
Actual Results:
Output file is zero sized
Expected Results:
File should have some data
Reporter | ||
Comment 1•20 years ago
|
||
The CrudeFileCopy was buggy in some case the copied data can be lost and a zero
sized file is produced
Here is the actual code, comments are added inline
430 stat_result = stat (in, &in_stat);
returned value should be tested, no need to continue and try to read the file
if we cannot stat it
432 ifp = fopen (in, "r");
433 if (!ifp)
434 {
435 return -1;
436 }
437
438 ofp = fopen (out, "w");
439 if (!ofp)
440 {
441 fclose (ifp);
442 return -1;
443 }
444
The following loop is buggy
445 while ((rbytes = fread (buf, 1, sizeof(buf), ifp)) > 0)
fread returns zero for FEOF and errors. In this code feof is not tested, thus
all errors are treated as eof and loop is skipped without error handling. Next
instruction after loop is fclose, thus a zero sized file is created in case of
error
read can fail on NFS mount, and it will fail from time to time...
446 {
447 while (rbytes > 0)
this while loop is useless since fwrite will buffer the data and write all the
1024 bytes without loop.
448 {
449 if ( (wbytes = fwrite (buf, 1, rbytes, ofp)) < 0 )
buf has to be replaced by an index pointer if we loop on remaining rbytes ! or
the first data are always written, index is not moving
450 {
451 fclose (ofp);
452 fclose (ifp);
453 unlink(out);
454 return -1;
455 }
456 rbytes -= wbytes;
buf pointer should be incremented
457 }
458 }
459 fclose (ofp);
fclose can fail, data are actually written after a fsync call was successful
this code works in 99.9% of the time. But if the path is a NFS mount then it
will fail from time to time and buffer won't be flushed
460 fclose (ifp);
461
462 if (stat_result == 0)
463 chmod (out, in_stat.st_mode & 0777);
This algorithm was buggy and it can explain why from time to time we have zero
sized filespec (.msf) in mailbox stored on /home mount by NFS. But maybe it is
not the only reason why...
Reporter | ||
Updated•20 years ago
|
Attachment #173498 -
Flags: review?(mscott)
Summary: The file copy under unix can fail and produce zero sized file in some circomstances → The file copy under unix can fail and produce zero sized file in some circumstances
Comment 2•20 years ago
|
||
Comment on attachment 173498 [details] [diff] [review]
This patch fix some bug in CrudeCopyFile on unix platform
changing to Darin for his linux expertise :-)
Attachment #173498 -
Flags: review?(mscott) → review?(darin)
Comment 3•20 years ago
|
||
Confirming, we should consider for tb1.0.1.
/be
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: blocking-aviary1.0.1+
Comment 4•20 years ago
|
||
darin, brendan, who should this bug be assign to. i will have cycles next week.
Comment 5•20 years ago
|
||
This is needed for tbird 1.0.1 as I understand it, and I'm not sure what the
schedule looks like for that. If you have cycles to spend on this next week,
then that sounds great to me :)
Comment 6•20 years ago
|
||
william, did you have any test cases that show the failure(s) or did you
discover these problems through code inspection?
Comment 7•20 years ago
|
||
why don't we just reuse the nsLocalFile code? for example, just create two
nsLocalFile's and copy one to the other. Something like (ignoring error checking):
nsCOMPtr <nsILocalFile> inFile;
NS_NewLocalFile(in, PR_TRUE, getter_AddRefs(inFile));
nsCOMPtr <nsILocalFile> outFile;
NS_NewLocalFile(out, PR_TRUE, getter_AddRefs(outFile));
nsCOMPtr <nsIFile> source = do_QueryInterface(inFile);
source.CopyTo(outFile, EmptyString());
Comment 8•20 years ago
|
||
I agree with dougt. Let's leverage nsIFile/nsILocalFile as much as possible
here. nsFileSpec is an obsolete deprecated API. Its implementation doesn't
have to be performant or anything like that.
Reporter | ||
Comment 9•20 years ago
|
||
I discovered it trough code inspection after i fixed the bug on drag and drop
loosing folders.
The problem was that this function was failing on zero sized files, so i had a
close look to it.
I think you can reproduce it easily if you make a simple test program which use
this function to copy files on a nfs mount. Maybe you will have to had some
extra load on your network depending on bandwith and so on. File access through
NFS has to be "difficult" some time.
I do agree that if there is already a function to copy files under unix, it's
better to use it since this code is in obsolete part. But i am a newbie to
mozilla code and i don't know yet all the function available :) that's why i
choosed to fix it ;)
Reporter | ||
Comment 10•20 years ago
|
||
I just had a quick look to the nsLocalFile i am almost sure it has some bugs
also... so be careful if you just replace old code by a call to the new function
CopyTo.
The code will not work properly on NFS drive for the some of the reasons why
CrudeFileCopy was failing, syncing data on disk.
In the copy loop :
while ((bytesRead = PR_Read(oldFD, buf, BUFSIZ)) > 0) {
there is no test for errors and eof, thus if an error occur at first read, it
will be handled as an eof, and an empty file will be created.
After the files are closed, an error code is returned (NS_ERROR_OUT_OF_MEMORY
which not the real error reason). In this implementation, the caller has to
handle the error code and destroy the new file. I think it would be better if
the method would "unlink" the new file since the copy failed. Otherwise some
file may stay on the file system without ever been deleted.
PRInt32 bytesWritten = PR_Write(newFD, buf, bytesRead);
if (bytesWritten < 0) {
bytesRead = -1;
break;
}
Here we test if bytesWritten is < 0 to detect errors. That's ok according to the
man, since PR_Write is mapped to a write call, it will return -1 if it fails.
But in case of success it returns the number of written bytes. If for some
reasons it does not write all data we asked to be written, we have to handle it
as an error. Thus the test should be :
PRInt32 bytesWritten = PR_Write(newFD, buf, bytesRead);
if (bytesWritten < bytesRead) {
The following lines :
// close the files
PR_Close(newFD);
PR_Close(oldFD);
has the same problem as in CrudeFileCopy. On an NFS mount sometimes close can
fail, return code should be tested. We can be really sure that the data are
stored on disk only after calling successfully PR_Sync before PR_Close. Thode
should be something like :
int l_nRet;
// close the files
PR_Close(oldFD);
l_nRet = PR_Sync(newFD);
if (l_nRet != 0)
{
return NS_ERROR_OUT_OF_MEMORY; // or some more accurate error code
}
PR_Close(newFD);
Updated•20 years ago
|
Whiteboard: thunderbird only
Comment 11•20 years ago
|
||
If you want to write a clean UNIX copy routine on top of NSPR and call it from
both places that would be good. It is important that we make
nsLocalFileUnix.cpp work well since it is the preferred class now. We could put
the implementation in there and call it from nsFileSpec.
Comment 12•20 years ago
|
||
Comment on attachment 173498 [details] [diff] [review]
This patch fix some bug in CrudeCopyFile on unix platform
removing review request in anticipation of new and improved patch :)
btw, please use NSPR methods instead of stdio methods, thanks! (we don't need
the buffering provided by fread & fwrite.)
Attachment #173498 -
Flags: review?(darin)
Reporter | ||
Comment 13•20 years ago
|
||
Ok i'll submit a new patch based on nspr this week end
What is the expected release date of thunderbird 1.0.1 ?
Reporter | ||
Comment 14•20 years ago
|
||
This patch is a rewrite of the previous on with the removal of the f* function
from stdio. Patch is now using PR_* functions.
Attachment #173498 -
Attachment is obsolete: true
Attachment #176394 -
Flags: review?(mscott)
Comment 15•20 years ago
|
||
Comment on attachment 176394 [details] [diff] [review]
This patch fix some bug in CrudeCopyFile on unix platform using NSPR
>+ int stat_result = -1, l_nRet;
>+ PRFileDesc * l_pFileDescIn, * l_pFileDescOut; // Src and dest filespec pointers
the "l_" prefix is non-standard. please try to stick with
existing code style in the modified file. that means no
"hungarian" notation.
>+ while(l_rbytes > 0) // While there is data to copy
>+ {
>+ // Data buffer size is only 1K ! fwrite function is able to write it in
>+ // one call. According to the man fwrite returns a number of elements
>+ // written if an error occured. Thus this is no need to handle this case
>+ // as a normal case. Written data cannot be smaller than rbytes
>+ // than input buffer.
>+ l_wbytes = PR_Write(l_pFileDescOut, l_cBuf, l_rbytes); // Copy data to output file
>+ if (l_wbytes != l_rbytes) // Test if all data were written
>+ { // No the is an error
>+ PR_Close(l_pFileDescIn); // Close input file
>+ PR_Close(l_pFileDescOut); // Close output file
>+ unlink(out); // Destroy output file
>+ return -1; // Return an error
>+ }
>+
>+ // Write is done, we try to get more data
>+ l_rbytes = PR_Read(l_pFileDescIn, l_cBuf, sizeof(l_cBuf));
>+ if (l_rbytes < 0) // Test if read was unsuccessfull
>+ { // Error case
>+ PR_Close(l_pFileDescOut); // Close input file
>+ PR_Close(l_pFileDescIn); // Close input file
>+ unlink(out); // Destroy output file
>+ return -1; // Return an error
given all the different places where you need to return and
cleanup resources, i think it would make sense to move the
bulk of this code (the loop) into a subroutine. then you
can write close like this:
// open files
do_copy();
// close files
// unlink out if necessary
that way you end up with much smaller and simpler code.
Reporter | ||
Comment 16•20 years ago
|
||
This patch includes Darin's comments :
. removal of l_ hungarian notation,
. add a DoCopy method to get a cleaner code)
Attachment #176394 -
Attachment is obsolete: true
Attachment #176454 -
Flags: review?(mscott)
Comment 17•20 years ago
|
||
is there an enduser test we could run to verify this as fixed (once it is
fixed)? or is this more of a backend fix?
Comment 18•20 years ago
|
||
moving to 1.1 for investigation.
Flags: blocking-aviary1.0.1+ → blocking-aviary1.1+
Attachment #176394 -
Flags: review?(mscott)
Comment 19•20 years ago
|
||
Comment on attachment 176454 [details] [diff] [review]
Fix the in CrudeCopyFile on unix platform using NSPR including Darin's comments
>+static int CrudeFileCopy_DoCopy(PRFileDesc * pFileDescIn, PRFileDesc * pFileDescOut, char * pBuf, long bufferSize)
>+ // If EOF is reached and no data available, PR_Read returns 0. A negative return
no data _is_ ...
>+ // value means error occured
means _an_ ...
>+ rbytes = PR_Read(pFileDescIn, pBuf, bufferSize);
>+ if (rbytes < 0) // Test if read was unsuccessfull
>+ { // Error case
>+ return -1; // Return an error
>+ }
drop the else (after return) [and unindent the codeblock]:
>+ else // Success case
also, don't bother commenting like this ^
we try to stick to 80 cols, if you're going to stick comments, i'd rather they
be above the lines of code instead of to their right.
>+ {
^ please don't include trailing whitespace
>+ while(rbytes > 0) // While there is data to copy
^ please make sure you're following house style for spaces before
()s, personally i try to be consistent between if/for/while
>+ {
>+ // Data buffer size is only 1K ! fwrite function is able to write it in
>+ // one call. According to the man fwrite returns a number of elements
the man _page,_ ...
>+ // written if an error occured. Thus this is no need to handle this case
this => there ?
>+ // as a normal case. Written data cannot be smaller than rbytes
>+ // than input buffer.
than => i.e., the size of the (?)
>+ wbytes = PR_Write(pFileDescOut, pBuf, rbytes); // Copy data to output file
>+ if (wbytes != rbytes) // Test if all data were written
were => was
>+ { // No the is an error
the is => this (?)
>+ return -1; // Return an error
>+ }
>+
>+ // Write is done, we try to get more data
>+ rbytes = PR_Read(pFileDescIn, pBuf, bufferSize);
>+ if (rbytes < 0) // Test if read was unsuccessfull
unsuccessful
>+ { // Error case
>+ return -1; // Return an error
>+ }
>+ }
>+ }
>- while ((rbytes = fread (buf, 1, sizeof(buf), ifp)) > 0)
>- while (rbytes > 0)
style clearly likes a space before condition parens. [let's ignore the spaces
after fread ...]
>+static int CrudeFileCopy(const char* in, const char* out)
>+//----------------------------------------------------------------------------------------
>+{
>+ char buf[1024]; // Used as buffer for the copy loop
>+ PRInt32 rbytes, wbytes; // Number of bytes read and written in copy loop
>+ struct stat in_stat; // Stores informations from the stat syscall
information
>+ int stat_result = -1, ret;
>+ PRFileDesc * pFileDescIn, * pFileDescOut; // Src and dest filespec pointers
'filespec' smells like nsFileSpec, can you avoid reminding us of that? :)
>+ // Check if the pointers to filenames are valid
NS_ASSERTION(in && out, "CrudeFileCopy should be called with pointers to
filenames...");
>+ if (!in || !out)
>+ return -1;
how about:
NS_ENSURE_TRUE(in && out, -1);
>+ // Check if the pointers are the sames, if so, no need to copy A to A
same (please fix all occurrences)
>+ if (in == out)
>+ return 0;
>+
>+ // Check if content of the pointers are the sames, if so, no need to copy A to A
>+ if (strcmp(in,out) == 0)
people tend to complain about strcmp :)
>+ return 0;
>+
>+ // Retrieve the 'in' file attributes
>+ stat_result = stat (in, &in_stat);
i believe this is the first time i've seen you use a space after a function
name
>+ if(stat_result != 0)
>+ {
>+ // test if stat failed, it can happens if file doesnt exist, is not
either "can happen" or "happens"
"the file"
"does not" (or doesn't if you use isn't, which sounds bad...)
>+ // readable or not available ( can happen with NFS mounts )
"is not"
>+ return(-1); // Return an error
and the first time you parenthesized a return value (please don't)
>+ }
>+
>+ // Open the input file for binary read
>+ pFileDescIn = PR_Open(in, PR_RDONLY, 0444);
>+ if (pFileDescIn == 0)
>+ {
>+ return -1; // Open failed, return an error
>+ }
>+
>+ // Open the output file for binary write
>+ pFileDescOut = PR_Open(out, PR_WRONLY | PR_CREATE_FILE, 0666);
please put a comment explaining why you're using 0666. or better, don't :).
i just wrote a patch to remove some 0666's, i don't feel like writing another.
most likely you should be getting the value from the original file. since
you're setting it later, how about using 0600 now, this should prevent someone
from poisoning it.
>+ // Copy the data
>+ if (CrudeFileCopy_DoCopy(pFileDescIn, pFileDescOut, buf, sizeof(buf)) != 0)
>+ { // Error case
>+ PR_Close(pFileDescOut); // Close input file
>+ PR_Close(pFileDescIn); // Close input file
>+ unlink(out); // Destroy output file
>+ return -1; // Return an error
>+ }
>+
>+ // There is no need for error handling here. This should have worked, but even
>+ // if it fails, the data are now copied to destination, thus there is no need
>+ // for the function to fail on the input file error
>+ PR_Close(pFileDescIn);
>+
>+ // It is better to call fsync and test return code before exiting to be sure
>+ // data are actually written on disk. Data loss can happen with NFS mounts
>+ if (PR_Sync(pFileDescOut) != PR_SUCCESS)
>+ { // Test if the fsync function succeeded
>+ PR_Close(pFileDescOut); // Close output file
>+ unlink(out); // Try to destroy the file
not PR_Delete?
>+ return(-1); // Return an error
>+ }
>+
>+ // Copy is done, close both file
>+ if (PR_Close(pFileDescOut) != PR_SUCCESS) // Close output file
>+ { // Output file was not closed :(
>+ unlink(out); // Try to destroy the file
>+ return(-1); // Return an error
>+ }
>+
>+ ret = chmod(out, in_stat.st_mode & 0777); // Set the new file file mode
>+ if (ret != 0) // Test if the chmod function succeeded
>+ {
>+ unlink(out); // Try to destroy the file
>+ return(-1); // Return an error
>+ }
>
>- if (stat_result == 0)
>- chmod (out, in_stat.st_mode & 0777);
>+ return 0; // Still here ? ok so it worked :)
>+} // nsFileSpec::CrudeFileCopy
>
>@@ -497,24 +575,24 @@ nsresult nsFileSpec::MoveToDir(const nsF
>- ((nsFileSpec*)this)->Delete(PR_FALSE);
>+ ((nsFileSpec*)this)->Delete(PR_FALSE);
thank you for killing these tabs
Attachment #176454 -
Flags: review?(mscott) → review-
Reporter | ||
Comment 20•19 years ago
|
||
This patche fixes the issues identified by Timeless (including my English
mistakes ;) )
Attachment #176454 -
Attachment is obsolete: true
Attachment #183076 -
Flags: review?(mscott)
Updated•19 years ago
|
Flags: blocking-aviary1.5+
Reporter | ||
Comment 21•19 years ago
|
||
Hi
There was no activity on this bug for several months from now.
Is it possible to know if it is planned to change its status and accept the patch ?
Thx in advance
Comment 22•19 years ago
|
||
Whenever it gets review.
Scott, you seem to be the one the review was requested of, presumably because
mailnews is the only consumer of this code...
Reporter | ||
Comment 23•19 years ago
|
||
Please Scott do you when it will be possible to have a review for this bug ?
Reporter | ||
Comment 24•19 years ago
|
||
Hi
it seems i had no answer yet :(
would it be possible to have a review please ?
Kind regards,
William
Updated•18 years ago
|
Assignee: dougt → nobody
QA Contact: xpcom
Reporter | ||
Comment 25•18 years ago
|
||
Hi
This bug is waiting for a review since May 2005. That's more than a year now that patch has been submitted, and faulty code documented and bug explanation provided.
It's a long time for a critical bug.
Would it be possible please to have a review of the patch ?
Regards,
William
Comment 26•18 years ago
|
||
Comment on attachment 183076 [details] [diff] [review]
Fix the previous patch, including Timeless comments
Maybe timeless or darin can take another look at this...
Attachment #183076 -
Attachment description: Fix the previous patche, including Timeless comments → Fix the previous patch, including Timeless comments
Attachment #183076 -
Flags: superreview?(darin)
Attachment #183076 -
Flags: review?(timeless)
Comment 27•18 years ago
|
||
Comment on attachment 183076 [details] [diff] [review]
Fix the previous patch, including Timeless comments
// Check if content of the pointers are the sames, if so, no need to copy A to A
sames => same
> return(-1);
please be consistent w/ parens or lack for a statement, in this function that's return w/o ()s. (which is also the general mozilla style)
Attachment #183076 -
Flags: review?(timeless)
Attachment #183076 -
Flags: review?(dougt)
Attachment #183076 -
Flags: review+
Updated•18 years ago
|
Attachment #183076 -
Flags: superreview?(darin) → superreview+
Comment 28•18 years ago
|
||
Comment on attachment 183076 [details] [diff] [review]
Fix the previous patch, including Timeless comments
mozilla/xpcom/obsolete/nsFileSpecUnix.cpp 1.10
Attachment #183076 -
Attachment is obsolete: true
Attachment #183076 -
Flags: review?(mscott)
Attachment #183076 -
Flags: review?(dougt)
You need to log in
before you can comment on or make changes to this bug.
Description
•