Closed Bug 1134529 Opened 10 years ago Closed 8 years ago

C-C Thunderbird - Cleaning of incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp and friends. (Step-3)

Categories

(MailNews Core :: Networking: POP, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1242030

People

(Reporter: ishikawa, Assigned: ishikawa)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

Attachments

(1 file, 3 obsolete files)

+++ This bug was initially created as a clone of Bug #1134527 +++ +++ This bug was initially created as a clone of Bug #1122698 +++ This is a branch of meta Bug 1121842 which depends on this entry. This is Step-3. --- begin quote from Bug 1121842 After Close() and Flush() are taken care of, we check the error return of Write() and the mismatch of the # of written bytes and requested bytes. Step-3. Add error value checking of Write(); Check the return value of Write and if the requested # of bytes matches the # of really written bytes. First step. Simply add NS_ERROR(). Better than nothing. At least, we will see the error printed during testing of DEBUG version of TB. (For better error handling, we will wait for Step 4.) The patch for this will be posted as a separate bugzilla entry. --- end quote This is the work-in-progress patch for step-3. I did several tests and fixed incorrect handling and added a few band-aid error processing code snipets and comments. But this will be needed to work on more. (Maybe step-1 in Bug 1122698 can be checked in, and also step-2 in Bug 1134527 after clean up.) However, the error processing here needs more testing since I encountered a crash of TB due to the memory allocation area corruption while doing error recovery. It seems already freed area is freed again or something like this. We now handle error conditions better (more strictly: before the errors were ignored and silently corrupted data in some cases), but error handling path seems to release already released data: There are problems of similar pattern in the code. Case in point: Close against already closed stream. In step-1 (bug 1122698), I fixed this issue. The original code superficially tried to avoid |Close()| against already closed stream by setting the stream variable to nullptr hoping that the next caller of Close() will not call it when the stream is nullptr. However, in the original call hierarchy, the higher-level code passed the global variable that contains the stream value to a leaf function, which copies it to a local copy and the leaf function closed the stream and set the local copy of the stream to nullptr. HOWEVER, it DOES NOT MODIFY the global stream variable. The result is that the higher-level code does not realize the stream is closed when it tries to call |Close()| since the global stream variable is not nullptr. I suspect that the failure to check the return value of |Close()| was that there were so many bogus |Close()| calls to already closed streams and checking the return code would have revealed the series of bogus |Close()| on already closed streams, and simply the checking was dropped to the detriment of data integrity :-( Now, I think I see a pattern here. It seems that some error handling path seems to release listener/observer during error recovery, but I think there are places where the data already released during error recovery is again released higher-level code which does not realize that the data is already is released. I need to investigate more here. So the attached patch is definitely a work-in-progress: at least, it does not introduce new bugs (although it uncovered a few issues in the original error handling), and print messages where some error conditions that were ignored before are encountered. TIA PS: The following is a comment from Bug 1134527 (Step-2) This is step-2: --- begin quote from bug 1121842 After the removal of unnecessary calls of |Close|, we go to Step-2. Step-2. Add error value checking of Close() and Flush(); First step. Simply add NS_ERROR(). Better than nothing. At least, we will see the error printed during testing of DEBUG version of TB. (For better error handling, we will wait for Step 4.) The patch for this will be posted as a separate bugzilla entry. After Close() and Flush() are taken care of, we check the error return of Write() and the mismatch of the # of written bytes and requested bytes. --- end quote This is the work-in-progress patch for step-2. Below is the initial comment from bug 1122698. This is the step-1 (from bug 1121842) <=== typo fixed when cloned. --- quote --- Step 1. Remove Extra ->Close() calls (and Flush() before Close().). Let us remove extra / unnecessary / bogus |Close| calls. The removal is based on the two proposals below. PROPOSAL-1: DiscardNewMessage should not close the file stream passed as the 1st argument. It is caller's responsibility to close it.. PROPOSAL-2: FinishNewMessage should not close the file stream passed as the first argument. It is caller's responsibility to close it. The reasoning is given in a crude write-up I created from checking the code at the end. (This will be posted as the next comment.) I checked the usage of two functions above, and with a few patches, the proposals ought to work. This patch will be filed as a different bugzilla on which this meta bugzilla entry depends. --- end quote Actually, not many |Close| are removed statically, but their calls during execution will be decreased by - setting stream value to nullptr after |Close|, and - the checking of stream value before calling |Close|. (Well, that is the theory. We may find out more bogus calls once we check the return value of Close(). But this is a start.) Local |make mozmill| test passed, and |make xpcshell-tests| did not introduce any new bugs with the attached patch. I will still mark this as Work-in-Progress in the series so that people in the know can figure out where any modifications should be made, etc. by looking at the whole picture. (Hmm. I wonder why there is component for Mail Transfer. This is a mail client.)
Here is an updated patch to match the current tree (as of March 16).
Attachment #8566392 - Attachment is obsolete: true
> > Here is an updated patch to match the current tree (as of March 16). Actually, this applies to the current tree.
This is an update for the current C-C tree and a patch for checking the file pointer position in Bug 1116055 - Performance issue: Failure to use buffered write (comm-central thunderbird)
Assignee: nobody → ishikawa
Attachment #8589759 - Attachment is obsolete: true
Blocks: 1174500
This is an updated patch to match the current tree. This is going to be in a series of patches. ======================================== PLANNED ORDER ======================================== * 1 A check_fileptr_change.patch: Bug 1116055: Check the unexpected change of file seek position and ask the user to report it bug 1116055 This was considered a necessary step so that we can know that if any user experiences a strange seek position change. * 2 A fix-close-step-1.patch: Bug 1122698 Cleaning of incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp and friends (part 1 of a series) bug 1122698 * 3 A fix-close-part-2.patch: Bug 1134527 Add error value checking of Close() and Flush() (part-2 of a series) bug 1134527 * 4 A fix-close-part-3-write-error-check.patch: Bug 1134529: Check Write failures (part-3 of a series) bug 1134529 (This entry) 5 A removing-a-Seek.patch: C-C Thunderbird - Cleaning of incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp and friends. (step 4) bug 1117450 6 A 1116055-acelist.patch: One liner patch posted in Bug 1116055 to return a buffered stream from mbox::getNewOutputStream by aceman Bug 1176857 - C-C TB Enabling buffering for pop3 download (and possibly part of imap file operation) I have created a separate bug for this since it will be necessary to apply this patch only after other patches landed. (This is because enabling the buffering requires proper error checking of |Close()| and |Flush()|. These are only taken care of AFTER 1-5 patches above are applied.) Maybe about a month after the first patch (bug 1116055) above is applied, and other patches 2-5 have been applied, I can enable this patch. (Assuming there is NO ONE who reports strange seek position change on their computer.) 7 A add-short-read-handling.patch: Bug 1170606 - C-C T-B fixes to take care of short read Bug 1170606 This is theoretically independent of 1-6, but my local patch assumes the changes in 1-6 to be in place. (The patch may not apply cleanly without 1-6 in advance.) There are some additional shor-read issues UNDER M-C portion of the source tree. They can be applied indepdent of this series. Please feel free to take up the review or move it to somebody else based on the workload of the potential reviewers. TIA
Attachment #8600193 - Attachment is obsolete: true
Attachment #8625465 - Flags: review?(Pidgeot18)
Sorry there was a typo of a bugzilla #. >5 A removing-a-Seek.patch: C-C Thunderbird - Cleaning of incorrect Close, unchecked > Flush, Write etc. in nsPop3Sink.cpp and friends. (step 4) > > bug 1117450 This should have read "bug 117450".
(In reply to ISHIKAWA, Chiaki from comment #5) > Sorry there was a typo of a bugzilla #. > >5 A removing-a-Seek.patch: C-C Thunderbird - Cleaning of incorrect Close, unchecked > > Flush, Write etc. in nsPop3Sink.cpp and friends. (step 4) > > > > bug 1117450 > > This should have read "bug 117450". Ouch. I hate an uncooperating keyboard and a mouse. It *is* bug 1174500.
Blocks: 1176857
Depends on: 1116055
As a serious user of TB, I appreciate that TB has been released to the public for free. At the same time, I noticed serious I/O issues, and thus am posting series of patches. I hope TB will be more robust, and does not eat my message even in the face of I/O errors :-) [Well, it did before :-( ] Summary of the patch in this bugzilla. Main purpose of this patch is to add the checks to error return of Write() and the mismatch of the # of written bytes and requested bytes. First step. Simply add NS_ERROR(). Better than nothing. At least, we will see the error printed during testing of DEBUG version of TB. (For better error handling, we will wait for Step 4, but see my comment regarding placing the error message into error console using macros proposed in bug 1116055.) - a few random debug message printing in nsMsgDBFolder.cpp to print error value encountered in simulated network outage. - XXX comment on line 437 of nsImapOfflineSync.cpp and line 3752 of nsImapProtocol.cpp suggests imap testing is not quite complete yet. (This has to wait. I want to clear the current local patch queue first.) - in light of short-read issues that I became aware later, xxx comment on line 9232-9235 in nsImapMockChannel() needs handling and it is indeed handled in the later patch in Bug 1170606. xxx comment on line 324 of nsMsgMailboxParse.cpp was analyzed in Bug 1170606 and it was determined that we don't need handle the short-read issue here Printing serious I/O error message in error console: Before the whole patch series was approved and applied, I may want to output the error messages added in my patch here in this bugzilla entry into error console using the macros introduced in bug 1116055. This is because of the observation below. Seeing the error in real-life usage will help the developers at analyzing bug reports. Learning what type of I/O errors the user may have encountered, developers can co-relate this to the buggy behavior the user reports, and come up with better fixes, etc. AND MORE TO THE POINT: it was not utterly obvious what error recovery ought to be taken when these write errors occur. This is because the current code simply ignored such I/O errors in many places :-( Thus I could not put in sensible error recovery unless there is already premature "return" with error code or NS_ENSURE_SUCCESS() in place. So I am afraid that seeing the error message in release-build using error console is the only way to learn the errors happened (that is, the type of I/O errors detected my patch). And probably we failed to take recovery action soonish until something went wrong later and the user noticed it then :-( In this sense, my patch is not complete yet, but at least IT IS MUCH BETTER THAN BEFORE. It handles the major error modes in Pop3 (and some in imap code). Perfect is the enemy of good as the saying goes. PS: I have noticed that return values of |Seek()| and |Tell()| are not checked very well. I will add the check in a planned patch to pick up these left-over failures to check low-level I/O function return values (mainly Seek and Tell) after the patch [7] add-short-read-handling.patch: Bug 1170606 - C-C T-B fixes to take care of short read so that existing patches do not have to be modified heavily. PPS: It might be a good idea to merge the patches of - bug 1122698 Cleaning of incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp and friends (part 1 of a series) - bug 1134527 Add error value checking of Close() and Flush() (part-2 of a series) - bug 1134529 Check Write failures (part-3 of a series) - bug 1174500 C-C Thunderbird - Cleaning of incorrect Close, unchecked Flush, Write etc. in nsPop3Sink.cpp and friends. (step 4) into one big patch. (Yet, still keeping the following patches separate. - bug 1116055 Performance issue: Failure to use buffered write (comm-central thunderbird) - bug 1176857 - C-C TB Enabling buffering for pop3 download (and possibly part of imap file operation) This may make it easier to check what I/O error is being checked, what action is taken, etc. in one pass.
Depends on: 1242030
In bug 1242030, I have uploaded merged patches from the bug 1122698, bug 1134527, bug 1134529, bug 1174500 The reason for consolidation is that the patch as a whole is much easier to read when consolidated. However, at the same time, I have split the resulting gigantic patch into a few smaller chunks that address files under only certain directories each. I will upload the latest road map that explains the patches and application order shortly. TIA
As per comment 8, the patch discussed here is now merged and split per directories in bug 1242030. So I am duping this bugzilla entry to the new bug.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → DUPLICATE
Depends on: 1306914
No longer depends on: 1306914
Comment on attachment 8625465 [details] [diff] [review] ( step-3) work-in-progress patch to add error checks to |Write| (in addtition to step-1 and step 2) [take 3, update to match current tree] No chance that Joshua will review this. The patch most likely has rot anyway.
Attachment #8625465 - Flags: review?(Pidgeot18)
(In reply to Jorg K (GMT+2) from comment #10) > Comment on attachment 8625465 [details] [diff] [review] > ( step-3) work-in-progress patch to add error checks to |Write| (in > addtition to step-1 and step 2) [take 3, update to match current tree] > > No chance that Joshua will review this. The patch most likely has rot anyway. You are wright. I am working on updating the patch. It now compiles and succeeds with |make mozmill| test with April source tree. I need to re-upgrade my local source tree to make sure the patch works with the current tree.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: