Enabling buffering for file stream to write message for C-C TB

NEW
Assigned to

Status

3 years ago
2 years ago

People

(Reporter: ishikawa, Assigned: ishikawa)

Tracking

(Depends on: 1 bug, Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 3 obsolete attachments)

(Assignee)

Description

3 years ago
Created attachment 8711180 [details] [diff] [review]
enable-buffering-1st-step.patch

This patch should be applied after the error checks of |Close|, |Flush|, etc. are in place (See bug 1242030 that consolidate the patches from bug 1122698, bug 1134527, bug 1134529, and bug 1174500).

I will post the roadmap of the patches shortly.

TIA
(Assignee)

Updated

3 years ago
Assignee: nobody → ishikawa
(Assignee)

Comment 1

3 years ago
Created attachment 8721173 [details] [diff] [review]
enable-buffering-1st-step.patch (Feb 18 version)

A full explanation of the updated patches and road map is in bug 116055 comment 107
https://bugzilla.mozilla.org/show_bug.cgi?id=1116055#c107

I am excerpting a relevant part for this bugzilla entry and a couple of other very related patches.



========================================
PLANNED ORDER of landing patches
========================================

bug 1116055
  -> bug 1242030 (consolidation of 1122698, 1134527, 1134529, 1174500)
    -> bug 1242042 
       -> bug 1176857
         -> bug 1242046
           -> bug 1242050
             -> bug 1170606

========================================
An overview of patches in bugzilla entries.
----------------------------------------

[In February 2016, I fixed the errors observed ONLY UNDER WINDOWS when
we use Maildir format for storage. Thus I needed to tweak my patch set
somewhat and added a patch to take care of this issue and so I updated
this write-up.]

......

[7] (New in 2016 ) Bug 1242042 - Enabling buffering for file stream to write message for C-C TB
A enable-buffering-1st-step.patch: Enabling buffering first step.

 I have extracted the code to enable buffering to be applied after
 the additional error checks in the preceding patches [1]-[6].

 After a couple of months of testing of the final patches, I realize
 a certain buffering operations that were in the patch are actually
 unnecessary, and so commented them out. (We can enable them back
 again if my understanding of the code flow is incorrect. There are
 many combination of execution paths, and I may misunderstood the
 nature of certain paths. My gripe about slow downloading of
 messages and saving of large attachment *ARE* handled properly by
 buffering now.)

 I have created the separate patch to enable buffering since it will
 be necessary to apply this patch ONLY AFTER previous patches to
 check for primitive I/O errors have landed. (This is because
 enabling the buffering requires proper error checking of |Close()|
 and |Flush()|. |Close| was never checked properly in the existing
 code.   These are only taken care of AFTER patches [1]-[6] above are
 applied.

 BTW, I have not had the time to fix this issue of missing error
 checks of |Close()| in files under mailnews/import directory
 completely yet. If you don't believe me, read the source!
 Fixing this issue would be a good student project.

[8] Bug 1176857
1116055-acelist.patch: One liner patch posted in Bug 1116055 to return a buffered stream from mbox::getNewOutputStream by aceman

 The patch has been posted to the following bugzilla entry.
 Bug 1176857 - C-C TB Enabling buffering for pop3 download (and possibly part of imap file operation)

 After the latest review of my local patches and testing, I realized
 that some buffering operations in my patch may be unnecessary for some
 operations. But I leave them as such.
 This patch from aceman could fall under that category, too.
 We may want to review the operation of the program more carefully
 to see if it is indeed unnecessary for some operations, but it does
 not hurt to have this buffering function for now.
 Lack of internal documentation of low-level protocol handling,
 etc. hurts.

[9] (NEW in 2016)
Bug 1242046:  	Removing unnecessary |Seek| that caused the C-C TB to operate slowly in terms of I/O 
  removing-a-Seek-rev02.patch: removing a Seek that is performance
  pig!  This was extracted into a new patch from the patches above [5]-[8].

  Even if buffering is properly enabled, one offending |Seek()|
  needs to be disabled so that the buffering is effective for
  writing messages to a local file. Otherwise, this |Seek()|
  interferes with buffering, and essentially we call system |write| 
  for EACH LINE of message, thus lowering output performance
  significantly.

  Killing this |Seek| did the trick. We now call ONE |Seek| per
  message, NOT one |Seek| per ONE LINE of message!

  This patch requires the landing of [1]-[8] before it.

...
Attachment #8711180 - Attachment is obsolete: true
(Assignee)

Updated

2 years ago
Blocks: 1176857
(Assignee)

Updated

2 years ago
Depends on: 1242030
(Assignee)

Comment 2

2 years ago
Created attachment 8798563 [details] [diff] [review]
enable-buffering-1st-step.patch

Bitrot, rebase, etc.
Attachment #8721173 - Attachment is obsolete: true

Comment 3

2 years ago
Comment on attachment 8798563 [details] [diff] [review]
enable-buffering-1st-step.patch

Review of attachment 8798563 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/local/src/nsLocalMailFolder.cpp
@@ +2178,5 @@
> +  // sanity check
> +  NS_ASSERTION(mCopyState->m_fileStream, "mCopyState->m_fileStream should not be nullptr.");
> +// let us buffer the m_fileStream
> +  mCopyState->m_fileStream->Flush(); // just in case
> +  mCopyState->m_fileStream = NS_BufferOutputStream(mCopyState->m_fileStream, 64 * 1024 );

This is the same change that was attempted in bug 769346:
https://hg.mozilla.org/comm-central/rev/a9f658106adb#l1.28

This was later backed out since it caused bug 815012. Have we ever investigated why this has caused the corruption described in bug 815012?

I guess all the other bugs in this series are there to ensure proper error handling so we would see problems early, but somehow I still don't see the big picture.
(Assignee)

Comment 4

2 years ago
(In reply to Jorg K (GMT+2) from comment #3)
> Comment on attachment 8798563 [details] [diff] [review]
> enable-buffering-1st-step.patch
> 
> Review of attachment 8798563 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: mailnews/local/src/nsLocalMailFolder.cpp
> @@ +2178,5 @@
> > +  // sanity check
> > +  NS_ASSERTION(mCopyState->m_fileStream, "mCopyState->m_fileStream should not be nullptr.");
> > +// let us buffer the m_fileStream
> > +  mCopyState->m_fileStream->Flush(); // just in case
> > +  mCopyState->m_fileStream = NS_BufferOutputStream(mCopyState->m_fileStream, 64 * 1024 );
> 
> This is the same change that was attempted in bug 769346:
> https://hg.mozilla.org/comm-central/rev/a9f658106adb#l1.28
> 
> This was later backed out since it caused bug 815012. Have we ever
> investigated why this has caused the corruption described in bug 815012?

I think the rewinding of file pointer due to the function(s) that enables buffered outupt
and not taking care of that effect was to blame.

At least, after the proposed changes including the |Seek()| being necessary per message copy, 
the intense testing in the 1Q of 2015, I did not see any corruption.
(Filtering, downloading/copying/moving between
folders of intra and inter accounts copies [pop3 and imap accounts].

> I guess all the other bugs in this series are there to ensure proper error
> handling so we would see problems early, but somehow I still don't see the
> big picture.

To be honest, nether do I, especially the error handling [or the lack of it]
in imap subdirectory remains mostly
a mystery to me: this is because I have not used imap server extensively and I don't have the gut feeling of its operation and I have not read up on the latest RFC regarding IMAP protocol. So I can't say
for sure how imap server and client interaction should behave.
(In the early 2000's, the imap server implementations available then was so buggy, I stayed away from imap server for a long time when I use TB. I think people today have started to use gmail in IMAP mode fairly casually but GMAIL seems to have its own quirks regarding the extesion of IMAP protocol or lack of some features, but I digress. 
Until I am FORCED to use an IMAP server, I don't think I will read every lines of imap subdirectory.)
(Assignee)

Comment 5

2 years ago
Created attachment 8807624 [details] [diff] [review]
enable-buffering-1st-step.patch

Fixed bitrot and modified some comments so that they are more "appropriate" in C-C source tree.
Attachment #8798563 - Attachment is obsolete: true
So your patch(es) addresses the concerns of bug 815012 comment 45 and  bug 815012 comment 49?
Flags: needinfo?(ishikawa)

Comment 7

2 years ago
I asked the same in comment #3 and the answer was in comment #4. My understanding is that Chiaki reckons that a seek was missing from the initial attempt in bug 769346. This seek was now added in bug 1116055. Also, missing error handling didn't detect that things went wrong in bug 769346. After Chiaki's other patches have landed, we will be in a better position to assess the side-effects of what is re-proposed here. Maybe we'll have it for Christmas ;-)
(Assignee)

Comment 8

2 years ago
(In reply to Jorg K (GMT+1) from comment #7)
> I asked the same in comment #3 and the answer was in comment #4. My
> understanding is that Chiaki reckons that a seek was missing from the
> initial attempt in bug 769346. This seek was now added in bug 1116055. Also,
> missing error handling didn't detect that things went wrong in bug 769346.
> After Chiaki's other patches have landed, we will be in a better position to
> assess the side-effects of what is re-proposed here. Maybe we'll have it for
> Christmas ;-)

Yes, I still think my analysis is correct at least for mbox (Berkeley mailbox) case bacin in 2012 (and I suspect Maildir was not an issue back then yet).

bug 815012 comment 51 singles out the following patch as the  culprit for the corruption.

https://hg.mozilla.org/comm-central/rev/a9f658106adb

In it the change was this:

  rv = mCopyState->m_msgStore->
          GetNewMsgOutputStream(this, getter_AddRefs(mCopyState->m_newHdr),
                                &reusable,
                                getter_AddRefs(mCopyState->m_fileStream));
   NS_ENSURE_SUCCESS(rv, rv);
+
+  /* Buffer the output if possible */
+  /* BIG performance improvement on Windows with some network file systems */
+  mCopyState->m_fileStream = NS_BufferOutputStream(mCopyState->m_fileStream, EIGHT_K);
+
   if (mCopyState->m_parseMsgState)
     mCopyState->m_parseMsgState->SetNewMsgHdr(mCopyState->m_newHdr);

Basically, a newly created filestream mCopyState->m_fileStream is turned into a buffered one.

Note: 
GetNewMsgOutputStream() for mbox positioned the file seek pointer to the end of the file.
https://dxr.mozilla.org/comm-central/source/mailnews/local/src/nsMsgBrkMBoxStore.cpp#610
see line 646 and line 662.

However, it does not seem to me that 
NS_BufferOutputStream() [or other variants] honors this movement of file seek pointer to the end all the time.
No, I think I should rephrase.
Well, for a newly created file. seek position is at 0.
But for  a file that already contains data, seek position is not at zero.
I am not sure if the buffered output routine can cope with the situation when the buffer is empty and the position of
file seek pointer is not at zero after Init() very well.


https://dxr.mozilla.org/comm-central/source/mozilla/netwerk/base/nsBufferedStreams.cpp?q=%2Bfunction%3A%22nsBufferedStream%3A%3AInit%28nsISupports+*%2C+uint32_t%29%22&redirect_type=single#68

I suspect it can't and get confused, and that is the reason for file data corruption.
I think we better start with file position 0 OR before writing anything
move the file seek position to the desired position by explicit seek to set the internal data structure straightened out, and my patch is taking the latter fix. Call seek once before each message copy/download begins. And it seems to work well.
Come to think of it, we may want to make it robust in the future after this comment.

When I think about it, IN A PERFECT WORLD where no disk error or network error occurs, we can simply
add Seek() after the call to NS_BufferOutputStream() in the backed-out patch.
Unfortunately, in the REAL WORLD where such errors are rampant, we have to check the errors that are now deferred
until Close() time because of the buffering. Unfortunately, TB code in many places never bothered to check the error for Close().
That is why I had to add error checks in many places simply to make TB more robust when we enable buffering.

> we will be in a better position

Definitely.
Flags: needinfo?(ishikawa)
You need to log in before you can comment on or make changes to this bug.