Open Bug 1242046 Opened 4 years ago Updated 16 days ago

Removing unnecessary |Seek| that caused the C-C TB to operate slowly in terms of I/O

Categories

(MailNews Core :: Networking: POP, defect)

defect
Not set

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: ishikawa, Assigned: ishikawa)

References

(Depends on 1 open bug)

Details

(Keywords: perf)

Attachments

(1 file, 5 obsolete files)

Attached patch removing-a-Seek-rev02.patch (obsolete) — Splinter Review
This patch removes a |Seek| that offends the buffering output and caused the writing of message calling system |write| for EACH LINE of a message.
Calling |write| for each line of a message hurts for large message with, say, JPEG photo attachment, and more so with remote file system such as CIFS, NFS, etc.
This patch has to land after other patches to add proper checks for I/O errors of low-level routines and buffering is properly enabled for some output streams.
(See bug 1242030 that consolidate the patches from bug 1122698, bug 1134527, bug 1134529, and bug 1174500). 

I will post the roadmap of the patches shortly.

TIA
Assignee: nobody → ishikawa
Keywords: perf
Created attachment 8721175 [details] [diff] [review] [diff] [review]
Debug output with One liner patch posted in Bug 1116055 to return a buffered stream from mbox::getNewOutputStream by aceman buffering-1116055-acelist.patch

A full explanation of the updated patches and road map is in bug 116055 comment 107
https://bugzilla.mozilla.org/show_bug.cgi?id=1116055#c107

I am excerpting a relevant part for this bugzilla entry and a couple of other very related patches (they must come before this patch.).



========================================
PLANNED ORDER of landing patches
========================================

bug 1116055
  -> bug 1242030 (consolidation of 1122698, 1134527, 1134529, 1174500)
    -> bug 1242042 
       -> bug 1176857
         -> bug 1242046
           -> bug 1242050
             -> bug 1170606

========================================
An overview of patches in bugzilla entries.
----------------------------------------

[In February 2016, I fixed the errors observed ONLY UNDER WINDOWS when
we use Maildir format for storage. Thus I needed to tweak my patch set
somewhat and added a patch to take care of this issue and so I updated
this write-up.]

......

[7] (New in 2016 ) Bug 1242042 - Enabling buffering for file stream to write message for C-C TB
A enable-buffering-1st-step.patch: Enabling buffering first step.

 I have extracted the code to enable buffering to be applied after
 the additional error checks in the preceding patches [1]-[6].

 After a couple of months of testing of the final patches, I realize
 a certain buffering operations that were in the patch are actually
 unnecessary, and so commented them out. (We can enable them back
 again if my understanding of the code flow is incorrect. There are
 many combination of execution paths, and I may misunderstood the
 nature of certain paths. My gripe about slow downloading of
 messages and saving of large attachment *ARE* handled properly by
 buffering now.)

 I have created the separate patch to enable buffering since it will
 be necessary to apply this patch ONLY AFTER previous patches to
 check for primitive I/O errors have landed. (This is because
 enabling the buffering requires proper error checking of |Close()|
 and |Flush()|. |Close| was never checked properly in the existing
 code.   These are only taken care of AFTER patches [1]-[6] above are
 applied.

 BTW, I have not had the time to fix this issue of missing error
 checks of |Close()| in files under mailnews/import directory
 completely yet. If you don't believe me, read the source!
 Fixing this issue would be a good student project.

[8] Bug 1176857
1116055-acelist.patch: One liner patch posted in Bug 1116055 to return a buffered stream from mbox::getNewOutputStream by aceman

 The patch has been posted to the following bugzilla entry.
 Bug 1176857 - C-C TB Enabling buffering for pop3 download (and possibly part of imap file operation)

 After the latest review of my local patches and testing, I realized
 that some buffering operations in my patch may be unnecessary for some
 operations. But I leave them as such.
 This patch from aceman could fall under that category, too.
 We may want to review the operation of the program more carefully
 to see if it is indeed unnecessary for some operations, but it does
 not hurt to have this buffering function for now.
 Lack of internal documentation of low-level protocol handling,
 etc. hurts.

[9] (NEW in 2016)
Bug 1242046:  	Removing unnecessary |Seek| that caused the C-C TB to operate slowly in terms of I/O 
  removing-a-Seek-rev02.patch: removing a Seek that is performance
  pig!  This was extracted into a new patch from the patches above [5]-[8].

  Even if buffering is properly enabled, one offending |Seek()|
  needs to be disabled so that the buffering is effective for
  writing messages to a local file. Otherwise, this |Seek()|
  interferes with buffering, and essentially we call system |write| 
  for EACH LINE of message, thus lowering output performance
  significantly.

  Killing this |Seek| did the trick. We now call ONE |Seek| per
  message, NOT one |Seek| per ONE LINE of message!

  This patch requires the landing of [1]-[8] before it.
Attachment #8711182 - Attachment is obsolete: true
Attachment #8721178 - Attachment description: removing-a-Seek-rev02.patch → removing-a-Seek-rev02.patch (Feb 18 version)
Component: OS Integration → Networking: POP
Depends on: 1116055
Product: Thunderbird → MailNews Core
Version: unspecified → Trunk
Depends on: 1176857
Depends on: 1306914
No longer depends on: 1306914
Blocks: 1242050
Is this dependent on bug 1306914 or a dupe of it now?

If we collect user reports we can decide whether to remove the seeks and their checks. In that case we can remove most of the code of bug 1116055, not just the single define as in the patch.

Is this patch meant for temporary testing the perf increase of the removal of the seek?
Status: NEW → ASSIGNED
Depends on: 1306914
(In reply to :aceman from comment #2)
> Is this dependent on bug 1306914 or a dupe of it now?
> 
> If we collect user reports we can decide whether to remove the seeks and
> their checks. In that case we can remove most of the code of bug 1116055,
> not just the single define as in the patch.
> 
> Is this patch meant for temporary testing the perf increase of the removal
> of the seek?

I am afraid that 1306914 may have been created by mistake somehow while I was sorting out the latest patches. [I am not sure. I don't think I meant to create it: this is because I did not put any patch to that bug. Hmm...]

In the series of patchs for enabling buffering,
this patch is meant to be the final step of removing the unnecessary |Seek()|.
Without this patch, the buffering is basically nullified.


bug 1116055 - Failure to use buffered write (c-c) (1patch approved neil, not landed) polishing by aceman, should land soon
  => bug 939548 - Check return Close()  nsImapOfflineSync::ProcessAppendMsgOperation (1 patch review?=aceman)
    -> bug 1242030 "consolidated" (imap+local+base no review requested, +2import patches one neil=r+)
      -> bug 1242042 Enabling buffering file stream to write message (1patch, no review requested***)
         -> bug 1176857 Enable buffer pop3DL + imap (1patch no review requested)
 ***>       -> bug 1242046 remove Seek=>slow I/O (1 triv patch no review requested*)  <***
               -> bug 1242050 output stream buffer 4>16KB (aceman review+landed)
              
              -> bug 1170606 deal with short read (8 patches, 3 trivial?, 5 not trivial no reviews requested) <- This can wait.)


|Seek()| before each |Write()| of a line nullifies the buffering and so the |Seek()| needs to be removed.
The patches that lead to this is
 - for checking the internal consistency of |Seek()| position,
 - enabling of buffering,
 - now that |Close()| can return errors due to buffering, stricter checking of |Close()|,
 - etc.
The details are in 
https://bugzilla.mozilla.org/show_bug.cgi?id=1116055#c107
and there have been a few more cleanups, but the patches need be read for these.
> I am afraid that 1306914 may have been created by mistake somehow while I was sorting out the latest patches. [I am not sure. I don't think I meant to create it: this is because I did not put any patch to that bug. Hmm...]

I checked my Activity log and really not sure how that bug 130694 was created. My mouse must have hit "Clone this bug" near the right bottom corner my mistake (when I tried to reach my cup of coffee or something).

Please ignore it for a moment and let me check this over the next couple of days. I think I am going to close it if it were made by mistake.
I know, I asked you to create a new bug to collect the user reports (in bug 1116055). So that is bug 1306914. In that bug we may need to fix some problems found by the users.

Only after that we can remove the seeks in the bug here.

So all is fine now and my question was incorrect.
The dependency is set correctly now and I'll also put it the to etherpad.
(In reply to :aceman from comment #5)
> I know, I asked you to create a new bug to collect the user reports (in bug
> 1116055). So that is bug 1306914. In that bug we may need to fix some
> problems found by the users.
> 
> Only after that we can remove the seeks in the bug here.
> 
> So all is fine now and my question was incorrect.
> The dependency is set correctly now and I'll also put it the to etherpad.

You were right.

I was totally confused myself.
Bug 1306914 has been created to let the users of beta/nightly to report the inconsistent |Seek()| position message. 
It was based on your suggestion that  a new bugzilla entry was made to collect such user report since bug 1116055 was too crowded now.

Quote of the comment in https://bugzilla.mozilla.org/show_bug.cgi?id=1306914#c3
---
When the user sees a seek position inconsistency during the beta testing period of patches to enable write buffering (see bug 1116055), C-C TB nightly would print the following message:

WriteLineToMailbox detected an unexpected file position change. If you can reliably reproduce this message, please report the steps you used to bug 1306914.

Please report the type of OS, 32-bit vs 64-bit, and CPU type and report the exact line you see in the error log or in the console.

Thank you!
---
Rebase, bitrot, etc.
Attachment #8721178 - Attachment is obsolete: true
Attached patch removing-a-Seek-rev02.patch (obsolete) — Splinter Review
Fixed bitrot and modified some comments so that they are more "appropriate" in C-C source tree.
Attachment #8798565 - Attachment is obsolete: true
Attached patch removing-a-Seek-rev02.patch (obsolete) — Splinter Review
bitrot.
Attachment #8807627 - Attachment is obsolete: true

In bug 1242030,
the following patches will be uploaded: it contains all the changes that have taken place for the last couple of years including the modification to match the clang-formatted source tree.

1242030-DONT-USE-REUSABLE-new-part-1.patch: bug 1242030: DONT-USE-REUSABLE new part-1 new splitting of (consolidation of 1122698, 1134527, 1134529, 1174500)
1242030-DONT-USE-REUSABLE-new-part-2-imap-JK-v1-rev02.patch: bug 1242030: DONT-USE-REUSABLE Improve error handling in file/stream I/O. Part 2. r=jorgk.
1242030-DONT-USE-REUSABLE-new-part-3-import-JK-v1-rev03.patch: bug 1242030: DONT-USE-REUSABLE Improve error handling in file/stream I/O. Part 3 (was 2). r=jorgk. revision 2
1242030-DONT-USE-REUSABLE-new-part-4-local-less-pop3.patch: bug 1242030: DONT-USE-REUSABLE new part-4 new splitting of (consolidation of 1122698, 1134527, 1134529, 1174500)
1242030-DONT-USE-REUSABLE-new-part-5-pop3.patch: bug 1242030: DONT-USE-REUSABLE new part-5 new splitting of (consolidation of 1122698, 1134527, 1134529, 1174500)

The following two patches, including one here, will complete the buffered output finally (!).

The other one:
enable-buffering-1st-step.patch: Bug 1242042: Enabling buffering for file stream to write message for C-C TB (Enabling buffering first step.)

The current one is this.
removing-a-Seek-rev02.patch: Bug 1242046 - Removing unnecessary |Seek| that caused the C-C TB to operate slowly in terms of I/O

clang-formatted.

Attachment #8834484 - Attachment is obsolete: true
You need to log in before you can comment on or make changes to this bug.