Open Bug 1890253 Opened 5 months ago Updated 3 months ago

Compact does not work for C-C TB POP3. Inbox does not get compacted although some messages were moved during download.

Categories

(MailNews Core :: Networking: POP, defect)

Thunderbird 126
x86_64
Linux
defect

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: ishikawa, Assigned: ishikawa)

References

Details

Attachments

(5 files)

I am cloning this bugzilla from bug 1890135.
In THAT bug, compact fails to be performed when all the messages were deleted inside Inbox. (I experience this under Linux, but it could be on all the architectures.)

This is different.

REPEAT-BY:
I am using mbox storage format and use pop3.

Delete all the messages in Inbox.
Compact Inbox and see its file size becomes zero.  (This did not happen before the patch in Bug 1890135)

Now send e-mails to Inbox some of which are moved to other folders by message filter.

(They are moved to folders after downloaded.)

This may be crucial, but I don't know.
The order of e-mails received is such that the first 19 messages are NOT moved to other folders, but remain 
in the Inbox since there is no message filter to move them.

There follow messages that are filtered to folder-D, folder-E and folder -F in that order, 19 messages in each folder.
Then comes the 19 groups of messages (one that remains in Inbox, followed by messages filetered to folder-D, folder-E , folder-F).

152  messages in Total.

 When I select Inbox and choose compact from the right-button context menu, compaction does not take place at all.

What should happen.
Compact should take place to remove the unused space from mbox file.

Well, if I delete the first message in Inbox, compaction happens.

Ben thinks this could be due to the code to move a message by filtering may have forgotten to update the expunge count.

From https://bugzilla.mozilla.org/show_bug.cgi?id=
1890135#c6
Note to track down the issue, the number of messages mentioned here is about 1/4 of the case discussed in
the following.
(In reply to ISHIKAWA, Chiaki from comment #5)
--- begin quote ---
Hmm...
I find another case where C-C TB won't compact Inbox after a successful download of 600+ messages
and about 3/4 of them were moved to other folders.
I tried hitting compact afew times, but nothing happened.
And after I deleted the first message, it compacted the file.
It reported about 20KB saving (of the first message), but actually the file shrunk much more than that (about 450 messages had been deleted already). So there is something wrong in the counting and that it did not offer to compact the file until I deleted a single message. Hmm...
Maybe I have to file another bug on this.

Hmm... I'd say it's a separate problem. Sounds like it's not even getting as far as starting the compaction. My guess would be that the fault is in the message-moving code, forgetting to update the expunge count for the folder as messages are deleted from it.
The compaction doesn't run if the expunge count is zero.
By manually deleting a message the expunge count gets set to something non-zero, and so compaction starts working again.

--- end quote ---

Bug 260698 deal with incorrect estimate of saved file size by compaction.
Compact folders - shows wrong estimated and compacted size. Compacts too often. totalExpungedBytes number is wrong.

Maybe I should trace the mExpungedBytes data carefully, and if the message database is properly available to report the mExpungedBytes, etc.

See Also: → 1890135, 1260698, 492344

Strange. I see the invocation of DiscardNewMessage() in otherwise perfect download.
I wonder if there is some optimization to remove the last message if that was moved to other folder from Inbox by calling DiscardNewMessage() (Is there some magic expected of DidscardNewMessage() to truncate the Inbox folder?)

I sent a series of messages in the following manner:
test data for messages have been created.

/var/tmp/3C is sent  ... will stay in Inbox
/var/tmp/2C is sent  ... will stay in Inbox
/var/tmp/1C is sent  ... will stay in Inbox
/var/tmp/3D is sent  ... moved to folder-D by a message filter.
/var/tmp/2D is sent      ditto
/var/tmp/1D is sent       ditto
/var/tmp/3E is sent  ... moved to folder-E by a message filter.
/var/tmp/2E is sent        ditto
/var/tmp/1E is sent        ditto
/var/tmp/3F is sent  ... moved to folder-F by a message filter.
/var/tmp/2F is sent      ditto
/var/tmp/1F is sent      dottp

I counted the number of times DiscardNewMessage() are called and I think they are 9 times and you can see
they are the number of messages that are moved from the end of Inbox mbox file by the filter.
Oh, well, that could be due to the locally added patch to try to
enable the compaction for moved message.
But I am not sure if that is the correct approach.
In the job,:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=dc2f2003fb46537698936c69d20c096a6e9040d2
the patch is
https://hg.mozilla.org/try-comm-central/rev/6b255fadebd5af1fda4c86331997e5c935f31f4d
In the file, comm/mailnews/search/src/nsMsgFilterService.cpp,
filter move shares the code to copy operation. BUT it forgets to set the removal information for the messages copied successfully.
The logic of the code, esp. flow is a bit hard to understand.
Here is my comment in the patch:

+
+          // mark messages as expunged if fall through from MOVE
+          // operation Copied verbatim from DELETE
+
+          // XXX BIG QUESTION.
+          // Not sure if the copy operation just done has successfully
+          // ended or not. We need to mark the messages as expunged
+          // ONLY WHEN the COPY succeeds. Anyone care to check the
+          // logic/flow here?  If there are two async operations
+          // issued in tandem, does the paramters in the following
+          // line properly refers to the original value as intended?
+          // The logical flow ought to be
+          // initiate copy
+          //   -> copy ends successfully
+          //     -> ONLY THEN we mark the original messages as expunged.
+
+          // The following seems insufficient for now?
+          if (actionType == nsMsgFilterAction::MoveToFolder) {
+            rv = curFolder->DeleteMessages(m_searchHitHdrs, m_msgWindow, false,
+                                           false, nullptr, false /*allow Undo*/);
+            BREAK_ACTION_IF_FAILURE(rv, "Deleting messages after move failed");
+         

So this may be invoking DiscardNewMessage() for the just downloaded but moved message.
I am not sure if it is working or not at the moment (probably not compaction does not seem to work after the download.
Oh, wait, if the message is deleted and not added because of DiscardNewMessage(), then compaction won't be necessary.
Neat. But is this what is expected of my patch here to set the messages as expunged by calling DeleteMessages() ?
They seem to only mark the messages as expunged.
As a matter of fact, the file is still big. I found the messages moved to the other folders are still inside Inbox.
So it is buggy, but there seems to be a pattern. )

Is my attempted patch the correct one?
Obviously not quite. I will show the incorrect behavior.

I wonder if anyone can shed light on this behavior.

Edit: Typo about folder-F fixed.

I am attaching the e-mails sent in this order.
The number is rather large. 152 in total.
The number of messages stored (not expunged) in Inbox is shown.
Problem is that the messages that have been moved to folder-D, folder-E, and folder-F
and still remain. But the numbers are so uniform and I was surprised.
The sending pattern may have something to do.

Inbox ought to contain messages that correspond to
prepared data 1C, 2C, 3C, ..., 19C.
Messages that contain data for 1D, 2D, 3D, ..., 1E, 2E, 3E, ..., 1F, 2F, 3F, ... are moved to
folder-D, folder-E and folder-F and should not remain.

And yet with work-in-progress patch,
messages that are moved to other folders are still remain (the meta database think there are
non-existent and only show messages that contain data for 1C, 2C, 3C, ...
And since the space occupied by messages for 1D, 2D, 3D, ..., 1F, 2F, 3F ..., are not marked as to be expunged, compaction does not occur.

Like I said, somehow DiscardNewMessage() is called (maybe due to WIP patch), but
something that marks the occupied space by the just downloaded message as expungible is not executed with my WIP patch.

Hmm...

Because in the test in preceding test sends messages that contain data for 1C, 2C, 3C, ..., 1D, 2D, 3D, ...1E, 2E, 3E, ... 1F, 2F, 3F, ...
twice for each message, Inbox receives all of them and since folder-{D,E,F} receive one fourth of the messages each,
and Inbox somehow is not truncated automagically and compaction does not happen,
the listing of mbox files show the size of folder mbox files accordingly.
Inbox is approximately 4 times the size of folder-{D,E,F}.

-rw------- 1 ishikawa root     703 Apr  6 07:19 Drafts
-rw-r--r-- 1 ishikawa root    1190 Apr  7 00:51 Drafts.msf
-rw------- 1 ishikawa root 1825480 Apr  9 05:43 Inbox
-rw-r--r-- 1 ishikawa root   20932 Apr  9 05:43 Inbox.msf
-rw------- 1 ishikawa root 1825480 Apr  9 05:41 Trash
-rw-r--r-- 1 ishikawa root   64955 Apr  9 05:47 Trash.msf
-rw------- 1 ishikawa root       0 Apr  8 06:31 folder-C
-rw-r--r-- 1 ishikawa root    1594 Apr  9 05:07 folder-C.msf
-rw------- 1 ishikawa root  456374 Apr  9 05:43 folder-D
-rw-r--r-- 1 ishikawa root   17053 Apr  9 05:49 folder-D.msf
-rw------- 1 ishikawa root  456374 Apr  9 05:43 folder-E
-rw-r--r-- 1 ishikawa root   17705 Apr  9 05:47 folder-E.msf
-rw------- 1 ishikawa root  456376 Apr  9 05:43 folder-F
-rw-r--r-- 1 ishikawa root   18181 Apr  9 05:43 folder-F.msf
-rw-r--r-- 1 ishikawa root     505 Apr  8 06:56 msgFilterRules.dat
-rw-r--r-- 1 ishikawa root      80 Apr  9 05:43 popstate.dat

Until I fix this issue, automatic testing of error behavior is very difficult.
My idea was to send a known number of e-mails of different contents,
and after an error recovery from a very simple I/O error,
download the rest of the messages,
and after compacting Inbox,
I count the number of each messages in the said folders by scanning the files
by an external program, and see if the numbers of each messages match what I sent, and the contents are intact.

Compacting does not happen without deleting a message in Inbox as of now due to this bug.

I have an idea.
MoveToFolder filter operation can be decomposed into
CopyToFolder filter operation
Delete filter operation

Problem is that copy is async and ApplyFilter seems to be called in a tail-recursion like
fashion to make sure async copy is finished before proceeding to the next filter (well this is what I read from the
comments, not the real code).
So I am not entirely sure where to modify the list of actions.: I am not sure who owns it and when we are allowed to change.
Also, we see lines like

    // 'm_curFolder' can be reset asynchronously by the copy service
    // calling OnStopCopy(). So take a local copy here and use it throughout the
    // function.
    nsCOMPtr<nsIMsgFolder> curFolder = m_curFolder;
    nsCOMPtr<nsIMsgFilter> curFilter = m_curFilter;

So we have to be sure that the m_searchHitHdrs, m_msgWindow passed to the next addtional DELETE that follows copy operation are the correct ones. But I am not sure how I can enforce it.

One thing is for sure.: MoveToFilter operation forgets to mark the messages for deletion in the current C-C TB source.

Oops.
For Pop3 case, maybe the heart of the operation
is in
https://searchfox.org/comm-central/source/mailnews/local/src/nsParseMailbox.cpp#1727
If so, either MoveNewlyDownloadedMessage() or MoveIncorporatedMessage() fails
to mark the message in Inbox as being expunged when the move succeeds, OR
we are expected to mark it when the move succeeded at (*) in the code

              nsCOMPtr<nsIMsgPluggableStore> msgStore;
              rv = m_downloadFolder->GetMsgStore(getter_AddRefs(msgStore));
              if (NS_SUCCEEDED(rv))
                rv = msgStore->MoveNewlyDownloadedMessage(msgHdr, destIFolder,
                                                          &msgMoved);
              if (NS_SUCCEEDED(rv) && !msgMoved)
                rv = MoveIncorporatedMessage(msgHdr, m_mailDB, destIFolder,
                                             filter, msgWindow);
              m_msgMovedByFilter = NS_SUCCEEDED(rv);
              if (!m_msgMovedByFilter /* == NS_FAILED(err) */) {
                // XXX: Invoke MSG_LOG_TO_CONSOLE once bug 1135265 lands.
                if (loggingEnabled) {
                  (void)filter->LogRuleHitFail(filterAction, msgHdr, rv,
                                               "filterFailureMoveFailed"_ns);
                }
              }
            //   *** <--- Here *\/
            // else  
            // mark the message to be deleted.
       

Initially I thought I found a smoking gun.

But I need an opinion from Ben.

What I did to check where the problem lies.

I started with an old version of TB and updated to the latest ESR.
And then C-C TB without my buffered write patch (meaing daily).

  1. Old TB 68.11.0 (64-bit) under linux does not show the symptom. The Inbox file properly removes the final message that was moved to other subfolder and thus the Inbox did not grow (or leave the room of moved message in the folder file.)
    I happened to have this old version in my local disk.
    I am restarting to update Thunderbird to see if later ESR works OK. (IO am on the release update channel.)

  2. Now, it was moved into 68.12.1 (suggested to update to 78.14.0).
    I tested the same procedure. I first removed all the messages in Inbox and compact it.
    And then send the messages, some of which are moved to other subfolders by message filter.
    ( I did not delete messages in Inbox and compacted it if it matters in step above.)
    I think 68.12.1 worked OK. Inbox does not leave the room for the moved messages.

  3. Now I update ESR TB to 78.14.0
    Does it work as expected.
    First I emptied Inbox to make the check easy.
    Yes, it seems to work. Inbox does not have any extra room and compact does not run.

  4. I update to 91.13.1 as suggested.
    empty and compact Inbox.
    Send the messages and see the file size.
    It seems to work. Inbox does not have extra space of moved messages.

  5. Now I update to 102.15.1
    Same procedure.
    It seems to work. Inbox does not have extra space of moved messages. So compact does not run.

  6. Updated 115.9.0 (This is the latest update) on release channel.
    Same procedure.
    It seems to work. Inbox does not have extra space of moved messages. So compact does not run.

  7. I downloaded a daily 64 bit for linux.
    https://www.thunderbird.net/ja/thunderbird/releases/
    Unfortunately, this version crashed when I try to empty Trash. Hmm...
    So I do not have a great confidence with this test.
    But this seems to fail.
    Inbox is not compacted. I mean, there are unused places in the Inbox folder after the download.
    Unused because the message previously there was moved to other subfolders.
    Inbox is 138486 bytes
    folder D is 34628 bytes.

My scenario now sends 72 messages and 18 messages are moved to folder D, 18 messages to folder-E, 18 messages to folder F. Messages 1-18 are created with different sizes, but the same numbered message for different folders have the same size.
So in the end, the folders, say Inbox and folder-D should have the similar size.
But Inbox is about four times as large as folder-D. This means the space left over by moved messages are in it.

Since as I mention, I cannot compact Inbox in this state of affair, I have to delete a message, the first one, and then compact. Now compact begins and since there are unused messsage spaces, the size shrunk. (Of course, the 1st message was also removed during this operation. How do I list the size of each message in this supernova GUI?
It was easy to add the size column in the old GUI, but not anymore?
OK, I have to choose the table view from the hamburger menu in the upper right corner.
(I have other misgivings like not seeing a menu item for deleting a message or selected messages in context menu until I realize there is a very smallish dust bin icon. The menu also should have "delete the chosen message(s) since
using the wide menu is much easier to click the smallish dust bin icon. But I digress.)

The first message is only about 3.1 KB. So it won't affect the result match.
Anyway, I dleted the first message in Inbox and compact.

Now Inbox becomes 31424 bytes, and folder-D is 34626. Considering the 3.1 message that was removed from C.
the Inbox size is as it should be.

Ben, I have found out the old code had a code section in DiscardNewMessage() that
seems to tweak the size as intended.

However, you have removed it in Bug 1857450.
Was there a reason for the removal, or was it just a single oversight?

I think what I found below is the cause of the problem.

---cf. In the old days, something like this was done.
See Bug 715924
pop3 move mail filters not truncating inbox correctly

That bugzilla entry is closed.
HOWEVER, it contains in the patch
a few lines of code that seems to adjust the folder size in DiscardNewMessage()
HOWEVER, my reading of the current code seems to miss this code block.
Hmm, have I removed it accidentally? I had an uneasy feeling. But, eventually I found it was not me. :-)

--- a/mailnews/local/src/nsMsgBrkMBoxStore.cpp
+++ b/mailnews/local/src/nsMsgBrkMBoxStore.cpp
@@ -764,27 +764,22 @@ nsMsgBrkMBoxStore::DiscardNewMessage(nsI
   NS_ENSURE_ARG_POINTER(aOutputStream);
   NS_ENSURE_ARG_POINTER(aNewHdr);
 #ifdef _DEBUG
   m_streamOutstandingFolder = nsnull;
 #endif
   PRUint64 hdrOffset;
   aNewHdr->GetMessageOffset(&hdrOffset);
   aOutputStream->Close();
-  if (hdrOffset != 0)
-  {
-    nsCOMPtr<nsILocalFile> mboxFile;
-    nsCOMPtr<nsIMsgFolder> folder;
-    nsresult rv = aNewHdr->GetFolder(getter_AddRefs(folder));
-    NS_ENSURE_SUCCESS(rv, rv);
-    folder->GetFilePath(getter_AddRefs(mboxFile));
-    rv = mboxFile->SetFileSize(hdrOffset);
-    NS_ENSURE_SUCCESS(rv, rv);
-  }
-  return NS_OK;
+  nsCOMPtr<nsILocalFile> mboxFile;
+  nsCOMPtr<nsIMsgFolder> folder;
+  nsresult rv = aNewHdr->GetFolder(getter_AddRefs(folder));
+  NS_ENSURE_SUCCESS(rv, rv);
+  folder->GetFilePath(getter_AddRefs(mboxFile));
+  return mboxFile->SetFileSize(hdrOffset);
 }

Other two diff blocks are still in MoveIncorporatedMessage(), CheckPartialMessage.I think.

(In reply to ISHIKAWA, Chiaki from comment #1)

Strange. I see the invocation of DiscardNewMessage() in otherwise perfect download.
I wonder if there is some optimization to remove the last message if that was moved to other folder from Inbox by calling DiscardNewMessage() (Is there some magic expected of DidscardNewMessage() to truncate the Inbox folder?)

It seems that it used to be done indeed for POP3 code.
But unfortunately, because the code to adjust the size by truncating Inbox has been removed it does not work anymore

Let me see if I can find out which patch removed the block...
Ok, it was removed in Bug 1857450
https://phabricator.services.mozilla.com/D200748
It was Ben's patch.

EDIT: Since daily shows the problem, I did not check the local version with my buffered patches in it.

Flags: needinfo?(benc)

That was a long analysis and I really just skimmed it, so forgive me if I picked up on the wrong parts of it!

(In reply to ISHIKAWA, Chiaki from comment #8)

Ben, I have found out the old code had a code section in DiscardNewMessage() that
seems to tweak the size as intended.

However, you have removed it in Bug 1857450.
Was there a reason for the removal, or was it just a single oversight?

It was intentional. That patch changes the output stream to be a nsISafeOutputStream, which means that you have to call it's .finish() method in order to commit the write. If the stream object is closed or deleted without .finish(), then it is rolled back.
So nsIMsgPluggableStore.FinishNewMessage calls .finish() upon the stream, while nsIMsgPluggableStore.DiscardNewMessage() just calls .close().

The truncation is done in the .close() implementation for the mbox-aware outputstream, using a seek followed by nsIFile.SetEOF():

https://searchfox.org/comm-central/source/mailnews/base/src/MboxMsgOutputStream.cpp#283

(In reply to ISHIKAWA, Chiaki from comment #7)

For Pop3 case, maybe the heart of the operation is in
https://searchfox.org/comm-central/source/mailnews/local/src/nsParseMailbox.cpp#1727
If so, either MoveNewlyDownloadedMessage() or MoveIncorporatedMessage() fails
to mark the message in Inbox as being expunged when the move succeeds, OR
we are expected to mark it when the move succeeded at (*) in the code

nsIMsgPluggableStore.MoveNewlyDownloadedMessage() is a no-op on mbox, my guess would be that MoveIncorporatedMessage() isn't quite right...

Flags: needinfo?(benc)

OK, I will investigate more.
In the meantime, I included the truncate code from the old patch in a somewhat modified manner in my local patch and
saw the Inbox shrunk as intended locally.
It does not seem to cause problems as far as message receiving side on the treeherder.

HOWEVER, your explanation suggests bad things may happen.

Yes I noticed MoveNewlyDownloadedMessage() was a noop and was a bit perplexed about it.
I will focus on MoveIncorporatedMessage() then.

It is really a maze of complex operations interacting in a subtle way.

Thank you for the comment.

I THINK if I can fix this, my test involving I/O error simulation can be pretty much automated (but requires a real e-mail server to receive a few hundred e-mails with mixed sizes for testing purposes. So treeherder is not target. The test is done locally.) since manual intervention to delete a message and then compact will no longer be necessary.

OK, I get the gist of flow.

It was intentional. That patch changes the output stream to be a nsISafeOutputStream, which means that you have to call it's .finish() method in order to commit the write. If the stream object is closed or deleted without .finish(), then it is rolled back.
So nsIMsgPluggableStore.FinishNewMessage calls .finish() upon the stream, while nsIMsgPluggableStore.DiscardNewMessage() just calls .close().

The truncation is done in the .close() implementation for the mbox-aware outputstream, using a seek followed by nsIFile.SetEOF():

https://searchfox.org/comm-central/source/mailnews/base/src/MboxMsgOutputStream.cpp#283

Before proceeding further, I have to ask a questin.
Ben, are you sure this
moving the file pointer to somewhere by seek and then close the file truncates the file?
I vaguely recall that this was how truncation done on MS-DOS and later MS OSes including windows,.

But I doubt if it works on Linux, or for that matter POSIX. I think you need to call
truncate() function explicitly. I assumed SetFileSize() call in the old patch calls truncate in lower layer.
Correct me if I am wrong.

I have to leave for a chore until the evening. Once I return, I will check again if truncation by seeking and closing works under linux and posix (MacOS is derived from posix-based FreeBSD IIRC).

Thank you again.

Flags: needinfo?(benc)

Searching SetFileSize,
https://searchfox.org/comm-central/search?q=setfilesize&path=&case=false&regexp=false
I found two implementations for linux and Windows.

	mozilla/xpcom/io/nsLocalFileUnix.cpp
1443 nsLocalFile::SetFileSize(int64_t aFileSize) {
	mozilla/xpcom/io/nsLocalFileWin.cpp
2712	nsLocalFile::SetFileSize(int64_t aFileSize) {

Windows implementation in nsLocalFileWin.cpp does exactly what you
described (that has been the practice since the first MSDOS days in the 1980s.
)https://searchfox.org/comm-central/source/mozilla/xpcom/io/nsLocalFileWin.cpp#2712

but linux version nsLocalFileUnix.cpp calles truncate or its friends based on the platform.
https://searchfox.org/comm-central/source/mozilla/xpcom/io/nsLocalFileUnix.cpp#1443

So the code in https://searchfox.org/comm-central/source/mailnews/base/src/MboxMsgOutputStream.cpp#283
needs a linux implementation (or call the SetFileSize() function maybe).

Now I am totally confused.
nsFileStreamBase::SetEOF() at the following line
https://searchfox.org/comm-central/source/mozilla/netwerk/base/nsFileStreams.cpp#89
calls ftruncate under linux and calls a win system call under windows.

So I suspect that

  • either
    Case A: somewhere in the class hierarchy for files, an overriding SetEOF() implementation does not handle linux properly?
    This is based on the observation that similar SetEOF() usages are abundant and
    all of them cannot be wrong.
    So a particular usage here of SetEOF() may not be working due to incorrect SetEOF() implemenation, OR
    Case B: There is a subtle interaction of safe output stream with what we try to achieve and the truncation as we intend does not happen?

--- My misguided investigation in early hours is like this. ---
As I look around, the code in the following place requires handling under linux, too IFF
setEOF for seekable stream fails to handle linux file appropriately.

https://searchfox.org/comm-central/source/mailnews/base/src/MboxMsgOutputStream.cpp#283

I am surprised. Looking at the use of SetEOF(), I think the following code also requires
modification (it does not work under linux).(??? probably false alarm in hindsight)
https://searchfox.org/comm-central/source/mozilla/dom/fs/child/FileSystemThreadSafeStreamOwner.cpp#21

The use of setEOF() looks suspicious at https://searchfox.org/comm-central/source/mozilla/modules/libjar/zipwriter/nsZipWriter.cpp#599
...
... ad infinitum
...
This is when I started to think a file class somewhere fails to define setEOF() properly that overrides the nsFileStreamBase::SetEOF() and causing a problem for us here.

I better check if a seekable stream's setEOF() calls truncate?

I think I found the root cause.

Each function is implemented correctly.
However, there are unforeseen sequence of function calls that are not handled by the state transition in the current implementation.

Ben, to my surpise, the file size change code in .Close() called by DiscardNewMessage() IS NEVER EXECUTED when a move by filter occurs.
(I don't know what happens with a filter rule to delete the downloaded message, but I suspect it won't work also because of the same reason the truncation does not occur propery with C-C TB when a newly downloaded message is moved to a subfolder by a filter.)

Here is what I found.

Please compare the log for a simple download case (no filter), and the log when a message is moved to a subfolder by a filter.

When move by filter is NOT involved.

Summary:
Very straightforward.
FinishNewMessage() is called.
It invokes .Finish() in turn, and all is well.

    Incorporate message begin:
    uidl string: 0001dbd0533eeb2c
    NS_NewBufferedOutputStream: outputStream (= std::move(aOutputputStream)) =0x7fea10069a48
    (debug) Creating buffered output stream to mboxFile=<<mailbox://ishikawa@localhost/Inbox>> in nsMsgBrkMBoxStore::GetNewMsgOutputStream in nsMsgBrkMBoxStore.cpp;
    [Parent 930186, Main Thread] WARNING: (debug) We are enabling buffering for m_outFileStream in nsPop3Sink::IncorporateBegin in nsPop3Sink.cpp.: file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/local/src/nsPop3Sink.cpp:484
    (debug) 497: m_outFileStream->Flush() returned 0x0
    {debug} creating seekable stream failed: /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/local/src/nsPop3Sink.cpp:514, rv = 80004002
    FinishNewMessage(): before safe->Finish() is called
    MboxMsgOutputStream::Finish()  called. this=0x7fe9e0007d90, mState=1
    {debug} MboxMsgOutputStream::Finish(): Before calling mInner->Close()
    {debug} Entering msMsgFilterList::ApplyFiltersToHdr...
    [Parent 930186, Main Thread] WARNING: (debug): ApplyForwardAndReplyFilter getting called.: file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/local/src/nsPop3Sink.cpp:967
    Incorporate message complete.

When move by filter is involved.

A message is moved from Inbox to folder-D

Short Summary:

Note in the current code, it seems that we call FinishNewMessage() on Inbox first (and calls .Finish()), AND THEN
MoveIncorporatedMessage() is called.

After that DiscardNewMessage() is called to reclaim the space and revert the meta information about the message. You will probably immediately see the .Close() call from DiscardNewMessage() is a NOOP since the invocation of .Finish() by FinishNewMessage() essentially closes the file (mState == eClosed).

Long description of the dump log below:

Note that we have a call on FinishNewMessage on the successful download marked as [0] in the following dump log. This is for Inbox.

FinishNewMessage() calls .Finish() and changed the internal state mState to eClosed.

But we will later call DiscardNewMessage() marked [2] in the hope of shrinking Inbox.

However because mState == eClosed, the call of .Close() in DiscardNewMessage() returns early without doing the bulk of its body including the file size change (truncation)

I think we need a slightly more complex state change mechanismconsidering this state transition involving move by filter (and delete also).

    Incorporate message begin:
    uidl string: 0001dbd9533eeb2c
    NS_NewBufferedOutputStream: outputStream (= std::move(aOutputputStream)) =0x7fea10010298
    (debug) Creating buffered output stream to mboxFile=<<mailbox://ishikawa@localhost/Inbox>> in nsMsgBrkMBoxStore::GetNewMsgOutputStream in nsMsgBrkMBoxStore.cpp;

            ... we are writing to Inbox using a buffered stream ...

    [Parent 930186, Main Thread] WARNING: (debug) We are enabling buffering for m_outFileStream in nsPop3Sink::IncorporateBegin in nsPop3Sink.cpp.: file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/local/src/nsPop3Sink.cpp:484
    (debug) 497: m_outFileStream->Flush() returned 0x0

    {morkdebug} morkStdioFile::Length:file->_fileno=88, this=0x7fe9b40275a0, ev=0x55580deab660
    {morkdebug} this->mFile_Name=/home/ishikawa/REMOTE-NFS/Maildir/Inbox.msf
    ... more mork dumps ...
                ... so we are handling Inbox here ...

[0] FinishNewMessage(): before safe->Finish() is called
                    <- the above is for Inbox, I think 

[1] MboxMsgOutputStream::Finish()  called. this=0x7fe9e00087d0, mState=1
       ... Finish() sets mState to eClosed (=5) on successful return.

    {debug} MboxMsgOutputStream::Finish(): Before calling mInner->Close()
    {debug} Entering msMsgFilterList::ApplyFiltersToHdr...

            ... here the match to a filter rule to move the message to
            folder-D is found ...

*   nsParseNewMailState::MoveIncorporatedMessage is called

      ... now we are moving the downloaded message to folder-D folder.

    GetDiskSpaceAvailable returned: 9871892283392 bytes
    (debug) nsMsgBrkMBoxStore::GetMsgInputStream: NS_NewLocalFileInputStream() returned mboxFile.get()=mailbox://ishikawa@localhost/Inbox
               ... we are reading from Inbox ...

    {deubg} LINE=1078, seekable->Seek(PR_SEEK_SET, offset) =  0x0
    NS_NewBufferedOutputStream: outputStream (= std::move(aOutputputStream)) =0x55580dc5b3d8
    (debug) Creating buffered output stream to mboxFile=<<mailbox://ishikawa@localhost/folder-D>> in nsMsgBrkMBoxStore::GetNewMsgOutputStream in nsMsgBrkMBoxStore.cpp;

            ... we are writing a message to folder-D ...

    FinishNewMessage(): before safe->Finish() is called
                        ... this is on filter-D probably?

    MboxMsgOutputStream::Finish()  called. this=0x55580971dbb0, mState=1
    {debug} MboxMsgOutputStream::Finish(): Before calling mInner->Close()
    MboxMsgOutputStream::Close() called, this=0x55580971dbb0, mState=5
    MboxMsgOutputStream::Close() returning early. mState = 5 == eClosed

    ... Now we are calling DiscardNewMessage, which *SHOULD* be on the
    Inbox. ... 

[2] {debug} DiscardNewMessage is called.
    {debug} DiscardNewMessage(): Before calling aOutputStream->Close()
    MboxMsgOutputStream::Close() called, this=0x7fe9e00087d0, mState=5

           ... You can find the value of this=0x7fe9e00087d0 on the
           ... above line on earlier Finish() call on line marked with [1].
           ... So this is for Inbox.

    MboxMsgOutputStream::Close() returning early. mState = 5 == eClosed
             ... SEE WE ARE RETURNING EARLY 
             ... because we have called Finish() already on the line
             ... marked [1]. That has set mState to 5 (eClosed).
                 So we never execute the file size change there in .Close().(!!!)

    DiscardNewMessage: hdrOffset=15557 <--- this would have been the
                       truncated file position (I picked up the number
                       from the old code to obtain it from aNewHdr).
                       I wanted to compare it to the seek position in
                       .Close(), but that statement never got executed to my
                       surprise.
                       The reason it was not executed is explained above.

                       ... I am not sure where the next Close is called. ...
                       ... But the value of this=0x7fe9e00087d0
                       ... suggests this is for Inbox.

    MboxMsgOutputStream::Close() called, this=0x7fe9e00087d0, mState=5
    MboxMsgOutputStream::Close() returning early. mState = 5 == eClosed

    {morkdebug} morkStdioFile::Length:file->_fileno=90, this=0x7fe9b40be250, ev=0x55580ed132f0
    {morkdebug} this->mFile_Name=/home/ishikawa/REMOTE-NFS/Maildir/folder-D.msf
    ... more mork debug output ...

    [Parent 930186, Main Thread] WARNING: (debug): ApplyForwardAndReplyFilter getting called.: file /NEW-SSD/NREF-COMM-CENTRAL/mozilla/comm/mailnews/local/src/nsPop3Sink.cpp:967
    Incorporate message complete.

I think the problem has been identified.
We need to figure out how to solve this issue.

The attached log shows the VERY BEGINNING of the long sequences of mail download.
The folders were emptied and compacted before I started the download.
Note that the Semaphore for Inbox (puresumably) was acquired before Begin mail message delivery line. Please recall this is POP3 download. (Later, the code tried to obtain this lock for each message download and fails, and I wondered what was going on. But now I realize the lock is acquired at the very top level.)

FinishNewMessage() is called to store the downloaded message permanently.

At the end there is a listing of mail directory where the folders are located.

This log is for the case when the downloaded message is moved into another folder (folder-D).

Note that there is a call to DiscardNewMessage() on Inbox after the message is moved to folder-D.
But when the message was downloaded FinishNewMessage() was already called, and the internal state flags are such that the truncation code is never executed.
***CAUTION: In my local patch, I have resurrected old truncation code, and it truncates Inbox. This is for ease of checking the content of Inbox by an external program without deciphering .msf file. I wanted to count the messages in the folders and compare the content and number of messages against what I have sent. ****

So the problem boils down to this.

Inbox .................................................. Folder-D

download happens.
FinishNewMessage() called.

Filter check and hit.
MoveIncorporatedMessage() called.

                            --------------------------> message is copied.
                                                       FinishNewMessage() called on Folder-D.

DiscardNewMessage() on Inbox.
Unfortunately, due to the fact FinishNewMessage() has been called
already, the truncation code in Ben's new code
never gets executed in DiscardNewMessage() because
it returns early.

Haven't forgotten about this - it's just a messy one and I've not had the mental bandwidth to dig into it yet.
The core issue is that message parsing in general and nsParseNewMailState in particular is a tangled mess and really needs refactoring (Bug 1876407).
But I'll try and have a sift through the pop3 incoming filtering code sometime over the next few days and see if I can fix up some of the worse atrocities...

See Also: → 1876407

(In reply to Ben Campbell from comment #18)

Haven't forgotten about this - it's just a messy one and I've not had the mental bandwidth to dig into it yet.
The core issue is that message parsing in general and nsParseNewMailState in particular is a tangled mess and really needs refactoring (Bug 1876407).
But I'll try and have a sift through the pop3 incoming filtering code sometime over the next few days and see if I can fix up some of the worse atrocities...

Thank you, Ben.

The crux of the matter is that I have not expected the following calling sequence of FinishNewMessage() and DiscardNewMessage() on Inbox to occur during the processing of the SAME message to occur when a move by filter takes place.

FinishNewMessage() ... when the download finishes.
... copy to other folder ...
DiscardNewMessage() ... since the message is moved, TB tries to remove the message from Inbox.

I always thought that a download of a message via POP3 ends with a single call of
either FinishNewMessage() or DiscardNewMessage().
I was wrong.

Version: unspecified → Thunderbird 126

We have resurrected the old code. This seems to work with my patch for Bug 1242030.

Or we can probably modify Close() to take the second argument, an
enum, to indicate which function is calling this function. If Close()
is called by DiscardNewMessage from the code to remove a message (from
Inbox), then Close() should execute the truncation code
even if mState = 5 == eClosed (which had been set by FinishNewMessage() for the same
message earlier).

Assignee: nobody → ishikawa
Status: NEW → ASSIGNED
Attachment #9407130 - Attachment description: try: -b do -p all -u all -t none Bug 1890253, Truncate Inbox after the downloaded message is moved by a filter to other folders → Truncate Inbox after the downloaded message is moved by a filter to other folders, r=BenC

The patch I posted essentailly resurrected the old code to truncate the folder.
It works as far as I can tell.

However, as I mentioned in my patch, the code can be elaborated as in comment 20.

I don't think this addresses the real problem, which is that the move-new-message-because-of-filter code expects to be able to call DiscardNewMessage() after FinishNewMessage() has already "committed" it.

The problem is nsParseNewMailState::MoveIncorporatedMessage(), which is a custom message-move routine.
But probably only for local folders - I just bet it'd screw up in all kinds of interesting ways if your filter target was an IMAP folder :-)

I've only had a brief look at it, but based on that, here's what my approach would be:

  1. Kill nsParseNewMailState::MoveIncorporatedMessage() (and nsParseNewMailState::AppendMsgFromStream())
  2. Kill nsIMsgPlugableStore.moveNewlyDownloadedMessage() - it would no longer be required.
  3. Replace the two uses of MoveIncorporatedMessage() with a call to nsIMsgCopyService.CopyMessages() (with the isMove param set) to perform the actual move. This is what is done when the filter action is a copy rather than a move. See nsParseNewMailState::ApplyFilterHit() for an example.

This would mean that moved messages would remain in the mbox file until the next compaction. But really I don't think that's such a bad thing.
It would also separate the filtering code from downloading/parsing and massively simplifies things (just look at how much code could be removed!)

Flags: needinfo?(benc)

(In reply to Ben Campbell from comment #22)

I don't think this addresses the real problem, which is that the move-new-message-because-of-filter code expects to be able to call DiscardNewMessage() after FinishNewMessage() has already "committed" it.

Yes, I was surprised that
DiscardNewMessage() was called after FinishNewMessage() had already been called. A total surprise.

The problem is nsParseNewMailState::MoveIncorporatedMessage(), which is a custom message-move routine.
But probably only for local folders - I just bet it'd screw up in all kinds of interesting ways if your filter target was an IMAP folder :-)

I have not used IMAP. I wonder if there have been any real-world issues.

I've only had a brief look at it, but based on that, here's what my approach would be:

  1. Kill nsParseNewMailState::MoveIncorporatedMessage() (and nsParseNewMailState::AppendMsgFromStream())
  2. Kill nsIMsgPlugableStore.moveNewlyDownloadedMessage() - it would no longer be required.
  3. Replace the two uses of MoveIncorporatedMessage() with a call to nsIMsgCopyService.CopyMessages() (with the isMove param set) to perform the actual move. This is what is done when the filter action is a copy rather than a move. See nsParseNewMailState::ApplyFilterHit() for an example.

Hmm... I will take a look at these pieces of codes.
But at least we have to kill the message as being deleted in the Inbox folder. That was done in the DiscardNewMessage().

This would mean that moved messages would remain in the mbox file until the next compaction. But really I don't think that's such a bad thing.

Actually, it is a bad thing for a user with MANY mailing list subscriptions.
I am afraid if such message remains Inbox, I have to run compaction more frequently.
As of now, I think I only compact Inbox one or twice a year.
(I forgot. I move about half a year's worth of e-mails to a newly created folder to hold a particular half a year period.
So in that sense, I do compact at least twice a year.)

It would also separate the filtering code from downloading/parsing and massively simplifies things (just look at how much code could be removed!)

Sounds nice.
I will take a look at this while I am cleaning up the buffered write patch.

Thank you for your comment.

For, now, though, the patch works.

I need this compaction now so that an external checker can count the received messages inside Inbox before and after a simulated lightweight I/O error to figure out if there is a serious discrepancy of received messages when an I/O error occurs.
The counting can be done without parsing the MSF db file. My test framework counts the messages by parsing mbox file directly.
Before the few patches accepted this year, somehow TB had deteriorated very much with regard to the handling of message delivery in the face of a lightweight I/O error: crash, or loss of e-mails.
Now, though, TB does not crash and does not lose e-mails. It re-downloads the messages.
cf. Lightweight I/O error: In my error simulator for testing, a simulated network outage causes a transient network file system error, which will succeed if it is retried later (after the simulated outage is restored).
TB no longer crashes after a mork patch this year.

(In reply to ISHIKAWA, Chiaki from comment #23)

But at least we have to kill the message as being deleted in the Inbox folder. That was done in the DiscardNewMessage().

This would mean that moved messages would remain in the mbox file until the next compaction. But really I don't think that's such a bad thing.

Actually, it is a bad thing for a user with MANY mailing list subscriptions.
I am afraid if such message remains Inbox, I have to run compaction more frequently.

I don't think it'd be too bad. I've got a lot more confidence in the compaction code after the recent rewrite, and it can be optimised for speed - there's no reason compaction shouldn't run at the maximum read/write speed of the underlying filesystem.

Anyway, I had a better idea.
It's only an mbox issue (maildir isn't affected). There's a deleteMessages() method on the nsIPluggableStore API.
The thing to do is to have a special case in the mbox store deleteMessages(): if you're deleting the last message in an mbox, it just resizes the file. Nobody outside the mbox code ever needs to know about it.

The current nsParseNewMailState::MoveIncorporatedMessage() implements it's own local-message-move. But really it should just be calling the nsMsgCopyService to move messages, and that does eventually call nsIPluggableStore.deleteMessages() to remove the local copy.

If I get some time later this week I'll try and have a go at this and everything I said up in Comment #22.

(In reply to Ben Campbell from comment #24)

(In reply to ISHIKAWA, Chiaki from comment #23)

But at least we have to kill the message as being deleted in the Inbox folder. That was done in the DiscardNewMessage().

This would mean that moved messages would remain in the mbox file until the next compaction. But really I don't think that's such a bad thing.

Actually, it is a bad thing for a user with MANY mailing list subscriptions.
I am afraid if such message remains Inbox, I have to run compaction more frequently.

I don't think it'd be too bad. I've got a lot more confidence in the compaction code after the recent rewrite, and it can be optimised for speed - there's no reason compaction shouldn't run at the maximum read/write speed of the underlying filesystem.

Anyway, I had a better idea.
It's only an mbox issue (maildir isn't affected). There's a deleteMessages() method on the nsIPluggableStore API.
The thing to do is to have a special case in the mbox store deleteMessages(): if you're deleting the last message in an mbox, it just resizes the file. Nobody outside the mbox code ever needs to know about it.

I was thinking how to go about the change of meta data of the message that is being removed.
deleteMessages() may be the way to go.

The current nsParseNewMailState::MoveIncorporatedMessage() implements it's own local-message-move. But really it should just be calling the nsMsgCopyService to move messages, and that does eventually call nsIPluggableStore.deleteMessages() to remove the local copy.

If I get some time later this week I'll try and have a go at this and everything I said up in Comment #22.

That will be great.
I will focus on my buffered write patch.
I could finally enable the use of environment variable on try-comm-central so that I can see output from MOZ_LOG().

Attachment #9407130 - Attachment description: Truncate Inbox after the downloaded message is moved by a filter to other folders, r=BenC → Bug 1890253 - Truncate Inbox after the downloaded message is moved by a filter to other folders. r=BenC
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: