Closed Bug 1756520 Opened 2 years ago Closed 2 years ago

Simplify nsIMsgFolderCompactor implementations.

Categories

(MailNews Core :: Backend, task)

Tracking

(thunderbird_esr91 wontfix)

RESOLVED FIXED
99 Branch
Tracking Status
thunderbird_esr91 --- wontfix

People

(Reporter: benc, Assigned: benc)

References

Details

Attachments

(9 files)

There are two implementations of nsIMsgFolderCompactor:
nsFolderCompactState (for local folders) and the derived class nsOfflineStoreCompactState (for IMAP/offline folders).

The nsIMsgFolderCompactor interface defines compact() and compactFolders(), both of which have a shared implementation in both the classes. Each class then diverges to handle compacting a single folder in different ways.

  • nsFolderCompactState calls nsIMsgMessageService.CopyMessages() to iterate through all the messages
  • nsOfflineStoreCompactState steps through non-deleted messages manually, calling nsIMsgMessageService.StreamMessage() on each message in turn.

This mingles together iteration over multiple folders with iteration over messages within each folder.
It gets really complicated really quickly. Loads of ill-defined listener callbacks. The only way I've been able to make any sense of it so far is to add printf()s on every function, then compare compaction runs on both local and IMAP folder to see which bits of code are shared and which are not. Ugh.

I think the way forward is to separate the nsIMsgFolderCompactor front end away from the code which actually compacts a single folder.
So there'd be a nsMsgFolderCompactor which orchestrates compacting any number/type of folders, delegating compaction of individual folders to the right backend (one class for compacting a local folder, one class for an IMAP/offline folder).

background: As part of Bug 1744461, I'm trying to rip out the ad-hoc header parsing in nsFolderCompactState and it's reliance on nsIMsgDBHdr.statusOffset. But it's all rather intertwined with nsOfflineStoreCompactState at the moment and hard to modify with any confidence.

Keywords: leave-open

In nsIMsgFolder.compactAll(), the aCompactOfflineAlso param was always true
for IMAP folders, and always false for local folders. The calling code
shouldn't have to deal with that check.

Depends on D139327

Just a couple of minor cleanups to begin with. Try build running here:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=713132a496069d8ffa0ed17ea353b12a21416e1d

Status: NEW → ASSIGNED
Target Milestone: --- → 99 Branch

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/aae95b69a058
Remove nsMsgDBFolder::CompactOfflineStore(). r=mkmelin
https://hg.mozilla.org/comm-central/rev/e92784413741
Remove redundant arg in nsIMsgFolder.compactAll(). r=mkmelin

I've been separating out the code for queuing multiple compaction operations, leaving the two related compaction classes (one for local folders, the other for imap offline folders), so they only have to deal with a single folder at a time.
(I plan to try to unify the two compaction classes at some point - I don't think they are both needed)

It's been getting a little mindbending, so I figured it'd be good to step back a little and ask a few questions:

  1. What does it mean (from a UI point of view) to "Compact" a folder? I'd say that it means:
    a) if it's an IMAP folder, kick off a server-side "expunge" AND...
    b) if there is a local mbox holding messages for the folder, rebuild that mbox to remove deleted messages (as denoted by the "deleted" flag in the msg db).
    Does this sound about right?
  2. For local folders, the compaction code currently checks for out-of-date/missing msgdb files and calls folder->ParseFolder() to rebuild it before starting the compaction. I can see that this is potentially useful... but it's an extra layer of complexity. Should it do this?
  3. The compactor uses folder->SetMsgDatabase(nullptr); to attempt to close the DB file (for Bug 249754 - the aim is to avoid running out of filehandles if you compact a huge number of folders). Is this the right way to do it? Are there any modern guidelines regarding msgDBs and filehandles?

I don't need specific answers, but I just figured someone might have a better feel for the intentions of the compacting code than I, so any thoughts welcome!

Flags: needinfo?(mkmelin+mozilla)
Flags: needinfo?(geoff)

I decided that mail header blocks are generally small enough that it's
reasonable to have them fully loaded into memory before parsing.
This simplifies the code, lets us skip some data copying, and avoids the
ugly and error-prone requirement of a Flush() call to mop up leftover data.
It also makes it easier to continue processing the message body.
(This is preparation for using it in the folder compaction code, which
currently has it's own ad-hoc header-parsing for dealing with X-Mozilla-*
headers).

(In reply to Ben Campbell from comment #5)

I've been separating out the code for queuing multiple compaction operations, leaving the two related compaction classes (one for local folders, the other for imap offline folders), so they only have to deal with a single folder at a time.
(I plan to try to unify the two compaction classes at some point - I don't think they are both needed)

It's been getting a little mindbending, so I figured it'd be good to step back a little and ask a few questions:

  1. What does it mean (from a UI point of view) to "Compact" a folder? I'd say that it means:
    a) if it's an IMAP folder, kick off a server-side "expunge" AND...
    b) if there is a local mbox holding messages for the folder, rebuild that mbox to remove deleted messages (as denoted by the "deleted" flag in the msg db).
    Does this sound about right?

rings right to me.

  1. For local folders, the compaction code currently checks for out-of-date/missing msgdb files and calls folder->ParseFolder() to rebuild it before starting the compaction. I can see that this is potentially useful... but it's an extra layer of complexity. Should it do this?

Not familiar with this code.

  1. The compactor uses folder->SetMsgDatabase(nullptr); to attempt to close the DB file (for Bug 249754 - the aim is to avoid running out of filehandles if you compact a huge number of folders). Is this the right way to do it? Are there any modern guidelines regarding msgDBs and filehandles?

I'm not aware that methods have changed in this area. However, I'm not sure this question has been raised in the past, i.e. is there better method.

(In reply to Ben Campbell from comment #5)

Does this sound about right?

I think so, but I never looked at that code in too much detail.

  1. For local folders, the compaction code currently checks for out-of-date/missing msgdb files and calls folder->ParseFolder() to rebuild it before starting the compaction. I can see that this is potentially useful... but it's an extra layer of complexity. Should it do this?

It's hard to say, but I'd assume it was added for a reason at some point.

  1. The compactor uses folder->SetMsgDatabase(nullptr); to attempt to close the DB file (for Bug 249754 - the aim is to avoid running out of filehandles if you compact a huge number of folders). Is this the right way to do it? Are there any modern guidelines regarding msgDBs and filehandles?

I don't know of a better way of doing it.

Flags: needinfo?(mkmelin+mozilla)

We discussed this elsewhere and I have nothing to add.

Flags: needinfo?(geoff)

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/b082b54edff5
Simplifed HeaderReader for parsing message headers in C++. r=mkmelin

Looking good.
I'm curious, do the tests force any unexpected failure conditions? (I confess I didn't slog through the code)

(In reply to Wayne Mery (:wsmwk) from comment #12)

Looking good.

That patch (comment #11) is really just to detangle some of the trickier code paths.
Over time, it looks like the multiple-folder handling had been wedged into the single-folder-compacting classes, along with a derived class to handle IMAP folders... and there were some eye-watering contortions required to keep it all working.

I've tried to leave the actual single-folder-compaction code as untouched as possible, just extracting out all the batch handling into a separate object. So should be waaay easier to work with in future. I've got a couple more things to tidy up, but I'll leave off doing any more major work on it for now - this part of the yak has had quite enough shaving for now!

I'm curious, do the tests force any unexpected failure conditions? (I confess I didn't slog through the code)

Yes, I was looking at adding some and was pleasantly surprised to find test_compactFailure.js was already there!
Not totally comprehensive, but it covers cases of files being locked or missing, and it'll do for now.

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/d721be96a806
Separate multifolder compaction management from folder compaction classes. r=mkmelin

nsMsgFolderCompactor implemented nsIUrlListener in order to hear when individual folders complete compacting.
This patch replaces that mechanism with a simple callback.
It makes the cause and effect much clearer - the completion code is defined close to where the compaction operation is kicked off.
It also hides away what is really an implementation detail - the outside world never had any interest in knowing that nsMsgFolderCompactor implemented nsIUrlListener!

Attachment #9268609 - Attachment description: Bug 1756520 - Remove redundant nsIUrlListener inheritance in nsMsgFolderCompactor with simple callback. r=mkmelin → Bug 1756520 - Remove redundant nsIUrlListener inheritance in nsMsgFolderCompactor. r=mkmelin

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/e1af0576f14e
Remove redundant nsIUrlListener inheritance in nsMsgFolderCompactor. r=mkmelin

Adding checkin-needed-tb so I remember to back out for daily

Sighs. Well, that's embarrassing. Looking into it now.

(Of course the breakage showed up in both of the try runs I did with the original patch... just that I was looking at another try run I'd kicked off for something completely unrelated! Doh!)

In some cases the lambda function was allowing the release of the object that held it, thus destroying itself and invalidating the captured variables. Easy fix - added a refptr to hold the owning object until the function completes.

And seems to do the trick:
https://treeherder.mozilla.org/jobs?repo=try-comm-central&revision=9638b876adc666e27f0b92d14f72ab89b2d29082

D141922 doesn't apply cleanly.

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/32ba6f666329
Remove redundant nsIUrlListener inheritance in nsMsgFolderCompactor. r=mkmelin

Sorry about that - it's logically independent of the other patch (D141567), but getting them out of order seems to cause massive mergetool confusion.
Now that D141567 has landed (comment 24) it should apply just fine. And I rebased it just to make sure.

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/6b107a42b4c4
Move nsMsgFolderCompactor implementation details from header file to cpp file. r=mkmelin

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/22a8a6232b4e
Comment changes only. Add some notes to the folder compaction classes. r=mkmelin

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/b93c1d190606
Remove redundant oneliner nsFolderCompactState::GetMessage(). r=mkmelin

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/74fc0e7895c1
Replace bespoke header-rewriting code in folder compaction. r=mkmelin

Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
See Also: → 1776727
Blocks: 1777766
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: