Closed Bug 1472524 Opened Last year Closed Last year

mbox to maildir conversion does not work if "Allow Windows Search to search messages" is enabled, chokes on mozmsgs folders

Categories

(MailNews Core :: General, enhancement)

x86
Windows 7
enhancement
Not set

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 63.0

People

(Reporter: jorgk, Assigned: benc)

References

(Blocks 1 open bug)

Details

Attachments

(1 file, 2 obsolete files)

+++ This bug was initially created as a clone of Bug #1470729 +++

In bug 1470729 we discovered that the conversion fails if "Allow Windows Search to search messages" is enabled. It chokes on the mozmsgs folders.

Removing those folders makes the conversion pass. I have sample data from bug 1470729 comment #18.

Magnus, which resource would you like to allocate to fix the mbox<->maildir experimental conversion. Looks like no one ever tried this case :-(
Flags: needinfo?(mkmelin+mozilla)
OK, I looked a little into the Windows search integration.

It all happens in https://dxr.mozilla.org/comm-central/source/mail/components/search/searchCommon.js. For example, there is a listener:
https://dxr.mozilla.org/comm-central/rev/51678c5b0b3edd827c57a70f1c5baf02d44acead/mail/components/search/searchCommon.js#594
The file extension is used here (and elsewhere):
https://dxr.mozilla.org/comm-central/rev/51678c5b0b3edd827c57a70f1c5baf02d44acead/mail/components/search/searchCommon.js#760

Other resources:
comm-central/mail/components/search/nsMailWinSearchHelper.cpp
comm-central/mail/components/search/wsenable - WSEnable executable
https://dxr.mozilla.org/comm-central/search?q=wdseml&redirect=false to see where the extension is used, for example inL
https://dxr.mozilla.org/comm-central/rev/51678c5b0b3edd827c57a70f1c5baf02d44acead/mail/components/search/WinSearchIntegration.js#79

And here is where we read mail.winsearch.enable:
https://dxr.mozilla.org/comm-central/rev/51678c5b0b3edd827c57a70f1c5baf02d44acead/mail/components/search/searchCommon.js#98

So that all appears to be working, we only need to teach the mbox<->maildir converter to deal with those files and folders.

And we need to be careful since there's also a Spotlight integration:
https://dxr.mozilla.org/comm-central/rev/51678c5b0b3edd827c57a70f1c5baf02d44acead/mail/components/search/SpotlightIntegration.js#15
Assignee: nobody → benc
Flags: needinfo?(mkmelin+mozilla)
Just to confirm: the presence of any directory in there other than the structural .sbd ones will screw up the mbox->maildir conversion.

I'm pretty sure the fix will be pretty simple (in https://dxr.mozilla.org/comm-central/source/mailnews/base/util/mailstoreConverter.jsm). Just slowly picking my way through it to make sure I understand it properly and don't introduce any unintended side effects.
Yes, if the conversion finds those mozmsgs folders, it somehow tries to interpret the content, single messages, as mbox files. That's on Windows. I can't speak to situation on Mac, all I know is that they have a Spotlight search.
OK, what I think is happening:
The conversion assumes that any non-.sbd directories are maildir directories (remember that the conversion can go either way).
It assumes this even if it's converting from mbox.
So, it's easy enough to fix by either:

  a) performing a further check for a "cur" subdirectory (and maybe "tmp" too), to confirm that it is, in fact, a maildir..
or
  b) if we're converting _from_ mbox, don't treat any source dirs as maildirs! 

(I tried a, and it worked fine, but b seems like the more 'correct' approach)

The next questions are:
- should the .mozmsgs dirs be copied over verbatim or just ignored? (I haven't yet looked at the windows search/spotlight code)
- are there any other directories or files which might legitimately be encountered there during a conversion?
Yeah checking for "cur" is a bit fragile. Maybe there's such a dir in mbox too.

For the .mozmsgs, I don't think they should be copied over, as with maildir you already have one message per file, which is what the .mozmsgs are. The search integration code may need some adjustment so it can use them directly. Or maybe the OS handles search without any integration when the mails are in their own files? (also see bug 1144478 and bug 1259040)
Personally I don't understand why those maildir files don't have a file extension. That way we could tell Windows to index them. With no file extension, they are anonymous files for which you can't even configure a program to open them although they could be opened in a text editor.
Attached patch skip_mozmsgs.patch (obsolete) — Splinter Review
Patch to skip ".mozmsgs" dirs during conversion.
Also fixes up a possible scope clash ('count' var) and adds in a couple of notes on clarifying the implementation a bit more.

This means the resulting maildir _won't_ support Windows search. Bug 1144478 would address that.

Maybe the best pre-1144478 solution is to just copy the .mozmsgs dir verbatim after all? This works in both directions.
I can rework the patch to do this.
Comment on attachment 9004428 [details] [diff] [review]
skip_mozmsgs.patch

Review of attachment 9004428 [details] [diff] [review]:
-----------------------------------------------------------------

Ready for feedback or review?

::: mailnews/base/util/converterWorker.js
@@ +269,5 @@
>          // mbox and the no. of msgs in the account is 0.
>          self.postMessage(["file", sourceFile, textNew.length]);
>        }
> +    } else {
> +      // UHOH - unhandled dir or file... shouldn't be here!

Not so useful unless you add some sort of output. dump(), throw new Error(), Cu.reportError().
Attachment #9004428 - Flags: feedback?(mkmelin+mozilla)
Comment on attachment 9004428 [details] [diff] [review]
skip_mozmsgs.patch

Review of attachment 9004428 [details] [diff] [review]:
-----------------------------------------------------------------

Please address the comments, and when done, request review by setting the review? flag on the attachment and enter my email in the box that appears

(About the commit message, do put a " - " after the bug number

::: mailnews/base/util/converterWorker.js
@@ +17,5 @@
>      var serverType = e.data[5];
>      var stat = OS.File.stat(sourceFile);
> +
> +    // TODO: have mailstoreConverter.jsm decide what operation we want
> +    // (copy file, create subfolder, mbox->maildir etc etc) and pass that

Well we pass in ailstoreContractId already.
Not sure how much of the logic we want to move into the caller. Note that you can't access xpcom in the worker

@@ +19,5 @@
> +
> +    // TODO: have mailstoreConverter.jsm decide what operation we want
> +    // (copy file, create subfolder, mbox->maildir etc etc) and pass that
> +    // in as a parameter. Keep all the filename voodoo in once place.
> +

nit: remove double empty line

@@ +22,5 @@
> +    // in as a parameter. Keep all the filename voodoo in once place.
> +
> +
> +    if (stat.isDir && sourceFile.substr(-8) == ".mozmsgs") {
> +      // it's an OS search integration dir.

nit: We try to capitalize and use final dot when commenting with whole sentences. (Applicable elsewhere in the patch too)

@@ +269,5 @@
>          // mbox and the no. of msgs in the account is 0.
>          self.postMessage(["file", sourceFile, textNew.length]);
>        }
> +    } else {
> +      // UHOH - unhandled dir or file... shouldn't be here!

You can just do throw new Error("Unhandled source file: " + sourceFile);

::: mailnews/base/util/mailstoreConverter.jsm
@@ +107,5 @@
>        var content = contents.getNext()
>                              .QueryInterface(Ci.nsIFile);
>        if (content.isDirectory()) {
> +        if (content.leafName.substr(-8) == ".mozmsgs") {
> +          // don't count OS integration dir

Having emtpy if statements is a bit unusual.
I think it's better to negate the statement and adjust code to that

@@ +135,5 @@
> +      if (content.leafName.substr(-4) == ".sbd") {
> +        // it's a subfolder. Recurse down...
> +        count = count + countMaildirMsgs(content);
> +      } else if (content.leafName.substr(-8) == ".mozmsgs") {
> +        // dir for OS search integration - don't count

I'd move this up the isDirectory check and do an early continue

@@ +167,3 @@
>          count = count + 1 + countMaildirMsgs(content);
> +      } else if (content.leafName.substr(-8) == ".mozmsgs") {
> +        // dir for OS search integration - don't count

same here
Attachment #9004428 - Flags: feedback?(mkmelin+mozilla) → feedback+
Attached patch skip_mozmsgs2.patch (obsolete) — Splinter Review
Attachment #9004428 - Attachment is obsolete: true
Attachment #9004758 - Flags: review?(mkmelin+mozilla)
OK, so my current approach is to just skip the .mozmsgs directories during the conversion, for now.

If the maildir messages gain their own .eml extension, then maildir won't need .mozmsgs - windows search can index the maildir itself.
However, we'd still need to make sure new .mozmsgs content was generated when converting from maildir to mbox. Either:
- the conversion process does it itself (ugh - paths and metadata file formats depend on OS and settings)
- the conversion process notifies the search integration that messages have been moved about
- the search integration automagically notices that .mozmsgs is missing, and regenerates it
Comment on attachment 9004758 [details] [diff] [review]
skip_mozmsgs2.patch

Review of attachment 9004758 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good to me, just a couple of nits. Once you made the adjustments you can tag "r=mkmelin" onto the commit message, upload a new patch, and put add the checkin-needed keyword. Somone will the land the patch for you

::: mailnews/base/util/converterWorker.js
@@ +23,5 @@
> +//  - xpcom interface of the source mailstore (maildir or mbox).
> +// Since mailstoreConverter.jsm is already scanning the store
> +// and making these decisions, it would probably make sense to have it
> +// specify the operation type explicitly rather than repeating the
> +// logic here so the worker can decide.

Great that you're adding documentation. 
Would you mind making it use 

/**
 *  This worker performs one of a set of operations requested by
 *  mailstoreConverter.jsm. ....
 */

 style, since that is generally documentation.

@@ +45,5 @@
> +      // it's an OS search integration dir.
> +      // A no-op for now. Maildir/OS search integration is still
> +      // a little undecided (see bug 1144478).
> +      return;
> +    } else if (stat.isDir && sourceFile.substr(-4) == ".sbd") {

our coding style guidelines say no else after return. bail as early as you can
Attachment #9004758 - Flags: review?(mkmelin+mozilla) → review+
(In reply to Ben Campbell from comment #12)
> OK, so my current approach is to just skip the .mozmsgs directories during
> the conversion, for now.
> 
> If the maildir messages gain their own .eml extension, then maildir won't
> need .mozmsgs - windows search can index the maildir itself.

We might have to tell it to index them still. (I'm not sure).
I'm also not sure to what extent it is used/working anymore. Bug 574525 comment 7.

> - the search integration automagically notices that .mozmsgs is missing, and
> regenerates it

This seems to be the case: https://searchfox.org/comm-central/rev/936022fa3ff720477a181fbf6f162048c617a6b9/mail/components/search/searchCommon.js#299
Status: NEW → ASSIGNED
Sorry about the big extra whitespace change. When fixing up the return/else issue I figured the early-out made more sense if I made the boundaries of code for the different worker operations more distinct.

I'll leave off checkin-needed until you've had another peek at it.
Attachment #9004758 - Attachment is obsolete: true
Attachment #9004790 - Flags: review?(mkmelin+mozilla)
Comment on attachment 9004790 [details] [diff] [review]
skip_mozmsgs3.patch

Review of attachment 9004790 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thx! r=mkmelin
Attachment #9004790 - Flags: review?(mkmelin+mozilla) → review+
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/da6807db5974
fix mbox to maildir conversion when mozmsgs folders are present (Windows search support). r=mkmelin
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
First patch landed, thank you ;-) - In the future, please provide a succinct commit message. Copying the bug summary is usually not the way to go.
Target Milestone: --- → Thunderbird 63.0
Another remark. Bug 856087 created three tests to cover the conversion. I ran all three of them and they still pass. (Should have done that before landing the bug.)

FYI, you run them like this:
mach xpcshell-test comm/mailnews/imap/test/unit/test_converterImap.js
mach xpcshell-test comm/mailnews/base/test/unit/test_mailstoreConverter.js
mach xpcshell-test comm/mailnews/base/test/unit/test_converterDeferredAccount.js
You need to log in before you can comment on or make changes to this bug.