Closed Bug 1719121 Opened 3 years ago Closed 5 months ago

Confine "From " line handling to mbox-specific code

Categories

(MailNews Core :: Backend, task)

Tracking

(thunderbird_esr115 wontfix)

RESOLVED FIXED
122 Branch
Tracking Status
thunderbird_esr115 --- wontfix

People

(Reporter: benc, Assigned: benc)

References

(Depends on 7 open bugs, Blocks 1 open bug, Regressed 1 open bug)

Details

Attachments

(2 files, 2 obsolete files)

Most of the codebase assumes that it has to deal with "From " lines. So, a "From " line at the beginning of a message, and escaping/unescaping such lines in the message body.

Now that mbox isn't the only game in town, we should hide all this "From " handling down inside the mbox-specific code (nsMsgBrkMBoxStore).

https://searchfox.org/comm-central/search?q=%22From+&path=mail*&case=true&regexp=false

Depends on: 121946
See Also: → 110389
Depends on: 1733849

Edit: This comment probably doesn't really belong in this bug so I've moved it here: bug 1741748.

Ben, I don't know if the problem I see with Zimbra imap server is related to this. For testing of copying a huge folder between servers, a few years ago a tb user sent me a very large mbox file saved from tb that I imported into Local Folders and then copied to gmail imap server. Everything looked ok on gmail at that point. I've copied this email set to several imap servers from gmail and, once I got a successful copy, the messages all looked correct at the destination. Now recently I tried copying the same messages from gmail to a locally running Zimbra imap server. The messages all seem to arrive OK and are all present but some the Date values in the summary list for the Zimbra account show only the time of the copy and no day/month/year of the actual message.

I think the lack of complete Date has something to do with the "From " and "X-Mozilla-Keys" headers that are somehow present in the gmail source messages derived from Local Folders. I only see these on gmail messages that I copied from Local Folders and not on gmail messages that come in normally.

After copying 200 gmail messages from this set to the local zimbra server, to obtain the information for the email summary, tb asks Zimbra for a set of message headers that includes the DATE header and zimbra responds like this for each of the 200 message with multiple lines of data:

Zimbra: * 43 FETCH (UID 33384 RFC822.SIZE 11410 BODY[HEADER.FIELDS (FROM TO CC BCC SUBJECT DATE MESSAGE-ID PRIORITY X-PRIORITY REFERENCES NEWSGROUPS IN-REPLY-TO CONTENT-TYPE REPLY-TO)] {404}
Zimbra: From 
Zimbra: Date: Thu, 16 Jun 2016 17:50:13 -0400
Zimbra: (Other requested headers sent...)
:

Note that zimbra always sends the not-requested "From " line first. The 2nd line returned above is the correct "Date" header but for other messages the order of the returned headers varies and the 2nd one is usually not "Date" but something else with "Date" header later on. It appears that the date displayed in the summary is only shown as the current time when Zimbra returns the "Date" header right after the bogus "From ".

For example this works OK and doesn't corrupt the summary:

Zimbra: * 3 FETCH (UID 33344 RFC822.SIZE 16163 BODY[HEADER.FIELDS (FROM TO CC BCC SUBJECT DATE MESSAGE-ID PRIORITY X-PRIORITY REFERENCES NEWSGROUPS IN-REPLY-TO CONTENT-TYPE REPLY-TO)] {415}
Zimbra: From 
Zimbra: From: "Guy Lombardo" <guy@louisville.edu>
Zimbra To: "surferdude@nmr.mgh.harvard.edu" <surferdude@nmr.mgh.harvard.edu>
Zimbra: Date: Wed, 17 Apr 2019 16:20:59 +0000
:

In this case, the bogus "From " is still transmitted by Zimbra but the "Date" header is not right after the "From " and the message summary displays OK with the correct date. The "From" in the summary is also OK even though it was transmitted right after the bogus "From ".

At this time I'm not sure if the problem is due the bogus "From " or the zimbra server or a tb parsing problem. So maybe this should be a new bug.

I went ahead and created a new bug for comment 1 with a proposed fix here: bug 1741748.

See Also: → 1741748
See Also: → 1735557
See Also: → 472000
Assignee: nobody → benc
Status: NEW → ASSIGNED
Depends on: 1793313
Depends on: 1797664
See Also: → 1797616
Depends on: 1800997
Depends on: 1801003
Depends on: 1801005

A bit of a monster patch, I'm afraid. There are mbox assumptions all over the codebase.
I'm sure there will be some more refinement required, but I'd like to get some testing and scrutiny onto these changes now.

The main gist of this patch is to add mbox-aware nsIInputStream/nsIOutputStream classes. These are used to read/write single messages, with all the gory mbox details hidden away, so outside code never has to deal with it. It means that mbox and maildir messages (and any other future msgstore systems) can be handled exactly the same way.

There are a whole bunch of mbox variants out there. We're aiming to support mboxrd (which uses '>' chars to reversibly escape "From " lines).
So our mbox writing is very strictly mboxrd.
But the reading is much looser - there are a load of mbox folders out there in the wild, with all kinds of different rules applied. Thunderbird itself has written a variety of mbox variants over the years, and in different parts of the codebase!
So we aim to be as accommodating as reasonably possible during reading, and very strict during writing.

There are a bunch of unit tests for the new stream classes, which can be run like this:
./mach gtest "TestMboxMsgInputStream.*"
./mach gtest "TestMboxMsgOutputStream.*"
Not sure if the build server runs these automatically or not (but it should!).

Blocks: 1848476
Attachment #9329378 - Attachment is obsolete: true

Depends on D190226

Attachment #9356925 - Attachment description: WIP: Bug 1719121 - Part 1/2, Confine "From " line handling to mbox-specific code. → Bug 1719121 - Part 1/2, Confine "From " line handling to mbox-specific code.
Attachment #9356926 - Attachment description: WIP: Bug 1719121 - Part 2/2, Update tests for mbox refactor. → Bug 1719121 - Part 2/2, Update tests for mbox refactor.
Attachment #9366002 - Attachment is obsolete: true
Target Milestone: --- → 122 Branch

Unfortunately part2 has bitrotted. Please rebase.

Sigh. It's toMboxString(), yet again. That function is my nemesis, but this patch will destroy it once and for all!

Pushed by benc@thunderbird.net:
https://hg.mozilla.org/comm-central/rev/222268ebb87b
Part 1/2, Confine "From " line handling to mbox-specific code. r=babolivier,leftmostcat,mkmelin
https://hg.mozilla.org/comm-central/rev/f46382a0d7a9
Part 2/2, Update tests for mbox refactor. r=darktrojan

Status: ASSIGNED → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
See Also: → 1868122
Regressions: 1868504
Regressions: 1872232
See Also: → 1872849
Regressions: 1872849
Regressions: 1890230
See Also: → 1716651
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: