Closed Bug 1696104 Opened 4 years ago Closed 4 years ago

Archive function (IMAP to local folder, ArGoSoft Mail Server) adds UTF-8 BOM (visible as ) to line after X-Mozilla-Keys header

Categories

(Thunderbird :: Untriaged, defect)

defect

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: scott, Unassigned)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.149 Safari/537.36

Steps to reproduce:

Archive Inbox messages to Local Folder with TB 78.8.0 (64-bit) for Windows.

Actual results:

Many (not all) messages get some ASCII garbage  added to the beginning of the line following the X-Mozilla-Keys header which was added by TB when the message was archived to a Local Folder.

When this ASCII garbage  is added to the beginning of the Subject header, this causes TB to display an empty subject in the message list.

Expected results:

There should not be any ASCII garbage characters  added to the beginning of the line following the X-Mozilla-Keys header which was added.

"" (EF BB BF) is the UTF-8 BOM.

I don't see that. Does that always happen?

It does not (In reply to Klaus B. from comment #2)

I don't see that. Does that always happen?

It does not always happen.

Did you try archiving a message from you IMAP mailbox to a Local Folder, then check the message headers of the message in the Local Folders? That's when I notice that it happens... when it happens, about 30% of the time.

It's entirely possible that the UTF-8 BOM is inserted at the beginning of the message header by the mail server when it's created on the mail server but is not visible or is ignored by TB if there is no ASCII before it.

I do not see the UTF-8 BOM characters in the headers of the message on the mail server, only after the message is archived by TB to Local Folders. But when the message is moved to the Local Folders, Thunderbird inserts a time stamp and X-Mozilla stuff at the very beginning of the message header. That's when I see the UTF-8 BOM characters about 30% of the time.

The mail server I'm using is ArGoSoft Mail Server .NET for Windows 1.3.0.2

I was lazy, I used local folders. Right, the first few lines (X-Mozilla-*) are added by TB, you're seeing the BOM in from of the first "real" message line.

Gene, ever seen something like this?

Flags: needinfo?(gds)
Summary: Archive function adds ASCII garbage  to line after X-Mozilla-Keys header → Archive function (IMAP to local folder, ArGoSoft Mail Server) adds UTF-8 BOM (visible as ) to line after X-Mozilla-Keys header

Gene, ever seen something like this?

No. I just tested by setting archiving to "Local Folders/test" (just the flat-folder option) and don't see the problem. I did it many times so that if it happens 30% of time, I didn't see it.

I haven't run into a user of the ArGo mail server. I wonder if it is UTF8=ACCEPT capable? Looking at their website, it didn't mention UTF8 so not sure. I guess maybe an IMAP log might be needed to tell: https://wiki.mozilla.org/MailNews:Logging. But then again, I guess the UTF8=ACCEPT is still in beta, so probably not an issue.

Also, is the message "show source" OK before archiving? Might also look in the IMAP mbox file where the messages are stored by TB to see if there are BOM-like strings in there on some messages.

Flags: needinfo?(gds)

I did a View Source of the message before I archived or moved it to the Local Folders in TB and did not see the  characters in the source.

So lemme remote to the actual ArGo mail server and open some .eml files and check. One sec...

Nope. I opened 12 .eml files on the server with Notepad and Notepad++ and none of the emails showed any  characters at the beginning. The same is true when I View Source the message in TB. The  characters only show up AFTER I archive the message to the Local Folders.

Better test:
I opened 3 .eml files on the mail server with Notepad and Notepad++.
None of the 3 .eml files had the  characters.
I Archived these 3 emails in TB to my Local Folders.
ALL 3 emails had the  characters when I View Source the message in the Local Folders! 100%!

Even if this is a problem originating with ArGo, maybe it's a good idea for TB to strip out these kind of characters when a message is archived or copied if these characters exist in the header where they shouldn't, to prevent message header corruption.

I suspect (I don't know, just a WAG) that the way a message is "moved" with TB to Local Folders from an IMAP server is that it streams the message into memory and inserts the "X-Mozilla-" stuff at the beginning of the message, then writes from memory to the Local Folders data store. Then TB sends a Delete command to the IMAP server. If this is the case, it seems that TB could strip out the  characters when the message is being read into memory. It wouldn't ever be likely that the first lines of a message would be high ASCII characters. The first characters of a message are probably "Subject" or "X-Spam" or something in the low ASCII range.

I opened 3 .eml files on the mail server with Notepad and Notepad++.

Well, the nature of the UTF-8 BOM is that any well-behaved editor will of course not show it. To be safe, you'd have to look with a hex editor. Or in Notepad++ check whether it displays UTF-8 BOM in the status bar or the Encoding menu.

Yes, TB could of course detect the BOM after receiving the message from the IMAP server, but it's just totally puzzling where those three bytes originate.

Something else. By default, TB stores a local offline copy of all IMAP messages unless you switch it off (folder properties, synchronization, offline use). Can you check the local copy, either in a mailbox or maildir file. What happens if you save the message as .eml file (File > Save As > File). Do you also get the BOM? And a third thing: Send yourself a message not encoded in UTF-8, but "Western" (windows-1252, ISO-8859-1) and use some Western European characters in it, like äöüáóú. Make sure the charset header is windows-1252. Do you still get the UTF-8 BOM?

I've determined that this problem originates with ArGoSoft Mail Server.

I installed hMailServer for Windows (free, open-source) and did not have the UTF-8 BOM  characters issue with messages that were archived from the mailbox to the Local Folders.

So this is not a Thunderbird bug.

That said, it might be nice if Thunderbird would ignore or remove the high-ASCII characters in the headers when a message is archived, since it's reading the message and inserting the X-Mozilla-* headers at the beginning before it writes to the Local Folders.

Thanks everyone for your help and discussion.

How do I close this case? First time using Bugzilla.

Thanks for the info and we'll keep your suggestion in mind.

Status: UNCONFIRMED → RESOLVED
Closed: 4 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: