Since v128 the inbox/message view shows all new mail as having no subject, no sender and date 1970-01-01 (pop)
Categories
(Thunderbird :: Folder and Message Lists, defect, P2)
Tracking
(thunderbird_esr128? affected, thunderbird132? affected)
People
(Reporter: pudding, Assigned: gds)
References
(Blocks 1 open bug)
Details
(Whiteboard: [datalossy])
Attachments
(4 files)
54.58 KB,
image/png
|
Details | |
1.36 KB,
patch
|
Details | Diff | Splinter Review | |
48 bytes,
text/x-phabricator-request
|
Details | Review | |
3.18 KB,
patch
|
Details | Diff | Splinter Review |
Steps to reproduce:
Exported full mailbox+config from Thunderbird 115, imported on "fresh" unconfigured Thunderbird 128.
Actual results:
All e-mail fetched with Thunderbird 128 are shown in the inbox/message list as blank sender, blank subject, and with a date 1970-01-01 01:00:00. Clicking an e-mail in order to read it will however show the sender/subject/etc. correctly in the "header" portion above the e-mail text.
All previous e-mails obtained during the import appear normal in the inbox/message list, as can be seen in the attached screenshot.
Expected results:
All e-mail should of course be shown with the sender, subject, date just as Thunderbird 115 did before I "upgraded".
It this IMAP or POP/local folders? If POP, it looks like a duplicate of bug 1911076. Windows? 32bit or 64bit? Or which OS?
Reporter | ||
Comment 2•4 months ago
|
||
(In reply to Francesco from comment #1)
It this IMAP or POP/local folders? If POP, it looks like a duplicate of bug 1911076. Windows? 32bit or 64bit? Or which OS?
This is with POP3, on macOS Sonoma Arm64.
Reporter | ||
Comment 3•4 months ago
|
||
More data in case it matters: my mailbox footprint is relatively small, not even 500 MB in size.
Have you tried repairing it? Some reports say that after repairing the "empty" messages are gone altogether, so take a backup copy of the mailbox before you repair. The raw data file should be enough to keep since the system can always reconstruct the .msf from it.
Assignee | ||
Comment 5•4 months ago
|
||
I've never done the "export/import" method when updating to a new TB version. Did you try just running TB 128 with your working 115 profile and not do the export/import? Or maybe you are saying the problem only occurs if you do the export/import update but is OK if you just run 128 with the original profile?
FWIW, I can receive new POP3 messages OK when just running 128/64-bit with the original profile.
Assignee | ||
Comment 6•4 months ago
|
||
I created a profile with single POP3 account with 115 and exported it to zip file. Then ran 128 and imported the zip file into new empty profile called "import". I'm able to receive new message OK and view them on profile "import" running on 128. Tested this using linux.
Reporter | ||
Comment 7•4 months ago
|
||
(In reply to Francesco from comment #4)
Have you tried repairing it? Some reports say that after repairing the "empty" messages are gone altogether, so take a backup copy of the mailbox before you repair. The raw data file should be enough to keep since the system can always reconstruct the .msf from it.
Repairing the inbox fixed the misinterpreted e-mails that were currently in the folder, but new e-mails fetched afterwards still show up in the message list as blank and with 1970-01-01 date.
Reporter | ||
Comment 8•4 months ago
|
||
(In reply to gene smith from comment #5)
I've never done the "export/import" method when updating to a new TB version. Did you try just running TB 128 with your working 115 profile and not do the export/import? Or maybe you are saying the problem only occurs if you do the export/import update but is OK if you just run 128 with the original profile?
FWIW, I can receive new POP3 messages OK when just running 128/64-bit with the original profile.
I did not try to start 128 with the old data/profile already in place, as I was migrating to a new computer and new OS installation. It is/was a "fresh" TB 128 with no previous data/config available.
(In reply to Minipudding from comment #7)
Repairing the inbox fixed the misinterpreted e-mails that were currently in the folder, but new e-mails fetched afterwards still show up in the message list as blank and with 1970-01-01 date.
Very interesting, so you have a repaired mailbox which shows newly received/added messages as corrupted. I'm sure the devs will want to have a look at a snapshot of that mailbox to work out what's going on.
Comment 10•4 months ago
|
||
(In reply to Minipudding from comment #7)
(In reply to Francesco from comment #4)
Have you tried repairing it? Some reports say that after repairing the "empty" messages are gone altogether, so take a backup copy of the mailbox before you repair. The raw data file should be enough to keep since the system can always reconstruct the .msf from it.
Repairing the inbox fixed the misinterpreted e-mails that were currently in the folder, but new e-mails fetched afterwards still show up in the message list as blank and with 1970-01-01 date.
Can I ask one question: Is your profile and TB installed locally on your PC?
Reporter | ||
Comment 11•4 months ago
|
||
(In reply to Arthur K. (he/him) from comment #10)
Can I ask one question: Is your profile and TB installed locally on your PC?
Yes, everything including TB is stored on internal storage on local file system in my user's home, and I'm running as a plain normal unprivileged (non-admin) user.
Reporter | ||
Comment 12•4 months ago
|
||
A bit more data: the exported mailbox from TB 115 came from an older macOS/x86 setup, before I imported it on a newer macOS/Arm64 setup. The profile was created on a Thunderbird version earlier than 115, which over time saw several "in-app" updates.
Updated•3 months ago
|
Reporter | ||
Comment 14•3 months ago
|
||
(In reply to Magnus Melin [:mkmelin] from comment #13)
Are you able to share the mailbox (privately)?
Unfortunately not. It contains both personal and work-related e-mails.
Is some compromise possible? Partial content? DDL for the schemas?
Updated•3 months ago
|
Comment 15•2 months ago
|
||
Somehow related to removal of "From" header line?
Reporter | ||
Comment 18•11 days ago
|
||
Unfortunately this does not seem to have been a duplicate of 1911916. After updating to 128.4.2 the problem remains in the exact same form.
Reporter | ||
Updated•11 days ago
|
Comment 19•11 days ago
|
||
It's not really clear what's going on here. Receiving messages with POP generally works, I'm using it myself.
After bug 1911916 has now been included in the release, messages added to the mailbox now have their so-called envelop date again, for example:
From - Sun Nov 10 18:29:00 2024
.
However, your report says: no subject, no sender and date 1970-01-01. This points to a totally corrupt mailbox.
So what happens when you repair the mailbox? Do the messages then show some correct content? And what happens when new messages are received into that mailbox?
Reporter | ||
Comment 20•11 days ago
|
||
(In reply to Francesco from comment #19)
It's not really clear what's going on here. Receiving messages with POP generally works, I'm using it myself.
After bug 1911916 has now been included in the release, messages added to the mailbox now have their so-called envelop date again, for example:
From - Sun Nov 10 18:29:00 2024
.However, your report says: no subject, no sender and date 1970-01-01. This points to a totally corrupt mailbox.
So what happens when you repair the mailbox? Do the messages then show some correct content? And what happens when new messages are received into that mailbox?
As mentioned before, if I repair the mailbox the misinterpreted e-mails are corrected and they will remain correct. But each time I receive a new e-mail it ends up the same way.
The contents of the e-mails have so far never been corrupted. They are simply presented/interpreted wrong in the message folder list. If I simply click an e-mail to read it all the details are shown correctly. If I reply to one of the mispresented e-mails, the wrong details are used in the reply - the recipient e-mail address for my reply is correct but it has a blank subject, blank name, 1970-01-01 date.
Comment 21•11 days ago
|
||
So for a local folder or POP folder, this is how it works:
Messages are downloaded from the POP server and placed into the raw messages file (mbox format). The message is parsed and its details (from, subject, date) are stored in the database file (.msf). TB uses the database to display the message list, since the db file has fast access and avoids repeated mbox parsing.
When the db gets "corrupted", it gets out of step with the underlying raw data in the mbox file. If you repair the folder, the .msf is thrown away, the raw data is parsed again and a new db is constructed.
So according to your previous comment, the repair creates a new "good" db, but more message received end up bad again. When you click an e-mail, TB actually reads the content from the raw data/mbox, but the "offset" where to find the data for a particular message is actually taken from the db. That's why what you report sounds somewhat contradictory. The reply uses the message information from the db, so if the message wasn't displayed correctly in the message list, the reply will also lack the correct information.
Very hard to work out what's going on without getting the data or seeing it happening.
Another change is shipping in 128.4.3 in the next couple of days which makes the mbox parser more restrictive. If the offset stored in the db doesn't point to a mbox separator "From -" in the raw data, any processing is aborted, likely leading to an empty display.
Reporter | ||
Comment 22•11 days ago
|
||
(In reply to Francesco from comment #21)
Very hard to work out what's going on without getting the data or seeing it happening.
Unfortunately the mailbox contains both private and work-related e-mails, so I cannot hand it over as-is.
What is "msf"? Is it BDB? Sqlite v2 or 3? Some custom format? Is the format described anywhere such that I can iterate across it and see if any of the offsets point to a bad location in the plain-text mbox file? Is the offset value a byte-offset? Line offset?
Comment 23•11 days ago
|
||
It's Mork, you can use this script to analyse: https://www.ggbs.de/extensions/mork.pl. It's a byte-offset. Take a look at bug 1920329 comment 18 for inspiration. The offset is stored in the storeToken
. If you have a small mbox with few good messages an even fewer bad ones, you should be able to spot bad offsets easily. I use Notepad++ to get the offset. It displays the character position in the file counting CRLF as two characters (as it should).
Reporter | ||
Comment 24•11 days ago
|
||
So I've iterated through the "msf" file now. I currently have two misinterpreted ("non-repaired") e-mails in my mbox file, but I cannot find any certainly corresponding entries for them in the "msf". Though I have two suspects:
{ 'threadParent' => 'ffffffff', 'ProtoThreadFlags' => '0', 'msgThreadId' => '3a7', 'junkscoreorigin' => 'plugin', 'ID' => '3A7', 'flags' => '1', 'LastVisitDate' => 0, 'threadSubject' => '', 'junkpercent' => '8', 'numLines' => '0', 'size' => '6aa', 'junkscore' => '0', 'storeToken' => '322789786' }
{ 'junkpercent' => '3', 'size' => '21d9', 'numLines' => '0', 'junkscore' => '0', 'storeToken' => '322791527', 'threadParent' => 'ffffffff', 'ProtoThreadFlags' => '0', 'msgThreadId' => '3a8', 'junkscoreorigin' => 'plugin', 'ID' => '3A8', 'flags' => '1', 'LastVisitDate' => 0, 'threadSubject' => '' }
And here's a third weird one:
{ 'LastVisitDate' => 0, 'threadSubject' => 'Fuck You', 'ID' => 'FFFFFFFE' }
The sizes of the first two roughly correspond to the sizes of the misinterpreted envelopes in the mbox file. I'm guessing linebreaks or something isn't fully accounted for. The "fuck you" envelope is an old thread in my inbox from long before this bug showed up, but I see no problems with any of the e-mails in the thread. They look fine in the message list and they look fine when I open each of them. I don't know why this sparse entry is present in the "msf".
Any suggestions on where I proceed to try untangle this mess?
Comment 25•11 days ago
|
||
Hmm, the output format isn't anything the script produces. Plus the script dumps out subjects that can be related to messages. The point is that the storeToken
entries are pointers into the raw data. What is at those file positions?
Reporter | ||
Comment 26•11 days ago
|
||
(In reply to Francesco from comment #25)
Hmm, the output format isn't anything the script produces. Plus the script dumps out subjects that can be related to messages. The point is that the
storeToken
entries are pointers into the raw data. What is at those file positions?
It's my own parser I wrote with File::Mork from the same author a few minutes before you linked the other one. What you see are, as far as I can tell, all of the keys present in each of the "msf" entries, without any alterations.
Both positions point to an mbox separator, e.g. From - Sun Nov 10 17:50:05 2024
followed by CRLF. I notice no discrepancy in the sizes of the envelopes, they match when I count from first header (after separator) up to but excluding the finalizing CRLF before next envelope.
Comment 27•11 days ago
|
||
Maybe you should join the TB dev team ;-) - That said, read in bug 1925085 comment 54 about the size. That confirms your observation: The mbox separator is not counted, and also not the CRLF before the next separator.
So all is well, also clicking onto the message positions correctly and displays the message. In your parser, are the other message headers recorded correctly in your .msf, like subject, from, etc. Why isn't the message displayed correctly in the list if all the data is correct (and should be correct after various repairs)? How about in the table view? Also bad?
Reporter | ||
Comment 28•11 days ago
|
||
(In reply to Francesco from comment #27)
Maybe you should join the TB dev team ;-) - That said, read in bug 1925085 comment 54 about the size. That confirms your observation: The mbox separator is not counted, and also not the CRLF before the next separator.
So all is well, also clicking onto the message positions correctly and displays the message. In your parser, are the other message headers recorded correctly in your .msf, like subject, from, etc. Why isn't the message displayed correctly in the list if all the data is correct (and should be correct after various repairs)? How about in the table view? Also bad?
I'll provide a screenshot of it all just so that the manifestation of the bug is entirely clear: https://i.imgur.com/jpgQY8G.jpeg
If I were to repair the Inbox, the three blank e-mails at the bottom would be fixed and it would be as if nothing had ever been wrong...until the next e-mail arrives.
If I sample a bunch of other normal e-mails from the .msf, they appear fully correct as far as I can tell. Here is a redacted example:
{
'sender' => 'xxxx xxxxxxxx <xxxxxxxx@gmail.com>',
'numLines' => '2b',
'flags' => '3',
'priority' => '1',
'threadSubject' => 'xxxxx xxxx',
'msgThreadId' => '397',
'LastVisitDate' => 0,
'dateReceived' => '66fd1d66',
'ID' => '397',
'recipients' => 'xxxxx xxxxxxxx <xxxxx@xxx.xx>',
'storeToken' => '314156129',
'subject' => 'xxxxx xxxx',
'date' => '66fd1d5e',
'msgOffset' => '12b9a461',
'ProtoThreadFlags' => '0',
'message-id' => 'xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx@gmail.com',
'references' => '',
'threadParent' => 'ffffffff',
'account' => 'account2',
'size' => '10fd'
}
Position goes to the right place in the mbox, the size matches, subject is normal etc. etc. I have not made an attempt to traverse the entire .msf + mbox to see if every entry is sane, because to me this would be akin to reinventing TB's "repair" function. Additionally I don't have a complete understanding of the key-value store and the keys' meaning and uses anyway.
Reporter | ||
Comment 29•11 days ago
|
||
Something I notice is that in correct .msf entries the storeToken
has a matching hexadecimal counterpart in msgOffset
. But perhaps that field is not really used.
Comment 30•11 days ago
|
||
I'll provide a screenshot of it all just so that the manifestation of the bug is entirely clear: https://i.imgur.com/jpgQY8G.jpeg
That basically shows what was discussed before. In the reply, the "header details" are missing since they are taken from the db, the body is quoted correctly since it is taken from the raw message.
I'm surprised that msgOffset is still there, that is the old way of storing the offset and was removed in bug 1720047. Apparently the repair function doesn't remove it. Hah, here's a new idea! What happens if instead of repairing the folder, you simply delete the .msf (after taking a backup). That way it's completely built from scratch and as we've seen in bug 1920329 comment 37 msgOffset
then finally disappears. Maybe it will make a difference.
I really can't understand why everything is fine after a repair, and then the next message shows up badly in the message list. There must be some quirk somewhere.
Ben, do you have any ideas?
Reporter | ||
Comment 31•10 days ago
|
||
(In reply to Francesco from comment #30)
I'm surprised that msgOffset is still there, that is the old way of storing the offset and was removed in bug 1720047. Apparently the repair function doesn't remove it. Hah, here's a new idea! What happens if instead of repairing the folder, you simply delete the .msf (after taking a backup). That way it's completely built from scratch and as we've seen in bug 1920329 comment 37
msgOffset
then finally disappears. Maybe it will make a difference.
Good suggestion, that did "clean up" the .msf which is now void of all msgOffset
keys - but unfortunately it doesn't solve the bug.
I notice that the rebuilt-from-scratch .msf still contains the curious malformed "Fuck You" envelope. I will try to locate and remove this envelope from a copy of the mbox and see what happens following a new rebuild.
Assignee | ||
Comment 32•10 days ago
•
|
||
Francesco wrote:
I really can't understand why everything is fine after a repair, and then the next message shows up badly in the message list. There must be some quirk somewhere.
So after a "repair" (ether way) when new messages come in they are unreadable? But the new .msf records for the new msgs look OK?
Maybe this a lower level POP3 problem? What are your server settings for the problem POP3 account? Specifically,
- Auto download new messages
- Fetch headers only
- Leave msgs on server
- For at most x days
- Until I delete them
How many messages are you leaving on the server?
Do all the items in Inbox and Inbox.msf come from the server when getting new mail or have you copied stuff from elsewhere into Inbox?
What is your POP3 server type?
I set up an outlook.com pop3 account with over 30k messages still on the server resulting in huge Inbox and Inbox.msf and don't see a problem when new mail comes in. Also, the storetokens shown with mork.pl look correct compared to offsets in Inbox file.
Assignee | ||
Comment 33•10 days ago
|
||
Another possible thing that could cause problems is the file popstate.dat (alongside Inbox and Inbox.msf) which might be corrupted. I think if you delete it a re-download of all messages still on the server will occur in order to re-create it. But not 100% sure what will happen and don't want to cause more problems.
Comment 34•10 days ago
|
||
(In reply to gene smith from comment #33)
I think if you delete it [popstate.dat] a re-download of all messages still on the server will occur in order to re-create it.
That is correct.
Reporter | ||
Comment 35•10 days ago
|
||
(In reply to gene smith from comment #32)
Francesco wrote:
I really can't understand why everything is fine after a repair, and then the next message shows up badly in the message list. There must be some quirk somewhere.
So after a "repair" (ether way) when new messages come in they are unreadable? But the new .msf records for the new msgs look OK?
Maybe this a lower level POP3 problem? What are your server settings for the problem POP3 account? Specifically,
- Auto download new messages
- Fetch headers only
- Leave msgs on server
- For at most x days
- Until I delete them
How many messages are you leaving on the server?
Do all the items in Inbox and Inbox.msf come from the server when getting new mail or have you copied stuff from elsewhere into Inbox?
What is your POP3 server type?I set up an outlook.com pop3 account with over 30k messages still on the server resulting in huge Inbox and Inbox.msf and don't see a problem when new mail comes in. Also, the storetokens shown with mork.pl look correct compared to offsets in Inbox file.
[x] check for new messages at startup
[x] check for new messages every 3 minutes
[x] automatically download new messages
[ ] fetch headers only
[ ] leave messages on server
No messages are left on the server, they are trashed in the same session before disconnecting.
All messages originate from that same server and account. Nothing has been copied in from elsewhere. But as stated in this report, the current "source" is an exported backup from a previous installation of TB 115. The problems began when importing that backup into a new install of TB 128. After the backup import into the new TB 128 I made no configuration changes. I have tried to redo the whole "installation" of TB and import of the backup, in case it was a temporary fluke, but the behavior is the same.
Regarding POP3 server type Thunderbird's server settings dialog says "POP Mail Server", and I connect to it with POP3S (implicit TLS).
Comment 36•10 days ago
|
||
Import. That's bad news. I causes all sorts of bugs no one can trace.
Please create a new profile (use thunderbird -p) and set it up. Then copy the "Mail" folder from the old profile to the new to you get all your local/POP folders.
Reporter | ||
Comment 37•10 days ago
|
||
I cut the suspicious "fk you" envelope out of the mbox and ran my msf parser again to see what would change. This time another "small" msf entry showed up, with the same three keys as the "fk you" envelope, but with a different subject, thus it must relate to another envelope in the mbox (or whatever bug is occuring is simply having a hiccup somewhere else).
I took a closer look at the "f**k you" envelope and the envelope of the new suspcious msf entry, and I notice some differences compared to most other envelopes in the mbox - the custom headers injected by Thunderbird (i.e. X-Mozilla-Status etc.) have CRLFs on the tail instead of just an LF, which is the case of almost all envelopes in the mbox.
Unfortunately after tidying that up so that all of those headers are in parity (just an LF at the end) there is still no change in the behavior.
At this point I am convinced that the mbox has been corrupted, either during the export in TB 115 or during import in TB 128, and I see no apparent solution besides archiving this mbox and starting a new setup from scratch, without importing my old config.
Comment 38•10 days ago
|
||
An mbox is just a text file. As you saw, removing the .msf recreates it with not problem. Just create a new profile, transfer the inbox/sent box or the Mail folder over. Alternatively, copy the old mboxes to the new profile with a new name, like inboxOld. According to your screenshot, your setup is really simple, so a manual transfer should be easy. Last question: Do you use any filters?
Assignee | ||
Comment 39•10 days ago
|
||
Reporter mentioned the export/import in comment 0. I tested it on an account in comment 5 and comment 6 and it "worked for me" when imported to a new profile.
Ok, sounds like popstate.dat would be (mostly?) empty since nothing left on server.
Probably doesn't matter, but was asking for the server type, e.g., exchange, dovecot, courier, gmail etc.
Reporter | ||
Comment 40•10 days ago
|
||
(In reply to Francesco from comment #38)
An mbox is just a text file. As you saw, removing the .msf recreates it with not problem. Just create a new profile, transfer the inbox/sent box or the Mail folder over. Alternatively, copy the old mboxes to the new profile with a new name, like inboxOld. According to your screenshot, your setup is really simple, so a manual transfer should be easy. Last question: Do you use any filters?
I used one filter for junk mail classification, matching on presence of a header.
I have now deleted everything for TB 128 - profile, caches, configs, everything - and made a new profile, this time using Maildir instead of mbox ... and the problem is still present...
Here is a fresh "1970-01-01" e-mail with redacted content (using a hex editor to make sure nothing sensitive was spoiled). Does anything look off?
X-Account-Key: account1
X-UIDL: b42aea04b73383efe54772ee8590062a
X-Mozilla-Status: 0000
X-Mozilla-Status2: 00000000
X-Mozilla-Keys:
Return-Path: <XXXXX@XXXXXXXXXX.XXXXXXXXXX.XXX>
Delivered-To: XXXXX@XXXXXXXXXX.XXX
Authentication-Results: XXXXXXXX.XXXXXXXXXX.XXX; dkim=pass header.s=2k24
header.d=XXXXXXXXXX.XXXXXXXXXX.XXX header.a=rsa-sha256
Received: from XXXXXXXXXX.XXXXXXXXXX.XXX (XXXXXXXXXX.XXXXXXXXXX.XXX [999.999.99.999])
by XXXXXXXX.XXXXXXXXXX.XXX (OpenSMTPD) with ESMTPS id 147a68ef (TLSv1.3:TLS_AES_256_GCM_SHA384:256:NO)
for <XXXXX@XXXXXXXXXX.XXX>;
Mon, 11 Nov 2024 22:38:15 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; s=2k24; t=1731361094;
x=1731620294; bh=a7QS2KKWt5XuTGorvOAER/yQSnI/DU2oULa2iigWzW4=;
h=subject:to:from:date; d=XXXXXXXXXX.XXXXXXXXXX.XXX; b=d3xGl4ZrLadTQ6O
zDveyO9XUWersUC1ZUuwYoHsnpHG4CdWxTURSwENPCjugKQ3iCLAVqX7mmmKM6Ubk038HY
FMAUvFo/s1oBXWn3zIVyva+FT3IB14huoYdlJBryxHcOTapgeTKZvM4NZNnct6gt+QDU24
Bx9AfBpcu8iSyDKmUmAT7EVtuZ32zXENki77eb43FNiDprVY5jJdpRydrThbRWlMt/LM8F
Po/NKzbC/96sbkxx6seqr3sUjt37vbztzvnOpOAG5vhmDThdJfCaT9J7MVbbTtGpOybTKv
on45cKuWWyT3/NTpmgeMsddxzVkg+SGO/AjxKVQv/jEH/Wg==
Received:
by XXXXXXXXXX.XXXXXXXXXX.XXX (OpenSMTPD) with SMTP id 03e4bcdb
for <XXXXX@XXXXXXXXXX.XXX>;
Mon, 11 Nov 2024 21:38:14 +0000 (UTC)
Date: Mon, 11 Nov 2024 22:38:14 +0100
From: XXXXX XX XXXXXXXXX <XXXXX@XXXXXXXXXXXXXXXXXXXXXXXXX>
To: XXXXX@XXXXXXXXXX.XXX
Subject: XXXXXX
Message-ID: <20241111213814.eyqyrrp25rd256a2@XXXXXXXXXX.XXXXXXXXXX.XXX>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
XXXXX
And here is the same redacted Maildir file in base64 in order to preserve binary integrity, just to be on the safe side:
WC1BY2NvdW50LUtleTogYWNjb3VudDEKWC1VSURMOiBiNDJhZWEwNGI3MzM4M2Vm
ZTU0NzcyZWU4NTkwMDYyYQpYLU1vemlsbGEtU3RhdHVzOiAwMDAwClgtTW96aWxs
YS1TdGF0dXMyOiAwMDAwMDAwMApYLU1vemlsbGEtS2V5czogICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICAg
ICAgICAgICAgICAgICAgICAgICAKUmV0dXJuLVBhdGg6IDxYWFhYWEBYWFhYWFhY
WFhYLlhYWFhYWFhYWFguWFhYPgpEZWxpdmVyZWQtVG86IFhYWFhYQFhYWFhYWFhY
WFguWFhYCkF1dGhlbnRpY2F0aW9uLVJlc3VsdHM6IFhYWFhYWFhYLlhYWFhYWFhY
WFguWFhYOyBka2ltPXBhc3MgaGVhZGVyLnM9MmsyNAoJaGVhZGVyLmQ9WFhYWFhY
WFhYWC5YWFhYWFhYWFhYLlhYWCBoZWFkZXIuYT1yc2Etc2hhMjU2ClJlY2VpdmVk
OiBmcm9tIFhYWFhYWFhYWFguWFhYWFhYWFhYWC5YWFggKFhYWFhYWFhYWFguWFhY
WFhYWFhYWC5YWFggWzk5OS45OTkuOTkuOTk5XSkKCWJ5IFhYWFhYWFhYLlhYWFhY
WFhYWFguWFhYIChPcGVuU01UUEQpIHdpdGggRVNNVFBTIGlkIDE0N2E2OGVmIChU
TFN2MS4zOlRMU19BRVNfMjU2X0dDTV9TSEEzODQ6MjU2Ok5PKQoJZm9yIDxYWFhY
WEBYWFhYWFhYWFhYLlhYWD47CglNb24sIDExIE5vdiAyMDI0IDIyOjM4OjE1ICsw
MTAwIChDRVQpCkRLSU0tU2lnbmF0dXJlOiB2PTE7IGE9cnNhLXNoYTI1NjsgYz1y
ZWxheGVkL3JlbGF4ZWQ7IHM9MmsyNDsgdD0xNzMxMzYxMDk0OwoJeD0xNzMxNjIw
Mjk0OyBiaD1hN1FTMktLV3Q1WHVUR29ydk9BRVIveVFTbkkvRFUyb1VMYTJpaWdX
elc0PTsKCWg9c3ViamVjdDp0bzpmcm9tOmRhdGU7IGQ9WFhYWFhYWFhYWC5YWFhY
WFhYWFhYLlhYWDsgYj1kM3hHbDRackxhZFRRNk8KCXpEdmV5TzlYVVdlcnNVQzFa
VXV3WW9Ic25wSEc0Q2RXeFRVUlN3RU5QQ2p1Z0tRM2lDTEFWcVg3bW1tS002VWJr
MDM4SFkKCUZNQVV2Rm8vczFvQlhXbjN6SVZ5dmErRlQzSUIxNGh1b1lkbEpCcnl4
SGNPVGFwZ2VUS1p2TTROWk5uY3Q2Z3QrUURVMjQKCUJ4OUFmQnBjdThpU3lES21V
bUFUN0VWdHVaMzJ6WEVOa2k3N2ViNDNGTmlEcHJWWTVqSmRwUnlkclRoYlJXbE10
L0xNOEYKCVBvL05LemJDLzk2c2JreHg2c2VxcjNzVWp0Mzd2Ynp0enZuT3BPQUc1
dmhtRFRoZEpmQ2FUOUo3TVZiYlR0R3BPeWJUS3YKCW9uNDVjS3VXV3lUMy9OVHBt
Z2VNc2RkeHpWa2crU0dPL0FqeEtWUXYvakVIL1dnPT0KUmVjZWl2ZWQ6IAoJYnkg
WFhYWFhYWFhYWC5YWFhYWFhYWFhYLlhYWCAoT3BlblNNVFBEKSB3aXRoIFNNVFAg
aWQgMDNlNGJjZGIKCWZvciA8WFhYWFhAWFhYWFhYWFhYWC5YWFg+OwoJTW9uLCAx
MSBOb3YgMjAyNCAyMTozODoxNCArMDAwMCAoVVRDKQpEYXRlOiBNb24sIDExIE5v
diAyMDI0IDIyOjM4OjE0ICswMTAwCkZyb206IFhYWFhYIFhYIFhYWFhYWFhYWCA8
WFhYWFhAWFhYWFhYWFhYWFhYWFhYWFhYWFhYWFhYWD4KVG86IFhYWFhYQFhYWFhY
WFhYWFguWFhYClN1YmplY3Q6IFhYWFhYWApNZXNzYWdlLUlEOiA8MjAyNDExMTEy
MTM4MTQuZXlxeXJycDI1cmQyNTZhMkBYWFhYWFhYWFhYLlhYWFhYWFhYWFguWFhY
PgpNSU1FLVZlcnNpb246IDEuMApDb250ZW50LVR5cGU6IHRleHQvcGxhaW47IGNo
YXJzZXQ9dXMtYXNjaWkKQ29udGVudC1EaXNwb3NpdGlvbjogaW5saW5lCgpYWFhY
WAoK
Reporter | ||
Comment 41•10 days ago
|
||
If I compare with the original Maildir envelope file directly from the server, as created by OpenSMTPd, there is besides Thunderbird's custom "X-Mozilla-Blah" headers at the head only one single difference: Thunderbird's copy has an additional LF at the end. The original envelope file ends with single 0x0a, Thunderbird's copy has 0x0a 0x0a.
I don't know if it's Thunderbird adding it or if it's the POP3 daemon I am using. Nor do I know if that would have any effect on anything. I have used this POP3 server with Claws, Apple Mail (on macOS and iPhone), Thunderbird <=115, Neomutt, Sylpheed, and even POP3 fetching from inside Gmail webmail. Only TB 128 exhibits this problem. How can I proceed from here?
Reporter | ||
Comment 42•10 days ago
|
||
This is the POP3 daemon I use and here is the part that handles RETR
: https://github.com/stolendata/little-peepo/blob/master/little_peepo.pl#L323-L325
It finalizes transmission of envelope with \r\n.\r\n
. I don't know POP3 very well, is this aligned with specification?
Assignee | ||
Comment 43•10 days ago
•
|
||
Back to original mbox, you said:
I took a closer look at the "f**k you" envelope and the envelope of the new suspcious msf entry, and I notice some differences compared to most other envelopes in the mbox - the custom headers injected by Thunderbird (i.e. X-Mozilla-Status etc.) have CRLFs on the tail instead of just an LF, which is the case of almost all envelopes in the mbox.
Unfortunately after tidying that up so that all of those headers are in parity (just an LF at the end) there is still no change in the behavior.
Did you repair the folder after editing the mbox file? Or, alternately, delete the .msf and let TB re-create it?
I don't know much about line endings for osx. Maybe "classic" used CRLF? and these were create back then? Anyhow, on linux (just uses LF) I see CRLF only at the end of each message in mbox and after the "From - ....". Otherwise all lines just end in LF. This shows a part of my perfectly working POP3 mbox:
:
X19fX19fX19fX19fXwpGcmVlc3VyZmVyIG1haWxpbmcgbGlzdApGcmVlc3VyZmVyQG5tci5tZ2gu
aGFydmFyZC5lZHUKaHR0cHM6Ly9tYWlsLm5tci5tZ2guaGFydmFyZC5lZHUvbWFpbG1hbi9saXN0
aW5mby9mcmVlc3VyZmVy
^M
From - Mon Nov 11 02:56:04 2024^M
X-Mozilla-Status: 0000
X-Mozilla-Status2: 00000000
X-Mozilla-Keys:
X-Account-Key: account79
X-UIDL: 2740
:
Where ^M
indicate CRLF in VIM editor; all other lines just end with LF. From what I understand, modern macOS should be the same.
FWIW, I don't right off see any problem with your maildir file snippet. Was going to maybe suggest converting to maildir as a "hail mary" desperate action. :).
You mention a "POP3 daemon". I assume that is your POP3 server? (Wrote this before received comment 42.)
It finalizes transmission of envelope with \r\n.\r\n. I don't know POP3 very well, is this aligned with specification?
Yes, that sounds correct. Maybe that's why every message in mbox ends with CRLF.
Edit: Sorry, sent with formatting messed up. Fixed it.
Reporter | ||
Comment 44•10 days ago
|
||
(In reply to gene smith from comment #43)
Did you repair the folder after editing the mbox file? Or, alternately, delete the .msf and let TB re-create it?
I don't know much about line endings for osx. Maybe "classic" used CRLF? and these were create back then? Anyhow, on linux (just uses LF) I see CRLF only at the end of each message in mbox and after the "From - ....". Otherwise all lines just end in LF. This shows a part of my perfectly working POP3 mbox:
``
:
X19fX19fX19fX19fXwpGcmVlc3VyZmVyIG1haWxpbmcgbGlzdApGcmVlc3VyZmVyQG5tci5tZ2gu
aGFydmFyZC5lZHUKaHR0cHM6Ly9tYWlsLm5tci5tZ2guaGFydmFyZC5lZHUvbWFpbG1hbi9saXN0
aW5mby9mcmVlc3VyZmVy
^M
From - Mon Nov 11 02:56:04 2024^M
X-Mozilla-Status: 0000
X-Mozilla-Status2: 00000000
X-Mozilla-Keys:
X-Account-Key: account79
X-UIDL: 2740
:Where ```^M``` indicate CRLF in VIM editor; all other lines just end with LF. From what I understand, modern macOS should be the same. FWIW, I don't right off see any problem with your maildir file snippet. Was going to maybe suggest converting to maildir as a "hail mary" desperate action. :). You mention a "POP3 daemon". I assume that is your POP3 server? (Wrote this before received comment 42.) > It finalizes transmission of envelope with \r\n.\r\n. I don't know POP3 very well, is this aligned with specification? Yes, that sounds correct. Maybe that's why every message in mbox ends with CRLF.
I removed the .msf and also manually repaired the mbox after that. Same result, the "broken" e-mails were corrected, everything returned to order, next new e-mail ended up in the same blank/1970 fashion.
All the mails in my "weird" mbox file, after correcting a few culprits, are in alignment with your observation then. Every mbox separator and every envelope ends with CRLF. So still no apparent problem.
Yes I am the admin of the server and the mail setup, I have full access to the configs and everything arriving into the inboxes etc. It's OpenBSD 7.6 with OpenSMTPd and Little Peepo.
Comment 45•10 days ago
|
||
So however you set up TB 128 with POP (mbox/maildir; from scratch/import), this is always result:
New messages get downloaded and have an "empty" entry in the message list, as can be seen here: https://i.imgur.com/jpgQY8G.jpeg
Repair fixes the issue, but the next download breaks it again.
The downloaded messages do get registered in the .msf, when you click on the message in the list, the correct message is displayed, either by positioning to the correct offset in the mbox file or using the correct maildir file.
A reply quotes the body correctly, but the header information information is wrong. Additionally, analysing the .msf gives the right information, subject, sender, recipient, all there, see comment 28 for the last .msf extract.
Have you run POP debugging to check whether there's anything reported? Set pref mailnews.pop3.loglevel to All, then check the error console. Likely that won't yield any insight since the message is received and stored.
I'm at the end of my wits.
Reporter | ||
Comment 46•10 days ago
|
||
(In reply to Francesco from comment #45)
So however you set up TB 128 with POP (mbox/maildir; from scratch/import), this is always result:
New messages get downloaded and have an "empty" entry in the message list, as can be seen here: https://i.imgur.com/jpgQY8G.jpeg
Repair fixes the issue, but the next download breaks it again.The downloaded messages do get registered in the .msf, when you click on the message in the list, the correct message is displayed, either by positioning to the correct offset in the mbox file or using the correct maildir file.
A reply quotes the body correctly, but the header information information is wrong. Additionally, analysing the .msf gives the right information, subject, sender, recipient, all there, see comment 28 for the last .msf extract.
Have you run POP debugging to check whether there's anything reported? Set pref mailnews.pop3.loglevel to All, then check the error console. Likely that won't yield any insight since the message is received and stored.
I'm at the end of my wits.
I set the loglevel to All and fetched another envelope, the saved log from Thunderbird is here: https://paste.debian.net/hidden/3405ceb4/
(Formatting looks a bit wrong but that is how TB created the log file)
Comment 47•10 days ago
|
||
Very legible but nothing interesting. There's also Mbox debugging, but that's irrelevant given that you switched to maildir. The maildir file has the correct content, I assume. I forgot to ask whether there's an issue when you don't use the filter, that's really the last idea I have.
Reporter | ||
Comment 48•10 days ago
|
||
(In reply to Francesco from comment #47)
Very legible but nothing interesting. There's also Mbox debugging, but that's irrelevant given that you switched to maildir. The maildir file has the correct content, I assume. I forgot to ask whether there's an issue when you don't use the filter, that's really the last idea I have.
The maildir envelope files are correct, with the previously observed difference that they end with two LF instead of one LF like the original files on server. Currently no filter is configured. Basically it's just POP+SMTP accounts configured, and some generic changes in the settings. Is there a way for me to obtain a differential on the config, in case some general setting I change is causing this?
Comment 49•10 days ago
•
|
||
It's a baffling one, and I don't have any instant insights here...
Some general observations:
- The "storeToken" in the .msf file should always point exactly at the "From " separator line for the message. If that's the case, the code should be able to stream out the message from the mbox without a problem, stopping at the next "From " line, or the end of the file.
- The messageSize in the .msf file isn't use to read the message out of the mbox (although a check has gone in recently to throw an error if the read goes 10% past that length (I think), as a sanity check).
- The line endings are probably a red herring - the code generally requires an LF, and will ignore any CRs. So CRLF is fine. LF is fine. CR-only is not.
- Emails in mbox format are supposed to have a blank line between them - that's part of the mbox container format and not part of the message body. However the code should be able to read mbox files without the blank line between emails just fine (mbox is really a family of formats, so there are a few hazy bits. The idea is to be strict about what we write, but accepting of variations when we read).
Some questions:
- Does it still happen if you disable all the filters?
- Does it happen on a fresh account, without the original import? (I realise that's probably a bit of a pain to try out!)
It really sounds like we should be able to reproduce this issue. I'm busy on other stuff right now but will try and have a go at it tomorrow (I'll leave the NI on it to remind me!).
(EDIT: Sorry, missed the bit about starting from scratch with maildir! Curiouser and curiouser...)
Comment 50•10 days ago
|
||
Ben, the NI was for comment 30: Why does a repair not remove the old msgOffset? You need to actually delete the .msf to get rid of it. Bug 1920329 comment 37 asked the same thing, but additionally: why does storeToken/offlineMsgSize not get removed when the message is not offline?
Comment 51•9 days ago
|
||
The code which removed .messageOffset
didn't do a data-migration phase (it didn't really change the DB usage at all, just made msgOffset
redundant). Might be worth adding back in, but it's not too big a deal - nothing accesses it now.
The storeToken/offlineMsgSize not being removed is more of an IMAP-specific thing. IMAP tends to use the Offline
flag of the message to tell that there's a local copy of it. (there are a few other differences between IMAP and local/pop use of local messages, and they really do need to be properly unified).
Comment 52•9 days ago
|
||
The code which removed .messageOffset didn't do a data-migration phase ...
Sure, but modern "repair" could/should remove it, as it could clean up the IMAP entries which are not offline.
Reporter | ||
Comment 53•7 days ago
|
||
Any insights, suggestions? At this juncture I am willing to create test accounts on the MTA/POP3 on my mail server for someone else in order to see if they can reproduce this and spot something I'm not seeing.
Comment 54•7 days ago
|
||
I'm sure Gene would be happy to try it on your server.
Assignee | ||
Comment 55•7 days ago
|
||
(In reply to Francesco from comment #54)
I'm sure Gene would be happy to try it on your server.
Sure, no problem. You can send the credentials via my profile email address if you want.
Reporter | ||
Comment 56•6 days ago
|
||
(In reply to gene smith from comment #55)
Sure, no problem. You can send the credentials via my profile email address if you want.
Sent. Thanks for looking into it.
Assignee | ||
Comment 57•6 days ago
|
||
I've confirm the issue via the test account provided by the reporter. However, have not yet researched what is causing it.
Assignee | ||
Comment 58•4 days ago
•
|
||
The first thing I noticed, looking with network sniffer wireshark, is that when a message is downloaded with the pop3 RETR command, each line of data is terminated with \n (LF). The POP3 RFC specifies that each response line terminates with \r\n (CRLF). However, responses to other commands generated directly by the server (e.g., CAPA, LIST, UIDL etc) are correctly terminated with CRLF. Also, the server correctly sends the final RETR response with \r\n.\r\n
(CRLF.CRLF
). (I didn't test it but I expected the server's TOP response has the same issue.)
Currently TB expects CRLF as line breaks so it was seeing the whole message as one long line. I made a not trivial change to LineReader.sys.mjs so that just LF is treated as valid and that fixed the problem.
I then was wondering if the problem exists in older TB versions. So I have a computer still running TB 68 and it worked OK with the test account.
I then noticed that the reporter had mention that the account was previously running on 115. I wasn't sure if there was a problem with 115 so I tested it and it also worked OK with the test account.
I then moved the 115 pop3 parsing and protocol JS code (LineReader.sys.mjs and Pop3Client.sys.mjs) into daily and it also had the same problem.
Then I went back to unchanged daily and ran with 115 nsPop3Sink.cpp and it worked OK. I then compared the nsPop3Sink.cpp code between daily and 115 and noticed one change pertaining to line endings that has been removed in daily (and is also not in 128). With this change https://phabricator.services.mozilla.com/D190226, the call to write CRLF (MSG_LINEBREAK) was removed in mailnews/local/src/nsPop3Sink.cpp
at line 579 with no explanation that I could find. Putting this back in, as seen in the attached patch, fixes the bug without doing any other changes.
Maybe there's a good reason to remove the write of a final CRLF so I'll ask Ben. (My guess would be that since all POP3 lines are supposed to terminate with CRLF, there is no reason to write an extra one at the end of file.)
Edit: The write of MSG_LINEBREAK that the proposed patch puts back in writes CRLF only for windows; for all others platforms it writes just LF. Also, there is an additional write of CRLF for all platforms ending each message record in the mbox file. See comment 63 for an example.
Comment 59•4 days ago
|
||
(In reply to gene smith from comment #58)
Maybe there's a good reason to remove the write of a final CRLF so I'll ask Ben. (My guess would be that since all POP3 lines are supposed to terminate with CRLF, there is no reason to write an extra one at the end of file.)
Oh, that's interesting! I removed the final CRLF write as it's part of the POP protocol, not part of the message, so it shouldn't be written to the mbox. But that data is also being fed to the nsParseNewMailState
object, and I'm guessing that the parser is relying on that extra CRLF.
I'd say we apply your patch and put it back in, for now at least. I don't think it'll screw up the mbox stuff, and if it fixes the issue until nsParseNewMailState can be fixed/replaced, that's a good thing.
Assignee | ||
Comment 60•4 days ago
|
||
... I don't think it'll screw up the mbox stuff ...
I tested it with a "conformant" server that does terminate RETR response lines with CRLF and it also works OK. So I'll go ahead and submit a formal patch.
Reporter | ||
Comment 61•4 days ago
|
||
(In reply to gene smith from comment #58)
The first thing I noticed, looking with network sniffer wireshark, is that when a message is downloaded with the pop3 RETR command, each line of data is terminated with \n (LF). The POP3 RFC specifies that each response line terminates with \r\n (CRLF).
I briefly read parts of the RFC just now and it's not entirely clear to me that CRLF during multi-line responses is required also for the actual e-mail payload sent during RETR. This would mean that POP3 server software should, for transit, alter the envelope's contents such that it is no longer binary identical to what was received/written to disk by the SMTP software, where only an LF is used between headers and more.
Would it be possible to sniff the traffic coming back from dominant POP3 server software, for example Dovecot or whatever is used by Gmail or Hotmail etc. to see if they are in parity? Despite TB 128 being the sudden odd man out - TB <=115, Apple Mail, Mutt etc. don't manifest this bug - I now worry about any eventual changes happening to TB not to fix an apperent bug but to accommodate for the hiccups I am having, because the POP3 server software I use is a minority, in fact written specifically for me by a friend over a decade ago, and I think there are extremely few users of it.
Comment 62•4 days ago
|
||
By the time that extra linefeed is added in, the POP3 end-of-message has already been detected by the protocol handling code. It's just the nsParseNewMailState
object which is relying on it. That class parses the message headers to create a database entry, and applies filter rules.
It should not be relying on POP3 protocol quirks, and generally needs some major refactoring. The filter aspect should be split out for a start.
See Bug 1876407.
Assignee | ||
Comment 63•3 days ago
•
|
||
Reporter minipudding wrote:
I briefly read parts of the RFC just now and it's not entirely clear to me that CRLF during multi-line responses is required also for the actual e-mail payload sent during RETR.
The RFC states that the RETR command produces a "multi-line" response. And it says each line of a multi-line response ends with CRLF. Then again, we might look at the message as a single line containing random LFs and since the server appends a final CRLF.CRLF it is following the letter of the spec. This can result in a very long line and I can't find that the RFC ever specifies a max line length so still maybe within the spec. (Although I think there is an "Internet Message" RFC that recommends line length not more the 72 and max 998, but not sure it applies to this.)
Anyhow I sent a large message (about 5 MB) from gmail to to ISP POP3, to Outllook/hotmail and to your server. Only your server has the payload line endings as LF while the ISP and Outlook show the line ending as CRLF looking with wireshark. (Might be easy to fix in the perl code but I don't know perl.)
Ben wrote:
By the time that extra linefeed is added in, the POP3 end-of-message has already been detected by the protocol handling code.
The reporter's server is sending this on wire after the "last text" of the message:
last text\n\r\n.\r\n
Without the patch, this ends up at the bottom of the Inbox mbox file and causes the problem:
last text\n\n\r\n
With the patch (put back the write of MSG_LINEBREAK), this ends up at bottom of mbox and there is no problem:
last text\n\n\n\r\n
Don't know why we end up with 3 LFs here or why adding one fixes the bug?
The final \r\n in both is put in by call to m_msgStore->FinishNewMessage(m_outFileStream, hdr);
a bit later in IncorporateComplete()
.
Reporter | ||
Comment 64•3 days ago
|
||
OK, then I suppose it's confirmed that there is an actual bug and change of behavior in TB.
So TB always adds one line-break... which it can't stomach in my particular case... but adding two of them is passable. I don't understand why TB forcefully adds an extra line-break in the first place, if the transmitted envelope already has one.
Wouldn't it be "more correct" if it did not alter the envelope at all unless needed? How would TB fare if it made sure there is just one line-break before the terminator? For example, if the envelope's last line of data before the byte-stuffed EOF (\r\n.\r\n
) is "last text\n", TB just finalizes the envelope file as-is along with CRLF termination:
last text\n\r\n
...and if the envelope comes in without a line-break before the byte-stuffed EOL; "last text"; TB helps out by adding one, then terminates the file with the usual CRLF so that it comes out the same way.
Assignee | ||
Comment 65•2 days ago
|
||
POP3 server sends message body multi-line responses with lines terminated
with just LF instead of the RFC specified CRLF. This did not cause a
problem until 128esr.
Updated•2 days ago
|
Assignee | ||
Comment 66•2 days ago
|
||
Explanation of comment 65 patch:
Turns out that the extra write of a \n wasn't really doing anything. However, as part of the write, it also triggered a parse of the headers. Without the write of the \n in incorporateComplete()
the headers were never parsed because we never detect the blank line between the headers and the message body due to the \n line endings. So now instead of doing the write of \n, I trick the parser into just doing the header parse by giving the parser a blank/empty string in incorporateComplete()
.
I also removed some dead code and improved a comment.
Updated•2 days ago
|
Assignee | ||
Comment 67•10 hours ago
•
|
||
Re: comment 58, 2nd paragraph.
This low level patch of the line handling code also fixes the problem. It just allows LF line ending in the server response to be accepted as valid and converts them to CRLF so the blank line between headers and body is correctly detected and the headers are parsed. Not sure if this is a "non-tricky" fix Ben was referring to in his moz-phab comment.
Description
•