Open Bug 1513093 Opened 6 years ago Updated 2 years ago

Missing Message-Ids are being synthesized (with md5 checksum)

Categories

(Thunderbird :: General, defect)

defect

Tracking

(Not tracked)

People

(Reporter: benc, Unassigned)

Details

For emails missing the `Message-Id` field, TB is generating a stand-in using an md5 checksum (eg "<md5:PRAjuyzOD2qayYqLry%2f+2Q==>"). So a message which came in with a missing `Message-Id` will magically have one when GetMessageId() is called upon it. `Message-Id` is an optional header, according to RFC2822. While this shouldn't be too big a deal - the fake ID is only used internally and the original message source isn't tampered with - it _does_ lead to some unexpected side effects (eg Bug 1259040). The fake ID is generated in `nsParseMailMessageState::FinalizeHeaders()`: https://searchfox.org/comm-central/source/mailnews/local/src/nsParseMailbox.cpp#1478 Ideally the MessageId could just be left blank is missing, but there appears to be at least one place which relies on it as a unique key, `nsImapMailDatabase::UpdatePendingAttributes`: https://searchfox.org/comm-central/source/mailnews/db/msgdb/src/nsImapMailDatabase.cpp#113
And at least one case of some code specifically checking for "md5:" messageIds: https://searchfox.org/comm-central/source/mail/base/content/mailWindowOverlay.js#3307
Please always paste permalinks for source code. The links you've pasted here could be invalid tomorrow :-(
Ahh - good point! updated links: The fake ID is generated in `nsParseMailMessageState::FinalizeHeaders(): https://searchfox.org/comm-central/rev/c0aedb277de2fec61681019e46d323d66ff7f5aa/mailnews/local/src/nsParseMailbox.cpp#1478 nsImapMailDatabase::UpdatePendingAttributes() is one place that seems to rely on a unique key: https://searchfox.org/comm-central/rev/c0aedb277de2fec61681019e46d323d66ff7f5aa/mailnews/db/msgdb/src/nsImapMailDatabase.cpp#113 And at least one case of some code specifically checking for "md5:" messageIds: https://searchfox.org/comm-central/rev/c0aedb277de2fec61681019e46d323d66ff7f5aa/mail/base/content/mailWindowOverlay.js#3307 (not a comprehensive survey - likely there are other places which rely on the md5 fake id)
IIRC there were all kinds of issues when the messageId wasn't there, so I doubt this is easy to change.
(In reply to Magnus Melin [:mkmelin] from comment #4) > IIRC there were all kinds of issues when the messageId wasn't there, so I > doubt this is easy to change. I suspect you're right. A fix to aspire to. I'm a little concerned that an email's Message-ID isn't guaranteed to be unique anyway (eg messages cross-posted to multiple mailing lists), but if nobody has noticed any problems so far it's probably not too big a deal :-) Brain dump on fixing it: Add another attribute (`uniq`, say). Set it to use the `messageID` if there is one, otherwise stash the md5sum there instead and leave the `messageId` attr empty. Then switch over any code which absolutely must have some unique message key to use `uniq` instead of `messageId`.
About uniqueness - in the wild I doubt there are actually two different mails you'd encounter with the same id. But we need to cope since it's sender data, and misconfiguration or malicious intent could produce them. The real world case where you will find duplicate ids is that you can have several copies of the same mail in a folder.
(In reply to Magnus Melin [:mkmelin] from comment #6) > The real world case where you will find duplicate ids is that you can have > several copies of the same mail in a folder. Yes, I was going to ask about this - what is TB's goal in this case? Are the duplicate emails shown as a single email or does each one appear individually?
I think we need to show both. Of course, for any use case where you'd for instance lookup in file system by message id for instance, it doesn't matter which of the mails we'd show.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.