Closed Bug 472000 Opened 16 years ago Closed 2 years ago

Bad parsing of message boundaries corrupts messages index in eMail browser

Categories

(MailNews Core :: Backend, defect)

1.8 Branch
x86
Windows XP
defect
Not set
major

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: billc, Unassigned)

References

(Blocks 1 open bug, )

Details

(Keywords: testcase)

Attachments

(1 file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.5) Gecko/2008120122 Firefox/3.0.5
Build Identifier: version 2.0.0.19 (20081209) <== latest build available

Each eMail message is delimited at its beginning by the following record with
relevant information fields as anchors...for example:
  From - Sat Jan 03 14:11:29 2009
where the "From - " field starts in column 1 of the record
With such a scheme, it is *mandatory* that the message boundary parser obeys
this delimiter record criteria *exactly* or else eMail browser message index
lists get corrupted when the message body happens to contain content lines
that also start with "From " in column 1! Thunderbird appears to *not* obey
this beginning of message record delimiter criteria and thereby creates cor-
rupted message index lists for browsing!

Reproducible: Always

Steps to Reproduce:
Download and use Thunderbird to open the files "Inbox" and "Inbox.msf" from the
URL posted above...there is also a file "Image01.jpg" to show the screen image
of how Thunderbird incorrectly parses the "Inbox" file to generate a bad message
index list for browsing...

Actual Results:  
The message index list shows FIVE entries when there are in actuality only TWO
messages (the first TWO entries in the index list)! See image file from the URL
posted above...

Expected Results:  
The message index list should show only TWO messages (the first TWO entries in 
the index list) rather than FIVE as actually shown. The last THREE entries are
BOGUS!

The fix is to have the beginning of messages delimiter be parsed based upon ALL
the information fields rather than just keying on "From " being located at col-
umn 1 of the record!

    From - Sat Jan 03 14:11:29 2009

Also key on Day-of-Week, Month-of-Year, Day-of-Month, hh:mm:ss, and Year fields
rather than only the first field of this record!
Version: unspecified → 2.0
I thought that by convention if the word 'From' was on column 1, it should be escaped as '>From' ? These messages are not formed like that. I'm not sure if it needs to be done in MIME parts.
Hmm...for some reason, even if such a convention were adopted, I do still get
lots of eMail messages that are *not* escaped and just occur in the body of 
the message as-is in column 1...so there still needs to be some way to ensure
the integrity of message boundaries?
(In reply to comment #2)
> I do still get lots of eMail messages that are *not* escaped and just occur in the body of the message as-is in column 1...

Read thru following documents. (wiki/mbox is found by google search for Unix Mbox) > http://en.wikipedia.org/wiki/Mbox
> http://www.qmail.org/man/man5/mbox.html
> http://homepages.tesco.net./~J.deBoynePollard/FGA/mail-mbox-formats.html

Because of "Unix Mbox" spec(I believe flaw of Unix Mbox design), Tb escapes "From " in column 1 of a line by ">From " when Tb writes mail data to Tb's mail folder file(Unix Mbox format) upon mail download. (" From ", escape by a ' ', is probably used for draft/unsent mail with format=flowed.)
If user copies an Unix mbox style file to Tb's mail directory, user MUST escape the "From " line, before Tb uses it as mail folder file in Unix Mbox format.

(In reply to comment #2)
> so there still needs to be some way to ensure the integrity of message boundaries?

There is no way to escape from "From " line issue of "Unix Mbox", unless Tb stops to use "Unix Mbox", or stops to provide compatibility with other softwares who believe Tb uses popular "Unix Mbox", including former versions of Tb, Seamonkey, Mozilla App Suite, Mozilla, Netscape.
Apple Mail 2 already transfferd from "Unix Mbox"(multi mails in a file, which was used by Apple Mail 1) to ".emlx"(same as ".eml", single mail per a file). AFAIR, Opera(M2) also changed from "multi mails per a file" to "single mail per a file).
I think one of reasons of transfer was "From " line issue of "Unix Mbox" format.
Just to confirm that that the sample inbox indexes in the same way with:
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.1b3pre) Gecko/20090103 Lightning/1.0pre Shredder/3.0b2pre ID:20090103031218

Interesting that when viewed as an .eml the newsletter views fine.
Component: General → Backend
Product: Thunderbird → MailNews Core
QA Contact: general → backend
Version: 2.0 → 1.8 Branch
wada, where does this bug fit in your grand list of issue?
how do we move this forward?
Keywords: testcase

Ben, is this covered in other bugs, or no longer needed?

Flags: needinfo?(benc)

I'd say this one could be closed.
Since this bug was reported "From "-quoting has been peppered all over the codebase, so it shouldn't be an issue these days.
(All of which I am in the process of sweeping away as part of Bug 1719121 :-)

Flags: needinfo?(benc)
Status: UNCONFIRMED → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
See Also: → 1719121
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: