Open Bug 503271 Opened 15 years ago Updated 2 years ago

text/plain messages have crlf on linux when saved as .eml file

Categories

(Thunderbird :: Message Reader UI, defect)

x86
Linux
defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: avi, Unassigned)

References

Details

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1) Gecko/20090630 Fedora/3.5-1.fc11 Firefox/3.5
Build Identifier: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1b3pre) Gecko/20090513 Fedora/3.0-2.3.beta2.fc11 Lightning/1.0pre Thunderbird/3.0b2

Saving a text/plain message results in a file with crlf line endings instead of lf which is the OS convention.

Might be related: selecting text to the end of a line also selects a cr.

Reproducible: Always

Steps to Reproduce:
1. Save a text/plain message to disk

Actual Results:  
message contains carriage returns

Expected Results:  
message does not contain carriage returns

This is incredibly annoying for someone who uses thunderbird to manage patches.
Version: unspecified → Trunk
> Saving a text/plain message results in a file with crlf line endings

Saving to .eml file? Or to .txt file?
If "save to .eml" case, converting to [CRLF](line ending stream defined by RFC 2822 etc.) in mail data stream to [LF](line ending character of text file on Linux) will produce problem like Bug 500479.
Save to .eml.

Saving to .txt corrupts the message in a different way: headers are split into separate lines.
(Off-Topic)

(In reply to comment #2)
> Saving to .txt corrupts the message in a different way: headers are split into separate lines.

You are experiencing Bug 269812(or Bug 257828 / Bug 261923 which are listed in Dependency Tree for Bug 269812)? See also bugs referred by Bug 269812.
.eml file is not .txt file. I recommend you to use command.
  tr -d '\r' < inputfile > outputfile
  sed -e 's/\r$//' inputfile > outputfile # DOS  to UNIX (removing CRs)
  perl -pe 's/\r\n|\n|\r/\n/g' inputfile > outputfile  # Convert to UNIX
( see http://en.wikipedia.org/wiki/Newline )
I believe "Keeping [CRLF] upon saving to .eml" is correct action on any OS, because I believe data in ".eml" file is basically defined by RFC 2822.
Avi Kivity(bug opener), do you agree on "closing as INVALID"?
Adding ".eml" to bug summary for ease of search.
Summary: text/plain messages have crlf on linux when saved → text/plain messages have crlf on linux when saved as .eml file
Avi - would you mind answering Wada's question ?
(In reply to comment #4)
> .eml file is not .txt file.

OK, I take this. Could we get then selection of type in File/Save As sticky, please? I for many others never want to save email as .eml, only .txt.
.eml file ... CRLF ends, otherwise correct
EOLs are correct, but localized headers and EOL after every header making it corrupted in most email-treating programs.
Matej Cepl, the correct solution to your problem is to attach patch files to the message, rather than pasting the patch into the message body.
(In reply to comment #13)
> Matej Cepl, the correct solution to your problem is to attach patch files to
> the message, rather than pasting the patch into the message body.

I cannot choose how people are sending patches to me ... and to say on lkml that unfortunately my MUA cannot process their emails so they have follow my convention ...
*they have to
Inlining patches is a _REQUIREMENT_ for many lists.
Status: UNCONFIRMED → RESOLVED
Closed: 15 years ago
Resolution: --- → DUPLICATE
(:Aureliano Buendía), this bug is for absolutely different issue from bug 487115, even if "line ending character" is relevant to both issues.
  bug 487115 :
   save as plain text file, .txt file
   Actual Results:  
    Subject:
    [PATCH] Build fix: switch from "git-foo" to "git foo" for git 1.6
   Expected Results:  
    Subject: [PATCH] Build fix: switch from "git-foo" to "git foo" for git 1.6
   I guess bug opener is requesting remove of "new line" after "From: ".
   AFAIK, Tb writes "new line" of OS when .txt file. 
  This bug :
   save as message file, .eml file (please note ".eml != .txt")
   "Line ending" is CRLF. It's different from "new line" of OS(LF if Linux).

Re-opening.
Status: RESOLVED → UNCONFIRMED
Resolution: DUPLICATE → ---
(In reply to comment #10)
> Could we get then selection of type in File/Save As sticky, please?

Matej Cepl, if [CRLF] in .eml is critical issue for you, please open separete enhancement request for "line ending selection" in save dialog. As some text editors which is sutable for programing has such capability, I think request itself is reasonable. But, read my comment #4 before open bug for such request, please. 

(In reply to comment #11)
> email saved as .eml file
> .eml file ... CRLF ends, otherwise correct

Was the .eml file saved by Tb on Linux?
Agreed this is invalid. Messages when saved as .eml should be in rfc822 format:

   Messages are divided into lines of characters.  A line is a series of
   characters that is delimited with the two characters carriage-return
   and line-feed; that is, the carriage return (CR) character (ASCII
   value 13) followed immediately by the line feed (LF) character (ASCII
   value 10).  (The carriage-return/line-feed pair is usually written in
   this document as "CRLF".)

->INVALID
Status: UNCONFIRMED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → INVALID
If you want to quote RFC on me, then please don't forget parts which are not making your life easy. What about 1.1. Scope:

...

     Some message systems may  store  messages  in  formats  that
     differ  from the one specified in this standard.  This specifica-
     tion is intended strictly as a definition of what message content
     format is to be passed BETWEEN hosts.

....

Also note general practice (yes, I know, it is not RFC) as recorded for example in mbox(5) manpage (part of mutt and qmail at least):

    An mbox is a text file containing an arbitrary number of e-mail messages.
    Each message consists of a postmark, followed by an e-mail message
    formatted according to RFC822, RFC2822. The file format is line-oriented.
    Lines are separated by line feed characters (ASCII 10).

I would note that this practice is honored by all most major email clients on Linux (Evolution, mutt, kmail, are just the ones I've checked).
... and of course aside from mentiond MUAs, LF-only-ended RFC822 formatted file is the format of /var/spool/mail/$LOGNAME which is THE industry standard defining storage of the mails on Unix machines (supported by all MTAs implementing /usr/sbin/sendmail interface, plus procmail and other MDAs).
.eml messages aren't intended to be mailbox format.

Granted the rfc only covers the format to be passed between hosts. But, for message/rfc822 type files, i don't really see any upsides with making the format platform specific. Why would one NOT follow the only standard that specifically could apply? 

One of the main usage of such files is surely to send to another recipient or another system, which is pretty close to what the rfc says.
(In reply to comment #23)
> Granted the rfc only covers the format to be passed between hosts. But, for
> message/rfc822 type files, i don't really see any upsides with making the
> format platform specific. Why would one NOT follow the only standard that
> specifically could apply? 

First of all let's clear up one thing ... there are NO standards for the local storage on the hard drive, just quasi-standards of what's commonly done.

On the other hand, LF-only mboxes are used by ALL email-related programs on Linux, so it is pretty deeply entrenched practices.

It is not that this bug asks for introduction of new standard. Just contrary, it asks for fixing Thunderbird behavior as the only unusual MUA on Linux (using unusual, because as I wrote there are no standards, just generally followed patterns).

> One of the main usage of such files is surely to send to another recipient or
> another system, which is pretty close to what the rfc says.

Now, concerning upsides, you can try to re-read what I (and others) wrote above, and there is no need to type it once again. Here in the Linux-land users are quite used to further process emails with various tools we have (there is a reason why for example patch mentioned by previous commenters can directly apply patches from email messages).

By imposing Windows practices in the Linux-land and by asking for patches as attachments you not only make the process more cmplicated for everybody, but you are also making it very clear that you don't care about the local practices.
I fully agree with the original poster and #24. Blatantly ignoring the operating system's newline conventions is a terrible idea. This makes Thunderbird much less usable on affected operating systems than it otherwise would be.

A typical use case to save messages is to read, parse and use them in the operating system Thunderbird is running and failing to properly format the messages forces the user either to use another mail client or run "fromdos" before doing anything else, since every other program is unable to work with the attachments as saved by Thunderbird.

Could someone with required access rights reopen this bug? (I don't think I can do it.)

I'm using Seamonkey (and because of this bug, also Mutt) myself but the problem in Thunderbird is the same.

Furthermore, this is regression from Seamonkey 1.x.
Per comment 24 and comment 25, reopening. This was closed by mistake.
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: INVALID → ---
When I read the code a while ago, I found this:

http://mxr.mozilla.org/comm-central/source/mailnews/local/src/nsMailboxProtocol.cpp#645
645       /* When we're sending this line to a converter (ie,
646       it's a message/rfc822) use the local line termination
647       convention, not CRLF.  This makes text articles get
648       saved with the local line terminators.  Since SMTP
649       and NNTP mandate the use of CRLF, it is expected that
650       the local system will convert that to the local line
651       terminator as it is read.
652       */

So I thought the line ending character(s) is LF on Linux.

Does anybody know when the characters became CR+LF on all platforms?
Indeed. I still think this isn't really a bug, though of course an inconvenience for some. If you want to do stuff with an message/rfc822 message the tool should support the line ending format set by rfc822. 

Arguing that "LF-only mboxes are used by ALL email-related programs on Linux" is besides the point, as .eml isn't mbox, it's a file representation of a message whose format is defined by rfc822 (and successors).
> as .eml isn't mbox
It is not passed between hosts as well.
> it's a file representation of a message whose format is defined by rfc822 (and successors).
rfc822 (and successors) doesn't define the file representation of the message. Rather, it explicitly limits the scope to messages to be passed between hosts. Also, mbox is a file representation of rfc822 messages even if it is not .eml.
(In reply to Masatoshi Kimura [:emk] from comment #31)
> > as .eml isn't mbox
> It is not passed between hosts as well.

Well, not "that" way, but commonly sent as attachments to a recipient that may not be on the same platform.

> > it's a file representation of a message whose format is defined by rfc822 (and successors).
> rfc822 (and successors) doesn't define the file representation of the
> message. Rather, it explicitly limits the scope to messages to be passed
> between hosts. Also, mbox is a file representation of rfc822 messages even
> if it is not .eml.

Of course messages can have serveral file representations, and i don't think the file representation is defined per se for .eml files, but given it's mime type following rfc822 would be implied imho.
(In reply to Magnus Melin from comment #30)
> .eml isn't mbox, it's a file representation of a message whose format
> is defined by rfc822 (and successors).

Yes, that's nice legal trick how to get out the bug, but it doesn't matter. Nobody asked you to make this file compliant with message/rfc822. The real question is whether you want to make it useful for your users or whether you want to make life for them intentionally difficult.
Lack of interoperability is what make life difficult for users. You're thinking of it from a developer perspective. An end user couldn't care less - he just opens the message with whatever application can handle .eml files. 

I don't know your use case, but scripts can easily convert line endings before further processing - though i wouldn't consider that end-user activity.
(In reply to Magnus Melin from comment #34)
> Lack of interoperability is what make life difficult for users.

WE DON'T TALK ABOUT THE WIRE PROTOCOL HERE, INTEROPERABILITY WITH OTHER OPERATING SYSTEMS DOESN’T MATTER HERE!!!

If there is any interoperability issue, then it is that all text files on my system have LF endings, except for this weird one.

> You're
> thinking of it from a developer perspective. An end user couldn't care less
> - he just opens the message with whatever application can handle .eml files. 

yes, and such application will certainly know that on Unix text files (or text-like files) use LF, not CRLF. Only broken Windows applications badly ported to Unix pretend that everybody should follow Windows standards and use CRLF. I don't think Thunderbird is the one, so I don't understand why you try to keep it broken.

> I don't know your use case, but scripts can easily convert line endings
> before further processing - though i wouldn't consider that end-user
> activity.

Yes, for broken applications which cannot understand the operating system they are on, we have to create workarounds. We can, but these are workarounds around bugs of such applications.
(In reply to Magnus Melin from comment #34)
> Lack of interoperability is what make life difficult for users. You're
> thinking of it from a developer perspective. An end user couldn't care less
> - he just opens the message with whatever application can handle .eml files. 

Fortunately, Thunderbird, Outlook Express and Windows Live Mail can open both CR+LF and LF-only eml file.
Outlook 2010 and Evolution does not have capability to open eml file.

I do not know about MUAs on Mac OS X.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #36)
> (In reply to Magnus Melin from comment #34)
> > Lack of interoperability is what make life difficult for users.
> Fortunately, Thunderbird, Outlook Express and Windows Live Mail can open
> both CR+LF and LF-only eml file.
> I do not know about MUAs on Mac OS X.

Mail.app opens both CR+LF and LF-only eml files.

(And vim handles them transparently by default, too.  I had to use "vi -b" to see the "^M"s.  ;)

Later,
Blake.
(In reply to Matej Cepl from comment #35)
> (In reply to Magnus Melin from comment #34)
> > Lack of interoperability is what make life difficult for users.
> 
> WE DON'T TALK ABOUT THE WIRE PROTOCOL HERE, INTEROPERABILITY WITH OTHER
> OPERATING SYSTEMS DOESN’T MATTER HERE!!!

Just because *you* don't intend to use it elsewhere doesn't mean people in general won't send it to other people on other OSes. Apparently that's not a big deal on the most common apps though, as they all have had to create workarounds. So if it's decided to use platform line endings it isn't that big a deal - it's not the first time standards would be broken to make something easer.

Out of interest, what does Mail.app use when saving eml files?
(In reply to Magnus Melin from comment #38)
> Out of interest, what does Mail.app use when saving eml files?

Text is LF-only.
Eml is CRLF.

Thunderbird shows LF-only attached emails (in attachments) just fine.

Later,
Blake.
Magnus: What you say about standards compliance is valid in principle, I agree with that. However, no other e-mail program I know of on Linux/Unix uses DOS style newlines. I've also never heard anyone mentioning this would have been a problem.

Instead, the problem of using non-native newlines is a real one.

Also, I don't think it's not a very common case that you save a message to disk and then attach it to a new message. On top of that users of Thunderbird and Seamonkey can attach messages using drag & drop instead, and that's what at least I do most of the time.

In principle the real fix for this problem could perhaps be to provide a new file type (mbox) that would use native OS newlines. And preferrably also to choose the default type for it. I still don't think it'd be worth the trouble.

Does someone happen to either know or remember why the convention was changed between Seamonkey 1.x and 2.x? I'm not certain about the Thunderbird versions; likely 2.x and 3.x. Did someone file a bug and complain about it, or was it made to better conform to the RFC?
(In reply to Sakari Ailus from comment #40)
> Magnus: What you say about standards compliance is valid in principle, I
> agree with that. However, no other e-mail program I know of on Linux/Unix
> uses DOS style newlines. I've also never heard anyone mentioning this would

Not as native storage format of course but that's not what we're discussing. Mail.app (unixy!) does use CRLF for eml like we do per comment 39.
(In reply to Sakari Ailus from comment #40)
 
> In principle the real fix for this problem could perhaps be to provide a new
> file type (mbox) that would use native OS newlines. And preferrably also to
> choose the default type for it. 

I agree. Evolution does so (saving mail as mbox). The main problem here is Thunderbird has no feature to save mail with headers to disk in text (platform dependent line endings).
Why wasn’t Bug 391810 just reverted when this bug report appeared? There seems to have been no explanation in that other bug for the change, yet more than enough here to suggest it is against the grain. Is a patch worth writing?
Why this bug was re-opened and still complaints are repeatedly posted to this bug even though closed as INVALID once?
This bug's claim is "CRLF in .eml saved by Tb is invalid and is by flaw in code of Tb", and Severity is set to one for Tb's flaw in code. Change by Bug 391810 is reasonable. So, INVALID is pretty natual cloing code.
No one is opposed to request of "option to save .eml file with LF or OS line ending", and I think no one rejected it. No one says that it's impossible based on that RFC5822 exists.
Why you all don't open bug for Enhancment request for user's convenience?
Here is buzilla.mozilla.org. Here is not user support center. Please request in appropriate way, instead of posting complaints repeatedly.
(In reply to WADA from comment #45)
> Change by Bug 391810 is reasonable.
Why.

> Why you all don't open bug for Enhancment request for user's convenience?
Users don’t want options, they want their software to just work. This current behaviour works against that for me.

> Here is buzilla.mozilla.org. Here is not user support center. Please request
> in appropriate way, instead of posting complaints repeatedly.
I asked if its worth making a patch, to know if it might be accepted, there’s no point wasting time if it is just going to be WONTFIXed.
(In reply to John Drinkwater (:beta) from comment #46)
> This current behaviour works against that for me.

Any software is never developed for you only, unless you hire programmer, and pay sufficient money or force him to make program for you by gun :-)

One of biggest problems in "OS line ending of .eml always" are;
- If digital signed mail, "CRLF to LF conversion by Tb always" means
  mail data corruption is produced by Tb.
- .eml file is sent as message/rfc822 part, so, it can't be sent in
  quoted-printable/base64 because it's clearly prohibited by RFC.
  So, if ".eml file saved as .eml by Tb" always has "line ending of LF",
  it means "loss of original CRLF in mail data" is produced by Tb.
So, as "standard behaviour of mailer named Thunderbird", "CRLF to LF conversion always by Tb himself" can not be acceptable.
These are a reason why WONTFIX.

"Optional" doesn't mean only "option upon each save". "hidden prefs such as save_eml_in_OS_Line_Ending" is a possible solution. And, if almost all Linux users can't afford to see "CRLF in .eml file", there is a way; Ship Linux build with save_eml_in_OS_Line_Ending=true.
What is best for your convenience is better analyzed in separate bug.

Quick history is as follows.
(i) Initially, .eml file was saved with OS line ending.
(ii) Bug 391810 was fixed to resolve interoperability problems.
(iii) However, it was inconvenient for Linux users, and some users requested "OS line ending" as standard behavior of Tb.
(iv) However, it can't be acceptble.
(v) So, enhacement for "optional OS line ending" is requested.
I knew change of (ii) was done at somewhere, but I didn't know bug number until you pointed bug number in comment #44.
I believe above history is better recorded in crisp bug for enhancement request.
I believe many comments on "why WONTFIX" and "Why shouldn't WONTFIX" is useless for enhancement for user's convenience.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: