Closed Bug 219593 Opened 21 years ago Closed 21 years ago

mozilla corupt text attachments, mozilla must send text attachments bas64 encoded!

Categories

(MailNews Core :: Attachments, defect)

x86
All
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED WORKSFORME

People

(Reporter: metze, Assigned: sspitzer)

References

Details

User-Agent:       Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.4) Gecko/20030624

mozilla sends text attachments as following:

Content-Type: text/plain;name="example-01.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline;filename="example-01.diff"

the 7bit encoding will corrupt the line feed of unix txt files which uses '\n'
and not '\r\n' as new line delimiter.

according to the RFC 2045-2049 text attachments must be transfer encoded.
and this should be in base64.

also the disposition should be attachment instead of inline.



Reproducible: Always

Steps to Reproduce:
1.send a unix txt attachment and receive it by a windows client
2.you will see that there're now \r\n new lines
3.

Actual Results:  
attachment is corrupted

Expected Results:  
mozilla should send text attachments base64 encoded by default.
(maybe a config option to restore the old behavior would be good)

so a "encode text attachments with base64/quoted-printable/none(7bit)" option 
would be nice.
also a "send text attachments as inline/attachment" option would be good
this should be the default:

Content-Type: text/plain;name="example-01.diff"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;filename="example-01.diff"


this also applies to thunderbird!
Er ... 

RFC 2045 6.8

  "The Base64 Content-Transfer-Encoding is designed to represent arbitrary 
   sequences of octets in a form that need not be humanly readable."

RFC 2046 4.1.1

  "The canonical form of any MIME 'text' subtype MUST always represent a line  
   break as a CRLF sequence.  Similarly, any occurrence of CRLF in MIME 'text' 
   MUST represent a line break.  Use of CR and LF outside of line break 
   sequences is also forbidden."

There we have that Base64 is intended for octets (binary) data, and that any
content-type that starts with "text" (e.g., text/plain) MUST always represent a
line break as CRLF.
>  "The Base64 Content-Transfer-Encoding is designed to represent arbitrary 
>   sequences of octets in a form that need not be humanly readable."
text attachments doesn't need to be humanly readable while transporting.

only message text should be humanly readable...

>  "The canonical form of any MIME 'text' subtype MUST always represent a line  
>   break as a CRLF sequence.  Similarly, any occurrence of CRLF in MIME 'text' 
>   MUST represent a line break.  Use of CR and LF outside of line break 
>   sequences is also forbidden."
yep, but this also applies to the 'on wire' or in mailbox raw mail.
and base64 uses line of no more than 76 chars, which are terminated by CRLF.

>... and that any
>content-type that starts with "text" (e.g., text/plain) MUST always represent a
>line break as CRLF.
this is *not* conflicted by using base64 or quoted-printable!

RFC 2045 page 21/22:
-----------
   Because quoted-printable data is generally assumed to be line-
   oriented, it is to be expected that the representation of the breaks
   between the lines of quoted-printable data may be altered in
   transport, in the same manner that plain text mail has always been
   altered in Internet mail when passing between systems with differing
   newline conventions.  If such alterations are likely to constitute a

   corruption of the data, it is probably more sensible to use the
   base64 encoding rather than the quoted-printable encoding.
-------

RFC 2049 3.:
-------
....
   The following guidelines may be useful to anyone devising a data
   format (media type) that is supposed to survive the widest range of
   networking technologies and known broken MTAs unscathed.  Note that
   anything encoded in the base64 encoding will satisfy these rules, but
   that some well-known mechanisms, notably the UNIX uuencode facility,
   will not.  Note also that anything encoded in the Quoted-Printable
   encoding will survive most gateways intact, but possibly not some
   gateways to systems that use the EBCDIC character set.
....
-------
Please read the examples how plain unencoded text can be changed on the
tranport!

RFC 2049 4.:
-------
4.  Canonical Encoding Model

   There was some confusion, in earlier versions of these documents,
   regarding the model for when email data was to be converted to
   canonical form and encoded, and in particular how this process would
   affect the treatment of CRLFs, given that the representation of
   newlines varies greatly from system to system.  For this reason, a
   canonical model for encoding is presented below. 
...
    (1)   Creation of local form.

          The body to be transmitted is created in the system's
          native format.  The native character set is used and,
          where appropriate, local end of line conventions are
          used as well.  The body may be a UNIX-style text file,
          or a Sun raster image, or a VMS indexed file, or audio
          data in a system-dependent format stored only in
          memory, or anything else that corresponds to the local
          model for the representation of some form of
          information.  Fundamentally, the data is created in the
          "native" form that corresponds to the type specified by
          the media type.

    (2)   Conversion to canonical form.
...
    (3)   Apply transfer encoding.

          A Content-Transfer-Encoding appropriate for this body
          is applied.  Note that there is no fixed relationship
          between the media type and the transfer encoding.  In
          particular, it may be appropriate to base the choice of
          base64 or quoted-printable on character frequency
          counts which are specific to a given instance of a
          body.

    (4)   Insertion into entity.
...
   Conversion from entity form to local form is accomplished by
   reversing these steps.
...
   For example, a message with the following
   header fields:

     Content-type: text/foo; charset=bar
     Content-Transfer-Encoding: base64

   must be first represented in the text/foo form, then (if necessary)
   represented in the "bar" character set, and finally transformed via
   the base64 algorithm into a mail-safe form.
------

Note I only talk of text attachments and not of message text!
And I think a attached file should be in a *mail-safe* form!

At least I think it should be possible to configure mozilla this way.
(maybe on a per attachment or globa basis)
(maybe let the default as it is now...)
"only message text should be humanly readable..."

Says who?

I didn't see anything anywhere that said text/plain attachments had to be base64
encoded.

There is a way to make all attachments binary. Set the mail.file_attach_binary
pref to true in about:config.
> text attachments doesn't need to be humanly readable while transporting.

I think a lot of people would disagree with that -- there is a reason why the
code is as it is at the moment, and this is that most people like to read their
text/plain attachments inline.
ok, thanks

user_pref("mail.content_disposition_type", 1);
user_pref("mail.file_attach_binary", true);

is all I need...:-)

maybe it wouls be usefull to set this with the UI...

for me the bug is resolved

thx

metze

PS: can anyone point to a documentation of all possible 
prefs?
marking as WORKSFORME, as suggested by the reporter
Status: UNCONFIRMED → RESOLVED
Closed: 21 years ago
Resolution: --- → WORKSFORME
Verified.
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.