Closed Bug 168098 Opened 22 years ago Closed 22 years ago

Binary attachments will be encoded as quoted-printable and get corrupted in some situations

Categories

(MailNews Core :: Attachments, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: A.Sloman, Assigned: Bienvenu)

References

Details

(Whiteboard: Thunderbird0.3)

Attachments

(4 files)

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.2a) Gecko/20020906 Build Identifier: Mozilla 1 My wife is using mozilla 1.0 on her PC running windows 2000. She uses mozilla for web browsing and for reading and sending mail. However one of her correspondents cannot cope with RTF files when they are sent as attachments in quoted printable format. There does not seem to be any way to specify that the attachment should be mime-encoded (base64) which the recipient can cope with. Apologies if there is a preference setting that I have been unable to find. Reproducible: Always Steps to Reproduce: 1.compose mesage 2.select RTF file as attachment. 3.send it Recipient complains about gobbledygook message. Actual Results: Message sent as quoted printable format. Expected Results: Should at least have given the option to use base-64. Curiously when I send the same message as an attachment using mozilla on my linux PC, it sends it as base64, not quoted printable. But I can't find any place to specify that as an option.
QA Contact: trix → yulian
QA Contact: yulian → stephend
*** Bug 200185 has been marked as a duplicate of this bug. ***
*** Bug 195009 has been marked as a duplicate of this bug. ***
*** Bug 200112 has been marked as a duplicate of this bug. ***
The problem is connected to the file types listed in prefs of Navigator/Helper Applications. If a file type is assigned to an extension, sending such a file results in quoted-printable used if its MIME Type has the wrong syntax. If there e.g. is one entry with MIME Type *.zip and extension zip, zips will be send as QP. Try application/zip or even application/grmpf and it will be send as base64. If it is *.rtf or something illegal and extension rtf all RTF files will be send QP a.s.o. So you can work arround this problem by assigning a proper MIME Type or simply delete the entry for this type. But the Mozillas behaviour should be fixed at two points, anyway. 1. Avoid QP even with bad MIME Type. 2. Use QP encoding that won't corrupt binary data.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Cannot make mozilla send RTF file in BASE64 instead of quoted printable. → Binary attachments will be encoded as quoted-printable and get corrupted in some situations
*** Bug 204855 has been marked as a duplicate of this bug. ***
I figured out at least one common source for the *.extention entries in the helper applications prefrences window. After watching this bug on my user base, I was able to correlate the creating of those *.extention entries to their use of AOL.com Web Mail. The AOL.com email client has some sort of Javascript based download invoker that is sending the *.whatever as a MIME type. Is the a problem with Mozilla/Javascript Interpreter? Should it allow invalid MIME specs to be created? Is it a problem with the MIME Handler? or is this an Mozilla evangelism issue (ie. someone needs to tell AOL to fix it)?
Also, requesting that this block 1.4 as it inhibits AOL/Netscape/Mozilla from integrating their products well. (Netscape 7.5/8? will need 1.4 to not have this bug in it.)
Flags: blocking1.4?
This is an insidious and serious bug, it should be changed to "critical", there is data loss here, and the casual user would never happen on the explanation! For a couple of months I was trying to track down problems we were having with sending *.doc attachments, until I realized the problem was with Mozilla and not the recipients' ability to handle my files. In a business environment this could be disasterous. This bug shouldn't escape into a "stable" version. It has been present since at least 1.3. When I did a byte by byte examination of the transmitted files I found a few extra bytes were being inserted in exactly the same spots, the QPF problems referred to here. I had a mime type of .doc, and removing that fixed it. But then I lose some of the flexibility for attachment handling. * Orest
*** Bug 206998 has been marked as a duplicate of this bug. ***
Orest, what flexibility do you lose in attachment handling when you remove a buggy mime handler for .doc? Or the other way round, the MIME Type was .doc (or *.doc)? What about using the right MIME type (application/msword) instead removing the whole entry? But you're right, it's a bad bug. The bytes being added are 0x0d and maybe 0x0a I suspect, can you confirm this? Please vote for this bug (and all others who suffer from this bug) if you want to increase the chance of getting it fixed.
Severity: normal → major
OS: Windows 2000 → All
Hardware: PC → All
Christian, Sorry, flexibility was not the right word, what I meant -- if .doc is ostensibly a valid option, then some users will choose it instead of the "more correct" application/msword, resulting in this sneaky error. As you surmise, when the error conditions apply, Mozilla finds all of the solitary 0x0d and 0x0a bytes, and either prepends an 0x0d or appends an 0x0a to make it a "normal" cr/lf pair. This bug is so obvious now! I added two screen captures of a hex display of the .doc files that nicely demonstrate this. In the original file (original.jpg) you will find: at offset 0x0223 00 0a 00 at offset 0x022f 00 0d 00 In the corrupted version (corrupt.jpg) of the file you will find: at offset 0x0223 00 0d 0a at offset 0x0s30 00 0d 0a * Orest
*** Bug 206593 has been marked as a duplicate of this bug. ***
mozilla1.4 shipped. unsetting blocking1.4 request.
Flags: blocking1.4?
nominating for 1.5b blocker -- this is a data loss bug
Flags: blocking1.5b?
taking
Assignee: mscott → scott
I have documented a new twist to this: A Mozilla "infected" with these invalid MIME types can "infect" another Mozilla or even Thunderbird with the same invalid MIME types. I have witnessed this viral behavior in my installed user base. The invalid MIME types are collected by a user using the AOL.com email client. Upon sending an attachment to another user with an invalid MIME type, that other user's computer becomes affected, as well, by picking up the MIME type from the email. Another work around: delete mimetypes.rdf from the chrome directory (only way to address this in Thunderbird).
I think this is related to Bug #212998 which cites the following places where we could incorrectly encode binary attachments as quoted printable: http://lxr.mozilla.org/seamonkey/source/mailnews/compose/src/nsMsgAttachmentHandler.cpp#264 http://lxr.mozilla.org/seamonkey/source/mailnews/compose/src/nsMsgCompUtils.cpp#1431
Status: NEW → ASSIGNED
Whiteboard: Thunderbird0.3
I'd be happy to look at this - I was working on it under another bug.
taking.
Assignee: scott → bienvenu
Status: ASSIGNED → NEW
there is a hidden pref, "mail.file_attach_binary", which if set to true, will tend to force base64 encoding. You can edit prefs.js, or use about:config to set this.
IMHO Mozilla should send all attachments as binary except when it is SURE it is safe to send an attachment as ASCII (QP). My suggestion on how to identify ASCII files is the following: - The whole file should be checked. Not just the first 16 kB. Mixed ASCII/binary files do occur (I have seen them in .pdf and PostScript, but also in data from the "Envisat" satellite) - Mozilla should know what the default record terminator (from now: DRT) is in the underlying operating system. In my short life I have seen CR, LF, CRLF, binary zero. If a file contains any of these characters not being used as DRT, the file is definitely NOT ASCII. For example a CR character anywhere in a file on a UNIX system makes it a binary file. - I am uneasy about using filename extensions to determine if ASCII encoding is safe. For example a .doc file on a UNIX system is often ASCII, on a windows system rarely.
One big part of the problem could be solved if Mozilla wouldn't fiddle with the given line breaks. As it looks to me from my tests, 0x0a isn't extended to 0x0d0a on Windows only but on Linux too. So there's nothing plattform specific with it. Saying CRLF on Linux is always binary would prevend us from getting corrupted files but isn't really smart. A line break preserving QA would not cause any corruption - not in pure text and not in binary.
the problem is that we're using quoted-printable on attachments that are binary - that's just never going to work well. From what I can tell from reading http://www.freesoft.org/CIE/RFC/1521/6.htm we'd need to encode naked CR's and LF's (=0D, =0A) for binary data - we can't just let them through. But we shouldn't be using QP on binary attachments anyway. We do analyze the whole attachment, not the first 16K, at least in the code I'm looking at. I'm not sure where that comment comes from, unless it's some necko code that's sniffing attachments, and that comment applies to remote attachments, as opposed to local files. I'm still having trouble reproducing this problem, despite the fact that my wife had this same problem until I cleaned up her mime types. If I try to do set up the mime types incorrectly by hand, I'm not having the problem.
Attached file 256 bytes test file
Creating a MIME Type "*.out" with Extension "out" and attaching this file to a new mail produces a corrupt QP file.
Yes, this is a two part problem. But I really think both should be solved. 1. Don't use QP for know binary and for any unknown data 2. Encode each 0x0a to =0A and each 0x0d to =0D. And if linebreaks have to be inserted to meet the requirements of standards, a soft line break (=0x0a0x0d) works fine.
has anyone tried this with a debug build? I don't see this happening with my debug build, but I do see it happening with my release build. However, my release build is Netscape 7.1, which is another variable...I'll try downloading a release mozilla build and see if it happens there too. if we do #2 on linux and mac, every cr and lf will be encoded. Or did you mean, only do that if it's not the native line ending? That would mean I could send you an attachment from the mac to linux, and if I saved it on linux and sent it on again, every line ending would be encoded. I believe if we do #1, #2 is not that important. Am I missing something?
Yes, this also happens with a last weeks debug build under Linux as with 1.4 final on Win. Right, #2 isn't important if we have #1. And yes, I meant encoding them always. Also on Linux Mozilla converts the cr in crlf and the lf in crlf. So on the wire we always have DOS line endings. It's my point of view that the receiver should be able to save even a non binary attachment completely unchanged, that means with the same line breaks as the original document.
Christian, have you tried it on a build from 8/16 or later? I did check in a change which might help for files larger than 500 bytes (there's a rather arbitrary piece of code that decides files less than 500 bytes are probably not binary, which I can also remove). Always encoding naked CR's or LF's is OK with me.
Hm, my debug build is from 13th or so, I'll make a new one today and test it. I'd call this piece of code quite obscure - what's with small GIF or PNG? Better do some reall content checking instead of such a workaround.
I agree; that 500 byte check is arbitrary - this patch removes it.
*** Bug 212998 has been marked as a duplicate of this bug. ***
Attachment #130093 - Flags: review+
Attachment #130093 - Flags: superreview?(scott)
Attachment #130093 - Flags: superreview?(scott) → superreview+
Comment on attachment 130093 [details] [diff] [review] remove size check a=asa (on behalf of drivers) for checkin to 1.5beta
Attachment #130093 - Flags: approval1.5b+
it would be good if we could get this in today. trying to make builds for 1.5b thursday or friday.
fix checked in.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Flags: blocking1.5b?
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: