Closed
Bug 241821
Opened 21 years ago
Closed 20 years ago
Mozilla gives dubious mime-type "text/plain" when I attach a file to outgoing e-mail
Categories
(MailNews Core :: Attachments, defect)
Tracking
(Not tracked)
RESOLVED
DUPLICATE
of bug 238152
People
(Reporter: ishikawa, Assigned: sspitzer)
References
Details
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.6) Gecko/20040113
This is 1.7b and 1.6 problem as well.
(Could be older than this.)
attaching a file gets incorrect mime type
Recently, I had complaints from a few recipients of
my e-mail that the attachment is unreadable.
Or rather, to be exact that my e-mail is UNREADABLE!?
I thought this was bogus since I attach a memo file
as separate attachment after writing a few
lines of message. This can't be "unreadable"!?
(A background: there are three major character sets used in Japan.
JIS, EUC and MS-Kanji.
[Note I simplified the explanation drastically.
But you get the idea. It is very chaotic.]
ISO-2022-JP, a variant of JIS is used for e-mail exchange.
EUC (or ujis) is often used under workstation and PC-unixens.
MS-Kanji and its variants are used by MS operating systems and Mac OS.
(I forgot, sure Unicode is making inroad these days. But they are
often used as internally and not exposed to userland yet.)
If you try to render a Japanese text file assuming incorrect character set,
the display is messed up.
With all the three major Japanese character code sets [plus unicode
used internally in some commercial products],
it is often safe to send a document file [even in so called plain text file]
to a different computer using e-mail attachment.
Viewers such as web browsers and intelligent editors such as Emacs,
commercial editors and word processors can handle code conversion when
it becomes necessary for a particular hardware platform.
This is why I often send a text file prepared by Emacs on my unix/Linux
machines as
- attachment file and let the recipient handle the code set issue, OR
- copy/paste the document into the message buffer of mozilla
before sending so that it would go
out in the interoperable ISO-2022-JP encoding.
I would avoid the latter when the document in question is large because
copyt and paste doesn't work easily for, say, a 600KB document from emacs.
I would simply attach the file to my outgoing e-mail in such cases.
I think it used to work without major complaints.
But now the problem as I noted.)
Back to my original problem.
It turns out that the attached file, which I hope the recipient will
save into an external file and view it with a proper viewer, is given
somewhat problematic MIME types as in below: I created a very short
file and attach it to an e-mail to myself to recreate the problematic
situation.
Please note the use of "text/plain" type.
I wonder why mozilla tries to give "text/plain" file type
test.dat. I think it is trying to be clever by "GUESSING" the contents
and got the incorrect result!
Example 1:
--------------020902060403030802040501
Content-Type: text/plain;
name="test.dat"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
filename="test.dat"
VGhpcyBmaWxlIGlzIGluIEVVQyBjb2RlLgqks6TspM8gRVVDIKWzobylyaTOpdWloaWkpeuk
x6S5oaMKVGhpcyBpcyBOT1QgaW4gSklTIGNvZGUuCqSzpOykzyBKSVMgpbOhvKXJpMekz6Si
pOqk3qS7pPOhowo=
--------------020902060403030802040501--
It seems that some e-mail readers both on Mac and Windows try to use
the above MIME information to display the file contents using ITS OWN
IDEA OF USED CHARACTER SET and thus show garbage on the screen
and worse messed up the recipient's mail folder.
I got curious why I didn't notice this problem since I CC:ed the
same e-mail messages to myself.
First of all, I found out that I didn't turned on "Display Attachment
inline" in my mozilla setting. (This is a good way to avoid the
problem of currupted display.)
As soon as I turned it ON, Mozilla showed the attached file in a
readable manner. But the attachment was shown below a clearly marked
horizontal bar (presumably to show that this is part of an
attachment.)
I got suspcious about this. Mozilla must be trying to be very, very
clever in detecting the character set inside a file, it seems, and
show it in a legitimate manner.
So I disabled the automatic character set recognition off. Then the
rendering is messed up. Quite natural and this is all as it should be!
Back to the original problem.
I think that the reason the recipients of my e-mail messages got
mangled message display was that the attachement, which mozilla
unfortunately gave `incorrect' "text/plain" content-type, was handled
as text file (because of "text/plain")
and was automagically rendered using the recipient's
mailer character set recognition routine, which probably was not quite
the same as mozilla's.
(And I believe everyone agrees here that there will be no way the
character set recognition routines will behave in the same manner in
all the applications!)
I think here mozilla's is giving a WRONG/INAPPRORIPATe MIME type after all.
When I tried to attach /bin/cat (a binary program file) to an e-mail to
myself,mozilla somehow gave this mime information.: note
"application/x-vnd.mozilla.guess-from-ext".
Example-2:
--------------060403080600000207080505
Content-Type: application/x-vnd.mozilla.guess-from-ext;
name="cat"
Content-Transfer-Encoding: base64
Content-Disposition: inline;
filename="cat"
f0VMRgEBAQAAAAAAAAAAAAIAAwABAAAA4IoECDQAAADYNgAAAAAAADQAIAAGACgAGAAXAAYA
AAA0AAAANIAECDSABAjAAAAAwAAAAAUAAAAEAAAAAwAAAPQAAAD0gAQI9IAECBMAAAATAAAA
..omission ...
--------------060403080600000207080505--
With the above mime-type, there was no way for mozilla to "display"
the contents of the attachment incorrectly no matter what I did to
tweak the "Display attachment inline" and the character code
recognization setting. Again, this is all as it should be. With this
mime type, I think no sane e-mail client will try to render this in
visible message buffer and messes up display. No complaint from the
recipient of my e-mail. That will be good.
So my conclusion is mozilla is doing something funny to attach a mime
type to an attachment and getting it incorrectly sometimes.
Since "Guessing" can't succeed always, I would rather see the
unreliable guessing turned off completely to avoid the costly exchange
with my e-mail recipients such as "I can't read the latter part of
your e-mail, please re-send" after a day passed after my original e-mail.
This might change "user experience" as some software company's ads
often mentions. So the turning off can be a clearly marked option in
the main menu or something.
Any thoughts?
My take on this is to add mime type that will err on the safe side.
That is, make sure that the receiving e-mail cliednt will handle the
attachment "text" document as attached file rather than to be clever
about showing it inline (unless mozilla offers a menu to override this
somehow when we send an e-mail with attachment.).
Also treat the receiving e-mail's mime type in a carefull manner.
(Not enabling the attachment inline achieves this goal rather well.)
----
An observation of similar bugs in bugzilla.
I typed
"incorrect mime type attachment"
in mozilla bugzilla search and quite a few hits came up.
I think we need to fix the mime handling somehow.
Some stood out since they have something in common.
236212 ... incorrect mime type for PHP attachment.
239849 ... basically the duplicate of 236212.
I mention this bug since the display corruption (or the
lack of display at all) reported is the simiar
kind of symptom my e-mail
recipients may have experienced
71551 ... text-related. This bug reports
text-specific issued concerning incorrect mime type.
But I don't agree with the fine-tuning proposals
mentioned in the dicussion thread. Guessing the code
inevitably fails. Unless you are 100% sure [ and this
could be asserted ONLY by the user, and s/he could be
wrong somtimes even :-) ], make the
attachment NOT "text" type.
18920 ... not directly related, but this could have happened
to me as well. I once tried to send to the complaining
recipient a MS-WORD DOC version of the document
since I thought this binary document format
would be readable, but somehow
that was not handled very well on the receiving end.
But this particular problem I experienced
COULD be IE problem.
"Guessing failes inevitably" department:
215005 ... Not only the display gets mangled sometimes,
it seems that some files gets corrupted even.
But this doesn't seem to be related to
incorrect mime type issue.
Reproducible: Always
Steps to Reproduce:
1. Try attaching a data file when we send an e-mail.
The data file probably needs to contain text-like
data sufficiently, but it has binary data inside actually.
2.
3.
Actual Results:
Mozilla gave "text/plain" Content-type to the attachment.
Expected Results:
I think mozilla should give something different.
"Application/binary" or something???
Comment 1•21 years ago
|
||
*** Bug 241825 has been marked as a duplicate of this bug. ***
Comment 2•21 years ago
|
||
> Any thoughts?
Yes: strive for conciseness. Your report is overwhelmingly long, and a lot of
what isn't noise is supposition. Just the facts, please.
> If you try to render a Japanese text file assuming incorrect character set,
> the display is messed up.
> [snip]
> Example 1:
>
> --------------020902060403030802040501
> Content-Type: text/plain;
> name="test.dat"
> Content-Transfer-Encoding: base64
> Content-Disposition: inline;
> filename="test.dat"
>
> VGhpcyBmaWxlIGlzIGluIEVVQyBjb2RlLgqks6TspM8gRVVDIKWzobylyaTOpdWloaWkpeuk
> x6S5oaMKVGhpcyBpcyBOT1QgaW4gSklTIGNvZGUuCqSzpOykzyBKSVMgpbOhvKXJpMekz6Si
> pOqk3qS7pPOhowo=
> --------------020902060403030802040501--
>
> It seems that some e-mail readers both on Mac and Windows try to use
> the above MIME information to display the file contents using ITS OWN
> IDEA OF USED CHARACTER SET and thus show garbage on the screen
> and worse messed up the recipient's mail folder.
Note that the Content-Type header on this MIME section does not specify a
character set. For instance:
Content-type: text/plain; charset=iso-2022-jp;
Therefore, whatever client is displaying the attachment has to either figure the
charset out heuristically, or assume it's the same character set as the original
mail.
> I think here mozilla's is giving a WRONG/INAPPRORIPATe MIME type after all.
If the file is actually plain text, the MIME type is correct. It does need a
charset if it's not the expected text, and Mozilla does not provide a means to
specify that. Bug 71551 addresses that in a simplistic way; bug 72116 requests
a UI for it. This bug should be marked a duplicate of one of those two, or of
bug 192262 which requests a UI to specify the Content-Type.
> 236212 ... incorrect mime type for PHP attachment.
Unrelated to this problem.
> 18920
Is not viewable (probably security-related); do you have access to this bug?
> 215005 it seems that some files gets corrupted even.
Unrelated to this problem.
i believe 18920 was a typo as it doesn't seem to fit the description of this bug
(fwiw i can't see the bug, so it is not a mozilla security bug).
Comment 4•21 years ago
|
||
(In reply to comment #2)
> Note that the Content-Type header on this MIME section does not specify a
> character set. For instance: Content-type: text/plain; charset=iso-2022-jp;
> Therefore, whatever client is displaying the attachment has to either figure
> the charset out heuristically, or assume it's the same character set
> as the original mail.
No. Default charset of "text" subtype is US-ASCII.
Assuming it's the same character set as the original mail, or automatic charset
detection for text is Mozilla's extended function for user's convenience.
See RFC 2046 ( http://www.faqs.org/rfcs/rfc2046.html )
> 4.1.2. Charset Parameter
> The default character set, which must be assumed
> in the absence of a charset parameter, is US-ASCII.
> The default character set, US-ASCII, has been the subject of some
> confusion and ambiguity in the past. Not only were there some
> ambiguities in the definition, there have been wide variations in
> practice. In order to eliminate such ambiguity and variations in the
> future, it is strongly recommended that new user agents explicitly
> specify a character set as a media type parameter in the Content-Type
> header field. "US-ASCII" does not indicate an arbitrary 7-bit
> character set, but specifies that all octets in the body must be
> interpreted as characters according to the US-ASCII character set.
> National and application-oriented versions of ISO 646 [ISO-646] are
> usually NOT identical to US-ASCII, and in that case their use in
> Internet mail is explicitly discouraged. The omission of the ISO 646
> character set from this document is deliberate in this regard. The
> character set name of "US-ASCII" explicitly refers to the character
> set defined in ANSI X3.4-1986 [US- ASCII].
If mime-type of text is specified(or defaulted to text/plain; by omission),
charset should be specified and valid, although many mailers and text editors
have automatic character detection mechanism and/or character set choice mechanism.
For attachment of unknown MIME type and/or unknown file extention,
Content-Type: application/octet-stream, Content-Transfer-Encoding: Base 64 or QP
(or 7bit or 8bit if possible), Content-Disposition: attachment is better, I
think, which is a general purpose way for data attachment to a mail.
Comment 5•21 years ago
|
||
Ishikawa-san, will adding extention of ".dat"(with preferable Mime Type) to
Profile/"Helper Application" resolve problem?
I attatched xxx.dat file(content is Shift_JIS text) by Mozilla on Win-2K under
following definitions.
(A) Windows Regstry definition
HKEY_CLASSES_ROOT\.dat (Name=Content Type , Data=application/x-httpd-php)
(B) Mozilla's Helper Applocation
Mime-Type = application/x-dat-file, Extention = dat
Generated headers are as follows. (Definition in Helper Application was used)
> Content-Type: application/x-dat-file; name="xxx.dat"
> Content-Transfer-Encoding: base64
> Content-Disposition: inline; filename="xxx.dat"
Reporter | ||
Comment 6•21 years ago
|
||
Answer to post #5.
Thank you Wada-san,
Well, since I am using Linux version of mozilla, I have no idea
where the MIME type association is stored.
(Well actually I have a hunch: /etc/mime.types. )
But even if we can override the MIME type given by mozilla for
files with suffix ".dat", this doesn't solve the general problem.
I mean, I can have ".xxa", ".xxb", ".xxc", ..., and other suffices,
and have files that have binary data.
If mozilla decides to give "text/plain" or whatever data
based on ITS OWN IDEA of data contents, which is NOT SHARED
by other mailers on other hardware/software platforms,
the problem I mentioned in the original post will persist.
Giving application/octet-stream type and
encoding the contents seems to be
only reliable solution here.
Again, if a particular set of parties (senders and receivers) agree
on a certain rules (and can make sure that their email clients
on various platforms behave according to such an agreement)
using that particular rule for adding MIME data types to
whatever the party exchanges among its members
may work.
But unfortunately, in real life, we have problems.
Comment 7•21 years ago
|
||
Ishikawa-san, "Helper Applications" is Mozilla's Preference.
If you already tried "Helper Applications" setting, it was probably due to
preceding dot in extention specification in "Helper Applications".
If you entered ".dat" as extention, remove preceding dot ("dat" only as
extention instead of ".dat").
This seems to be a new Mozilla's bug. See Bug 236212 Comment #12
Comment 8•21 years ago
|
||
I comfirm this bug. (recreated with trunk-nightly/Win-2K)
Problem can de said as follows :
If Extention=>Mime-Type relation is defined in both Helper Applications and
systems's inventry(Windows registry on MS Windows) for extention of attached file,
and if attachment file is guessed as text file (Not zip format nor jpeg format
nor png format nor gif format ...),
Mozilla generates atattchment with "Content-Disposition: inline" and
"Content-Type: text/plain;" with no charset parameter
even though content of atacched file is not US-ASCII (EUC-JP or Shift_JIS or
ISO-2022-JP in reporter's case and my recreation test case).
This is aparantly vioration of RFC 2046 for other than US-ASCII text data.
Comment 9•21 years ago
|
||
WADA: what do you think this bug is about, that is not covered by either
bug 71551 or bug 72116? Is the problem a bug, or an unimplemented feature?
One possible approach would be to specify a default charset for each text/* type
(and for whatever other textual types there are) in the Helper Apps UI. But
there will always be a case where a file (whether arriving via the net or being
attached from your files) doesn't match the default and doesn't have its own
charset specifed. For attachments, we will still need a way to specify the
charset on a per-file basis -- bug 72116.
Another possible approach would be for Mozilla to scan each text attachment to
determine the charset; this could use the same heuristic at work in the View
Source window, which I've seen make correct guesses on messages that were
unspecified or even incorrectly specified. But the problem of needing to
specify it individually is still there, in case of an incorrect scan (?).
Re: Helpers requiring extension entered without the dot (txt instead of .txt):
> This seems to be a new Mozilla's bug. See Bug 236212
As I noted there, bug 170090 (filed against 1.2) covers that problem, so it's
not such a new bug.
Comment 10•21 years ago
|
||
(In reply to comment #6)
> Well, since I am using Linux version of mozilla, I have no idea
> where the MIME type association is stored.
I've found Boris Zbarsky's decription about mime-type determination from
extention on Unix.
>When reading from the local filesystem, we ask the OS for the type (at the
moment, >on Unix, this is done by looking at the extension and looking it up in
>the ~/.mime.types file, the /etc/mime.types file, and the gnome-vfs registry).
See Bug 242743 Comment 2
Comment 11•21 years ago
|
||
(In reply to comment #9)
Mike:
RFC requests charset if "Content-Type: text" is used for other than US-ASCII.
However, perfect charset determination is impossible for other than US-ASCII.
Therefore, "Content-Type: text" should not be used for other than US-ASCII data
as the automatically determined Content-Type,
even if Mozilla properly guessed that the attachced file is text file.
(I prefer "Content-Type: application/octet-stream;" for text data of unknown
charset.)
And automaticaly determined Content-Disposition: should be attachment in this
case, even when mail.content_disposition_type setting in prefs.js is "Inline".
In other words, "Content-Type: text" and "Content-Disposition: inline" should be
used only when text data of US-ASCII.
In addition, I think mail.content_disposition_type setting of "inline" should
not be applied for files such as jpeg, gif, even if Mozilla can display them in
inline.
"Content-Disposition: inline" should be used only for "Content-Type: text"
attachment.
If these automatically determined Content-Type: and Content-Disposition: can be
changed by user on mail composition or can be set in preference, it is very
convinient for many users.
But I think these are enhancement requests.
(Some of them are already requested, as you mentioned.)
Comment 12•21 years ago
|
||
(In reply to comment #11)
> perfect charset determination is impossible for other than US-ASCII.
I don't think 'perfect' is necessary; 'better than we currently have' would be
more than acceptable, particularly since Mozilla is already quite clever about
ID'ing charsets for display in the View Source window (in my experience,
anyway).
> (I prefer "Content-Type: application/octet-stream;" for text data of unknown
> charset.)
Using text/plain with charset=unknown-8bit (as requested in bug 71551) would be
better; why throw away the information that this file was ID'd as text/plain on
the sender's system? Could be oarticularly useful if saving to a MIME-enabled
filesystem (e.g. BeOS), or for future enhancements to Mozilla's mail/attachment
viewer.
> And automaticaly determined Content-Disposition: should be attachment in this
> case, even when mail.content_disposition_type setting in prefs.js is "Inline".
> In other words, "Content-Type: text" and "Content-Disposition: inline" should
> be used only when text data of US-ASCII.
OK, but I don't think that's this bug. See bug 65794.
> In addition, I think mail.content_disposition_type setting of "inline" should
> not be applied for files such as jpeg, gif, even if Mozilla can display them
> in inline.
> "Content-Disposition: inline" should be used only for "Content-Type: text"
> attachment.
That I don't agree with. When my sister sends me a photo of her garden, I don't
want to have to bother making an explicit action (and switching to a different
window) just to view it. But again, that's Not This Bug.
I still do not see any reason this bug is not a dupe of one of the two suggested
in comment 9.
Comment 13•21 years ago
|
||
(In reply to comment #12)
Mike, sorry for bothering you by my thoughts on other than charset issue.
> I still do not see any reason this bug is not a dupe of one of the two
suggested in comment 9.
I think Bug 71551(Using text/plain with charset=unknown-8bit) is very good idea,
and I now believe Bug 71551 is one of the best solutions by your suggestion.
So I also think this bug can be closed as DUPE of other bug.
But I can not say which bug's DUPE.
Decision should be made by people who have previledge on marking DUPE, including
bug opener.
Please note that I don't have previledge on it.
Reporter | ||
Comment 14•21 years ago
|
||
I experienced a similar problem again today and
looked into the problem with a fresh viewpoint.
I am now inclined to say that Mozilla mailer should not
send an attachment file as inline contents and let
the receiver to use its own attachment handler (which
presumably has a better idea of how to display contents
including the correct deduction of character code system, etc..)
So let's get rid of Content-Disposition: inline
and instead use
Content-Disposition: attachment
this will solve the problem.
Related bug reports:
Bug 244829 mentions
>Expected Results:
>Display the text content
>Or do not display inline
I would now concur with "do not display inline".
This Bug 244829 is probably worth reading to resolve this
bug report.
The comment #7 had this to say:
>Currently, I think it's
>assumed that text attachment has the same character encoding >as the main body of
>the message. It doesn't hold in cases like this but in the >majority of the cases
>it holds (although it may change as time goes by).
This assumption IS and HAS BEEN INVALID in Japan!!!
Bug 238152 has this to say:
> It's not easy to determine the charset of a text file being attached
>without prompting users to pick one. However, there are a couple of possibilities:
>1. assume text/* file being attached is in the locale character encoding
>2. assume it's in the same character encoding as the current character encoding
for the mail composition
>3. prompt users to pick one
Assumption 1, 2 is invalid. We may have a different coding
in the attached file after all.
3 is prone to errors.
Since we can't get it right all the time, we should abandon
the idea of specifying correct char set and just decide
to attach the file as "non"-inlined application-dependent
file. (Only the receiver cares about how to read it
and s/he has the array of reading tools on her/his end.)
it is not up to us (the sender of the attached file)
to worry about it.
Updated•21 years ago
|
Product: MailNews → Core
Comment 15•20 years ago
|
||
(In reply to comment #14)
> So let's get rid of Content-Disposition: inline
> and instead use
> Content-Disposition: attachment
>
> this will solve the problem.
Well, yes and no. Per bug 65794, adding
user_pref("mail.content_disposition_type",1);
to user.js will send all attachments out as 'attachment' -- but Mozilla will
ignore that disposition for known text/* types, and (attempt to) display them
inline -- bug
> Related bug reports: [...] Bug 238152
Thank you for finding that -- marking this as a dupe.
> Since we can't get it right all the time, we should abandon
> the idea of specifying correct char set and just decide
> to attach the file as "non"-inlined application-dependent
> file.
Again: bug 71551 -- giving a charset of 'unknown' -- is an even better approach
than changing Content-Disposition.
*** This bug has been marked as a duplicate of 238152 ***
Status: UNCONFIRMED → RESOLVED
Closed: 20 years ago
Resolution: --- → DUPLICATE
Comment 16•20 years ago
|
||
(In reply to comment #15)
> but Mozilla will
> ignore that disposition for known text/* types, and (attempt to) display them
> inline -- bug
Sorry -- that should be "bug 147461"
Updated•17 years ago
|
Product: Core → MailNews Core
You need to log in
before you can comment on or make changes to this bug.
Description
•