Message is displayed despite charset specifying invalid character set

RESOLVED INVALID

Status

Thunderbird
Mail Window Front End
--
minor
RESOLVED INVALID
13 years ago
13 years ago

People

(Reporter: David R. Conrad, Assigned: Scott MacGregor)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

13 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.6) Gecko/20050317 Firefox/1.0.2
Build Identifier: Thunderbird 1.0.2 (20050317)

I received a multipart/alternative MIME message that specified, in the headers
of the first (and only) alternative section,

Content-Type: text/html;
	charset="iso-8029-2"

There is no such character set, but the message was displayed regardless.

Reproducible: Didn't try

Steps to Reproduce:
1. Craft a MIME message that specifies charset=<something invalid>
2. Send it to yourself.
3. View the message.
Actual Results:  
The message was displayed, although the question-mark-in-a-diamond was displayed
in place of a few non-ASCII characters.

Expected Results:  
It is unclear.  The current behavior, which appears to be either assuming
US-ASCII or Unicode and attempting to proceed, may be best, but refusing to
display the message at all might be more correct.

The message contained certain characters (&#x93; and &#x94; in positions that
would make sense for quotation marks, &#xAE; in a position that would make sense
for a registered trademark symbol (note: html character entities were not used
in the message -- raw bytes with the high bit set were)) that seem to indicate
that the "real" encoding of the message was windows-1252.

I could provide the actual message, but it was a garden-variety phishing attempt
with nothing to distinguish it but that: 1) it was multipart/alternative, but
only had one part, 2) the body of the message was base64 encoded, and 3) it
claimed to be in charset=iso-8029-2.

In another instance of a damaged MIME message, where the "boundary" for a
multipart message is malformed, Thunderbird does not display the message.  I
have several times received spam that was damaged in this way, and chuckled at
the fact that the spammers had shot themselves in the foot in this manner.

I would have thought that specifying something totally bogus for the charset
would have the same effect.  The actual behavior, of trying to recover, may in
some sense be better, but what is the right thing to do here?

P.S. ISO-8029 is, in fact, a standard for plastic water hoses.  Quoting from
http://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=15039
"Lays down the requirements for two types of textile reinforced thermoplastics
collapsible water hoses for general applications...."

Comment 1

13 years ago
Dupe of bug 251634?

Comment 2

13 years ago
(In reply to comment #0)
> Expected Results:  
> It is unclear.  The current behavior, which appears to be either assuming
> US-ASCII or Unicode and attempting to proceed, may be best, but refusing to
> display the message at all might be more correct.
> [...]
> In another instance of a damaged MIME message, where the "boundary" for a
> multipart message is malformed, Thunderbird does not display the message.

And there is bug filed somewhere asking for Moz to somehow determine that the 
broken boundaries "match".  In general, the policy is to display if at all 
possible to figure it out.

I refer you again to bug 251634, which is related but not really the same.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 13 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.