Open
Bug 377599
Opened 17 years ago
Updated 17 years ago
From: header displayed wrong when contains 0xB0 character
Categories
(SeaMonkey :: MailNews: Message Display, defect)
Tracking
(Not tracked)
NEW
People
(Reporter: nelson, Unassigned)
Details
Attachments
(3 files)
Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a4pre) Gecko/20070410 SeaMonkey/1.5a In a newsgroup I read regularly, there appeared a message with a From: header that looked like this: > From: "° some name °" <someaddr@domain.tld> ^ ^ where the two character shown above are each a single byte containing hex B0. A copy of that message is attached. When I view that message in the newsgroup, the message is displayed with a default character set of UTF-8. When that message is viewed with the UTF-8 character set, the From: header in the message header pane Looks like this: From: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX where each character shown as an "X" above actually looks like a rectangle containing 4 hex characters. Here is an ASCII art approximation: +-----+ | F F | | F D | +-----+ I see two things wrong with that: 1) Not only are the two 0xB0 characters replaced by that odd FFFD character, but the entire From line is replaced with them. 2) There are FAR MORE characters in that displayed From: header than there are in the actual From: character in the message. The displayed width of the From: header does not even begin to approximate the actual length of the real address in the messages From header. IMO, that From: header *SHOULD* be displayed something like this: From: "? some name ?" <someaddr@domain.tld> That is, the invalid characters (if that's what they are) should each be replaced with a single replacement character, such as "?", and the valid characters in the rest of the header should be displayed as is. I copies this message to a local mail folder, and when I view it there, the default character set is different. It is western (ISO-8859-1). In that character set, the 0x0 character displays as a degree symbol. This different character set behavior surprised me, and raises these questions: Is the default character set for displayed messages a per-folder setting? or per-server setting? Or a single pref for all folders/servers? (as I thought)
Comment 1•17 years ago
|
||
> When that message is viewed with the UTF-8 character set, It's not a charset, just an encoding. ;-) > the From: header in the message header pane Looks like this: > From: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX > where each character shown as an "X" above actually looks like a rectangle > containing 4 hex characters. Here is an ASCII art approximation: > +-----+ > | F F | > | F D | > +-----+ Maybe you could provide a screenshot? (This sounds like you probably don't have a Unicode font available.) > 1) Not only are the two 0xB0 characters replaced by that odd FFFD character, > but the entire From line is replaced with them. 0xFFFD is the correct chararcter to show if 0xB0 is found in a UTF-8 string, its meaning is "invalid character" (0xB0 is invalid in UTF-8). It usually looks like a 'white' question mark inside a 'black' rhomb. Showing only lots of 0xFFFD surely is a bug. > IMO, that From: header *SHOULD* be displayed something like this: > From: "? some name ?" <someaddr@domain.tld> > > That is, the invalid characters (if that's what they are) should each be > replaced with a single replacement character, such as "?", and the valid > characters in the rest of the header should be displayed as is. Exactly.
Reporter | ||
Comment 2•17 years ago
|
||
I *think* I have numerous Unicode fonts, but it's not clear to me how to select one for utf-8 use. SM prefs let me select fonts for various lanugages, but UTF-8 is not one of the languages for which I can select a font (naturally enough, I suppose). I am seeing this problem on two different systems. But the actual character being displayed repeatedly in the From heading is different on the two. I can get a (better) screen capture from the other system, shortly.
Reporter | ||
Comment 3•17 years ago
|
||
I suspect these are the characters you were expecting to see. The question is: why are they repeated all the way across the From header ?
Comment 4•17 years ago
|
||
Similar result is observed with your test data when Default Character Eincoding = ISO-2022-JP. What will happen when "character encoding" setting of tools/options is changed to windows-1252 or iso-8859-1 from UTF-8? - Tools/Options/Display/Formating/Fonts : - Character Encodings : - Incoming Mail : western(windows-1252) or iso-8859-1 "View/Character Encoding" is applied to "message text body" and "message header pane", but doesn't seem to apply to thread pane(mail list pane) when header is not encoded. Since no encoding in From: header, RFC defines charset as us-ascii, but Tb seems to try to use "Tools/Options/Character Encoding" always in such case for user's convenience. As far as I remember, "enhancement of header data display in thread pane"(more simple way to forcing specific charset or option in display) is already requested in case of invalidly encoded header or invalid data for header encoding like your case. But I can't recall bug number...
Comment 5•17 years ago
|
||
(In reply to comment #0) > Is the default character set for displayed messages a per-folder setting? > or per-server setting? > Or a single pref for all folders/servers? (as I thought) Sorry but I missed important your comment. "Default Chracter Encoding" setting was already changed to per-folder from per-account. Context menu of folder/news-group => Properties => General Information tab
You need to log in
before you can comment on or make changes to this bug.
Description
•