Open Bug 1228193 Opened 9 years ago Updated 2 years ago

UX problem selecting font for UTF-8 encoded plain text mail (see comment #8)

Categories

(Thunderbird :: General, defect)

38 Branch
defect

Tracking

(Not tracked)

People

(Reporter: foo459, Unassigned)

References

(Depends on 1 open bug)

Details

Attachments

(1 file)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0
Build ID: 20140212131424

Steps to reproduce:

Selecting a new monospace font for a plain text message does not affect the text in the plain text message.

This appears to be essentially the same as bug 546877, which I am reopening here, since there seems to be no way to actually reopen it.

The problem seems to stem from the poor correspondence of the choices in the "Fonts for:" pulldown box, and the charset in the actual message.

For example, a message to me from bugzilla itself says
"Content-Type: text/plain; charset="UTF-8"
However, when this message is selected, the font chooser in options > display > advanced defaults to "Latin."  A normal person would assume that this message was in a Latin font, but TB does not think so, so changing the font for "Latin" has no effect.

So, that is problem #1:

Language type should correspond to the currently selected message.
And, it should know what type "UTF-8" happens to be.

I note that if I choose "Other Writing Systems", the font for the plain text message is indeed changed as desired.

This is a bug.  I can understand if you don't want to spend resources to fix it, but it's still a bug and should not be listed as "resolved."
I don't understand the problem you're reporting.

When I select a BMO message to display in the message pane which is encoded in UTF-8 the menu "View > Character Encoding" correctly shows "Unicode".

Equally a charset=ISO-8859-1 encoded message shows as "Western".

UTF-8 falls into the category "Other writing systems", so you need to change the settings for "Other writing systems" to change the display of any UTF-8 encoded message.

> it should know what type "UTF-8" happens to be.
I don't understand this comment. UTF-8 is an encoding that can encode *all* the languages of the planet in one single encoding, while offering backward compatibility with 7bit ASCII (https://en.wikipedia.org/wiki/UTF-8).

No one can know what UTF-8 "happens" to be, it can be German: Heute ist es heiß, Spanish: ¡Buenos días!, it can be Japanese: テスト or Korean: 안녕하세요? or French: Voilà, Français and even Chinese: 中文. Sorry, I don't speak Russian, but you get the point ;-)

Unless you convince me otherwise, I will mark this bug as invalid, just as bug 546877, which is NOT listed as resolved but invalid.
Ok, let me put it this way.  The average user, such as for example, me, is not likely to know the classifications of all the possible encoding systems that tb can understand.

Therefore, when I see an email message, and I want a different font, and I go to an option that wants me to select a font, I naturally assume that this defaults to whatever category the current encoding falls into.  Alternately, I would expect this to be discoverable without having expert knowledge of coding systems.

So, it is true that the current behavior is "correct."  It's just of no use to a user without expert knowledge of coding systems, unless the current message's encoding by chance happens to be in the default category.

One solution is for tb to examine the current message, and choose the default in the font chooser accordingly.
to be clearer:  choose the default *category* in the font chooser accordingly.
(In reply to peter from comment #2)
> One solution is for tb to examine the current message, and choose the
> default in the font chooser accordingly.

Impossible. The message could contain this text:
¡Buenos días! - テスト - 안녕하세요? - Voilà, Français - 中文.
Which category do you choose now?

Look, set the fonts for all "writing systems" to the fonts you want to use and be done with it. I have the same fonts set for "Latin" and "Other writing systems".
Status: UNCONFIRMED → RESOLVED
Closed: 9 years ago
Resolution: --- → INVALID
I observe that the message that bugzilla sent me with your last reply is again 'Content-Type: text/plain; charset="UTF-8"', and I can change both font size, and font (for at least the latin part) under "other writing systems."

So, I would say that if the message contains charset="UTF-8", it still should choose "other writing systems."  It can't choose the right font for me, though, obviously.  I am not intimately familiar with all the encoding niceties, so I'm sure you can come up with some counterexample that happens one time out of a thousand where that will fail, or even one time in ten.

More generally, I would say that the goal in UI design should be to not sacrifice good handling for high probability events in order to optimize the best handling for low probability events.

In this case, the default category, as best I can tell, is unrelated to the message content.  This is like a stopped clock being right twice a day.  If instead, the default category reflected what is most likely the desired one, based on the content of the message, in particular what charset it claims to use, it might not be right 100% of the time, but it would be more frequently right than a stopped clock.

As for me personally, at this point I know what to do.  A year from now, if tb should still exist, I will no longer remember what to do.  Unfortunately, it's not a set-and-forget thing, because it depends on the display I am using, and on the current state of my eyesight, which varies from day to day, not to mention from year to year.
Feel free to raise an enhancement request. I haven't understood what you mean by category. But you say that the system should inspect the content of the message to select the display format. So if UTF-8 encoded content is mostly Western/Latin, than that should be used.

As for forgetting: Everything I know I will forget, I write down. There are so many things I need to work out at some stage and then I won't need the knowledge for years. So I write it down, not on a piece of paper which I also won't find, but in a searchable electronic format (example: How to disable Windows auto-reboot after Windows update).
"So if UTF-8 encoded content is mostly Western/Latin, than that should be used."

What I am saying is that right now, if the message is UTF-8, then the category I need to choose in the font chooser, in order to have it apply to the current message, as near as I can tell, is "other writing systems," but it does not default to that category.  So, I am saying that if the message claims to be UTF-8, then the font chooser should choose that category by default, making the assumption that I want to change the current message.

(BTW, the current message does not actually change when I click 'ok' enough times in the options dialogs -- I must look at another message, then come back.  Maybe that is another bug, I mean enhancement, that could be done.)

I am not sure if you are interpreting what I am saying as a request that if the message says it is UTF-8, *and* the predominant characters are Latin, then the "Latin" category of the font chooser should apply to that message. (The font chooser has a field called "Fonts for" which I am calling "category.")  Yes, that would be another way to fix it, but it would be a lot more complicated than just having the chooser default to (i.e., pre-choose) "other writing systems" when the message is UTF-8.

Right now, I am thousands of miles from home (in metric, er, millions of meters from home);  I will wait til I get home next week to submit it as an enhancement.  Hopefully, if I write it down, I will remember.
OK. Let me summarise.

The main complaint is that for messages received in UTF-8, you have to select "Other Writing Systems" in
  Tools > Options, Display, Formatting, Fonts & Colors, Advanced, Fonts for:
to set the font for plain text messages. And that is not obvious.

In general, I find the options on the text encoding menu and the "Fonts for:" options quite confusing and I will attach a picture showing both. Note that the reporter refers to "Fonts for:" as "category".

Now the bad news is that this is not Thunderbird functionality, it is in fact Mozilla functionality which is used in Firefox and Thunderbird. This can only be changed for both products and the chances to get this approved are very low.

In Firefox the features are used as follows:
The "Text encoding" on the view menu shows how the web page, or in case of Thunderbird, the e-mail is encoded.

The "Fonts for:" ("category") is basically a legacy feature coming from the time when websites allowed users to select the fonts and sizes they wanted to use. These days websites select the fonts and sizes and most of the web would look very strange if users were to select their own. In fact, most of the web is done in CSS and these font setting have no effect. In Google Chrome you even need a plug-in to set fonts for different writing systems.

Only poor old Thunderbird relies on the feature to select a font for plain text e-mail which doesn't carry font information.

As I said before: To ensure the font you like, set the font for "Other Writing Systems" and "Latin".

Coming to think of it, this bug is a "wontfix" rather than "invalid".

References:
Writing systems are a science of their own, here a few links for further reading:
https://en.wikipedia.org/wiki/Writing_system
https://en.wikipedia.org/wiki/Script_%28Unicode%29
https://en.wikipedia.org/wiki/ISO_15924 - http://unicode.org/iso15924/iso15924-codes.html
Component: Untriaged → General
Resolution: INVALID → WONTFIX
Summary: Changing monospace font does not affect plain text message → UX problem selecting font for UTF-8 encoded plain text mail (really a Firefox problem, see comment #8)
Henri, I saw your posts on dev-platform and I believe you know something about encodings.

I've always been wondering about the relationship between encodings and so-called writing systems. In Firefox and Thunderbird there are options to assign fonts to a "writing system". My questions are:
- How does the system work out the "writing system". Is it derived from
  the encoding? My picture shows that encoding "Chinese, Traditional" can be
  associated with two writing systems, Traditional Chinese, Hong Kong and Taiwan.
- How are Armenian, Bengali, Devanagari, Ethiopic, Georgian, etc. detected?
- If something is encoded in Unicode, then "Other Writing Systems" is associated,
  rather than looking at the content and associating with the respective writing system.

If you don't know, please point me to someone who does.
Flags: needinfo?(hsivonen)
First, Gecko looks up the language of the text node being rendered by walking up the DOM until it finds the language declared using a lang attribute. This code predates the standardization for writing systems of tags and language tags, so stuff like ja-Latn don't map to Latin font prefs. Rather, the mapping is done based on the language according to https://mxr.mozilla.org/mozilla-central/source/intl/locale/langGroups.properties which is not exactly in sync with the IANA registry.

If there is no declared language, the writing system is guessed from the character encoding according to the mapping in https://mxr.mozilla.org/mozilla-central/source/dom/encoding/encodingsgroups.properties?force=1 .

This stuff is not in a great shape. The system was designed at a time when even Western European and Central European fonts were separate (we only recently merged these as well as some others under Latin) and when UTF-8 wasn't in much use. Making this subsystem great never seems high enough a priority to get proper developer time. (The Latin merge was exceptional.)

Also, obviously, looking for the lang attribute doesn't help with text/plain.
Flags: needinfo?(hsivonen)
Ok, thanks for taking this seriously and working out the details.
Henri, thank you for the information, it is most interesting.

In the Thunderbird e-mail composition window we do have a "lang" attribute on document.documentElement, and it is my personal project to transmit the language attribute in an e-mail header as per bug 1201836. If an e-mail with language information is received, we could make sure that Gecko finds the language so the font mapping would work better for UTF-8 as the reporter suggested.

That's a bit down the track but it's very helpful to understand the language to writing system and the encoding to writing system lookup.
Summary: UX problem selecting font for UTF-8 encoded plain text mail (really a Firefox problem, see comment #8) → UX problem selecting font for UTF-8 encoded plain text mail (see comment #8)
Let's revisit this when bug 1201836 is fixed.
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: WONTFIX → ---
Status: REOPENED → NEW
Depends on: 1201836
See Also: → 1254053
See Also: → 1501655
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: