Closed Bug 224391 Opened 21 years ago Closed 15 years ago

Outgoing mail/news should use UTF-8 by default

Categories

(Thunderbird :: Preferences, enhancement)

enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: mozilla, Unassigned)

References

Details

(Whiteboard: [patchlove])

Attachments

(1 obsolete file)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.6a) Gecko/20031029 Firebird/0.7+
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-GB; rv:1.6a) Gecko/20031029 Thunderbird/0.4a

For internationalisation purposes, I think it would make sense.

Reproducible: Always

Steps to Reproduce:
Component: Message Compose Window → Preferences
QA Contact: preferences
In bug 374294 Magnus Melin wrote (2007-03-17 05:04:03 PDT)

> xref/dupe bug 224391. Unfortunately, webmail support for utf-8 messages is not
> very good.

Well, there are at least 4 webmails that have UTF-8 support: Horde, Roundcube, Openwebmal, Squirrelmail. In gmail, an option exists to send all mail in UTF-8.
Webmail client not supporting UTF-8 in year 2007 should be considered outdated/broken.
It should also be noted, that most software development tools are now sending all emails in UTF-8 (e.g. bugzilla, subversion, trac, ...)

The users thus need to be prepared to receive UTF-8 encoded mails anyway.

So why not changing Thunderbird as well?

This will definitely speed up the process of identifying non-compliant software and trigger efforts to fix it, which will be a clear benefit for everyone.

It doesn't make sense to tag your email as UTF-8 if in fact you're just sending plain ASCII. There's no sense in harrassing old mail clients with UTF-8 if your content is just latin-1. 

MUAs should adhere to the golden principle of "be liberal in what to accept, but conservative in what you send", and the latter means "use the smallest, most-spread charset possible".

(Whether windows-1252 is a suitable default or not beats me, I turned it off. ;-))
Attached patch Make UTF-8 default. (obsolete) — Splinter Review
Just a straightforward defaults change.
It patches both mail/ and mailnews/ , unfortunately I do now the details why it is there twice.
Test build for Fedora is downloadable at (it will get removed in some time):
  http://koji.fedoraproject.org/scratch/jkratoch/task_287777/
(In reply to comment #4)
> It doesn't make sense to tag your email as UTF-8 if in fact you're just
> sending plain ASCII.

Still it is much worse to make Thunderbird unusable for the non-English part of the world.  It is terrible to see mails from the Free software more broken than those from Outlook.


> There's no sense in harrassing old mail clients with UTF-8 if your content
> is just latin-1. 

This is a minor issue.  Please support your claims by facts - name any MUA which is unable to handle `Content-type: text/plain; charset=utf8' with ASCII content.

But I know about Japanese mobile devices which cannot parse the Japanese Kanji in `utf-8' while they can display it for `iso-2022-jp'.  As there are IMO no such devices in the Western world, if utf-8 is inacceptable proposing to change the `iso-8859-1' default to `iso-2022-jp' as it will solve even existing problems for a substantial number of users.


> MUAs should adhere to the golden principle of "be liberal in what to accept,
> but conservative in what you send", and the latter means "use the smallest,
> most-spread charset possible".

Sure, using Mutt which does so well.  Give a patch or be quiet.
> It patches both mail/ and mailnews/ , unfortunately I do now the details why it
> is there twice.

/mail is for Thunderbird, /mailnews is for SeaMonkey.

This also means I'll veto any change to the SM default, unless a compelling reason is given to change it. The comments here so far are just "because we can", which isn't very helpful.
Not sending UTF-8 does no harm - if you enter non-fitting characters, we "autoraise" the charset anyway...

> Still it is much worse to make Thunderbird unusable for the non-English part
> of the world.  It is terrible to see mails from the Free software more broken
> than those from Outlook.

Huh? We don't send broken mails, we autodetect unfitting characters.

> This is a minor issue.  Please support your claims by facts - name any MUA
> which is unable to handle `Content-type: text/plain; charset=utf8' with ASCII
> content.

Just because I don't know any Urdu speakers, the language doesn't exist? 
Strange argument.
(In reply to comment #7)
> This also means I'll veto any change to the SM default,
...
> Not sending UTF-8 does no harm - if you enter non-fitting characters, we
> "autoraise" the charset anyway...

It is proven in the real world that the users click "Send anyway".  They do not understand what does charset mean.  Regular users send with the current Thunderbird mails with invalid national characters with Content-type charset iso-8859-1.  Otherwise I would not raise the bugreport and I would not get here.
Hello,

> MUAs should adhere to the golden principle of "be liberal in what to accept,
> but conservative in what you send", and the latter means "use the smallest,
> most-spread charset possible".

I totally agree here - mutt *should not* be cited as an old-fashioned, obsolete application. It does *an excelent* job in its segment (where no X session is available)... Many of us use it day by day, as their main e-mail client.

IMHO, the above is a more deep, "political" question: we want to stop Thunderbird's progress for the sake of some obscure, old e-mail client ? As Jan said, *all* major e-mail clients in 2007 know about UTF-8...

Yes, I think this should be a case where *we want* to break compatibility - at the risk of breaking some very old systems. *They* will be forced to upgrade and that's all.

This is especially true about some websites (Yahoo ?), mobile phone manufacturers and their e-mail clients. BTW, I will *adore* to see a smartphone (Motorola Q9, Nokia E61i...) that knows how to read/edit ISO OpenDocument files instead of .doc, .xls and all that Microsoft ****...


Regards,
Răzvan
(In reply to comment #4)
> It doesn't make sense to tag your email as UTF-8 if in fact you're just sending
> plain ASCII. There's no sense in harrassing old mail clients with UTF-8 if your
> content is just latin-1.

Such view is totally western-focused and does not take any other countries into account. Just in the EU, more that 10 different charsets are being used today, which is a complete mess - and UTF-8 is the only viable solution.

When you send your email in latin-1, it means that on several email clients it's only possible to reply in latin-1 - and all other typed characters are lost. For example, Outlook Expres just discards all non-latin-1 characters without any waring and sends broken message. Netscape gives out a warning, but does not offer to upconvert into UTF-8. Even with TB, user gets a confusing "Send in UTF-8" dialog - but most users click "Send anyway" and break the correctly written message. As a result, all non-latin-1 users are receiving several broken emails every day.
  
> MUAs should adhere to the golden principle of "be liberal in what to accept,
> but conservative in what you send", and the latter means "use the smallest,
> most-spread charset possible".

Using the smallest charset possible is the main source of email problems today.
It requires every email client to know about all possible charsets used on the net and implement bug-free charset conversions on both sender & recipient side.
It also means, that your outgoing emails will be sent in different charsets based on the content - and your recipients might be able to read some of your messages correctly while other messages will be broken. I've even seen email client, which encoded a mix of latin-2 and cyrrilic characters into some japanese charset! To be short - this principle introduces undeterministic behaviour and random problems which are hard to diagnose and fix.

On the other hand, when we change to UTF-8 as a default, all sane email clients will be able to:
- correctly display every message
- allow the user to reply without any restrictions / lost characters
- avoid unnecessary conversion between the editor (which today usually runs in UTF-8 anyway) and email transfer encoding
- simplify email processing as such and avoid undeterministic bugs.

Please also note, that UTF-8 promoting sites are suggesting to use this
approach for all email clients for some time already - see e.g. http://www.utf-8.sk  Also for mutt, which was mentioned here before, the suggestion is to use    set send_charset="utf-8" 
(In reply to comment #8)
> > Not sending UTF-8 does no harm - if you enter non-fitting characters, we
> > "autoraise" the charset anyway...
> 
> It is proven in the real world that the users click "Send anyway".

It is? I strongly doubt that, especially since "Send as UTF-8" is the default button in that dialog. (Of course, you'll always have a certain amount of morons not reading/understanding/doing the right thing.)

But that's what we have intl.fallbackCharsetList.* prefs for.

Per (non-localized) default, messages assume ISO-8859-1, so even pure ASCII is send as such (IIRC, to suite some Outlook recipients). 

If a character doesn't fit into ISO-8859-1, the fallback list in intl.fallbackCharsetList.ISO-8859-1 is used, which currently just contains of windows-1252. Thus, a '€' will result in windows-1252 without further questions, while a '⁂' will trigger the UTF-8 dialog.

> They do not understand what does charset mean.

No doubt about that.



(In reply to comment #9)
> Yes, I think this should be a case where *we want* to break compatibility

No. (See below.)



(In reply to comment #10)
> > It doesn't make sense to tag your email as UTF-8 if in fact you're just
> > sending plain ASCII. There's no sense in harrassing old mail clients with
> > UTF-8 if your content is just latin-1.
> 
> Such view is totally western-focused and does not take any other countries
> into account.

That was just an example. You can replace latin-1 with KOI8-U or Big5, etc. and it still holds true.

> When you send your email in latin-1, it means that on several email clients
> it's only possible to reply in latin-1 - and all other typed characters are
> lost.

This argument is basically "let us break other clients (by using UTF-8), because otherwise we break other clients (who don't support $feature)"...

> Even with TB, user gets a confusing "Send in UTF-8" dialog - but most users
> click "Send anyway" and break the correctly written message.

While I still doubt that "most" users do so (how do you count those which don't?), I agree that this dialog isn't very userfriendly - it's frightingly full of text about an issue normal users don't care about.

> It requires every email client to know about all possible charsets used on 
> the net and implement bug-free charset conversions on both sender & 
> recipient side.

Yeah, that shouldn't be the case in an ideal world. But, alas, we don't have one.

> It also means, that your outgoing emails will be sent in different charsets
> based on the content - and your recipients might be able to read some of your
> messages correctly while other messages will be broken.

That's why we stick to the charset used by the sender when replying to it (as long as our new content fits into it).

> On the other hand, when we change to UTF-8 as a default, all sane email
> clients will be able to:

"Sane" is a strange category here, especially when applied only to clients which fit into your point of view...



So I can summarize this discussion into "I am annoyed by some users who are confused by the 'Use UTF-8?' dialogue and send me broken mail, so I want St. Florian's principle applied and annoy others with UTF-8". ;-)

A simple solution to this problem (which I do see, I can understand your frustration!) which still would stay friendly to older clients would be adding UTF-8 as the final charset in the fallback list:

intl.fallbackCharsetList.ISO-8859-1=windows-1252,UTF-8

(maybe even dropping that windows-1252 abomination).

Actually, I think the default should be ASCII and have a localizable fallback list like "$localized_charset,UTF-8".

This would kill this useless dialog for almost everyone.
(In reply to comment #11)
> (In reply to comment #8)
> > > Not sending UTF-8 does no harm - if you enter non-fitting characters, we
> > > "autoraise" the charset anyway...
> > 
> > It is proven in the real world that the users click "Send anyway".
> 
> It is? I strongly doubt that, especially since "Send as UTF-8" is the default
> button in that dialog. (Of course, you'll always have a certain amount of
> morons not reading/understanding/doing the right thing.)

I had to instruct two women so far I met using Thunderbird who both were sending corrupted national characters as iso-8859-1.  I do not use Thunderbird, I do not know what dialogs it provides and what are its fallback/whatever settings.  I do not know what those women clicked but they were trying to use it as best as they could.

Your solution suggestions at the bottom may apply, I do not know.  And I believe any such charset dialog makes no sense in Thunderbird as it should try to be user friendly.  Others still have the `Config Editor' there or Mutt/Emacs-VM.
(In reply to comment #11)

> If a character doesn't fit into ISO-8859-1, the fallback list in
> intl.fallbackCharsetList.ISO-8859-1 is used, which currently just contains of
> windows-1252. Thus, a '€' will result in windows-1252 without further
> questions, while a '⁂' will trigger the UTF-8 dialog.

So this is the first major problem. Windows-1252 is proprietary charset of one vendor which should be avoided by principle. All RFCs mention that ISO-charsets should be used. At minimum, this needs to be replaced by UTF-8, as it's the case with e.g. mutt.

> > When you send your email in latin-1, it means that on several email clients
> > it's only possible to reply in latin-1 - and all other typed characters are
> > lost.
> 
> This argument is basically "let us break other clients (by using UTF-8),
> because otherwise we break other clients (who don't support $feature)"...

Could you be more specific which clients are we going to "break" by using UTF-8?
According to IMC recommendation from August 1998 (!) all email clients created or revised after 1.1.1999 should be capable of sending/receiving email in UTF-8.
Are you really advocating teribly outdated and broken software where the developers were not able to add UTF-8 support during last 9 years ?!

I mean, on the web, there's no discussion anymore that web designers should be disallowed to use UTF-8 since Netscape Navigator 0.1beta did not support it.
The mess and problems we're seeing with international email are a direct consequence of the "smallest charset" approach and no real push for UTF-8 support.

> That's why we stick to the charset used by the sender when replying to it (as
> long as our new content fits into it).

This approach only works with latin-1 since many countries fit into it.
It fails miserably everywhere else and UTF-8 is the only decent solution.

> > On the other hand, when we change to UTF-8 as a default, all sane email
> > clients will be able to:
> 
> "Sane" is a strange category here, especially when applied only to clients
> which fit into your point of view...

Sane here means that those clients are keeping up with standards and were updated during last 9 years... People have probably trashed their old computers and reinstalled their OS several times within this time period.

> So I can summarize this discussion into "I am annoyed by some users who are
> confused by the 'Use UTF-8?' dialogue and send me broken mail, so I want St.
> Florian's principle applied and annoy others with UTF-8". ;-)

No, this is not the point. The summary should read, that people living
in non latin-1 countries are annoyed to see several broken emails every day due to unfounded attempt to be tolerant to few email clients like Eudora which
were not able to keep up with standards during last 9 years and which are
dead anyway.
 
> Actually, I think the default should be ASCII and have a localizable fallback
> list like "$localized_charset,UTF-8".

This still keeps nondeterministic behaviour and unnecessary problems for many sysadmins. For what reason - few teribly outdated email clients which have died anyway? Noone is complaining to wikipedia and most international web pages for using UTF-8 everywhere - and UTF8-incapable clients are simply not there anymore. Most operating systems switched to UTF8 internally as well. So why we should stick with legacy charsets just for emails ???

(In reply to comment #13)
[windows-1252 is default fallback for (only) latin-1]
> So this is the first major problem.

Actually, no. If your character doesn't fall into any of the fallbacks, you still get the dialog.

> At minimum, this needs to be replaced by UTF-8,

Mind that I don't like/use windows-1252 either (comment #4), I just don't think that _replacing_ it by UTF-8 is useful.

> > This argument is basically "let us break other clients (by using UTF-8),
> > because otherwise we break other clients (who don't support $feature)"...
> 
> Could you be more specific which clients are we going to "break" by using
> UTF-8?

I won't do that, because that is completely irrelevant.
You're proposing to break something valid without any need! 

> > That's why we stick to the charset used by the sender when replying to
> > it (as long as our new content fits into it).
> 
> This approach only works with latin-1 since many countries fit into it.
> It fails miserably everywhere else

How that?!

> > Actually, I think the default should be ASCII and have a localizable 
> > fallback list like "$localized_charset,UTF-8".
[...]
> Noone is complaining to wikipedia and most international web pages for
> using UTF-8 everywhere - and UTF8-incapable clients are simply not there
> anymore.

I don't care a bit for what web pages do, it's completely irrelevant here.

> So why we should stick with legacy charsets just for emails ???

No, but because we're not just a mail client, but a Usenet client as well.
And Usenet clients are still found running on many, even very old, systems.
Since appending UTF-8 to the end of the fallback list will solve your problem without harming other uses (yes, I do think there's no need anymore to warn when sending UTF-8 in this day and age), I think it's the way to go.
(In reply to comment #14)

> > Could you be more specific which clients are we going to "break" by using
> > UTF-8?
> 
> I won't do that, because that is completely irrelevant.
> You're proposing to break something valid without any need!

I'm definitely not proposing to break something. UTF-8 is valid charset recognized by RFCc and ISO, and by using it nothing should break. US-ASCII is
contained in UTF-8 at exactly the same codepoints, so even an email client not understanding what UTF-8 is would display ASCII text without any problems.

Thus what's being proposed is just to use a different default (UTF-8 instead of latin-1), which is much better suited for today's environment.
 
> > > That's why we stick to the charset used by the sender when replying to
> > > it (as long as our new content fits into it).
> > 
> > This approach only works with latin-1 since many countries fit into it.
> > It fails miserably everywhere else
> 
> How that?!

As a latin-1 user, you probably have no chance to notice all the mess around legacy charsets. Most applications use latin-1 as default and latin-1 fonts are installed everywhere so even if something breaks, latin-1 text is always displayed correctly.

However, as soon as you need something else (e.g. latin-2) you'll find out, that every such problem results in displaying latin-1 characters instead of the correct ones and texts get broken. UTF-8 solves this cleanly, since every character has a different codepoint in UTF-8 so it's not possible to misinterpret the text.
  
> No, but because we're not just a mail client, but a Usenet client as well.
> And Usenet clients are still found running on many, even very old, systems.

Those systems would probably reject any 8-bit encoding, so the only usable charset for them might be US-ASCII. If usenet is the main concern, a better
solution would be to use different defaults: UTF-8 for email and US-ASCII for news.

> Since appending UTF-8 to the end of the fallback list will solve your problem
> without harming other uses (yes, I do think there's no need anymore to warn
> when sending UTF-8 in this day and age), I think it's the way to go.

This would just eliminate the "Send in UTF-8" dialog, but none of the other problems I've mentioned before. It definitely won't help to speed up UTF-8 adoption (which is the optimal solution) but instead would prolong the current mess with many different charsets and problems for non-latin1 users. You said that web is irrelevant here - but it's a good example how the change into UTF-8 helped to sort out problems and eliminated outdated/broken applications. I believe it's time to do this for email as well.

(In reply to comment #15)
> I'm definitely not proposing to break something.

You're proposing to break backwards compatibility.

> > > > That's why we stick to the charset used by the sender when replying to
> > > > it (as long as our new content fits into it).
> > > 
> > > This approach only works with latin-1 since many countries fit into it.
> > > It fails miserably everywhere else
> > 
> > How that?!
> 
> As a latin-1 user, you probably have no chance to notice all the mess around
> legacy charsets.

Oh, I have. ISO-8859-15 (containing €) is still far less spread than -1... 

> > No, but because we're not just a mail client, but a Usenet client as well.
> > And Usenet clients are still found running on many, even very old, systems.
> 
> Those systems would probably reject any 8-bit encoding, so the only usable
> charset for them might be US-ASCII.

(As a sidenot: we even violate the specs here, because by default we send "7Bit" along with "ISO-8859-1" if the mail is pure ASCII, just to please Outlook. That stinks.)

[UTF-8 as final fallback]
> This would just eliminate the "Send in UTF-8" dialog, but none of the other
> problems I've mentioned before.

If the recipient doesn't understand your charset, it doesn't understand it. This is as true for latin-2 as for UTF-8. With UTF-8 as final fallback, every mail you write which wouldn't fit into ISO-8859-1 would be send as UTF-8 without asking. 

> It definitely won't help to speed up UTF-8 adoption (which is the optimal 
> solution)

It would, because we would just send either ASCII/latin-1 (or a similar localization) or UTF-8.


All this so far actually applies only to new mails, since by default we respect the charset the sender was using when replying. In that case, neither "your" default charset nor "my" fallback will come into play, you'd still get asked about UTF-8 if you enter non-fitting characters...
(That's because we would look for a fallback for that particular charset, which we possibly don't have - we can't have fallback prefs for each and every charset out there.)

So, letting aside the question whether we should try to send minimal charsets or UTF-8 directly, we lack the ability to say "don't ask about sending UTF-8 (if needed), just do it".
(In reply to comment #16)
> You're proposing to break backwards compatibility.

Well, if we really want to be 100 % backwards compatible, we must not use
anything else than 7bit ASCII, since MIME extensions came much later than
email started ;-)
  
> [UTF-8 as final fallback] 
> All this so far actually applies only to new mails, since by default we respect
> the charset the sender was using when replying. In that case, neither "your"
> default charset nor "my" fallback will come into play, you'd still get asked
> about UTF-8 if you enter non-fitting characters...

Ah, that's too bad... This is the most typical case where "Send in UTF-8" appears - user receives email in different charset and replies to it with non-fitting characters.

That's why we by default configure UTF-8 as default charset together with "Use the default character encoding in replies" and we also need fix from bug 301291
With such configuration all works as expected.

Now the main question is what to put as defaults - if we can't eliminate the
"Send in UTF-8" dialog for replies, it doesn't make much sence to mess with
default charset or final fallback... Using sender's charset in replies is the
same sort of "extreme safety" as attempts to avoid UTF-8 - and our experience
shows that those are not required today (the whole university runs the above mentioned setup, webmail is configured for UTF-8 only and we receive much less complaints than before when we were using legacy charsets).
I guess we could just eliminate the dialog here <http://mxr.mozilla.org/seamonkey/source/mailnews/compose/resources/content/MsgComposeCommands.js#1803> (in the SM case, similar for TB) after calling checkCharsetConversion, ie delete lines 1807-1818 and 1820-1825.

That'd leave all our fallback charset magic intact, and if the user still enters non-fitting characters, we'd send UTF-8 without any further questioning.
OK, this looks like a good solution for eliminating this annoying dialog.

But two comments:

- the line 1826 should be removed as well, as this is a closing bracket corresponding to the one at line 1817
- shouldn't we also omit   gPromptService &&    from the if() on line 1804 ?

In order to make some progress, could you please go ahead with this change - although it's only a partial fix, it's a step in the right direction.
Let's however keep this bug open to be able to reconsider and possibly take further steps at later time.
Hello,


IMHO, here's what I suggest:

Letting Thunderbird code exactly as it is now (take 2.0.0.9 as a reference), please implement only a new *default setting* (setting after installing the program from standard kit):


- in Tools -> Options -> Display -> Fonts -> Character Encoding, put both 
Outgoing Mail and Incoming Mail to "Unicode (UTF-8)";

- put a mark at "Apply the default character encoding to all incoming messages";

- put a mark at "Use the default character encodings in replies".

- use a default font that is UTF-8 aware.


Would this change trigger a nuclear war ?   ;-)

Could we see how much "backward compatibility" gets broken, in practice ? How many users/e-mail clients actually complain via "bug" reports ? I suspect not very many...

If the next 2-3 releases of Thunderbird contain this new setting (assuming risks), we can easily evaluate, from the feedback we get, if users are *really* affected negatively. I seriously doubt that.



Regards,
Răzvan
Răzvan,

  please note that option "Apply the default character encoding to all incoming
messages" has catastrophic side-effects and is very serious RFC violation - I've opened bug 408335 against this one - it should IMHO completely disappear.

I fully agree with all the other settings, which ensure that all email is sent in UTF-8 by default. This is exactly what we're using already.
Comment 18 is split off to bug 410333.
related if not dup bug 401358 bug 415368 (same imo)
I think this bug has become a dupe of bug 410333
Assignee: mscott → nobody
Flags: wanted-thunderbird3?
Whiteboard: [patchlove]
Comment on attachment 292592 [details] [diff] [review]
Make UTF-8 default.

Bitrotted. mailnews/base/resources/locale/en-US/messenger.properties no longer exists.
(In reply to comment #24)
> I think this bug has become a dupe of bug 410333

And I'd agree. Duping. Please reopen with reasons why.
Status: NEW → RESOLVED
Closed: 15 years ago
Flags: wanted-thunderbird3?
Resolution: --- → DUPLICATE
Not a direct dupe, this bug per se is unfortunately wontfix for now per above discussion.
Resolution: DUPLICATE → WONTFIX
7½ years and this is a wontfix???  It should be possible to set your default encoding, and the default should be UTF-8.  Time to move out of the 1970's!

Oh yeah that was a utf char up there.  I also like to write 80˚ with a degrees symbol, but thunderbird changed it to a ?.  That's pathetic.
Well it is possible to set your own defaults, no problem here. This bug is about making this default in default installation.
Hello Magnus and company. In my professional opinion Thunderbird should send email as UTF-8 by default, out of the box. I recognize that the devs feel differently, however the comments here are mired, outdated, and difficult to follow. In comment #27 (2009) Magnum mentions that "this bug per se is unfortunately wontfix for now per above discussion". As per the discussion, there is very convincing argument to make UTF-8 the default and very little argument for not doing so. What is the concise reason, then, for not making UTF-8 the default encoding for outgoing mail?

I know of no production mail servers today that do not support UTF-8.

Thank you.
I'm pretty sure there are still a bunch of crappy webmail interfaces that mangle utf8. Not that i think they are really much to care for, but the point is that making utf-8 default gives very little wins - we already use utf-8 where it's appropriate like mixed charset replies.
(In reply to Magnus Melin from comment #32)
> making utf-8 default gives very little wins - we already use
> utf-8 where it's appropriate like mixed charset replies.

We don't (unless something has changed), read the comment #8 and the comment #12.
We do use utf-8 if needed since bug 410333 (see comment 22).
> I'm pretty sure there are still a bunch of **** webmail
> interfaces that mangle utf8.

I would love to know which interfaces those are. I manage a site for fixing wrongly-encoded text (http://gibberish.co.il) and since none of the letters of our alphabet are found in ISO-8859-1, I would know of any large webmail providers that are problematic. The last one that I remember being a real problem was Yahoo, but they've been fine for years now.



> Not that i think they are really much to care for, but the point is that
> making utf-8 default gives very little wins - we already use utf-8 where it's
> appropriate like mixed charset replies.

The problem is for users who don't know what a charset is: they install Tbird and start using it. The thought that only mixed charsets are problematic might be valid for users whose alphabet consists largely of letters found in ISO-8859-1. However, when users whose alphabets have no letters in ISO-8859-1 have to start changing "computer people" settings they won't: they will just say "No, Thunderbird doesn't work for me" and won't use it and also recommend against it.
(In reply to Dotan Cohen from comment #35)
> I would love to know which interfaces those are. I manage a site for fixing
> wrongly-encoded text (http://gibberish.co.il) and since none of the letters
> of our alphabet are found in ISO-8859-1, I would know of any large webmail
> providers that are problematic. The last one that I remember being a real
> problem was Yahoo, but they've been fine for years now.

I don't think large providers have much problems, its the small once small isps, and some schools/small universities use. At least i saw problems a year back...

> > Not that i think they are really much to care for, but the point is that
> > making utf-8 default gives very little wins - we already use utf-8 where it's
> > appropriate like mixed charset replies.
> 
> The problem is for users who don't know what a charset is: they install
> Tbird and start using it. The thought that only mixed charsets are
> problematic might be valid for users whose alphabet consists largely of
> letters found in ISO-8859-1. However, when users whose alphabets have no
> letters in ISO-8859-1 have to start changing "computer people" settings they
> won't: they will just say "No, Thunderbird doesn't work for me" and won't
> use it and also recommend against it.

But they don't have to change anything. If the chars doesn't fit, we silently upgrade to utf-8.
Ok, let's suppose that the system has no problem; I had some mailing with French people, but maybe it was due to their Outlook(?) or any other similar one (then i noticed that my Thunderbird wasn't using UTF-8).
So according the last comments there is some IF somewhere in the code
IF (outgoing_email_encode == MIXED_ENCODINGS)
   outgoing_email_encode=UTF-8;

OK, but since UTF-8 supports all languages, and nowadays all systems support UTF-8 (warmly suggested by W3C), if the defaul would be UTF-8 there won't be need to add any IF. The user, anyway, can have the possibility to change the default encoding (in memory of old times).

Anyway, i set UTF-8 in my thunderbird and I don't think about (and nobody complained about my emails).
Regards
An error in my text, sorry:
* I had some problems during mailing with French people
(In reply to Magnus Melin from comment #32)
> I'm pretty sure there are still a bunch of **** webmail interfaces that
> mangle utf8. 

What about **** webmail interfaces that mangle ISO-8859-1?

Look at https://groups.google.com/forum/?hl=sv&fromgroups=#!topic/rails-se/_Xi1LTc3jdM

I think UTF-8 as default makes a lot of sense.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: