Closed Bug 1070986 Opened 5 years ago Closed 5 years ago

React to the removal of us-ascii as a Gecko-canonical name

Categories

(MailNews Core :: Internationalization, defect, major)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 37.0

People

(Reporter: hsivonen, Assigned: mkmelin)

References

Details

Attachments

(2 files, 2 obsolete files)

Bug 1069954 removed us-ascii as a Gecko-canonical encoding. Please update charsetalias.properties to map us-ascii to windows-1252 and check that 1069954 comment 3 is correct.

If 1069954 comment 3 is not correct, there's always the option of importing us-ascii to c-c the way UTF-7 was imported.
This also removes the mail.label_ascii_only_mail_as_us_ascii pref
I can't remember ever seeing a bug report where this would be needed to be set, and if it actually was an issue nowadays we'd probably have plenty since it's just a hidden pref.

Re the compose windows change: yes that code is called for every change to the subject, but getting the label is very fast (at least now, it's below one ms) so the globals can go. Besides that, it was just doing things it doesn't need anymore. 

A couple of cases in send code had if-missing-use-ascii type of conditions. I made those UTF-8 instead. I guess it could be latin1 too, but since we aim for utf-8 in the long run there's little reason not to use that, IMO.

I did not change the places in mime where we do special stuff for us-ascii, as it's hard to know what it affects (is it input/output and so on). AFAIK, nothing's broken, and hopefully that code all go away at some point.

Try was happy with this - https://treeherder.mozilla.org/ui/#/jobs?repo=try-comm-central&revision=3efff4f95ecf
Assignee: nobody → mkmelin+mozilla
Status: NEW → ASSIGNED
Attachment #8529248 - Flags: review?(Pidgeot18)
(In reply to Magnus Melin from comment #2)
> This also removes the mail.label_ascii_only_mail_as_us_ascii pref
> I can't remember ever seeing a bug report where this would be needed to be
> set, and if it actually was an issue nowadays we'd probably have plenty
> since it's just a hidden pref.

So, in theory, we should label ASCII mail as the charset ASCII. However some clients (i.e., Outlook) use the charset of the message as the charset of the reply (and silently mojibake IIRC?), so labelling ASCII just because you never used non-ASCII characters turns out to interact very badly. I guess the ideal situation here is to not label the charset at all, but if the current setup works well enough, there's no reason to change it.
But why do the complication? I just don't see how setting it to UTF-8 would hurt anybody, and you'd get always defined behavior. If you don't set a charset someone is bound to do something unexpected.
There are people who filter on charsets, and even our own font selection is based on charsets. So there is some value in advertising a "weaker" charset, although our drive for UTF-8-everywhere should hopefully make it less of an issue.
Skipped the pref removal part
Attachment #8529248 - Attachment is obsolete: true
Attachment #8529248 - Flags: review?(Pidgeot18)
Attachment #8530657 - Flags: review?(Pidgeot18)
Comment on attachment 8530657 [details] [diff] [review]
bug1070986_react_to_ascii_encoder_removal.patch

Review of attachment 8530657 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/components/compose/content/MsgComposeCommands.js
@@ +2410,5 @@
>  }
>  
>  function GetCharsetUIString()
>  {
> +  // The charset here is already the canonical charset (not an alias).

I'm not 100% sure this is the case, but when I scampered off to find a counterexample, I couldn't.

::: mailnews/mailnews.js
@@ -123,5 @@
>  pref("mail.strictly_mime_headers",          true);
>  // 0/1 (name param is encoded in a legacy way), 2(RFC 2231 only)
>  // 0 the name param is never separated to multiple lines.
>  pref("mail.strictly_mime.parm_folding",     1);
> -pref("mail.label_ascii_only_mail_as_us_ascii", false);

I think when you removed the don't check for US-ASCII changes, you forgot to get rid of this part of the diff as well.
Attachment #8530657 - Flags: review?(Pidgeot18) → feedback+
Put it in try-catch in case there is some odd edge case.
Attachment #8530657 - Attachment is obsolete: true
Attachment #8538651 - Flags: review?(Pidgeot18)
Comment on attachment 8538651 [details] [diff] [review]
bug1070986_react_to_ascii_encoder_removal.patch

Review of attachment 8538651 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/components/compose/content/MsgComposeCommands.js
@@ +2415,5 @@
> +  let charset = gMsgCompose.compFields.characterSet;
> +  if (charset && charset != gMsgCompose.compFields.defaultCharacterSet) {
> +    try {
> +      return " - " + gCharsetConvertManager.getCharsetTitle(charset);
> +    } 

EWHITESPACE
Attachment #8538651 - Flags: review?(Pidgeot18) → review+
https://hg.mozilla.org/comm-central/rev/3770fda8192e -> FIXED
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 37.0
No longer blocks: cc-backlog
Followup fix. Missed the part that the charsets can be different case.
(gCharsetConvertManager doesn't care which though)
Attachment #8543376 - Flags: review?(Pidgeot18)
Comment on attachment 8543376 [details] [diff] [review]
bug1070986_compose_charset_title.patch

Review of attachment 8543376 [details] [diff] [review]:
-----------------------------------------------------------------

::: mail/components/compose/content/MsgComposeCommands.js
@@ +2413,5 @@
>    // The charset here is already the canonical charset (not an alias).
>    let charset = gMsgCompose.compFields.characterSet;
> +  if (!charset)
> +    return "";
> +  

EWHITESPACE
Attachment #8543376 - Flags: review?(Pidgeot18) → review+
You need to log in before you can comment on or make changes to this bug.