Closed Bug 253575 Opened 20 years ago Closed 8 years ago

Revamp of Character Encoding settings (preferences, Auto-Detect, Customize List, menu order)

Categories

(Core :: Internationalization, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: mcow, Assigned: smontagu)

References

(Blocks 1 open bug)

Details

(Keywords: intl)

The Character Encoding submenu currently in Mozilla (browser and mail/news,
1.8, including recent work from bug 52157, appears as:
    Auto-Detect	   >
    More Encodings   >
    Customize List...
    -----
    [list encodings]
    [recent encodings]

I think there are a couple of problems here.  The primary problem is that
there are two settings which properly belong in Preferences: the Customized
List, and the default state of Auto-Detect:
 - the Custom List is clearly a preference, and a fairly obscure setting
   that is not typically changed on the fly.
 - the way that I use Auto-Detect, I want to either have it always off or
   always Universal (or better, Western, but there is no Western
   autodetector).  For the odd foreign-charset page that autodetects
   incorrectly, I would like to change the autodetection for that page, but
   *not* change my default setting -- I don't want to have to go back and
   reset the setting for the next page.

Also, the order of the menu items at this level is nonintuitive.  Therefore:

First, I suggest that  Customize  become a button on the prefs page,
(Preferences|Navigator|Languages, which has a very large multi-line edit field
that, for most people, contains two or three lines at most).  The button would
open the same dialog seen now from the menu.

Second, the current Auto-Detect menu be placed on the same prefs page as a
dropdown, with nearly the same structure it has now:  Off at the top, followed
by the set of available detectors -- except, put Universal on top, as it is
(supposed to be) a superset of the others.  (If the Universal detector ever
gets good enough, this preference would become a checkbox instead of a
dropdown.)

Third, I suggest implementing a new, slightly different Auto-Detect feature in
the menu; not a radio-selection, but an action.  Clicking an item in this
submenu would apply the selected autodetector to the current page, but NOT
change the default autodetection setting.  There would be no OFF setting in
this submenu, but (again) put Universal on top.  The menu item itself could be
called "Auto-Detect" or "Apply Auto-Detection" or perhaps "Force
Auto-Detection".

Fourth, and finally, reorder the menu as follows:
     [list encodings]
     [recent encodings]
     More            >
     -----
     Auto-detect     >

Putting More *after* the list of encodings is apparently according to some
specification, per bug 47343.  I can't find any spec document for this menu
on mozilla.org that isn't obviously obsolete.  If someone felt strongly about
leaving the Auto-Detect setting *above* the list, I wouldn't fight that, but I
think it's clearer when placed at the bottom.
I fully agree that much of this UI doesn't belong in the menu, but since it also
effects other components, it doesn't really belong under Navigator either. I'd
rather have a special Character Encoding menu under Appearance (somewhere
between Fonts and Languages).

Here's a rough sketch of what this pref panel could look like:

  |- Character Encoding -------------------------------------------|
  |                                                                |
  | For pages that do not define character encoding:               |
  |                                                                |
  |     Default:      | Unicode (UTF-8)   |V|                      |
  | [v] Auto Detect:  | Universal         |V|                      |
  |                                                                |
  | Customize static list of active character encodings:           |
  |  ________________________    _________________________         |
  | |Available Encodings:    |  |Active Encodings:        | [Up]   |
  | | Arabic (IBM-864)       |  | Unicode (UTF-8)         | [Down] |
  | | Arabic (ISO-8859-6)    |  | Western (ISO-8859-1)    |        |          
  | | ...                    |  | ...                     |        |
  |  ------------------------    -------------------------         |
  |   [Add]                        [Remove]                        |
  ------------------------------------------------------------------

As for the menu, I would prefer it as simple as possible:

     [Static list encodings]
     [Recent encodings]
      More Encodings      >

If we really want to kkep providing this amount of control from the UI, we also
need to provide better documentation about the logic Mozilla employs in choosing
the encoding. Even Mozilla veterans find it too obscure:
http://www.squarefree.com/archives/000499.html (see this page for interesting
links about this subject)

Prog.
A prefs page other than "Languages" for those settings would be fine with me.  
There are other pages under "Navigator" which extend to Mail/News, however -- 
Helper Apps and Downloads.  And it's not really an "appearance" issue -- these 
settings are more functional than that.  Under "Advanced" maybe...

I could see removing "Auto-Detect" (the action, not the setting) from the 
Encoding submenu and making it the top item under "More," but I would not like 
to see it removed entirely.  If a page comes up that I think is Japanese, I 
don't want to have to try each individual Japanese (and Unicode) encoding when 
there's a Japanese autodetector that will probably do the job correctly.
(In reply to comment #2)

> I could see removing "Auto-Detect" (the action, not the setting) from the 
> Encoding submenu and making it the top item under "More," but I would not like 
> to see it removed entirely.  

I agree. Actually, just leaving 'auto-detect' where it is is better. Moving it
to 'More' would just make it less accessible.  
Blocks: 254868
Isn't this a dupe of Bug 47343?

Prog.
(In reply to comment #4)
> Isn't this a dupe of Bug 47343?

No.  That bug is about one specific part of the menu: putting the dynamic list 
at the top of the menu rather than the end -- as was specifically pointed out in 
the original report (comment 0 of this bug).

This bug also includes bug 185123.
QA Contact: amyy → i18n
99.999% of my incoming email is of only two character encodings: Western ISO-8859-1 and Unicode UTF-8. It is an unnecessary hassle to have to wend through View | Character Encoding and then select the encoding I want from the dropdown. 

First, an "Auto-detect" should, intuitively, recognize those two encodings and display them properly without having to intervene manually. If the programmers are unable to find a way to do that, then the next best option would be a preference to

Second, provide a one-button toggle on the message pane to flip between two pre-set encodings, such as the two above.

Same for composition. I would like to be able to flip between the two encodings quickly, with one button, from one message to the next.

And why has it taken so long to fix this?
(In reply to comment #6)
> Second, provide a one-button toggle on the message pane to flip between two
> pre-set encodings, such as the two above.

Sounds like a great idea for an extension.

BTW: Is there an extension ideas list somewhere like the one at userscripts.org?
(In reply to comment #6)
> for composition. I would like to be able to flip between the two encodings
> quickly, with one button, from one message to the next.

This isn't strictly necessary.  If you set your default encoding to ISO-8859-1 and set the preference 
  intl.fallbackCharsetList.ISO-8859-1
to
  UTF-8
then if any characters are entered that aren't 8859-1, the system automatically sends the message out as UTF-8.

This doesn't help if there are certain recipients who require UTF-8, for whatever reason, and you compose a message entirely with 8859-1 characters -- that message will go out as 8859-1, and be displayed wrongly in the UTF-8 reader.


But neither that, or anything else in your comment, is directly pertinent to this bug.  If you're seeing AutoDetect fail to correctly detect either of those encodings, that should be opened as a separate bug, with sample messages.
(In reply to comment #8)
> This isn't strictly necessary.  If you set your default encoding to ISO-8859-1
> and set the preference 
>   intl.fallbackCharsetList.ISO-8859-1
> to
>   UTF-8
> then if any characters are entered that aren't 8859-1, the system automatically
> sends the message out as UTF-8.

??! How does one do that through the user interface? Are you recommending editing some configuration file somewhere, that is liable to be overwritten in some future upgrade?

> If you're seeing AutoDetect fail to correctly detect either of those
> encodings, that should be opened as a separate bug, with sample messages.

When I click on View | Character Encoding | Auto-detect I am presented with the following options:

Chinese
East Asian
Japanese
Korean
Off
Russian
Simplified Chinese
Traditional Chinese
Ukrainian
Universal

Where in that list is a choice of a group of encodings that includes just the ones I want? Perhaps this is a training issue, but I see no option other than Off that works for one encoding, and none that support the two I want.
It is editing a config file, but you can do it through the Config Editor in Advanced Options.  The pref doesn't exist by default, but you can create it.  I set this pref years ago and have been through many upgrades and it's never gone away.

The AutoDetect modules are more useful for someone getting mail in Chinese or Japanese, which can come in a number of different encodings and, especially back when this bug was opened, often didn't/don't have the encoding specified.  All of the detectors are supposed to be able to recognize UTF-8 (or so I read in another bug, years ago).  Universal was intended to handle anything but didn't.

Detecting ISO-8859-1 is extra difficult, not to distinguish from UTF-8 but for distinguishing between ISO-8859-2, -3, -4, etc., which are all very similar; they may have similar distributions of their respective non-ASCII characters as well.


Anyway, I primarily opened this as a UI bug -- just looking to move the settings into a less confusing configuration.  It's less important these days because a lot more mail is properly encoded and tagged with the right headers, and UTF-8 is more widespread.
Thanks. I look forward to that fix. I get about 3000 messages a day (a quarter of them expecting replies). About 65% are in ISO-8859-1, about 30% (and rising) are in UTF-8, and the rest are HTML or graphics. That means to read many of the UTF-8 messages I have to change the character encodings of them, which means wending through the menus to do it for about 900 messages if I did it for all of them. 

Even worse, TB doesn't remember the setting for that message when I come back to it, so I have to do it all over again. Making the choice persistent for that message would also be a good fix.
This UI has been streamlined.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.