Closed Bug 936442 Opened 6 years ago Closed 6 years ago

Australis: Character encoding widget shows encodings that are not in the Encoding Standard

Categories

(Firefox :: Toolbars and Customization, defect)

x86_64
Linux
defect
Not set

Tracking

()

VERIFIED FIXED
Firefox 30
Tracking Status
firefox29 --- verified
firefox30 --- verified

People

(Reporter: hsivonen, Assigned: Gijs)

References

(Blocks 1 open bug)

Details

(Whiteboard: [Australis:P2])

Attachments

(3 files, 5 obsolete files)

The Australis character encoding widget recreates bug 805374.

Instead, the Australis character encoding widget should not list encodings that are not in the Encoding Standard. http://encoding.spec.whatwg.org/
Henri, is the API from the Encoding spec already available to us? Or at least the valid encodings enum?
Flags: needinfo?(hsivonen)
(In reply to Mike de Boer [:mikedeboer] from comment #1)
> Henri, is the API from the Encoding spec already available to us?

It is, so you can enumerate all encodings the old-fashioned way and filter out the ones for which new TextDecoder(enc) throws.

> Or at
> least the valid encodings enum?

This isn't available as an API.
Flags: needinfo?(hsivonen)
Cool, I'll take your suggested method!
Whiteboard: [Australis:M?][Australis:P2]
(In reply to Henri Sivonen (:hsivonen) from comment #2)
> (In reply to Mike de Boer [:mikedeboer] from comment #1)
> > Henri, is the API from the Encoding spec already available to us?
> 
> It is, so you can enumerate all encodings the old-fashioned way and filter
> out the ones for which new TextDecoder(enc) throws.

We need to create decoders for every possible encoding to check if we can create such a decoder? That sounds... bad?
(In reply to :Gijs Kruitbosch from comment #4)
> We need to create decoders for every possible encoding to check if we can
> create such a decoder? That sounds... bad?

Alternatively, you could hard-code an array with the encodings from the Encoding Standard in it.
I'm guessing this is what we need to do? I'm not sure what encodings to check as to having been removed, but at least this doesn't break anything. Also, I'm assuming that the charset detector won't propose invalid encodings - Henri, is that assumption correct?
Attachment #831508 - Flags: review?(mdeboer)
Attachment #831508 - Flags: review?(hsivonen)
Assignee: nobody → gijskruitbosch+bugs
Status: NEW → ASSIGNED
Comment on attachment 831508 [details] [diff] [review]
don't propose non-standardized encodings,

(In reply to :Gijs Kruitbosch from comment #6)
> I'm guessing this is what we need to do?

The filtering looks OK. The source of the initial list to be filtered looks questionable, though, but I guess that's out of the scope of this bug.

> Also, I'm assuming that the charset detector won't propose invalid encodings
> - Henri, is that assumption correct?

The detectors can propose non-Encoding Standard encodings, but how is that relevant to this UI?
Attachment #831508 - Flags: review?(hsivonen) → review+
(In reply to Henri Sivonen (:hsivonen) from comment #7)
> > Also, I'm assuming that the charset detector won't propose invalid encodings
> > - Henri, is that assumption correct?
> 
> The detectors can propose non-Encoding Standard encodings, but how is that
> relevant to this UI?

Because this patch only touches the code generating the lists, not the detected encodings. Compare these three lines:

https://mxr.mozilla.org/projects-central/source/ux/browser/components/customizableui/src/CustomizableWidgets.jsm?mark=656-656,659-660#646

The bottom two lines will pass through this code, the other one uses a different method. AFAIK this is all a pretty direct mirroring of the original C++ implementation of the charset menu.

I figured the detector would only provide standardized encodings by itself. If that assumption is wrong (?) then the patch should be updated to make that method use the same checks as the list one.
Flags: needinfo?(hsivonen)
Comment on attachment 831508 [details] [diff] [review]
don't propose non-standardized encodings,

Review of attachment 831508 [details] [diff] [review]:
-----------------------------------------------------------------

Could you post a screenshot with 'before' and 'after' captions, so we can see the practical result?

I'd also like to know if the list coming from `nsICharsetConverterManager::GetCharsetDetectorList()` needs checking too. Wouldn't hurt anyway.

Another concern I have is that this makes the contents of this subview and the Menu > View > Character Encoding menu contents out of sync. I know that now the subview will show the *correct* list(s), but how do we fix the other?

::: browser/components/customizableui/src/CustomizableWidgets.jsm
@@ +595,5 @@
>        let items = [];
>        for (let charset of list) {
>          charset = charset.trim();
>  
> +        if (this.blacklist.indexOf(charset) != -1)

please use .contains(charset)

@@ +607,5 @@
>  
>          if (notForBrowser)
>            continue;
>  
> +        if (this.whitelist.indexOf(charset) == -1) {

also use `.contains(charset)` here.
Attachment #831508 - Flags: review?(mdeboer) → review+
(In reply to Mike de Boer [:mikedeboer] from comment #9)
> please use .contains(charset)

not going to work for arrays.
Comment on attachment 831508 [details] [diff] [review]
don't propose non-standardized encodings,

>+    blacklist: [],
>+    whitelist: [],

seems like these should be named _blacklist and _whitelist, as they're not supposed to be used outside of this object.
(In reply to Mike de Boer [:mikedeboer] from comment #9)
> Comment on attachment 831508 [details] [diff] [review]
> don't propose non-standardized encodings,
> 
> Review of attachment 831508 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Could you post a screenshot with 'before' and 'after' captions, so we can
> see the practical result?


At least on my profile, with a random website (the Red Cross' appeal for the Phillipines) I don't see a difference. Perhaps Henri has an example with which we can actually test this?
(In reply to Mike de Boer [:mikedeboer] from comment #9)
> I'd also like to know if the list coming from
> `nsICharsetConverterManager::GetCharsetDetectorList()` needs checking too.
> Wouldn't hurt anyway.

It seems this returns detector names (a la "chardet.zhcn_parallel_state_machine") which don't fly with the TextDecoder, so we shouldn't.
Comment on attachment 831508 [details] [diff] [review]
don't propose non-standardized encodings,

Oops. Sorry. I failed to notice that this doesn't filter out encodings that don't throw but are nonetheless inappropriate for the menu.

These are:
 * UTF-16BE and UTF-16LE
 * The replacement encoding
 * Probably x-user-defined
 * Possibly hz-gb-2312

Instead of taking a legacy list and removing stuff from it, I suggest hardcoding the possible encodings:

(Globally relevant)
UTF-8
windows-1252

(Arabic)
windows-1256
ISO-8859-6

(Baltic)
windows-1257
ISO-8859-4
ISO-8859-13

(Central European)
windows-1250
ISO-8859-2

(Chinese, Simplified)
gbk
gb18030

(Chinese, Traditional)
Big5

(Cyrillic)
windows-1251
ISO-8859-5
KOI8-R
KOI8-U
IBM866
x-mac-cyrillic

(Greek)
windows-1253
ISO-8859-7

(Hebrew)
windows-1255
ISO-8859-8
ISO-8859-8-I

(Japanese)
Shift_JIS
EUC-JP
ISO-2022-JP

(Korean)
EUC-KR

(Thai)
windows-874

(Turkish)
windows-1254

(Vietnamese)
windows-1258

(Rare European encodings; not exposed by IE11)
ISO-8859-3
ISO-8859-10
ISO-8859-14
ISO-8859-15
ISO-8859-16
macintosh
Attachment #831508 - Flags: review+ → review-
(In reply to :Gijs Kruitbosch from comment #8)
> (In reply to Henri Sivonen (:hsivonen) from comment #7)
> > > Also, I'm assuming that the charset detector won't propose invalid encodings
> > > - Henri, is that assumption correct?
> > 
> > The detectors can propose non-Encoding Standard encodings, but how is that
> > relevant to this UI?
> 
> Because this patch only touches the code generating the lists, not the
> detected encodings. Compare these three lines:
> 
> https://mxr.mozilla.org/projects-central/source/ux/browser/components/
> customizableui/src/CustomizableWidgets.jsm?mark=656-656,659-660#646
> 
> The bottom two lines will pass through this code, the other one uses a
> different method. AFAIK this is all a pretty direct mirroring of the
> original C++ implementation of the charset menu.
> 
> I figured the detector would only provide standardized encodings by itself.
> If that assumption is wrong (?) then the patch should be updated to make
> that method use the same checks as the list one.

It seems to me the detector line deals with names of detectors rather than names of encodings returned by detectors. What am I missing?
Flags: needinfo?(hsivonen)
Attachment #832128 - Flags: review?(hsivonen)
Attachment #831508 - Attachment is obsolete: true
Comment on attachment 832128 [details] [diff] [review]
don't propose non-standardized encodings,

What's the reason for using this list to filter another list instead of just using this list directly?
(In reply to Henri Sivonen (:hsivonen) from comment #17)
> Comment on attachment 832128 [details] [diff] [review]
> don't propose non-standardized encodings,
> 
> What's the reason for using this list to filter another list instead of just
> using this list directly?

As far as I can see, we currently propose only some of the items on this list. Showing the entire list would show all 25-odd items rather than the 2 we currently show (again, for my test page, just UTF-8 and Windows-1252). I'm not aware of a better heuristic to use to determine which items should be shown, given the list you gave.
Comment on attachment 832128 [details] [diff] [review]
don't propose non-standardized encodings,

OK.
Attachment #832128 - Flags: review?(hsivonen) → review+
Comment on attachment 832128 [details] [diff] [review]
don't propose non-standardized encodings,

Before, After: http://imgur.com/UNws0jz,wgZK9KJ#1
Attachment #832128 - Flags: review?(mdeboer)
Comment on attachment 832128 [details] [diff] [review]
don't propose non-standardized encodings,

Review of attachment 832128 [details] [diff] [review]:
-----------------------------------------------------------------

Even though we should land this as-is, it makes me really sad.

Henri, I appreciate that you produced this list for us. There are two remaining issues:

1) This bug increases the priority for a follow-up after Australis landing to separate all this logic out to a JSM, also mentioned in https://bugzilla.mozilla.org/show_bug.cgi?id=865916#c12. Then we need to do away with the C++/ XBL implementation of the Character Encoding menu, because it will be out-of-sync from now on.
2) Henri, you provided a whitelist for us - which we now hardcoded for Australis - and the data source we used before is the blacklist located at `intl/uconv/src/charsetData.properties`. What's your strategy to keep both of these up to date? Should our whitelist be merged to uconv as well?

::: browser/components/customizableui/src/CustomizableWidgets.jsm
@@ +21,5 @@
>  const kNSXUL = "http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul";
>  const kPrefCustomizationDebug = "browser.uiCustomization.debug";
>  const kWidePanelItemClass = "panel-wide-item";
>  
> +// See bug 936442 for the rationale for this list

nit: 'See bug 936442 for the rationale behind this list'

@@ +23,5 @@
>  const kWidePanelItemClass = "panel-wide-item";
>  
> +// See bug 936442 for the rationale for this list
> +const kCharsets = new Set([
> +  //(Globally relevant)

nit: plz rewrite to our conventions re. comments: `// Globally relevant.`, also for the other labels below.

@@ +87,5 @@
> +  "iso-8859-10",
> +  "iso-8859-14",
> +  "iso-8859-15",
> +  "iso-8859-16",
> +  "macintosh",

No need for a trailing comma.
Attachment #832128 - Flags: review?(mdeboer) → review+
Henri, could you please look at the two issues I mentioned in comment 21?
Flags: needinfo?(hsivonen)
(In reply to Mike de Boer [:mikedeboer] from comment #21)
> 1) This bug increases the priority for a follow-up after Australis landing
> to separate all this logic out to a JSM, also mentioned in
> https://bugzilla.mozilla.org/show_bug.cgi?id=865916#c12. Then we need to do
> away with the C++/ XBL implementation of the Character Encoding menu,
> because it will be out-of-sync from now on.

The patch in bug 805374 re-implements the encoding list in the *browser* View menu in JS.

> 2) Henri, you provided a whitelist for us - which we now hardcoded for
> Australis - and the data source we used before is the blacklist located at
> `intl/uconv/src/charsetData.properties`. What's your strategy to keep both
> of these up to date? Should our whitelist be merged to uconv as well?

I think the list used here and the list used in bug 805374 should become a single shared JS constant after Australis lands. I would prefer to get rid of charsetData.properties and nsCharsetMenu.cpp in m-c (either really get rid of them or throw them over the wall to c-c).
Flags: needinfo?(hsivonen)
I don't think we should land this. We should wait until bug 805374 is fixed and then use that implementation. No point doing the same work twice - and plans on that bug are still in flux, too.
Depends on: 805374
Whiteboard: [Australis:M?][Australis:P2] → [Australis:M?][Australis:P4]
Attachment #832128 - Attachment is obsolete: true
(In reply to Henri Sivonen (:hsivonen) from comment #23)
> (In reply to Mike de Boer [:mikedeboer] from comment #21)
> > 1) This bug increases the priority for a follow-up after Australis landing
> > to separate all this logic out to a JSM, also mentioned in
> > https://bugzilla.mozilla.org/show_bug.cgi?id=865916#c12. Then we need to do
> > away with the C++/ XBL implementation of the Character Encoding menu,
> > because it will be out-of-sync from now on.
> 
> The patch in bug 805374 re-implements the encoding list in the *browser*
> View menu in JS.
> 
> > 2) Henri, you provided a whitelist for us - which we now hardcoded for
> > Australis - and the data source we used before is the blacklist located at
> > `intl/uconv/src/charsetData.properties`. What's your strategy to keep both
> > of these up to date? Should our whitelist be merged to uconv as well?
> 
> I think the list used here and the list used in bug 805374 should become a
> single shared JS constant after Australis lands. I would prefer to get rid
> of charsetData.properties and nsCharsetMenu.cpp in m-c (either really get
> rid of them or throw them over the wall to c-c).

Right. The contents of this list should mimic those of the view menu, except we can't do submenus. Fortunately, bug 805374 is reimplementing stuff so there's no more submenus, AFAICT, so presumably we can just share most of the implementation at that point.
There's still the autodetector submenu. I'm uneasy with Australis offering the autodetectors without burying them in the submenu, since hopefully we'd get to the point where IE/Chrome/Safari are with autodetection only for Japanese. (Bug 844115, bug 845791, bug 844118, bug 844120.)

Is there a reason why the Australis widget can't simply #include browser-charsetmenu.inc?
(In reply to Henri Sivonen (:hsivonen) from comment #26)
> There's still the autodetector submenu. I'm uneasy with Australis offering
> the autodetectors without burying them in the submenu, since hopefully we'd
> get to the point where IE/Chrome/Safari are with autodetection only for
> Japanese. (Bug 844115, bug 845791, bug 844118, bug 844120.)

I'm not sure I understand completely, but do you mean that landing the most recent patch would take away your concern here?

> Is there a reason why the Australis widget can't simply #include
> browser-charsetmenu.inc?

Yes, a couple (we too like doing less work instead of going all the way like this)

1) biggest difference: this interactive list lives in a panel, with quite a number of different semantics.
2) the widget can be placed in multiple areas (ie it's customizable) and it's appearance must change depending on its position. It looks different when placed in a toolbar or in the MenuPanel.

I'm sure there are more reasons, but I can't think of 'em atm.
Blocks: 943252
(In reply to Mike de Boer [:mikedeboer] from comment #27)
> I'm not sure I understand completely, but do you mean that landing the most
> recent patch would take away your concern here?

My detector concern would be addressed by hard-coding a list of only ["off", "ja_parallel_state_machine", "ruprop", "ukprob"] in the detector part of the Australis menu.
Now that our main UI has been updated we should update sooner rather than later.
Whiteboard: [Australis:M?][Australis:P4] → [Australis:M?][Australis:P2]
Depends on: 943524
Mike said he'd like to take this, so punting. :-)
Assignee: gijskruitbosch+bugs → mdeboer
Attached patch Patch 1: non-Australis changes (obsolete) — Splinter Review
Attachment #8340009 - Flags: review?(gijskruitbosch+bugs)
Attached patch Patch 2: Australis changes. (obsolete) — Splinter Review
Attachment #8340010 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8340009 [details] [diff] [review]
Patch 1: non-Australis changes

Review of attachment 8340009 [details] [diff] [review]:
-----------------------------------------------------------------

So there's a bunch of feedback below, but really the main point here is that you've adapted all the non-Australis code to fit into the widget-view-based DOM structure, changing as little Australis-based code as possible. That's the wrong way around. We should update the DOM structure of the widget to match the current layout of the menu, and insert the same separators.

Furthermore, the main menu and appmenu don't deal with setting the active charset/detector in here, because that happens in browser.js. You now built in code to deal with this into the module, but only somewhat. And instead of using the fact that we now have a menu that is generated once instead of all the time, you're making the Australis code generate it into the widget over and over again. Instead, you should keep the dealing with which encoding is active inside the Australis code, or, alternatively, move both browser.js' MultiplexHandler and friends, and some of the Australis bits, into this module (despite the generic names, MultiplexHandler et al. are only used by this menu) and make this JSM deal with both possibilities. That'll simplify a bunch of the code, because now responsibilities are strewn through 3 places instead of 2. I'm fine with keeping it 2, or reducing it to 1, but you're increasing the code fragmentation and that's not OK.

::: browser/locales/en-US/chrome/browser/charsetMenu.dtd
@@ -4,5 @@
>  
>  <!ENTITY charsetMenu.label             "Character Encoding">
>  <!ENTITY charsetMenu.accesskey         "C">
>  <!ENTITY charsetMenuAutodet.label      "Auto-Detect">
>  <!ENTITY charsetMenuAutodet.accesskey  "D"><!-- A reserved for Arabic -->

Why did you leave this one? It's now duplicated.

::: browser/modules/CharsetMenu.jsm
@@ +12,5 @@
>    const kUrl = "chrome://browser/locale/charsetMenu.properties";
>    return Services.strings.createBundle(kUrl);
>  });
> +
> +const kAutoDetectors = new Map(Iterator({

You never access this as a map, so don't make it one. Nested arrays work just fine for the for (let [a,b] of x) loop.

@@ +83,5 @@
>    // "macintosh"
>  ]);
>  
>  // Always at the start of the menu, in this order, followed by a separator.
> +const kPinned = new Set([

This is wrong. This list is ordered, so it shouldn't be a Set.

@@ +88,3 @@
>    "UTF-8",
>    "windows-1252"
> +]);

I'd actually like to just delete the pinned items from the first set immediately after declaring the pinned items. It'll simplify the rest of the code and remove the need to clone stuff like the old code here did.

@@ +88,5 @@
>    "UTF-8",
>    "windows-1252"
> +]);
> +
> +function createItem(doc, encoding, idPrefix = "", showAccessKeys = false,

This isn't always an encoding, and that creates all the complications otherwise in here. I'd much prefer:

function createDOMNode(doc, nodeData) {
}

where you just ensure that the nodeData had a localName (to decide menuitem[type="radio"] vs. toolbarbutton), id, label, name, accesskey.

and then you should just pass the right information to begin with, rather than stowing all the complications in here, where some of these complications make no sense to me and I have to go finding the callsites to figure out why you have both 'namePrefix' and 'group' and why one overrides the other but only for one property... the names of the parameters certainly don't tell me.

@@ +92,5 @@
> +function createItem(doc, encoding, idPrefix = "", showAccessKeys = false,
> +                    group = "charset", namePrefix = null) {
> +  let menuItem = doc.createElement("menuitem");
> +  menuItem.setAttribute("type", "radio");
> +  menuItem.setAttribute("name", (namePrefix || group) + "Group");

This doesn't make much sense.

@@ +115,5 @@
> +  return menu.appendChild(doc.createElement("menupopup"));
> +}
> +
> +function isAlreadyBuilt(parent) {
> +  // The detector menu and/ or charset menu are already built when no childNodes

Nit: s#and/ or#and/or#

@@ +123,2 @@
>  
>  this.CharsetMenu = Object.freeze({

While we're here, can you move the Object.freeze to be its own call, and just make this a:

let CharsetMenu = {

?

I missed this in my last review of bug 805374

@@ +127,5 @@
> +   * submenu.
> +   *
> +   * @param eventOrNode      Event that caused this handler to be invoked. Usually
> +   *                         originates from `onpopupshowing`. May also be a direct
> +   *                         reference to a DOM node.

If we're doing this anyway, just always pass in a node and update the callers in the .inc file.

@@ +131,5 @@
> +   *                         reference to a DOM node.
> +   * @param idPrefix         String that will prefix the `id` property of each
> +   *                         created menu item.
> +   * @param showAccessKeys   Boolean flag to set the `accesskey` property for each
> +   *                         created menu item. Defaults to FALSE.

Nit: false should be lowercase

@@ +134,5 @@
> +   * @param showAccessKeys   Boolean flag to set the `accesskey` property for each
> +   *                         created menu item. Defaults to FALSE.
> +   * @param popupFactory     Factory method that returns a DOM node of the Auto-Detect
> +   *                         submenu. Defaults to the internal `createAutoDetectPopup`
> +   *                         method.

I object strenuously to this naming. Let's not pretend we're in Java-land. These are just functions, they're not methods and they're not factories.

This particular one doesn't even try to be a factory, because the point of a factory is that it creates an object, and the 'popupFactory' in some cases just returns an existing thing (which it didn't create itself).

In this particular case, just make it a boolean on whether or not we need to create a submenu to stick the detectors in or not.

@@ +136,5 @@
> +   * @param popupFactory     Factory method that returns a DOM node of the Auto-Detect
> +   *                         submenu. Defaults to the internal `createAutoDetectPopup`
> +   *                         method.
> +   * @param itemFactory      Factory method that returns a DOM node for a menu
> +   *                         item. Defaults to the internal `createItem` method.

I think I'd personally prefer if this was just a bool to indicate which item to generate. If it's a menuitem, create it and set type="radio", otherwise create a toolbarbutton and set all the properties on there.

@@ +139,5 @@
> +   * @param itemFactory      Factory method that returns a DOM node for a menu
> +   *                         item. Defaults to the internal `createItem` method.
> +   * @param separatorFactory Factory method that returns a DOM node for a menu
> +   *                         separator. Defaults to the internal `createSeparator`
> +   *                         method.

There's no need to overengineer this. Subviews use menuseparators already, so I don't see why this needs to be abstracted. You should leave separators between the detectors and the actual charsets, and between the 2 items at the top and all the other charsets. Don't do this.

@@ +149,4 @@
>        return;
> +
> +    // First build the Auto-Detect submenu.
> +    let accesskey;

let accesskey = showAccessKeys ? blah : null;

@@ +155,5 @@
> +    let popup = popupFactory(parent, gBundle.GetStringFromName("charsetMenuAutodet"), accesskey);
> +    this.buildAutoDetectMenu(popup, idPrefix, showAccessKeys, itemFactory);
> +
> +    // Then build the list of encodings.
> +    this.buildEncodingsMenu(parent, idPrefix, showAccessKeys, itemFactory, separatorFactory);

There is no need to split these two parts out, also because the second of these two methods assumes the first one is called before it by starting out with inserting a separator, so you could never call it independently anyway. Just keep stuff in here.

@@ +175,5 @@
> +    // Fetch the list of auto-detectors and build items for them.
> +    let detectors = this.getAutoDetectors();
> +    for (let item of detectors) {
> +      parent.appendChild(itemFactory(doc, item, idPrefix, showAccessKeys,
> +                         "chardet", "detector"));

Please fix the indentation here

@@ +203,5 @@
> +    let sep = separatorFactory(doc);
> +    if (sep)
> +      parent.appendChild(sep);
> +
> +    for (let item of encodings.pinned)

Nit: braces for for loops.

Also, use a more descriptive name than 'item'

@@ +210,5 @@
> +    sep = separatorFactory(doc);
> +    if (sep)
> +      parent.appendChild(sep);
> +
> +    for (let item of encodings.other)

And here.

@@ +218,5 @@
> +  /**
> +   * Retrieve a full list of items for the Auto-Detect submenu.
> +   *
> +   * @returns an Array of items. Each items has the following signature:
> +   * <code lang="js">

Ugh. Don't do this. See also below.

@@ +226,5 @@
> +   *   accesskey: "J",                  // Keyboard shortcut key
> +   *   active: true                     // TRUE when this encoding is currently active
> +   * }
> +   * </code>
> +   */

See also below - this method should have a cache, and you should keep the active checking/setting elsewhere.

@@ +254,5 @@
> +    return detectors;
> +  },
> +
> +  /**
> +   * Retrieve a full list of items for the Character Encoding menu.

Nit: s/full//

Another nit: retrieve implies it's just returning some internal value, but you're actually constructing stuff, and there's no lazy caching or anything, so I'd say that's a misnomer too.

@@ +257,5 @@
> +  /**
> +   * Retrieve a full list of items for the Character Encoding menu.
> +   *
> +   * @param doc Current DOM document object.
> +   * @returns an Array of items. Each items has the following signature:

Nit: grammar

@@ +258,5 @@
> +   * Retrieve a full list of items for the Character Encoding menu.
> +   *
> +   * @param doc Current DOM document object.
> +   * @returns an Array of items. Each items has the following signature:
> +   * <code lang="js">

Again, please don't do this.

@@ +264,5 @@
> +   *   id: "UTF-8",      // ID part of the encoding
> +   *   title: "Unicode", // Title used for item labels
> +   *   accesskey: "U",   // Keyboard shortcut key
> +   *   active: true      // TRUE when this encoding is currently active
> +   * }

This would be more straightforward (and wouldn't need the braces or colons or quotes) if you just described the different recognized properties.

@@ +267,5 @@
> +   *   active: true      // TRUE when this encoding is currently active
> +   * }
> +   * </code>
> +   */
> +  getEncodings: function(doc) {

So... I see the point of doing this... sort of... but then you didn't implement it? Namely: this is a plain JS object, so you should cache it and not generate all this info more than once. Obviously that means you shouldn't make it set the active encoding. I'm not sure why that's necessary anyway, because the previous module didn't do that either. Just don't bother with it and only check that bit when creating the item?

@@ +350,5 @@
> +   *
> +   * @param doc Current DOM document object.
> +   */
> +  getActiveEncoding: function(doc) {
> +    return doc.defaultView.content.document.characterSet;

You should be nullchecking this.
Attachment #8340009 - Flags: review?(gijskruitbosch+bugs) → review-
Comment on attachment 8340010 [details] [diff] [review]
Patch 2: Australis changes.

Review of attachment 8340010 [details] [diff] [review]:
-----------------------------------------------------------------

I'm clearing this because of my r- of the other patch.
Attachment #8340010 - Flags: review?(gijskruitbosch+bugs)
See Also: → 947500
Duplicate of this bug: 947500
Per discussion IRL, taking.
Assignee: mdeboer → gijskruitbosch+bugs
Blair, what do you think of this? I noticed all this moved to toolkit, so I'm not sure how much we should be changing it for the panel UI specifically, but without any labels that list is seriously weird-looking.
Attachment #8362578 - Flags: review?(bmcbride)
Attachment #8340009 - Attachment is obsolete: true
Attachment #8340010 - Attachment is obsolete: true
Comment on attachment 8362578 [details] [diff] [review]
Australis character encoding widget needs updating,

Review of attachment 8362578 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/customizableui/src/CustomizableWidgets.jsm
@@ +658,2 @@
>          window.BrowserSetForcedCharacterSet(value);
> +      } else if (section == "chardet") {

Egh, clearly this should be node.id.contains("chardet")
Blocks: 948978
Comment on attachment 8362578 [details] [diff] [review]
Australis character encoding widget needs updating,

Review of attachment 8362578 [details] [diff] [review]:
-----------------------------------------------------------------

r- based on my just-invented grunts-per-patch scoring system.

This should roughly work - one main issue from a toolkit standpoint though (realMenu param - see below).

General notes: (Some of these aren't directly related to this bug, but arguably the bug has changed, so they are directly related to this patch)
* The sub-headings in the Australis panel really need some visual polish to make them look different from the clickable toolbarbuttons.
* Really need to make that sub-panel have a max-height to it scrolls. On my 1080p screen I can position the window so part of the panel is unusable with this patch - it'll be much worse on lower resolution screens.

::: browser/components/customizableui/src/CustomizableWidgets.jsm
@@ -681,5 @@
> -      while (containerElem.firstChild) {
> -        containerElem.removeChild(containerElem.firstChild);
> -      }
> -
> -      containerElem.addEventListener("command", this.onCommand, false);

I don't know about you, but I was thinking it might be useful to be able to click on things.

@@ +655,2 @@
>  
> +      if (node.id.contains("charset")) {

We know the prefix - should just check the ID starts with prefix+"charset". Safer and faster.

@@ +655,3 @@
>  
> +      if (node.id.contains("charset")) {
> +        let value = node.id.indexOf("charset." + "charset.".length);

Erm...?!

@@ +658,3 @@
>          window.BrowserSetForcedCharacterSet(value);
> +      } else if (section == "chardet") {
> +        let value = node.id.indexOf("chardet." + "chardet.".length);

Ditto: Eh?!

::: toolkit/modules/CharsetMenu.jsm
@@ +93,5 @@
> +
> +let gDetectorInfoCache, gCharsetInfoCache, gPinnedInfoCache;
> +
> +let CharsetMenu = {
> +  build: function BuildCharsetMenu(parent, idPrefix="", showAccessKeys=true, realMenu=true) {

The realMenu parameter feels rather hacky, and is very tightly coupled with what we want for Australis today - which isn't good for toolkit code. Can you find a nice way to change that to an optional builder function (or object), that CustomizableWidgets.jsm supplies to build the DOM nodes? And if it's not supplied, fall back to building a normal menu.

Alternatively, we could have two methods here - one that returns a data structure, and one that builds a DOM tree for a normal menu based on that data structure. And CustomizableWidgets.jsm uses the first method to get the data, and builds the DOM itself. 


Also: Kill the function name, now you've moved Object.freeze?

@@ +104,5 @@
> +      node.setAttribute("label", nodeInfo.label);
> +      if (showAccessKeys && nodeInfo.accesskey) {
> +        node.setAttribute("accesskey", nodeInfo.accesskey);
> +      }
> +      node.id = idPrefix ? idPrefix + nodeInfo.id : nodeInfo.id;

This line is awful to read :)

@@ +106,5 @@
> +        node.setAttribute("accesskey", nodeInfo.accesskey);
> +      }
> +      node.id = idPrefix ? idPrefix + nodeInfo.id : nodeInfo.id;
> +      return node;
> +    };

No semi-colon, this is a function not a function-expression.

@@ +163,5 @@
> +    gCharsetInfoCache.forEach(charsetInfo => parent.appendChild(createDOMNode(doc, charsetInfo)));
> +  },
> +
> +  getDetectorInfo: function() {
> +    return kAutoDetectors.map(([detectorName, DOMID]) => ({

Nit: DOMID makes my right eye twitch.

@@ +223,5 @@
> +  _getDetectorAccesskey: function(detector) {
> +    try {
> +      return gBundle.GetStringFromName("charsetMenuAutodet." + detector + ".key");
> +    } catch (ex) {}
> +    return detector;

This fall back doesn't look right for an accesskey.
Attachment #8362578 - Flags: review?(bmcbride) → review-
OK. I've reworked the patch to only touch toolkit, except for a temporary bit of code to allow the current CUI code to use the .properties file. If we want to ship in 29, we need the strings part landed before next week, hence splitting this up. Blair, does this part at least look landable?
Attachment #8365881 - Flags: review?(bmcbride)
Comment on attachment 8365881 [details] [diff] [review]
part 1: refactor CharsetMenu to allow Australis code to use it,

>+charsetMenuAutodet.uk = Ukranian

Please avoid regressing bug 952736.
Thanks, Henri. :-)
Attachment #8365883 - Flags: review?(bmcbride)
Attachment #8365881 - Attachment is obsolete: true
Attachment #8365881 - Flags: review?(bmcbride)
Comment on attachment 8365883 [details] [diff] [review]
part 1: refactor CharsetMenu to allow Australis code to use it,

>diff --git a/browser/base/content/browser-charsetmenu.inc b/browser/base/content/browser-charsetmenu.inc
>-    <menu label="&charsetMenuAutodet.label;"
>-#ifndef OMIT_ACCESSKEYS
>-          accesskey="&charsetMenuAutodet.accesskey;"
>-#endif
>-        >
>-      <menupopup>
Why are you creating this element dynamically? (Ideally I wanted to be able to have a charset menu without an autodetect option.)

>+const kAutoDetectors = [
>+  ["off", "off"],
>+  ["ja", "ja_parallel_state_machine"],
>+  ["ru", "ruprob"],
>+  ["uk", "ukprob"]
>+];
The detector name is actually "", not "off", but an id of "chardet." looked silly. Of course, the real silliness is embedding the detector name in an id instead of using a custom attribute for it, see bug 956657.

>+      gCharsetInfoCache = this.getCharsetInfo([...kEncodings]);
Why are you cloning this array?
Blocks: 956657
(In reply to neil@parkwaycc.co.uk from comment #43)
> Comment on attachment 8365883 [details] [diff] [review]
> part 1: refactor CharsetMenu to allow Australis code to use it,
> 
> >diff --git a/browser/base/content/browser-charsetmenu.inc b/browser/base/content/browser-charsetmenu.inc
> >-    <menu label="&charsetMenuAutodet.label;"
> >-#ifndef OMIT_ACCESSKEYS
> >-          accesskey="&charsetMenuAutodet.accesskey;"
> >-#endif
> >-        >
> >-      <menupopup>
> Why are you creating this element dynamically? (Ideally I wanted to be able
> to have a charset menu without an autodetect option.)

De-duplication, basically. Sharing more code. The current bits will let you have a menu without the autodetect option, you'll just need to build it yourself off the data getData() provides.


> >+const kAutoDetectors = [
> >+  ["off", "off"],
> >+  ["ja", "ja_parallel_state_machine"],
> >+  ["ru", "ruprob"],
> >+  ["uk", "ukprob"]
> >+];
> The detector name is actually "", not "off", but an id of "chardet." looked
> silly. Of course, the real silliness is embedding the detector name in an id
> instead of using a custom attribute for it, see bug 956657.

Right, but changing that here will hold up this bug even more (because then I get to fix all the consumers), and at least the string changes kind of need to make 29 if we want to ship Australis then.

> >+      gCharsetInfoCache = this.getCharsetInfo([...kEncodings]);
> Why are you cloning this array?

It's not an array, it's a Set. It's handy that it's a Set because we get O(1) (and easy-to-read) .delete() for the pinned items that need to be removed from there. It needs to be an array because that's what getCharsetInfo deals in. In principle, getCharsetInfo could probably use a generator expression ([getInfo(encoding) for (encoding of encodings)] instead of .map to do what it does, and that'd remove the requirement here. I'm not sure that kind of thing is worth a new patch though.
(In reply to Gijs Kruitbosch from comment #44)
> (In reply to comment #43)
> > Why are you creating this element dynamically? (Ideally I wanted to be able
> > to have a charset menu without an autodetect option.)
> 
> De-duplication, basically. Sharing more code. The current bits will let you
> have a menu without the autodetect option, you'll just need to build it
> yourself off the data getData() provides.

[So the irony is that I'm having to duplicate more code than you're saving overall by sharing the script to create the menu instead of leaving it in XUL in both files.]

> > Why are you cloning this array?
> 
> It's not an array, it's a Set.

Sorry, I'd overlooked that.

> It's handy that it's a Set because we get
> O(1) (and easy-to-read) .delete() for the pinned items that need to be
> removed from there.

[Huh, why are they even put in that set in the first place?]
(In reply to neil@parkwaycc.co.uk from comment #45)
> (In reply to Gijs Kruitbosch from comment #44)
> > (In reply to comment #43)
> > > Why are you creating this element dynamically? (Ideally I wanted to be able
> > > to have a charset menu without an autodetect option.)
> > 
> > De-duplication, basically. Sharing more code. The current bits will let you
> > have a menu without the autodetect option, you'll just need to build it
> > yourself off the data getData() provides.
> 
> [So the irony is that I'm having to duplicate more code than you're saving
> overall by sharing the script to create the menu instead of leaving it in
> XUL in both files.]

That sounds bad. Would things be better if the work inside the build method was split up in a buildDetectors and buildCharsets method, and you could elect to just call the latter one, and otherwise get the same output for the charsets? That wouldn't be hard... But also, you could do that in your patch, right? Why are you having to duplicate more code?
 
> > > Why are you cloning this array?
> > 
> > It's not an array, it's a Set.
> 
> Sorry, I'd overlooked that.
> 
> > It's handy that it's a Set because we get
> > O(1) (and easy-to-read) .delete() for the pinned items that need to be
> > removed from there.
> 
> [Huh, why are they even put in that set in the first place?]

Meh. Because it seemed to be clearer for future readers to have a complete set of encodings than one that magically missed UTF8 and Windows-1252 than to spell out which ones were 'pinned' and remove them from a complete list of front-end-supported encodings.
(In reply to Gijs Kruitbosch from comment #46)
> (In reply to comment #45)
> > [So the irony is that I'm having to duplicate more code than you're saving
> > overall by sharing the script to create the menu instead of leaving it in
> > XUL in both files.]
> 
> That sounds bad. Would things be better if the work inside the build method
> was split up in a buildDetectors and buildCharsets method, and you could
> elect to just call the latter one, and otherwise get the same output for the
> charsets? That wouldn't be hard... But also, you could do that in your
> patch, right? Why are you having to duplicate more code?

Sure, I could do that in my patch for bug 956657. (Sorry, I was thinking of what patch(es) I might need for comm-central at the time.)

> > [Huh, why are they even put in that set in the first place?]
> 
> Meh. Because it seemed to be clearer for future readers to have a complete
> set of encodings than one that magically missed UTF8 and Windows-1252 than
> to spell out which ones were 'pinned' and remove them from a complete list
> of front-end-supported encodings.

Isn't that what comments are for? ;-)
FWIW, I would have preferred to get bug 959061 in the current release train, but I didn't  want to collide with this bug. I wonder if there's any chance of getting both in before 29 moves to Aurora... :-(
(In reply to neil@parkwaycc.co.uk from comment #47)
> > Meh. Because it seemed to be clearer for future readers to have a complete
> > set of encodings than one that magically missed UTF8 and Windows-1252 than
> > to spell out which ones were 'pinned' and remove them from a complete list
> > of front-end-supported encodings.
> 
> Isn't that what comments are for? ;-)

Hahahaha.... *ahem* sorry. Mikes way is more foolproof.
(In reply to Blair McBride [:Unfocused] from comment #49)
> Hahahaha.... *ahem* sorry. Mikes way is more foolproof.

Er, Gijs' way.
(In reply to Henri Sivonen (:hsivonen) from comment #48)
> FWIW, I would have preferred to get bug 959061 in the current release train,
> but I didn't  want to collide with this bug. I wonder if there's any chance
> of getting both in before 29 moves to Aurora... :-(

Yea, should be able to get both in.
(Sorry, this being Australis, it's had to get priority.)
Attachment #8365883 - Flags: review?(bmcbride) → review+
remote:   https://hg.mozilla.org/integration/fx-team/rev/33f826ee0dfc
Whiteboard: [Australis:M?][Australis:P2] → [Australis:P2][leave open]
>   7.215 +  _getCharsetAccessKey: function(charset) {
>   7.216 +    try {
>   7.217 +      accesskey = gBundle.GetStringFromName(charset + ".key");
>   7.218 +    } catch (ex) {}
>   7.219 +    return "";
>   7.220 +  },

The line inside the try block should be a return statement instead of being an assignment, right?
(In reply to Henri Sivonen (:hsivonen) from comment #54)
> The line inside the try block should be a return statement instead of being
> an assignment, right?

Yes, it should.
(In reply to Mike de Boer [:mikedeboer] from comment #55)
> (In reply to Henri Sivonen (:hsivonen) from comment #54)
> > The line inside the try block should be a return statement instead of being
> > an assignment, right?
> 
> Yes, it should.

Ugh. Can't believe I missed that. Fixed on fx-team:

remote:   https://hg.mozilla.org/integration/fx-team/rev/edb9aca4de06
Attachment #8365883 - Flags: checkin+
For part 2, please note the new FoldCharset() function per bug 959061 comment 8. That function treats ISO-8859-8-I as windows-1255 (two Hebrew encodings that differ by one currency symbol code point) and gb18030 as gbk (two Simplified Chinese encodings that now decode the same even though the encoders differ) for the purposes of showing which item is currently active.
(In reply to Tim Taubert [:ttaubert] from comment #53)
> https://hg.mozilla.org/mozilla-central/rev/33f826ee0dfc

Sun Feb 02 2014 23:44:34
Error: undefined entity
Source file: chrome://communicator/content/charsetOverlay.xul
Line: 34, Column: 7
Source code:
      <menu id="charsetMenuAutodet"

Bah humbug.
See Also: → 943732
Comment on attachment 8369758 [details] [diff] [review]
, Australis charset view part 2: implement new view for Australis,

Review of attachment 8369758 [details] [diff] [review]:
-----------------------------------------------------------------

::: browser/components/customizableui/src/CustomizableWidgets.jsm
@@ +676,5 @@
> +      let currentCharset = content && content.document && content.document.characterSet;
> +      if (currentCharset) {
> +        currentCharset = aDocument.defaultView.FoldCharset(currentCharset);
> +      }
> +      currentCharset = currentCharset ? "charset." + currentCharset : "";

Nit: If you're using ternary operators, add brackets to avoid ambiguity/aid readability around any expression that's more than just an identifier/value.

@@ +681,5 @@
> +
> +      let pinnedContainer = aDocument.getElementById("PanelUI-characterEncodingView-pinned");
> +      let charsetContainer = aDocument.getElementById("PanelUI-characterEncodingView-charsets");
> +      let elements = [...(pinnedContainer.childNodes)];
> +      elements = elements.concat([...(charsetContainer.childNodes)]);

Nit: Can shorten this if you'd like:
  let elements = [...pinnedContainer.childNodes, ...charsetContainer.childNodes];
Attachment #8369758 - Flags: review?(bmcbride) → review+
w/ nits:

remote:   https://hg.mozilla.org/integration/fx-team/rev/84fef38d71e8
Whiteboard: [Australis:P2][leave open] → [Australis:P2][fixed-in-fx-team]
Depends on: 968277
https://hg.mozilla.org/mozilla-central/rev/84fef38d71e8
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [Australis:P2][fixed-in-fx-team] → [Australis:P2]
Target Milestone: --- → Firefox 30
Comment on attachment 8369758 [details] [diff] [review]
, Australis charset view part 2: implement new view for Australis,

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Australis
User impact if declined: Confusing character encoding subview where sometimes no items are selected, where the wrong items are shown/hidden (see dependent bugs)
Testing completed (on m-c, etc.): been on m-c, no known regressions
Risk to taking this patch (and alternatives if risky): reasonably low. Backing out the other changes for the main menu isn't really an option at this point.
String or IDL/UUID changes made by this patch: None.
Attachment #8369758 - Flags: approval-mozilla-aurora?
Attachment #8369758 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 969279
No longer depends on: 969279
QA Contact: cornel.ionce
Verified this issue as fixed on latest Aurora (build ID: 20140307004002) and on latest Nightly (build ID: 20140307030202).
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.