Our current "language groups" are a leftover from the pre-Unicode world of multiple codepages and codepage-specific fonts, and do not serve our needs well any longer. They are used as the key to specifying default fonts, but the distinctions they make are no longer helpful. For example, "Baltic", "Turkish" and "Western" languages are all written in Latin script, and should normally share font preferences; the split dates back to old codepages where no single codepage provided all the necessary accented characters, but is obsolete in the Unicode world. And the meaning of "Other Languages" and "User Defined" (does that mean x-unicode internally?) are far from clear in the font preferences. At the same time, the langGroups do not provide users with the flexibility they want - hence periodic requests to create new langGroup values such as Tibetan, Persian, Macedonian,..... whenever people want to set specific default fonts for a language that is not currently exposed as its own "group". To update and improve this situation, and to allow better user control, I propose that we replace the langGroup-based font prefs with a system based on BCP 47 "Tags for Identifying Languages" (http://tools.ietf.org/html/bcp47). This provides a standard model for tags that can incorporate language, script, and region, as well as rules for specific/generic tag matching. I envisage that font preferences will primarily be expressed in terms of script, with the language subtag normally being a "wildcard" (and the trailing region subtag being omitted); for example, the default Latin font would be listed under "*-Latn", the default Arabic as "*-Arab", etc. However, there will be the flexibility to create preferences for specific languages, so that if Persian users want different default fonts from the Arabic one, these can be provided as "fa-Arab". Or a different font preference for West African Arabic might be specified as "*-Arab-011" (where 011 is the IANA-registered subtag for Western Africa). Fonts will then be resolved using the script of the text, in combination with the language (where available) and the user's locale (if not overridden by an extended lang tag), and finding the most specific match among the available preferences. So Arabic-script text tagged as "fa" would use the "fa-Arab" fonts if defined; but if not, it would fall back to the "*-Arab" fonts. This will give us a consistent, extensible model where localizers or users can specify additional font preferences as needed, and have them automatically used in the right contexts, rather than being constrained by the fixed (and artificial) collection of defined langGroups. The font preferences UI will need some corresponding redesign; with care, we should be able to make it both clearer and more useful.
This sounds good to me, but I don't think it should be a priority.
Rescoping this bug to focus specifically on font and encoding negotiation, per bug 356038. See also: https://wiki.mozilla.org/User:GPHemsley/BCP_47
Hardware: x86 → All
Summary: Replace langGroups with BCP 47-based language/script/region coding → Implement font and encoding negotiation based on BCP 47
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #1) > This sounds good to me, but I don't think it should be a priority. Considering that this change would be desirable but is not a priority for the core gfx developers, I wonder if a gfx developer would be willing to mark this as a mentored bug and see what happens.
I'm not sure if the last comment and dependency changes belong more here or in bug 356038
Is that something that we can do now with LocaleService::negotiateLanguages and LocaleService::Locale?
(In reply to Zibi Braniecki [:gandalf][:zibi] from comment #7) > Is that something that we can do now with LocaleService::negotiateLanguages > and LocaleService::Locale? I think in principle we could probably use NegotiateLanguages to choose which font prefs to apply to a given run of text, but before that is actually feasible we'd need to restructure how the prefs themselves are specified (get rid of the archaic charset-based "langGroup" concept, and use prefs labelled with BCP47 tags instead). And design a new UI to work with that. So that's the real work to be done here, I think. LocaleService doesn't directly help us with that, but once that's done, we can probably use it to handle the actual negotiation step.
You need to log in before you can comment on or make changes to this bug.