Closed Bug 1023451 Opened 5 years ago Closed 5 years ago

Some locale names not displayable in locale picker with default fonts

Categories

(Firefox for Android :: Locale switching and selection, defect)

32 Branch
All
Android
defect
Not set

Tracking

()

VERIFIED FIXED
Firefox 33
Tracking Status
firefox32 --- fixed
firefox33 --- verified
fennec 32+ ---

People

(Reporter: aaronmt, Assigned: rnewman)

References

(Blocks 1 open bug)

Details

Attachments

(5 files, 1 obsolete file)

Attached image screenshot.png
Currently, shipping locales are not displayed if the device is missing a font. In my case, my Nexus 7 (2013, 4.4.3) does not ship with a Bangla font. The user will see the locales listed as whitespace or empty lines.

See screenshot.

In the case here for my device: 'gu_IN', 'pa-IN', 'or'

Selecting those locales 'breaks' the browser essentially in that no strings are shown, see second screenshot attached.

--
Asus Nexus 7 (2013, Android 4.4.3)
Aurora (06/10)
Lastly, content will show the traditional blocks
Anecdotally, HTC, Samsung, and Sony devices ship with enough font coverage. 

I've done a little research on whether it's possible to detect if a device can render a glyph; it's not easy, but there are hacks (like drawing to two bitmaps and comparing). 

Lastly, this might be our encouragement to ship Fira Sans, if it has enough glyphs.
(In reply to Richard Newman [:rnewman] from comment #3)
> Anecdotally, HTC, Samsung, and Sony devices ship with enough font coverage. 
> 
> I've done a little research on whether it's possible to detect if a device
> can render a glyph; it's not easy, but there are hacks (like drawing to two
> bitmaps and comparing). 
> 
> Lastly, this might be our encouragement to ship Fira Sans, if it has enough
> glyphs.

There's been a lot of work for Bengali scripts in Firefox OS. I imagine that Fira Sans would have the glyphs for this.
What other options do we have here? 

I don't want to replace our default sans serif typeface with Fira. I would actually prefer to fall back to the android system fonts for glyphs we don't support, if we have to. 

For the languages it supports, Clear Sans is a much stronger choice than Fira for Fennec in terms of overall design and readbility.
(In reply to Ian Barlow (:ibarlow) from comment #5)
> What other options do we have here? 
> 
> I don't want to replace our default sans serif typeface with Fira. I would
> actually prefer to fall back to the android system fonts for glyphs we don't
> support, if we have to. 
That's part of the challenge here. We're venturing into the land of unsupported Android locales, so the Android system fonts will likely not be a stronger option than what we would need to deliver to provide comprehensive character support.

If this is Clear Sans (http://www.fontsquirrel.com/fonts/clear-sans), it seems like their glyphs don't support the range of code points we'd need.

The info here may be useful: http://www.unicode.org/resources/fonts.html
I don't think we specify a particular font, so we're using Android default (Roboto?). Evidence for this: it works on my HTC.
Summary: Care for listed locales with missing device font → Some locale names not displayable in locale picker with default fonts
A related issue: it's not clear that all Android versions will display Bengali, Tamil, etc. even with the correct font -- reports online indicate that either (a) this was fixed in Jellybean, or (b) Samsung and friends ship harfbuzz or a custom skia to render text correctly.

We should be aware that if we manage to get the fonts right, we still might not have usable character conjunction.
I did a quick survey.

https://etherpad.mozilla.org/aurora-locale-font-test

I think the conclusion is: if we're committed to using Android system fonts, then we must only enable affected locales ('gu_IN', 'pa-IN', 'or') on Samsung and HTC devices, probably only recent ones.

(That is: we use that as a heuristic for two attributes: having the right fonts, and having the right conjunction behavior. We could also attempt to actually render the text and see if it's correct.)

All Google devices are broken, and I suspect that even Android devices that show "Bengali (India)" won't actually display Bengali text correctly.

This solution will leave rooted users out in the cold.
Second heuristic: if your device provides "Bengali (India)" (or any other non-localized string) for a locale, it can't display it anyway. If it could, it would get the name right.
As I understand it, this bug has a larger scope now in that essentially we can't ship locales that will not display correctly with fonts shipped with different devices.
(In reply to Aaron Train [:aaronmt] from comment #12)
> As I understand it, this bug has a larger scope now in that essentially we
> can't ship locales that will not display correctly with fonts shipped with
> different devices.

I would argue that that's poisoning the well. The locales in question for this particular bug work on the majority of devices in India, which is ultimately the target audience and where the bulk of users will be accessing these localizations. People who want to use a locale outside of the regional distribution of the device are the main people affected by this, as OEMs are continually adding support for more and more regional languages.

There are bugs on file currently to explore alternative routes of font delivery/management in Fennec. We should the ultimate costs of shipping a unicode-enabled font to use as the default when the system default can't perform.
(In reply to Aaron Train [:aaronmt] from comment #12)
> As I understand it, this bug has a larger scope now in that essentially we
> can't ship locales that will not display correctly with fonts shipped with
> different devices.

It's also worth noting that this largely affects only three locales, and the failing doesn't really generalize to other locales we might add.

(Issues with font rendering might affect others, but that's always been the case with Android, even for built-in locales.)
Assignee: nobody → rnewman
tracking-fennec: ? → 32+
There are at least two bugs here.

The first is to stop users getting into a bad situation -- hide locales that we can't actually use. That's what I'm tackling now. This will be a mix of heuristics and runtime checks.

The second (and onward) are to remove the need for restriction, via:
  * Shipping new fonts, allowing Android's own renderer to work for more locales (particularly on 4.4+)
  * Bundling harfbuzz and new fonts (which is the indic-text-renderer concept in Comment 15)
  * ...


Perhaps obviously, doing a proper job of part 2 (I doubt that just throwing fonts at the problem will suffice) involves significant low-level work, and I don't plan even to scope that in the near future.
Status: NEW → ASSIGNED
To my great surprise, this seems to work on the first attempt. Testing on my Nexus 10 it skips three locales (but keeps Bangla because it can display it!), and on my HTC One it keeps all three.
APK is here:

http://people.mozilla.org/~rnewman/fennec/skip-unsupported.apk

Aaron, could you try this on your Nexus and a modern Samsung or HTC, see how it works for you?
Flags: needinfo?(aaron.train)
(In reply to Richard Newman [:rnewman] from comment #19)
> APK is here:
> 
> http://people.mozilla.org/~rnewman/fennec/skip-unsupported.apk
> 
> Aaron, could you try this on your Nexus and a modern Samsung or HTC, see how
> it works for you?

Trying this on the Samsung Galaxy S5 (4.4.2) and my LG Nexus 5 (4.4.3) and HTC One (4.4.2) I see the exclusion of those locales on my Nexus 5 device.

On my Nexus 5, I do not see pa_IN, gu_IN nor 'or'.

On the other devices they are available in the menu in proper script. Note that they are all pretty much untranslated (w/ english strings) everywhere in the browser.
Flags: needinfo?(aaron.train)
(In reply to Aaron Train [:aaronmt] from comment #20)

> On my Nexus 5, I do not see pa_IN, gu_IN nor 'or'.
> 
> On the other devices they are available in the menu in proper script.

That's awesome, thanks!

> that they are all pretty much untranslated (w/ english strings) everywhere
> in the browser.

Yeah, I didn't do a proper build, so I don't think it packed the strings for the non-Nightly locales; just wanted to validate the filtering.


Jeff, does this approach seem sane to you, bearing in mind Comment 16?
(In reply to Richard Newman [:rnewman] from comment #21)
> (In reply to Aaron Train [:aaronmt] from comment #20)
> 
> > On my Nexus 5, I do not see pa_IN, gu_IN nor 'or'.
> > 
> > On the other devices they are available in the menu in proper script.
> 
> That's awesome, thanks!
> 
> > that they are all pretty much untranslated (w/ english strings) everywhere
> > in the browser.
> 
> Yeah, I didn't do a proper build, so I don't think it packed the strings for
> the non-Nightly locales; just wanted to validate the filtering.
> 
> 
> Jeff, does this approach seem sane to you, bearing in mind Comment 16?

Yes, this approach makes sense until we can scope out the long term solution. Thanks for your quick action, rnewman :-)
Attachment #8439691 - Flags: review?(michael.l.comella)
Attachment #8439692 - Flags: review?(michael.l.comella)
Attachment #8439691 - Flags: review?(michael.l.comella) → review+
Comment on attachment 8439692 [details] [diff] [review]
Part 2: apply basic heuristics for locale usability. v1

Review of attachment 8439692 [details] [diff] [review]:
-----------------------------------------------------------------

Are we using the default android font? If so, are these locales not appearing on some devices because each device ships their own default fonts with different sets of locales?

::: mobile/android/base/preferences/LocaleListPreference.java
@@ +32,5 @@
> +     * initial solution.
> +     */
> +    private static class CharacterValidator {
> +        private static final int BITMAP_WIDTH = 32;
> +        private static final int BITMAP_HEIGHT = 48;

As per discussion on IRC, add a comment as to why you chose these sizes. :)

@@ +47,5 @@
> +            Canvas c = new Canvas(b);
> +            c.drawText(text, 0, BITMAP_HEIGHT / 2, this.paint);
> +            return b;
> +        }
> +        private static byte[] getPixels(Bitmap b) {

nit: newline above.

@@ +139,5 @@
> +            final boolean checkFirstCharacter;
> +
> +            // Oh, for Java 7 switch statements.
> +            if (this.tag.equals("bn-IN")) {
> +                if (!this.nativeName.startsWith("বাংলা")) {

Why is this font an exception?

@@ +147,5 @@
> +                }
> +                checkFirstCharacter = true;
> +            } else if (this.tag.equals("or") ||
> +                       this.tag.equals("pa-IN") ||
> +                       this.tag.equals("gu-IN")) {

How did you select these fonts? Is it a whitelist of fonts that have issues?
(In reply to Michael Comella (:mcomella) from comment #23)

> Are we using the default android font?

Yes.


> If so, are these locales not
> appearing on some devices because each device ships their own default fonts
> with different sets of locales?

There are a bunch of concepts here which I haven't defined yet in one place, so here goes.


Locale: "bn-IN". Corresponds to a language (in this case Bengali/Bangla, as spoken in India) along with a collection of rules (e.g., whether the decimal separator is "." like English, or "," like Finland).

Script: languages are written in a script. Bengali is also the name of the script in this case. Russian: Cyrillic script. English: Latin script. Etc.

Glyph: very loosely speaking, scripts are composed of glyphs. A glyph is a picture that represents a character. Again loosely speaking, glyphs are identified by Unicode codepoints. Not all languages will use all of the glyphs in their script -- English pretty much gets away with US-ASCII, which doesn't include, e.g., þ.

Label: this varies based on the locale you're using for display, and also based on the Java environment. "Bengali (India)" or "বাংলা" would both be acceptable, but in this case the latter is what we expect.

Font: Android uses font fallbacks. Some devices ship fonts which contain more glyphs (and thus support more scripts) than others. Some devices are better at rendering glyphs into text (shipping non-standard stuff like harfbuzz), which is important because many (most?) languages are more complicated than just "draw these simple glyphs in order".

Again, loosely speaking, a font is a bag of glyphs.


Rendering text: the act of drawing a sequence of glyphs, supplied by a font, according to the rules of the locale/script.


We've found that Android devices fail at this for one of several reasons, the primary two being:

* Total failure: missing glyphs in the shipping fonts. They draw whitespace instead (when they should really draw a .notdef glyph).
* Moderate failure: missing rendering capabilities (e.g., ligatures, conjunct glyphs) -- see Comment 8 and later.


What we're trying to do with this patch is solve the former: attempt to figure out if this device is able to render text in the language used by this locale at all, or if it's just going to render whitespace or something else obviously wrong. We don't want to let a user pick a locale that is utterly unusable.

The hack we're using to do that is:

* Is the label of the locale at least in the right script? We've seen some devices that _know Bengali exists_ but don't have enough support, so they return the label in English. No chance they'll render the script correctly.

* Is the font hierarchy on the device able to render the name of the locale in its native script?

We figure that out by drawing the first character of the name. If it draws whitespace, we assume that the font stack on the device is missing the glyph.


We do not yet address bad rendering, because that's a harder problem.


> > +            // Oh, for Java 7 switch statements.
> > +            if (this.tag.equals("bn-IN")) {
> > +                if (!this.nativeName.startsWith("বাংলা")) {
> 
> Why is this font an exception?

s/font/locale:

This is a check to make sure we don't get "Bengali (India)" with no font support. We can't just check that we have a printable label -- we also need to make sure it's not a fallback English label!


> How did you select these fonts? Is it a whitelist of fonts that have issues?

Those are locales that aren't supported by mainstream Android. We did testing (e.g., Comment 10) to figure out the problematic locales.

There's not much point in putting this in some kind of declarative file; it doesn't change very often, and the test is unlikely to generalize to every locale.
nit: Add a brief comment to `CharacterValidator` describing roughly what it, and the analysis we use it for, is accomplishing, as per your latest comment 24. 

(In reply to Richard Newman [:rnewman] from comment #24)
> > Why is this font an exception?
> 
> s/font/locale:
> 
> This is a check to make sure we don't get "Bengali (India)" with no font
> support. We can't just check that we have a printable label -- we also need
> to make sure it's not a fallback English label!

Why don't the other locales we've whitelisted need similar considerations?

> > How did you select these fonts? Is it a whitelist of fonts that have issues?
> 
> Those are locales that aren't supported by mainstream Android. We did
> testing (e.g., Comment 10) to figure out the problematic locales.
> 
> There's not much point in putting this in some kind of declarative file; it
> doesn't change very often, and the test is unlikely to generalize to every
> locale.

nit: Add a comment mentioning that this listing was found through the testing in comment 10 (just direct link to it in the code?).
(In reply to Michael Comella (:mcomella) from comment #25)
> nit: Add a brief comment to `CharacterValidator` describing roughly what it,
> and the analysis we use it for, is accomplishing, as per your latest comment
> 24. 
> 
> (In reply to Richard Newman [:rnewman] from comment #24)
> > > Why is this font an exception?
> > 
> > s/font/locale:
> > 
> > This is a check to make sure we don't get "Bengali (India)" with no font
> > support. We can't just check that we have a printable label -- we also need
> > to make sure it's not a fallback English label!
> 
> Why don't the other locales we've whitelisted need similar considerations?
They do, but they also use the Bengali script, which is why Bengali is being called out here.
> 
> > > How did you select these fonts? Is it a whitelist of fonts that have issues?
> > 
> > Those are locales that aren't supported by mainstream Android. We did
> > testing (e.g., Comment 10) to figure out the problematic locales.
> > 
> > There's not much point in putting this in some kind of declarative file; it
> > doesn't change very often, and the test is unlikely to generalize to every
> > locale.
> 
> nit: Add a comment mentioning that this listing was found through the
> testing in comment 10 (just direct link to it in the code?).
> > Why don't the other locales we've whitelisted need similar considerations?
>
> They do, but they also use the Bengali script, which is why Bengali is being
> called out here.

More specifically: Bengali is the only locale we saw where the label could be English when the script was missing. This is a workaround for a flaw in the hack -- the other locales that use Bengali script don't fall back to English labels, so the character check works (and generalizes better).
Comments addressed.
Attachment #8443485 - Flags: review?(michael.l.comella)
Attachment #8439692 - Attachment is obsolete: true
Attachment #8439692 - Flags: review?(michael.l.comella)
Attachment #8443485 - Flags: review?(michael.l.comella) → review+
   https://hg.mozilla.org/integration/fx-team/rev/12388987c0f2
   https://hg.mozilla.org/integration/fx-team/rev/b2fd2f60ebad

ni for uplift.
Flags: needinfo?(rnewman)
Hardware: ARM → All
Target Milestone: --- → Firefox 33
Comment on attachment 8443485 [details] [diff] [review]
Part 2: apply basic heuristics for locale usability. v2

Approval request for parts 1 and 2.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
  Android limitations.

User impact if declined: 
  Some phones will show non-displayable locales in the picker.

Testing completed (on m-c, etc.): 
  Tested by hand, baked on Nightly.

Risk to taking this patch (and alternatives if risky): 
  All risk is isolated to the locale picker, so we're happy with that. Alternative would be some kind of whitelist/blacklist using device names, but that sucks hard.

String or IDL/UUID changes made by this patch:
  None
Attachment #8443485 - Flags: approval-mozilla-aurora?
Flags: needinfo?(rnewman)
Status: RESOLVED → VERIFIED
Comment on attachment 8443485 [details] [diff] [review]
Part 2: apply basic heuristics for locale usability. v2

Verified in 33. Taking it for 32.
Attachment #8443485 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 1049217
Blocks: 1075550
Blocks: 1136171
You need to log in before you can comment on or make changes to this bug.