Closed Bug 706888 Opened 13 years ago Closed 12 years ago

Arabic joining does not work properly on the native UI

Categories

(Core :: Graphics, defect, P3)

ARM
Android
defect

Tracking

()

RESOLVED FIXED
mozilla14
Tracking Status
fennec 12+ ---

People

(Reporter: ehsan.akhgari, Assigned: jfkthame)

References

()

Details

(Keywords: intl)

Attachments

(4 files)

It works for some of the headlines on this page, so I suspect that it might be related to fonts somehow.  I don't remember having this problem on the XUL version.
Keywords: intl
(In reply to Ehsan Akhgari [:ehsan] from comment #0)
> It works for some of the headlines on this page, so I suspect that it might
> be related to fonts somehow.  I don't remember having this problem on the
> XUL version.

Does it happen on XUL?
OS: Mac OS X → Android
Hardware: x86 → ARM
I might be missing something but short of the address bar the the text looks the same.
The page content looks fine; however, the address bar is thoroughly broken. This is a pretty serious functional regression that we're going to suffer as a result of the switch to native UI, but fixing it will be tricky on devices that don't natively support Arabic (or other complex-shaping scripts).
Do we know if this is a problem on Android device that have their OS defaulted to Arabic?
ehsan, which android OS and device are you on?  arabic isn't officially supported in versions less than 3.  http://blog.afriza.co.cc/archives/23
correction: wasn't officially support until honeycomb.  It now is.
This is an Android issue
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → INVALID
It's still a Firefox regression between the XUL version and the Android-native version, though, as we used to be able to handle this.
http://colincooper.net/?p=238 mentions that arabic is in 2.3, confirmed by http://developer.android.com/sdk/android-3.2.html#locs.

That doesn't mean that it's not without glitches. Hindi for example is having a split story in terms of rendering, some devices seem to be doing it OK, though stock seems to be broken in rendering.
I'm on Gingerbread based CyanogenMod, which has correct system-wide support for Arabic shaping.  But I was not talking about the parts of the UI drawn by Android, I was talking about the content area part drawn by Gecko.

In today's nightly, here's the behavior I get:

The shaping is broken on bbcpersian.com as it is loaded.  It gets fixed when it's finished loading.

The shaping is completely broken on Google search.

I will attach a screenshot.
Status: RESOLVED → REOPENED
Resolution: INVALID → ---
Attached image screenshot
I think the issue here is that your device has a font that includes Arabic glyphs but lacks OpenType layout tables, so it doesn't really support Arabic-script rendering. The bbcpersian site uses a web font, so once that loads, it renders OK, but sites that rely on the device's local font support will be broken. See also bug 676068.
Attachment #585584 - Attachment mime type: text/plain → application/octet-stream
OK, so the problem is that your device has a bad DroidSansArabic font. Specifically, the font includes GDEF/GSUB/GPOS tables, which should support OpenType Arabic shaping, but those tables are completely wrong - the substitution rules, for example, refer to non-existent glyph IDs.

This can be confirmed by opening the .ttf file with FontForge, for example - it spews a huge number of warnings about nonexistent glyphs as it loads the OpenType tables - or by running it through Microsoft's Font Validator.

My guess is that this DroidSansArabic font was created by taking a "complete" DroidSans (including Arabic glyphs and with presumably-correct OpenType tables), and stripping out the non-Arabic glyphs. This means that the remaining glyphs have completely different glyph IDs than they had in the original font - but the OpenType tables were not updated to match. So harfbuzz Arabic shaping necessarily fails.

The reason the font works in native Android apps is almost certainly because the Java code implements Arabic shaping based on the (deprecated) Unicode presentation forms, ignoring the OpenType layout tables. That's a poor solution, as it cannot take advantage of all the features of OpenType fonts (e.g. contextual variants of the glyphs, beyond the basic init/medi/fina/isol forms), nor can it support all Arabic-script languages (as the repertoire of presentation forms is limited and will not be extended to cover newer Arabic-script letters used in minority languages).

This is similar to the situation in bug 676068, but it's actually somewhat worse. In the case of the LG devices described there, the (no-longer-valid) OpenType tables have been stripped from DroidSansArabic; we could, in principle, detect that and fall back to a different font that does support Arabic shaping (either shipped with Firefox or downloaded on demand). Here, however, the font *does* have OpenType tables, and those tables claim support for the Arabic script, so it'll be harder for us to detect the brokenness and trigger a fallback. I foresee some ugly hackery if we're going to work around this.... (sigh).
Digging a bit further.....it turns out that the copy of DroidSansArabic from your device has several oddities:

- while the OpenType tables may not be quite as broken as I first thought - I was somewhat misled by FontForge, I think - they do seem to include references to glyphs that have been stripped out of the font, and this is probably why harfbuzz rejects them.

- the font contains a TSIV table, which is textual source code from Microsoft's VOLT tool, and should have been removed before shipping the font - it just bloats the file.

- the font also contains 'morx', 'prop' and 'just' tables, which are specific to Apple's AAT technology; unless these are supported on the Android device (which seems highly unlikely), they also represent pointless bloat.

- the font does not conform to TrueType/OpenType recommendations (it's unclear just how "normative" this requirement is) regarding the first four glyphs, which are expected to be .notdef, .null, CR and space. In particular, the absence of a space character may be confusing our code.

Looking at the VOLT source in the TSIV table, there are references to many glyphs that are not present in the font; and the glyph IDs mentioned in the GDEF table for Arabic glyphs that _are_ present do not match the actual glyph IDs in the font. What I currently suspect is that the OpenType tables were compiled (with VOLT) against a "complete" font, and then _subsequently_ some "subsetting" tool was used to remove non-Arabic glyphs; this left the font with OpenType tables that do not match the final glyph repertoire.
tracking-fennec: --- → 12+
Component: General → Graphics
Priority: -- → P3
Product: Fennec Native → Core
QA Contact: general → thebes
Version: unspecified → Trunk
We now have support for presentation-forms shaping in harfbuzz. However, this will only take effect for fonts that lack a GSUB table.

In the case of this particular version of Droid Sans Arabic, the problem is that it *does* have a GSUB table (and so we still use the OpenType shaping path), but that table is completely incorrect for the glyphs actually present in the font. And so no useful shaping happens.

I've pushed a tryserver build at https://tbpl.mozilla.org/?tree=Try&rev=1784e001fcad that includes a patch intended to detect this particular version of the font and disregard its GSUB table, so that we'll use fallback shaping instead. Could you please try this build on the device with this bad font, and let me know whether it actually works? Thanks.
This attempts to work around the problem by detecting this specific broken version of Droid Sans Arabic, and ignoring its GSUB table. The font checksum in the 'head' table is used to recognize this specific font and avoid affecting other versions.

If we find that there are lots of slightly different but similarly broken versions of the font in circulation, we might want to consider unconditionally disabling GSUB for Droid Sans Arabic, and relying solely on presentation-forms shaping. I dislike that on principle, as presentation-forms shaping is a bad solution for Arabic-script rendering, but in practice it would probably be OK as this font is unlikely to use any more complex features.
I tested the try server build Jonathan posted in comment 16, and it works great for me!
Comment on attachment 617820 [details] [diff] [review]
patch, ignore GSUB in broken version of Droid Sans Arabic

Ehsan reports that this fixed Arabic/Persian rendering on his device with the broken font. I think this is a sufficiently low risk that we should take it, or something along these lines, for the Android product ASAP.
Attachment #617820 - Flags: review?(jdaggett)
I actually faced something which looks to be a regression in this build.  While using Google reader, sometimes the first line of the collapsed items which is displayed in a non-bold font disappears when you interact with the app.  It doesn't reproduce reliably for me, but usually I see this problem when I open a few posts and pan around.
(In reply to Ehsan Akhgari [:ehsan] from comment #20)
> I actually faced something which looks to be a regression in this build.

Which build?  With the harfbuzz update applied or with the patch attached here?
(In reply to John Daggett (:jtd) from comment #21)
> (In reply to Ehsan Akhgari [:ehsan] from comment #20)
> > I actually faced something which looks to be a regression in this build.
> 
> Which build?  With the harfbuzz update applied or with the patch attached
> here?

The try server build in comment 16.
Comment on attachment 617820 [details] [diff] [review]
patch, ignore GSUB in broken version of Droid Sans Arabic

This is unfortunate but it can't be helped I guess.
Attachment #617820 - Flags: review?(jdaggett) → review+
(In reply to Jonathan Kew (:jfkthame) from comment #17)
> If we find that there are lots of slightly different but similarly broken
> versions of the font in circulation, we might want to consider
> unconditionally disabling GSUB for Droid Sans Arabic, and relying solely on
> presentation-forms shaping. I dislike that on principle, as
> presentation-forms shaping is a bad solution for Arabic-script rendering,
> but in practice it would probably be OK as this font is unlikely to use any
> more complex features.

I think if we find enough broken versions of this font we should simply ship a sane version of the font with Firefox and not add more hacky font logic.  We don't have logic to load Firefox-specific fonts currently but I think that's relatively easy to do and the MathML folks seem to be aiming for something effectively close to that.
(In reply to Ehsan Akhgari [:ehsan] from comment #20)
> I actually faced something which looks to be a regression in this build. 
> While using Google reader, sometimes the first line of the collapsed items
> which is displayed in a non-bold font disappears when you interact with the
> app.  It doesn't reproduce reliably for me, but usually I see this problem
> when I open a few posts and pan around.

Strange. I don't see how this patch could cause something like that - maybe an existing erratic issue? Could you try to reproduce with a recent nightly that doesn't have the arabic fix?

Was this specifically when reading Arabic/Persian stuff in GR, or do you mean you saw issues on *non-Arabic* pages when using the try build?
(In reply to Jonathan Kew (:jfkthame) from comment #25)
> (In reply to Ehsan Akhgari [:ehsan] from comment #20)
> > I actually faced something which looks to be a regression in this build. 
> > While using Google reader, sometimes the first line of the collapsed items
> > which is displayed in a non-bold font disappears when you interact with the
> > app.  It doesn't reproduce reliably for me, but usually I see this problem
> > when I open a few posts and pan around.
> 
> Strange. I don't see how this patch could cause something like that - maybe
> an existing erratic issue? Could you try to reproduce with a recent nightly
> that doesn't have the arabic fix?

I have not seen that issue in any other Fennec build.

> Was this specifically when reading Arabic/Persian stuff in GR, or do you
> mean you saw issues on *non-Arabic* pages when using the try build?

No, this was an issue when reading English content on Google Reader (Planet Mozilla to be precise.)

But yeah maybe this was a problem unrelated to this?  If you can make a new try server build based on a more recent m-c revision, I'd be happy to test it.
Could you try https://tbpl.mozilla.org/?tree=Try&rev=79ade3199eab and see if that shows problems? (Based on mozilla-inbound tip  as of this morning.)
This try build doesn't have that bug, so your patch is not to blame here.
https://hg.mozilla.org/mozilla-central/rev/42ef0ce00b8c
Assignee: nobody → jfkthame
Status: REOPENED → RESOLVED
Closed: 13 years ago12 years ago
Resolution: --- → FIXED
Comment on attachment 617820 [details] [diff] [review]
patch, ignore GSUB in broken version of Droid Sans Arabic

[Approval Request Comment]
Regression caused by (bug #): broken font on some Android devices

User impact if declined: Arabic text unreadable on affected devices

Testing completed (on m-c, etc.): since this landed on m-c, the issue is fixed in Nightly (as confirmed by Ehsan)

Risk to taking this patch (and alternatives if risky): minimal risk - mobile-only change, narrowly targeted to only affect the one known-broken font

String changes made by this patch: none

If we uplift this to Aurora, does that mean it'll go into the native-fennec beta release? Given the impact for users with affected devices, and the fact that the patch carries essentially no risk to anything else - it specifically targets just one version of one font - I think we should fix this for the beta if at all possible.
Attachment #617820 - Flags: approval-mozilla-aurora?
Comment on attachment 617820 [details] [diff] [review]
patch, ignore GSUB in broken version of Droid Sans Arabic

[Triage Comment]
Given the narrow scope and high reward, approving for Aurora 14.
Attachment #617820 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/37b8a2df105d
Target Milestone: mozilla15 → mozilla14
Depends on: 758257
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: