Last Comment Bug 674335 - BBC.co.uk - Arabic text not connected in Firefox 6.0
: BBC.co.uk - Arabic text not connected in Firefox 6.0
Status: RESOLVED WORKSFORME
: regression, relnote
Product: Tech Evangelism Graveyard
Classification: Graveyard
Component: Arabic (show other bugs)
: unspecified
: All All
: -- major
: ---
Assigned To: arabic
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-07-26 13:29 PDT by [:Cww]
Modified: 2015-04-19 23:36 PDT (History)
14 users (show)
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
http://www.bbc.co.uk/arabic/ (224.62 KB, image/jpeg)
2011-07-26 13:29 PDT, [:Cww]
no flags Details

Description [:Cww] 2011-07-26 13:29:53 PDT
Created attachment 548565 [details]
http://www.bbc.co.uk/arabic/

Arabic text seems to be using the disconnected/terminal versions of characters rather than the connected ones so that words can't be read in some situations, namely headlines on bbc arabic.

http://www.bbc.co.uk/arabic/
Comment 1 [:Cww] 2011-07-26 13:31:20 PDT
Screenshot above is Windows 7, Firefox 6, but it works fine on mac on Firefox 8... so we'll probably have to regression test.
Comment 2 Jonathan Kew (:jfkthame) 2011-07-26 13:52:14 PDT
The BBC site is using a webfont (http://www.bbc.co.uk/worldservice/fonts/arabic/nassim/bbc-nassim-bold.woff) that has bad OpenType tables, and the OTS font sanitizer strips them; this means Arabic shaping cannot work properly for this font. (Note that it's only the bold face that has problems; the regular face works OK.)

You can confirm this by disabling the sanitizer (set gfx.downloadable_fonts.sanitize to false) and reloading the page; the bold text will then shape correctly (but you're also vulnerable to potentially malicious downloadable fonts, so this isn't a recommended solution).

(The problem didn't show up in FF4 because OTS didn't yet validate the OpenType tables at that point.)

Moving to Tech Evangelism; the BBC Arabic site should be encouraged to validate/fix the fonts being used.
Comment 3 :Ehsan Akhgari 2011-07-26 15:04:54 PDT
A useful tech evangelism data point is to tell BBC that their BBC Persian website uses a font which doesn't have this problem (AFAIK).
Comment 4 christian 2011-07-26 15:10:21 PDT
Kev, do we have any contacts at the BBC? I might be able to dig up some if we don't.
Comment 5 tntypography 2011-08-08 08:24:00 PDT
Could you please specify what OTS considers "bad OpenType tables"? What exactly is it looking for? Is there a log-file that would tell me what it considers "bad" in the BBC font? Or could somebody point out a full list, (?) spec of the OTS's filtering routine in order to validate/fix the fonts for FF as suggested?

Thanks in advance,
tn
Comment 6 [:Cww] 2011-08-08 14:32:31 PDT
I'm working on finding an engineer that you can work with tn.

I'm going to quote the following from your email for more context:

I did validate the fonts and tested them quite extensively, so I would be rather surprised if there actually is an error. I am mainly intrigued that there is a difference between Bold and Regular as they have largely identical OT tables -- only the values for kerning and mark positioning are different, but the architecture should be identical. The Persian fonts are entirely different to the Arabic ones from an OT perspective.
Comment 7 Robert O'Callahan (:roc) (email my personal email if necessary) 2011-08-08 15:43:33 PDT
Jonathan Kew will answer all your questions :-)
Comment 8 Jonathan Kew (:jfkthame) 2011-08-09 04:26:18 PDT
(In reply to tntypography from comment #5)
> Could you please specify what OTS considers "bad OpenType tables"? What
> exactly is it looking for? Is there a log-file that would tell me what it
> considers "bad" in the BBC font? Or could somebody point out a full list,
> (?) spec of the OTS's filtering routine in order to validate/fix the fonts
> for FF as suggested?

Unfortunately, there's not currently any easy way (AFAIK) to determine exactly why OTS rejects a font table, short of running a debug build and setting breakpoints to detect when it returns an error. (We want to improve this, so that the Firefox web console can give you specific messages that identify the problem - see bug 670901.)

By using the (rejected) "part 2" patch from bug 494130, I was able to determine that the problem with the BBC Nassim Bold font is that the GPOS table is rejected for having a bad Script table; more precisely, the 'DFLT' (default script) entry does not satisfy the OpenType specification.

<quote src="http://www.microsoft.com/typography/otspec/chapter2.htm">
If a Script table with the script tag 'DFLT' (default) is present in the ScriptList table, it must have a non-NULL DefaultLangSys and LangSysCount must be equal to 0.
</quote>

Dumping the GPOS table with TTX (http://sourceforge.net/projects/fonttools/), I get the following (excerpt) showing the problem:

  <GPOS>
    <Version value="1.0"/>
    <ScriptList>
      <!-- ScriptCount=3 -->
      <ScriptRecord index="0">
        <ScriptTag value="DFLT"/>
        <Script>
          <DefaultLangSys>
            <ReqFeatureIndex value="65535"/>
            <!-- FeatureCount=2 -->
            <FeatureIndex index="0" value="3"/>
            <FeatureIndex index="1" value="8"/>
          </DefaultLangSys>
          <!-- LangSysCount=1 -->
          <LangSysRecord index="0">
            <LangSysTag value="ARA "/>
            <LangSys>
              <ReqFeatureIndex value="65535"/>
              <!-- FeatureCount=2 -->
              <FeatureIndex index="0" value="2"/>
              <FeatureIndex index="1" value="7"/>
            </LangSys>
          </LangSysRecord>
        </Script>
      </ScriptRecord>
      <ScriptRecord index="1">
        <ScriptTag value="arab"/>
        <Script>
          <DefaultLangSys>
            <ReqFeatureIndex value="65535"/>
            <!-- FeatureCount=2 -->
            <FeatureIndex index="0" value="1"/>
            <FeatureIndex index="1" value="6"/>
          </DefaultLangSys>
          <!-- LangSysCount=1 -->
          <LangSysRecord index="0">
            <LangSysTag value="ARA "/>
            <LangSys>
              <ReqFeatureIndex value="65535"/>
              <!-- FeatureCount=2 -->
              <FeatureIndex index="0" value="0"/>
              <FeatureIndex index="1" value="5"/>
            </LangSys>
          </LangSysRecord>
        </Script>
      </ScriptRecord>
      <ScriptRecord index="2">
        <ScriptTag value="latn"/>
        <Script>
          <DefaultLangSys>
            <ReqFeatureIndex value="65535"/>
            <!-- FeatureCount=1 -->
            <FeatureIndex index="0" value="4"/>
          </DefaultLangSys>
          <!-- LangSysCount=0 -->
        </Script>
      </ScriptRecord>
    </ScriptList>

Notice that in the DFLT script entry here, the LangSysCount is 1, not zero (as required). This causes OTS to discard the table as invalid; and if any of the OpenType Layout tables are invalid, they are all dropped as a group. Result: no Arabic shaping.

This problem does not occur with the Regular face because the "default" script entry is tagged 'dflt' (lowercase) instead of 'DFLT' (uppercase):

  <GPOS>
    <Version value="1.0"/>
    <ScriptList>
      <!-- ScriptCount=3 -->
      <ScriptRecord index="0">
        <ScriptTag value="arab"/>
        <Script>
          <DefaultLangSys>
            <ReqFeatureIndex value="65535"/>
            <!-- FeatureCount=2 -->
            <FeatureIndex index="0" value="1"/>
            <FeatureIndex index="1" value="6"/>
          </DefaultLangSys>
          <!-- LangSysCount=1 -->
          <LangSysRecord index="0">
            <LangSysTag value="ARA "/>
            <LangSys>
              <ReqFeatureIndex value="65535"/>
              <!-- FeatureCount=2 -->
              <FeatureIndex index="0" value="0"/>
              <FeatureIndex index="1" value="5"/>
            </LangSys>
          </LangSysRecord>
        </Script>
      </ScriptRecord>
      <ScriptRecord index="1">
        <ScriptTag value="dflt"/>
        <Script>
          <DefaultLangSys>
            <ReqFeatureIndex value="65535"/>
            <!-- FeatureCount=2 -->
            <FeatureIndex index="0" value="3"/>
            <FeatureIndex index="1" value="8"/>
          </DefaultLangSys>
          <!-- LangSysCount=1 -->
          <LangSysRecord index="0">
            <LangSysTag value="ARA "/>
            <LangSys>
              <ReqFeatureIndex value="65535"/>
              <!-- FeatureCount=2 -->
              <FeatureIndex index="0" value="2"/>
              <FeatureIndex index="1" value="7"/>
            </LangSys>
          </LangSysRecord>
        </Script>
      </ScriptRecord>
      <ScriptRecord index="2">
        <ScriptTag value="latn"/>
        <Script>
          <DefaultLangSys>
            <ReqFeatureIndex value="65535"/>
            <!-- FeatureCount=1 -->
            <FeatureIndex index="0" value="4"/>
          </DefaultLangSys>
          <!-- LangSysCount=0 -->
        </Script>
      </ScriptRecord>
    </ScriptList>

This also appears to be an error - the correct tag for the "default" script is the uppercase form - but has the effect of turning the intended "default" script entry into a script with the tag 'dflt', which is then harmlessly ignored.

So in summary: the Regular font should have its default script entry properly tagged as 'DFLT' rather than 'dflt', and *both* the Regular and Bold fonts need to have the 'ARA ' language system record removed from the 'DFLT' script entry.
Comment 9 tntypography 2011-08-09 07:57:49 PDT
Jonathan,
this is brilliant, thank you very much for the most detailed report! It's interesting that MS FontVal does not have an issue with this error, hence I did not come across it.
Thanks again!
tn
Comment 10 tntypography 2011-08-09 08:03:48 PDT
I have a follow-up question for you guys: at the moment we obfuscate name tables in .ttfs to make them slightly more pirating-proof. Hence, they would not comply with the spec either (though not the OT tables) -- are there any plans to filter such fonts too in the future?
Best
tn
Comment 11 Jonathan Kew (:jfkthame) 2011-08-09 08:23:50 PDT
(In reply to tntypography from comment #10)
> I have a follow-up question for you guys: at the moment we obfuscate name
> tables in .ttfs to make them slightly more pirating-proof. Hence, they would
> not comply with the spec either (though not the OT tables) -- are there any
> plans to filter such fonts too in the future?
> Best
> tn

Ah yes, I meant to mention this - I notice that MS Font Validator does complain about a couple of issues in the 'name' table. Although this does not (currently) prevent the font working, I would strongly urge you to avoid the practice, for several reasons:

- There is a clear trend towards more rigorous validation (not just in Firefox - the OTS library we're using originates with the Chromium project, and I think system libraries on various platforms are also becoming stricter) and so I think there's a significant risk that an out-of-spec font that happens to work today will fail at some time in the future;

- Even though the font works in commonly-tested scenarios, there's a non-zero risk that it may fail in more obscure situations, and it won't be at all clear to users (or support staff, if questions are asked) what is going wrong. For example, failure to provide a valid PostScript name might cause a printing failure, but only with certain printer drivers and PS interpreters. None of us want to be faced with debugging such issues.

- We make information on downloadable fonts accessible to end users (partly in response to the expressed desire of font designers/vendors, BTW). Currently, the fontinfo extension (https://addons.mozilla.org/en-US/firefox/addon/fontinfo/) offers a simple implementation of this for FF 7.0 and later. If you don't provide proper names, then users are presented with much-less-useful information (in the worst case, generic placeholders such as "MISSING" or "OTS derived font"). The user experience is much better if the font presents itself as "BBC Nassim Regular".

So please reconsider this practice - I'd recommend providing at least a proper FullName (and preferably Family and Subfamily as well, though we don't currently use those AFAIR), for both MS and Mac platforms, as well as a valid PostScript name.
Comment 12 tntypography 2011-08-09 14:49:01 PDT
Thanks again Jonathan, I will give it a thought - your reasoning is compelling, particularly as it only is a deterrent, nothing like a bullet-proof solution.
Comment 13 Thomas Phinney 2011-09-07 16:29:10 PDT
I'll point out that it's easy to use names that are legal, but are gibberish. That's what we aim for in our web font obfuscation for WebINK.
Comment 14 Jonathan Kew (:jfkthame) 2011-09-21 12:58:24 PDT
The BBC Arabic site now renders properly for me in Nightly builds, so I think updated/fixed fonts have been deployed.
Comment 15 Kutlu Çanlıoğlu 2011-09-29 05:11:56 PDT
Hi all,
We are having similar problems with our Hindi site which re-launched recently with an embedded font.
Firefox completely ignores the embedded font on the Mac and displays a system font. 
Would you be able to perform a similar diagnosis on the embedded font we use as you did for our Arabic font?
We have 2 test pages that you can access here:
http://extdev.bbc.co.uk/hindi/index_mangal4.shtml
and
http://extdev.bbc.co.uk/hindi/gel_test/font-hindi.shtml

We've had to block the embedded font only to IE for now on the live site site, so looking at bbc.co.uk/hindi on firefox wouldn't help. 

Any help would be much appreciated.
Comment 16 Jonathan Kew (:jfkthame) 2011-09-29 06:15:54 PDT
(In reply to Kutlu Çanlıoğlu from comment #15)
> Hi all,
> We are having similar problems with our Hindi site which re-launched
> recently with an embedded font.
> Firefox completely ignores the embedded font on the Mac and displays a
> system font. 
> Would you be able to perform a similar diagnosis on the embedded font we use
> as you did for our Arabic font?
> We have 2 test pages that you can access here:
> http://extdev.bbc.co.uk/hindi/index_mangal4.shtml
> and
> http://extdev.bbc.co.uk/hindi/gel_test/font-hindi.shtml
> 
> We've had to block the embedded font only to IE for now on the live site
> site, so looking at bbc.co.uk/hindi on firefox wouldn't help. 
> 
> Any help would be much appreciated.

I think this is an unrelated issue (note that it's Mac-specific, unlike the Arabic problem). In this case, the problem is that the Mangal font you're using only supports OpenType shaping features, but on OS X we currently don't have OpenType support for Devanagari; we use Apple's Core Text APIs, which would require a font with AAT tables in order to render Devanagari script properly.

Firefox on OS X recognizes that the Mangal font doesn't have AAT tables, and therefore it rejects it for Devanagari text and falls back to an alternative font.

I tried your index_mangal4.shtml page in Safari on OS X, and noted that it doesn't render correctly there either. It looks like Safari does use the downloaded font, but it doesn't fully support the OpenType features needed and therefore lots of conjunct forms, etc, are lacking.

So in short, to make this work with current browsers on OS X - not only Firefox - you'd need AAT support added to the font. Eventually, we intend to fully support OpenType fonts for Indic scripts as well, but this code isn't ready yet.

(I'm closing this as WORKSFORME, as the problem with the Arabic site has now been resolved, as far as I can tell. If you need to follow up about issues with other scripts such as Devanagari, please file a separate report.)

Note You need to log in before you can comment on or make changes to this bug.