Last Comment Bug 738136 - more scripts need to be excluded from cmaps on OSX
: more scripts need to be excluded from cmaps on OSX
Status: RESOLVED FIXED
: regression
Product: Core
Classification: Components
Component: Graphics (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: mozilla14
Assigned To: John Daggett (:jtd)
:
Mentors:
http://lo.wikipedia.org/wiki/%E0%BB%9...
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-03-21 20:48 PDT by John Daggett (:jtd)
Modified: 2012-04-10 08:46 PDT (History)
2 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
screenshot of testcase on 10.6 (213.59 KB, image/png)
2012-03-21 20:49 PDT, John Daggett (:jtd)
no flags Details
patch, add Lao to the scripts that require shaping under OSX (1.46 KB, patch)
2012-03-21 21:04 PDT, John Daggett (:jtd)
jfkthame: review+
Details | Diff | Review

Description John Daggett (:jtd) 2012-03-21 20:48:38 PDT
Steps: Click on the URL, the Wikipedia page for Lao.
Result: On OSX 10.6, the page shows up as lots of strings of "> > > > > > > >"
Comment 1 John Daggett (:jtd) 2012-03-21 20:49:47 PDT
Created attachment 608212 [details]
screenshot of testcase on 10.6
Comment 2 John Daggett (:jtd) 2012-03-21 21:04:28 PDT
Created attachment 608217 [details] [diff] [review]
patch, add Lao to the scripts that require shaping under OSX

Jonathan, can you suggest modifications and/or just modify the patch to include other scripts you think should be included here?

This fixes the problem for OSX 10.6/10.7 available fonts.
Comment 3 Jonathan Kew (:jfkthame) 2012-03-27 13:45:51 PDT
There are several more scripts that require shaping for proper/legible rendering - some additional Indic scripts that aren't included in the current code, plus a bunch of SE Asian scripts.

However, there are a couple of complications: first, OS X 10.7 handles OpenType shaping for the north Indian scripts (but not south Indian, AFAICT), which means that on 10.7, we don't want to exclude north Indian codepoints if OpenType layout tables are present, whereas on 10.6 we can only use them if there are AAT tables.

And second, if we exclude codepoints that we know we won't be able to shape (due to lack of layout tables), this has the unfortunate side-effect of breaking pdf.js display of documents that use these codepoints with an embedded font. Pdf.js doesn't require any shaping support, because the PDF already provides the positioned glyphs; but it does need us to load the font without blanking the codepoints from the cmap.
Comment 4 John Daggett (:jtd) 2012-03-27 14:35:00 PDT
(In reply to Jonathan Kew (:jfkthame) from comment #3)
> There are several more scripts that require shaping for proper/legible
> rendering - some additional Indic scripts that aren't included in the
> current code, plus a bunch of SE Asian scripts.

So this code is a simple fix for a problem that happens in practice.  If you have a patch for the other scripts, that's fine, but if not then I think we should take the fix.

> However, there are a couple of complications: first, OS X 10.7 handles
> OpenType shaping for the north Indian scripts (but not south Indian,
> AFAICT), which means that on 10.7, we don't want to exclude north Indian
> codepoints if OpenType layout tables are present, whereas on 10.6 we can
> only use them if there are AAT tables.

This is another bug which should be specific to Indic handling in 10.7.

> And second, if we exclude codepoints that we know we won't be able to shape
> (due to lack of layout tables), this has the unfortunate side-effect of
> breaking pdf.js display of documents that use these codepoints with an
> embedded font. Pdf.js doesn't require any shaping support, because the PDF
> already provides the positioned glyphs; but it does need us to load the font
> without blanking the codepoints from the cmap.

This also sounds like something that should be in a separate bug.  Maybe the solution here is an internal, extension-accessible API to disable shaping and sets a flag on the user font set so that the cmap is handled differently.  We also need real examples to verify that this works.
Comment 5 Jonathan Kew (:jfkthame) 2012-03-27 15:30:45 PDT
(In reply to John Daggett (:jtd) from comment #4)
> (In reply to Jonathan Kew (:jfkthame) from comment #3)
> > There are several more scripts that require shaping for proper/legible
> > rendering - some additional Indic scripts that aren't included in the
> > current code, plus a bunch of SE Asian scripts.
> 
> So this code is a simple fix for a problem that happens in practice.  If you
> have a patch for the other scripts, that's fine, but if not then I think we
> should take the fix.

I have a patch that covers other scripts, but then I ran into the problem that it disrupts pdf.js because it effectively "blanks out" glyphs that the pdf renderer wants to draw.

Do 10.5 and 10.6 ship with an AAT-enabled Lao font?

With the patch applied, and no Lao font installed locally, does http://www.unicode.org/charts/PDF/U0E80.pdf display properly through pdf.js?
Comment 6 John Daggett (:jtd) 2012-03-28 10:47:12 PDT
The problem font here is Arial Unicode MS, which is included on 10.6.  There doesn't appear to be a font that does support Lao.  Lion includes a number of fonts that support various less common scripts.  In the Lao case the CoreText fallback method chooses Lao Sangam MN.
Comment 7 John Daggett (:jtd) 2012-03-28 10:50:12 PDT
Firefox 11 renders the test page correctly on 10.7 because the "read all cmaps" fallback picks up Lao Sangam because it's later in the alphabet than Arial Unicode MS.
Comment 8 John Daggett (:jtd) 2012-04-08 21:05:37 PDT
review ping.

I think we don't need to include other scripts for this bug, since those aren't needed to address the regression here.  I'd actually like to push to land this on Aurora also and for that reason I think it makes sense to limit the scope of the patch to just the logic needed to fix the regression that exists on 10.7 on trunk but not in FF11.
Comment 9 Jonathan Kew (:jfkthame) 2012-04-09 06:10:21 PDT
Comment on attachment 608217 [details] [diff] [review]
patch, add Lao to the scripts that require shaping under OSX

Review of attachment 608217 [details] [diff] [review]:
-----------------------------------------------------------------

OK, let's go ahead and do this. There are definitely more scripts that cannot be rendered properly without layout tables, but I haven't seen any other cases where Core Text messes things up for us like the Lao case, so that's the most critical issue.

::: gfx/thebes/gfxMacPlatformFontList.mm
@@ +181,5 @@
>  const ScriptRange gScriptsThatRequireShaping[] = {
>      { eComplexScriptArabic, 0x0600, 0x077F },   // Basic Arabic, Syriac, Arabic Supplement
>      { eComplexScriptIndic, 0x0900, 0x0D7F },     // Indic scripts - Devanagari, Bengali, ..., Malayalam
> +    { eComplexScriptTibetan, 0x0F00, 0x0FFF },     // Tibetan
> +    { eComplexScriptLao, 0x0E80, 0x0EFF } // Lao

These two ranges are contiguous, so combine them into a single Lao_Tibetan range.

@@ +186,2 @@
>      // Thai seems to be "renderable" without AAT morphing tables
>      // xxx - Lao, Khmer?

And remove Lao from the comment here.
Comment 10 Jonathan Kew (:jfkthame) 2012-04-09 06:29:27 PDT
Comment on attachment 608217 [details] [diff] [review]
patch, add Lao to the scripts that require shaping under OSX

Retracting r+ on this, temporarily at least, as I just remembered the impact on pdf.js, which concerns me. Poking at things a bit more...
Comment 11 Jonathan Kew (:jfkthame) 2012-04-09 11:10:59 PDT
Comment on attachment 608217 [details] [diff] [review]
patch, add Lao to the scripts that require shaping under OSX

Well... I can't reproduce the problem in pdf.js any more, though I'm not sure why. Maybe there's been a change in how it handles fonts? AFAICT from a bit of tracing, it's relying entirely on PUA codepoints to render glyphs for complex-script text.

So I guess we can go ahead with this, and just keep an eye open for any possible fallout in pdf.js (or elsewhere), but it seems to be safe at this point.
Comment 12 John Daggett (:jtd) 2012-04-09 19:09:02 PDT
Updated based on review comments, pushed to mozilla-inbound
https://hg.mozilla.org/integration/mozilla-inbound/rev/a4b2bc52e384
Comment 13 :Ehsan Akhgari (busy, don't ask for review please) 2012-04-10 08:46:20 PDT
https://hg.mozilla.org/mozilla-central/rev/a4b2bc52e384

Note You need to log in before you can comment on or make changes to this bug.