Handle variant selectors properly




11 years ago
9 years ago


(Reporter: Jean-Marc Desperrier, Assigned: smontagu)


(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)




(1 attachment)



11 years ago
Unicode variant are suported in a very random manner with the new cairo based text rendering.

Steps to reproduce :
- download and install a recent version of code2000 : http://www.code2000.net/code2000_page.htm
- display the page : http://jmdesp.free.fr/i18n/varsel/math-variation-selectors-axx.html
- the variation selector character is displayed as a [VS1] glyph
- the characters displayed in the columns with or without the variation selector are the same
- Go in options. Select the Content tab. Click on the advance button in the font&colors box. Disable the "Allow pages to choose their own font, instead of my selection above". Selection above should be Times, Arial, Courrier New. 
Click OK to close the option boxes.
- The display of the page is updated.
- The [VS1] glyph is not more displayed. The characters in the second colums now are the VS1-variant instead of being the same as the first column.

It seems that when code2000 is directly selected for the display, the variant selector doesn't work, but that when code2000 is used trought some mechanism of font backfall.

Comment 1

11 years ago
I changed the sample for easier reproduction.

They are tree columns now, the two first force code2000, the third leaves the defaut font setting.

So there's no more need to change the Fx option, the third column will show what happens when code2000 is used through the backfall instead of being directly selected. In fact, the "Allow pages to choose their own font, instead of
my selection above" should be checked to see the problem in the second column.


Comment 2

11 years ago
Here's an interesting data point: the testcase displays as expected if I set browser.display.auto_quality_min_font_size to 0. So for some reason we are not going through the Uniscribe path when code2000 is the default font.

Comment 3

11 years ago
What's happening is that ScriptIsComplex is returning FALSE for the strings with variation selectors, so we take the GDI path, but when doing font fallback from the GDI path, we switch to the Uniscribe path.

Comment 4

11 years ago
Nice to see progress in this. 
The test case is derived from the following page http://babelstone.blogspot.com/2007/06/secret-life-of-variation-selectors.html where there was in fact a different behavior for the characters between U+2229 and U+22DB (the base version is also available in a much more common font than code 2000), and I need to check again if your explanation covers what I saw with them.

Comment 5

11 years ago
Created attachment 275951 [details] [diff] [review]
Patch v.0 -- use a textrun flag instead of ScriptIsComplex

I haven't tested this on Linux yet, and I may refine the definition of IS_COMPLEX_SCRIPT a bit more, but I'd welcome comments from Robert and Stuart (and anyone else!) on the general approach.
Assignee: nobody → smontagu
-            (gfxTextRunFactory::TEXT_OPTIMIZE_SPEED | gfxTextRunFactory::TEXT_IS_RTL)) ==
+            (gfxTextRunFactory::TEXT_OPTIMIZE_SPEED | gfxTextRunFactory::TEXT_NEED_CTL)) ==

Why are you removing the TEXT_IS_RTL check? It's not subsumed by TEXT_NEED_CTL ... for example style could be forcing the direction. And we need to bypass the simple path for that case because of glyph mirroring.

+#define IS_COMPLEX_SCRIPT(c) \

Since this is just an optimization, instead of this large list which will probably require extension, would it be better to identify some common characters that *aren't* complex and leave it at that? Say, Latin-1 and Han?

Comment 7

11 years ago
I rechecked what happens with U+2229 through U+22DB, and I have found out they indeed show another problem.

When doing font fallback for a character that has VS1 applied, the fonts that contain the base character will be considered OK for the fallback and will be used without searching further for a font that can handle the VS1.

For exemple fallback for "U+2229,VS1" will use Arial Unicode MS or Batang even when the system also has code2000 that could display it properly.

This requires more work, because the correct solution would be to first search for the VS1 variant, but if not found to settle for fonts containing only the base character.

Should this go in a separate bug ?


Comment 8

11 years ago
That can quite likely be solved in a similar way to IsJoiner in gfxWindowsFonts.cpp, but better make it a separate bug anyway.


11 years ago
Blocks: 392588

Comment 9

11 years ago
I created bug 392588 for the second issue

Comment 10

11 years ago
For info, the unicode site now has a page about what to do when you can't display a character :
  The expected rendering behavior for the sequence of character plus a variation selector (C+VS) is as follows:
    * If C + VS is listed in StandardizedVariants.txt and supported by the rendering system, then display with the specified glyph.
    * Otherwise, display with the normal glyph for C (with no visible rendering for the VS).

Comment 11

10 years ago
This problem seems now WORKSFORM on recnet build.
bug 392588 is still there though.

Comment 12

9 years ago
Variation selectors now work correctly in the base case. 
bug 392588 is still there
Last Resolved: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.