lang attribute not passed to font layout/rendering layer




Layout: Text
6 years ago
5 years ago


(Reporter: Steve White, Unassigned)


7 Branch

Firefox Tracking Flags

(Not tracked)



(7 attachments)



6 years ago
Created attachment 566692 [details]
correct rendering by pango-view

User Agent: Mozilla/5.0 (X11; Linux i686; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
Build ID: 20110928224103

Steps to reproduce:

Have a font with a 'locl' lookup for Cyrillic text which replaces a few Russian letters with Serbian/Macedonian ones.

Tested it with pango-view.  (See attachment) with input like this:
<span lang="ru"> б г д п т </span> <span lang="ru"> <i>б г д п т</i> </span>
<span lang="sr"> б г д п т </span> <span lang="sr"> <i>б г д п т</i> </span>

(Note very few fonts make this distinction.)

Wrote HTML showing the variable characters in between <span> tags with 
Viewed this with Firefox.

Actual results:

The text is displayed, I see the font is the correct one.
(I cleared all the caches, re-started Firefox after installing the font; I have reason to be confident about the font being correct.)
However, no distinction was visible between the lang="sr" text and the lang="ru" text.

This would be consistent with the browser not passing the lang tag properly to the underlying font rendering software (Pango in this case, I think.).

Expected results:

The text from the two languages should have appeared quite different, as in the attached image.

Comment 1

6 years ago
Does this work in any other browser?
Component: General → Layout: Text
Product: Firefox → Core
QA Contact: general → layout.fonts-and-text
I thought we'd fixed this (see bug 24139). Please attach your complete (minimal) testcase, and specify the exact font version being used so that we can look into it more closely.

Comment 3

6 years ago
Created attachment 566872 [details]
pango-view output with DejaVu

pango-view --font "DejaVu Serif Italic 32" --markup Serbian-pango.html

where Serbian-pango.html contains
<span lang="ru"> б г д п т ѓ</span> <span lang="ru"> <i>б г д п т ѓ</i> </span>
<span lang="sr"> б г д п т ѓ</span> <span lang="sr"> <i>б г д п т ѓ</i> </span>

Comment 4

6 years ago
Created attachment 566873 [details]
HTML test file using DejaVu

Comment 5

6 years ago
Does it work in other browsers?  Not that I can see, in Linux anyway.


And OpenOffice etc... Just nevermind.  They are pathetically broken when it comes to font features turned on by locale.

Comment 6

6 years ago
The sample is more interesting with

pango-view --font "DejaVu Serif 32" --markup Serbian-pango.html

Comment 7

6 years ago
I'm experiencing the same problem. Even this official test page doesn't work with my Firefox 7.01 on Ubuntu 11.04. Still I can see Serbian glyphs on the Serbian Wikipedia, but only there. Why?

Comment 8

6 years ago
Official test page:

Comment 9

6 years ago
Yes I looked at some Serbian pages, such as the Serbian Wikipdeia page on Macedonian orthography (look up Macedonian orthography, then click on language "Српски/Srpski")

In places, I see the Serbian "be" form in other places, the Russian.

For those not sensitive to Cyrillic: we're mostly looking at the letter "be", which looks rather like a Greek beta.  The Russian form has a tail that begins on the left side and has a pronounced upward flourish.  The Serbian form starts in the top middle of the circle, and ends more horizontally.  

In the DejaVu fonts, the distinction is made by a 'locl' substitution lookup for Serbian and Macedonian, which should be triggered by lang="sr".

Trying to determine why sometimes Serbian 'be' is appearing in that page, I was changing things all over the small example I attached.  

I am seeing very chaotic behavior here.  Things that should not affect it, are, and things that should pick the language, are failing to do so.  Really crazy.  Something's buggy.  But I haven't yet determined what triggers it.

It isn't CSS though (at least as set in the page).

Comment 10

6 years ago
Created attachment 567240 [details]
Alessandro's test showing Serbian (on my system at least)

Comment 11

6 years ago
More experimentation, using Alessandro's example page.

Putting lang="sr" in the <html> tag has a strong influence.  Of course, then unless otherwise specified, *all* the Cyrillic text should be modified for Serbian.  This should not be necessary, either.  Putting it in any element tag should set the default language for that element and its children.

It is clear at least that the lang attribute is not being correctly dealt with in the containment hierarchy.

However, I think it's worse than that.  I seem to be seeing things change sometimes with just a second screen refresh, and in the other test file, simply adding text somewhere could change the apparent language in an unrelated element.

At this moment, I have Serbian Cyrillic showing in Alessandro's example.  See attached.
I'm confirming this bug, as it's clear there's a problem somewhere that needs investigation and fixing - my hunch is that it may turn out to be a problem with the application of OpenType features, rather than the handling of the 'lang' attribute, but that's only speculation until we track this down.

The erratic behavior described in comment 11 suggests there may be an uninitialized value somewhere that's "randomly" affecting whether the feature gets applied correctly.
Ever confirmed: true

Comment 13

6 years ago

Just to rattle your cage:  I'm seeing this again in FF 9.0.1 with a font that makes a distinction between Yiddish and Hebrew vowel marks. 

The marks are properly positioned by XeTeX using this font.  Also pango-view has no problem.

Firefox behaves erratically.  If two table cells contain Hebrew, but one has 'lang' attribute "he" and the other "yi", it will position marks according to whichever language came *first*.  Weirder, simply reversing them in the file and re-loading isn't enough to make it forget.  I think I've gone so far as to delete the cache and re-start, to get it to see that I've changed the 'lang' attributes.

It's as though it associates a language with a script, and then won't let go of the association.
(In reply to Steve White from comment #13)
> Firefox behaves erratically.  If two table cells contain Hebrew, but one has
> 'lang' attribute "he" and the other "yi", it will position marks according
> to whichever language came *first*.

Do the two table cells contain the same characters? I ask because this reminds me of bug 386339 comment 1. The text run word cache has probably changed a lot since then so it may be completely irrelevant, but fonts with "locl" are certainly an example of what I asked there in bug 386339 comment 6 about the same sequence of unicode codepoints not being rendered with the same glyphs.

Comment 15

6 years ago
Created attachment 593428 [details]
Hebrew mark placement example

This requires a font which makes a distinction between the placement in Yiddish and Hebrew of vowel marks under the yod and yodyod consonants.

Comment 16

6 years ago
Hi, I tried altering the text in various ways.  It seems to have no effect.
Find the example attached.

1) This needs a special font that makes a distinction between the languages.
   The current SVN version of GNU FreeSans is the example I'm using.
   You can build this with FontForge.  I could also send you a snapshot if you like.

2) A correspondent has told me that a similar sample works with that version 
   of FreeSans under Mac OS with the latest FireFox 10.  
   I have tried it under Windows with FireFox 10 and it fails.  
   I've asked him for the exact HTML file he's using.
Attachment #593428 - Attachment mime type: text/plain → text/html

Comment 17

6 years ago
He has verified that the same "yiddish" HTML file works with FireFox 10 on Mac OS.
Again, it does not work for me with FireFox 10 on Windows 7.
Steve, can you retest with a nightly? I see the Yiddish/Hebrew issue on Linux but only with builds before 2012-01-07, so I think it was fixed by bug 703100
Attachment #566873 - Attachment mime type: text/plain → text/html
Attachment #567240 - Attachment mime type: text/plain → text/html
On Windows, I think you'll need to set gfx.font_rendering.harfbuzz.scripts to 7 (instead of the default 3) so that Hebrew script is rendered using harfbuzz in order to get the proper result here.

Comment 20

6 years ago
Created attachment 593493 [details]
working example

Comment 21

6 years ago
Created attachment 593496 [details]
working example-Hebrew marks

Comment 22

6 years ago
That's it.

Nightly 12.0a1 (2012-01-31) seems to have the problem solved.

I attached shots of the Russian/Serbian Cyrillic distinction as well as Hebrew/Yiddish.  These are using the SVN versions of GNU FreeFont's FreeSans and FreeSerif.

Thanks guys!
Last Resolved: 6 years ago
Depends on: 703100
Resolution: --- → FIXED
Duplicate of this bug: 741093
You need to log in before you can comment on or make changes to this bug.