Created attachment 566692 [details]
correct rendering by pango-view
User Agent: Mozilla/5.0 (X11; Linux i686; rv:7.0.1) Gecko/20100101 Firefox/7.0.1
Build ID: 20110928224103
Steps to reproduce:
Have a font with a 'locl' lookup for Cyrillic text which replaces a few Russian letters with Serbian/Macedonian ones.
Tested it with pango-view. (See attachment) with input like this:
<span lang="ru"> б г д п т </span> <span lang="ru"> <i>б г д п т</i> </span>
<span lang="sr"> б г д п т </span> <span lang="sr"> <i>б г д п т</i> </span>
(Note very few fonts make this distinction.)
Wrote HTML showing the variable characters in between <span> tags with
Viewed this with Firefox.
The text is displayed, I see the font is the correct one.
(I cleared all the caches, re-started Firefox after installing the font; I have reason to be confident about the font being correct.)
However, no distinction was visible between the lang="sr" text and the lang="ru" text.
This would be consistent with the browser not passing the lang tag properly to the underlying font rendering software (Pango in this case, I think.).
The text from the two languages should have appeared quite different, as in the attached image.
Does this work in any other browser?
I thought we'd fixed this (see bug 24139). Please attach your complete (minimal) testcase, and specify the exact font version being used so that we can look into it more closely.
Created attachment 566872 [details]
pango-view output with DejaVu
pango-view --font "DejaVu Serif Italic 32" --markup Serbian-pango.html
where Serbian-pango.html contains
<span lang="ru"> б г д п т ѓ</span> <span lang="ru"> <i>б г д п т ѓ</i> </span>
<span lang="sr"> б г д п т ѓ</span> <span lang="sr"> <i>б г д п т ѓ</i> </span>
Created attachment 566873 [details]
HTML test file using DejaVu
Does it work in other browsers? Not that I can see, in Linux anyway.
And OpenOffice etc... Just nevermind. They are pathetically broken when it comes to font features turned on by locale.
The sample is more interesting with
pango-view --font "DejaVu Serif 32" --markup Serbian-pango.html
I'm experiencing the same problem. Even this official test page doesn't work with my Firefox 7.01 on Ubuntu 11.04. Still I can see Serbian glyphs on the Serbian Wikipedia, but only there. Why?
Official test page: http://people.mozilla.org/~jdaggett/webfonts/serbianglyphs.html
Yes I looked at some Serbian pages, such as the Serbian Wikipdeia page on Macedonian orthography (look up Macedonian orthography, then click on language "Српски/Srpski")
In places, I see the Serbian "be" form in other places, the Russian.
For those not sensitive to Cyrillic: we're mostly looking at the letter "be", which looks rather like a Greek beta. The Russian form has a tail that begins on the left side and has a pronounced upward flourish. The Serbian form starts in the top middle of the circle, and ends more horizontally.
In the DejaVu fonts, the distinction is made by a 'locl' substitution lookup for Serbian and Macedonian, which should be triggered by lang="sr".
Trying to determine why sometimes Serbian 'be' is appearing in that page, I was changing things all over the small example I attached.
I am seeing very chaotic behavior here. Things that should not affect it, are, and things that should pick the language, are failing to do so. Really crazy. Something's buggy. But I haven't yet determined what triggers it.
It isn't CSS though (at least as set in the page).
Created attachment 567240 [details]
Alessandro's test showing Serbian (on my system at least)
More experimentation, using Alessandro's example page.
Putting lang="sr" in the <html> tag has a strong influence. Of course, then unless otherwise specified, *all* the Cyrillic text should be modified for Serbian. This should not be necessary, either. Putting it in any element tag should set the default language for that element and its children.
It is clear at least that the lang attribute is not being correctly dealt with in the containment hierarchy.
However, I think it's worse than that. I seem to be seeing things change sometimes with just a second screen refresh, and in the other test file, simply adding text somewhere could change the apparent language in an unrelated element.
At this moment, I have Serbian Cyrillic showing in Alessandro's example. See attached.
I'm confirming this bug, as it's clear there's a problem somewhere that needs investigation and fixing - my hunch is that it may turn out to be a problem with the application of OpenType features, rather than the handling of the 'lang' attribute, but that's only speculation until we track this down.
The erratic behavior described in comment 11 suggests there may be an uninitialized value somewhere that's "randomly" affecting whether the feature gets applied correctly.
Just to rattle your cage: I'm seeing this again in FF 9.0.1 with a font that makes a distinction between Yiddish and Hebrew vowel marks.
The marks are properly positioned by XeTeX using this font. Also pango-view has no problem.
Firefox behaves erratically. If two table cells contain Hebrew, but one has 'lang' attribute "he" and the other "yi", it will position marks according to whichever language came *first*. Weirder, simply reversing them in the file and re-loading isn't enough to make it forget. I think I've gone so far as to delete the cache and re-start, to get it to see that I've changed the 'lang' attributes.
It's as though it associates a language with a script, and then won't let go of the association.
(In reply to Steve White from comment #13)
> Firefox behaves erratically. If two table cells contain Hebrew, but one has
> 'lang' attribute "he" and the other "yi", it will position marks according
> to whichever language came *first*.
Do the two table cells contain the same characters? I ask because this reminds me of bug 386339 comment 1. The text run word cache has probably changed a lot since then so it may be completely irrelevant, but fonts with "locl" are certainly an example of what I asked there in bug 386339 comment 6 about the same sequence of unicode codepoints not being rendered with the same glyphs.
Created attachment 593428 [details]
Hebrew mark placement example
This requires a font which makes a distinction between the placement in Yiddish and Hebrew of vowel marks under the yod and yodyod consonants.
Hi, I tried altering the text in various ways. It seems to have no effect.
Find the example attached.
1) This needs a special font that makes a distinction between the languages.
The current SVN version of GNU FreeSans is the example I'm using.
You can build this with FontForge. I could also send you a snapshot if you like.
2) A correspondent has told me that a similar sample works with that version
of FreeSans under Mac OS with the latest FireFox 10.
I have tried it under Windows with FireFox 10 and it fails.
I've asked him for the exact HTML file he's using.
He has verified that the same "yiddish" HTML file works with FireFox 10 on Mac OS.
Again, it does not work for me with FireFox 10 on Windows 7.
Steve, can you retest with a nightly? I see the Yiddish/Hebrew issue on Linux but only with builds before 2012-01-07, so I think it was fixed by bug 703100
On Windows, I think you'll need to set gfx.font_rendering.harfbuzz.scripts to 7 (instead of the default 3) so that Hebrew script is rendered using harfbuzz in order to get the proper result here.
Created attachment 593493 [details]
Created attachment 593496 [details]
working example-Hebrew marks
Nightly 12.0a1 (2012-01-31) seems to have the problem solved.
I attached shots of the Russian/Serbian Cyrillic distinction as well as Hebrew/Yiddish. These are using the SVN versions of GNU FreeFont's FreeSans and FreeSerif.
*** Bug 741093 has been marked as a duplicate of this bug. ***