Closed Bug 404848 Opened 17 years ago Closed 8 years ago

Some Latin diacritics misplaced

Categories

(Core :: Graphics, defect, P2)

1.9.2 Branch
x86
Windows XP
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: smontagu, Unassigned)

References

Details

(Keywords: regression, testcase)

Attachments

(9 files, 1 obsolete file)

Attached file testcase
This regressed on Windows between the 2007092802 and 2007092902 nightlies, but I don't see any relevant checkins in that time frame: http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=all&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-09-28+02%3A00&maxdate=2007-09-29+02%3A00&cvsroot=%2Fcvsroot
Flags: blocking1.9?
Which character in the testcase is different?
I'm running 2007112302 on Linux, and I see unacceptable differences in the letter "i":  in the second line, the dot is still on the "i", and the accent is just globbed on top.

On my machine, the second line is less pleasing overall, with much squarer accents -- the dots of the umlaut/diaeresis are squares instead of circular, for example.  I can believe that's a problem with the fonts that I have installed rather than with the application.
Attached image on windows Vista
Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9b2pre) Gecko/2007112305 Minefield/3.0b2pre ID:2007112305

Screenshot from Windows vista - doesn't seem to see the same problem...
Attached image on Xubuntu Linux
As rendered by Seamonkey nightly 2007112302.
Flags: blocking1.9? → blocking1.9+
Priority: -- → P3
On Windows XP (standrd install, no complex script installed), I'm getting a different result. The i-circonflex and i-trema are damaged, but many accents are also offset to the right. 

http://img155.imageshack.us/my.php?image=screenshotek1.png

Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9b3pre) Gecko/2007121205 Minefield/3.0b3pre
Note that the accents that have the offset, are also the ones that look blocky. It seems that theses are not the built-in characters in the font, but combined from 2 characters (^ + i = î)
simon can you dig in to this a bit more?
Assignee: nobody → smontagu
I still don't understand the 2007-09-29 regression date, but what is happening is that we are passing the combining accents separately to WhichFontSupportsChar(), and apparently some fonts claim not to support them as separate characters even though they handle them correctly in context after a base character.
how important is this?
Priority: P3 → P4
I don't set priorities but to me this is a major regression and makes the combining characters unusable.  What's most vexing is that this works perfectly in Firefox 2.
I'm not sure if this only affects certain fonts, and if so, if they're common fonts.  We need some more investigation here, especially based on comment #9.
Flags: blocking1.9?
Flags: tracking1.9+
Flags: blocking1.9?
Flags: blocking1.9+
Priority: P4 → P2
so the problem here is that we match Times New Roman for the 'a' and then a different font for the #x30?.  If we want this to work "right" then we'd need to change the font we use for 'a' as well.  We could peak at nextCh and see what the best font for it is if it is in the combining char range and try to use it for ch
Attached patch fix? (obsolete) — Splinter Review
soooo.

the if 0 code in this patch is "correct" but ends up picking a different font.
the if 1 code doesn't seem right but works.

thoughts?
Assignee: smontagu → pavlov
Status: NEW → ASSIGNED
Attachment #311898 - Flags: superreview?(smontagu)
Attachment #311898 - Flags: review?(jdaggett)
To run, first save the file and open locally (uses XPConnect to pull in list of local fonts).

This shows that the problem is definitely font-dependent, some fonts render fine (Arial Unicode MS, Courier New) but others don't (Arial, Times New Roman).

I don't think the patch really works, since it's only dealing with the one-char look ahead case.  Combining marks can be stacked so you could have a situation like:

a + umlaut + tilde
&#x61 + &#x308 + &#x303

Seems like the ideal fallback would be to fallback with the entire cluster (i.e. base char plus all combining chars) and then iterate through:

- is there a separate codepoint for the base + combining char combination?  test if that's in the cmap and use that (since fonts seem to do a better job with these)

- are all base + combining chars in the cmap?  if not, fallback to the next font (hmmm???)

But, argh, our win+mac font matching definitely is not set up to handle this easily.
This isn't perfect, but is probably as good as we're going to do for now.  As this is XP only and only effects a certain set of fonts, we should probably either take this patch or not fix it.  The only other real option would be to parse the ligature tables and look for codepoints they do things with that aren't in the cmap.  I don't think we want to do that for 1.9
Attachment #311898 - Attachment is obsolete: true
Attachment #311943 - Flags: superreview?(vladimir)
Attachment #311943 - Flags: review?(smontagu)
Attachment #311898 - Flags: superreview?(smontagu)
Attachment #311898 - Flags: review?(jdaggett)
Simple testcase that shows related problem on Mac.  This is not exactly a real-world case since most folks don't need to render q + umlaut.

What's going on?  ATS is swizzling around the cmap for fonts that lack combining mark codepoints.  

cmap for Trebuchet MS:

<map charValue="0x02DD" glyphRefID="223"/>
<map charValue="0x0394" glyphRefID="168"/>

i.e. no combining marks

But when the cmap is loaded via ATSFontGetTable in MacOSFontEntry::ReadCMAP, combining mark codepoints magically appear:

0x0300 - 0x0304, 0x0306 - 0x030C, 0x030F, 0x0311, 0x0313, 0x031B, etc.

When ATSUI is handed the sequence a + umlaut, it passes back the glyph id for U+00E4, the combined codepoint.  So when there's no combined character, as in the q + umlaut case, we render pooh on the screen.
Stuart, I don't know why you think this is Windows (or even XP) only. I definitely stil see the problem with the original testcase (attachment 289717 [details]) with Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b5pre) Gecko/2008032601 SeaMonkey/2.0a1pre. It still displays like attachment 289951 [details] for me. As John sees problems on Mac, too, I change this to All/All.
OS: Windows XP → All
Hardware: PC → All
Comment on attachment 311943 [details] [diff] [review]
use the other path

Looks fine from my end.
Attachment #311943 - Flags: superreview?(vladimir) → superreview+
Peter: the font code on each platform is completely different.  This is an issue on Windows due to the fonts that are installed (namely, the versions of Times New Roman and others) on XP that are fixed on Vista.

Linux and Mac should get separate bugs.
OS: All → Windows XP
Hardware: All → PC
I'm well aware that the font backends are different. But that doesn't necessarily mean that one has to open different bugs because that creates additional overhead (for pointing to existing testcases and copying around text and URLs, makes it more difficult when searching for bugs). Anyway, opened Bug 425650 and Bug 425651 as requested.
Comment on attachment 311943 [details] [diff] [review]
use the other path

I prefer the first option, because it gives more consistent results for more fonts. This one tends to produce a "ransom note" effect.

The Alan Wood page comes out fine in a font like Code2000. FWIW, it's broken on IE7 also with default font settings.
Attachment #311943 - Flags: review?(smontagu) → review-
It looks like "text content SHOULD be in fully-normalized form"

http://www.w3.org/TR/2004/WD-charmod-norm-20040225/#sec-NormalizationApplication

and that form is NFC

http://www.w3.org/TR/2004/WD-charmod-norm-20040225/#sec-ChoiceNFC

Such text wouldn't see this bug so I question why this bug should block.
Are decomposed characters seen on the web frequently?

Or is this bug more about characters that don't have a precomposed form?
If so, the first option would not be a good solution (unless Uniscribe does some font fallback to find the combining marks).
any thoughts on just minusing this bug?  I don't like either of my patches and I'm not sure it will effect much in the real world, and is really a font bug.
Flags: blocking1.9+ → blocking1.9-
Karl,

even NFC can contain combining diacritics, not all base character + diacritic combinations exist as precomposed characters, so for various languages combining diacritics are a fact of life, even with NFC.

As far as I can see this isn't a bug to fix in Firefox, its a web development issue. If an appropriate version of uniscribe is being used and the OpenType font has mark and mkmk features defines that accommodate the base character + diacritic(s) combination required then every thing works fine and as expected.

The situation is complicated on Windows by the fact that support certain base character and diacritic combinations were introduced in Windows 2000 and in the core fonts in order to support Microsoft's implementation of Vietnamese.

More complete combining diacritic support wasn't introduced in the version of Uniscribe in Windows XP SP2. Core fonts weren't updated until Windows Vista.

Alternatively, you could use Graphite as an alternative rendering system.

There are limitations with the OS and their are limitations with the fonts. Its not enough to have a font with a glyph for the combining diacritic, you need a font with appropriate mark and mkmk OpenType features.

Andrew
I disagree that there is no bug to fix in FF; this used to work in FF2 but doesn't work in FF3, and this is for a page with no font family defined.  So whatever basic style FF3 is applying doesn't work with combining characters on Windows XP (I haven't tested it on Linux).  Maybe some fonts that a web designer calls for won't work but the defaults should work (especially since they USED to).
It may be necessary to be more specific.

I'd expect Firefox to use the fonts and font rendering available on windows. Rather than trying to implement its own rendering. Like any other operating system. Language support, fonts and rendering technologies change over time. 

Win2000 onwards used one method to support Vietnamese tone markers. Then introduced OpenType mark and mkmk feature support in WinXP-SP2.

If the its part of Microsoft's Vietnamese repertoire and the defualt font being used has Windows-1258/Vietnamese support then maybe there is a bug.

if its a base character + combining diacritic outside Microsoft's Vietnamese repertoire and on WinXP-SP1 or earlier, then there is no rendering support. 

If its a base character + combining diacritic outside outside Microsoft's Vietnamese repertoire and on WinXP-SP1, then most likely its a font issue. WinXP-SP2 has the rendering support, but default install has no appropriate fonts.

If its Windows Vista, and using core fonts as default fonts, then its a bug in Firefox

Basically to determine if its a bug in Firefox, you'd need to identify 

* the version of windows, 
* the actual font being used by default (including the version number of the font), i.e. Arial v. 3.06 will not work, but Arial 5.0 will.
* the version of uniscribe Firefox is using
* which groupings of base character + combining diacritics work or don't work?

If its a combination that works in my other applications, then i'd expect it to work in Firefox. If it doesn't work in other uniscribe aware applications then i wouldn't expect it to work in Firefox either.

An example that doesn't specify an appropriate font isn't likely to work. Unless you're on Vista and using a core font by default, and even that may not work for some combinations.

The best approach for pre-Windows XP-SP2 installations is to patch firefox to use Graphite.

Just my two cents worth

I only expect Firefox to support what actually is supported by the OS, unless you want a completely separate rendering system in Firefox.

Andrew
I don't have all the information on hand as it's my work PC that has the relvant test builds of firefox installed, but it is definitely Windows XP SP2.  As for the font that is used, it is whatever font Firefox 3 uses on Windows XP SP2 whenever you don't specify any font at all anywhere on a page.

I'm afraid I don't know anything about uniscribe so I can't comment on that unless you can tell me how I can tell which version I am using.

As for the groupings of base character and diacritic, the ā fails for me on Windows XP SP2.  I just tested it on Fedora 8 and it works fine (but the Chinese font that is selected looks very odd... another annoying problem but not as bad).

I can sympathize with your comment about supporting what the OS provides, but why does it work fine in FF2 on the exact same setup?
On Win XP-SP2 there are no fonts available that are designed to correctly position the combining macron properly. You'd need to install a third party font and set it up as your default.

You'd also have to enable complex script rendering on Win XP-SP2, since this isn't active by default.

Why it worked on FF2? a few possibilities:

* FF2 was tweaking something it shouldn't have been. 
* The default font you were using is different, different font or maybe different version. or 
* alternatively it could have been a fluke. 

Some fonts like Arial Unicode MS don't support diacritic positioning, but give somewhat adequate results with certain characters, i.e. a,e,o,u with macron, but you see rendering issue with the i or any of the uppercase characters. The glyph is just placed in a default location that works with some letters, doesn't with others and tends to overstrike capital letters.

Basically correct positioning of diacritics and diacritic stacking is a font and rendering system issue. Win XP-SP2 has the rendering capabilities if you install enable complex script support, but needs you to download and install appropriate fonts to make use of this rendering.

Andrew

I can confirm that (for my testcase) I am only testing the affected page on systems that have Asian language support enabled.

Personally I still see this as a regression and I'm sure most users will agree that something which used to work should continue to work.  Is there a workaround that works on Windows XP SP2 with Asian language support, besides just trying every font until I find one that works, and hoping that one is installed on my users' systems?  The vexing problem is that this only affects FF3 on one platform.  The exact same page works on every other combination of browser/platform I've tried (IE6, IE7, Opera, Safari, FF2: all XP SP2 w/ Asian language support, Vista FF2, IE7, Safari 3, Fedora 7/8 FF2, FF3, Konqueror).
(In reply to comment #32)
> I can confirm that (for my testcase)

Which is your testcase?
(In reply to comment #33)
> (In reply to comment #32)
> > I can confirm that (for my testcase)
> 
> Which is your testcase?

I guess it wasn't carried over when my bug was marked a dupe of this one, see the files attached to https://bugzilla.mozilla.org/show_bug.cgi?id=407911
I'm kind of jealously looking at the Xubuntu screenshot. On Gentoo Linux, things seem to be completely broken.
Just wanted to comment that I verified that Firefox 2 on Windows 2000 DOES place the combining characters properly using the same test case I attached to bug 407911.  So I think the discussion about Windows XP's font support is a red herring.
Screen shot of Firefox 2.0.0.15 rendering behaviour when a combining macron is present in data.
Hi Mark,

just tested on Firefox 2.0.0.15 on Windows XP-SP2, and results dependant on
font.

If i remember your test correctly, you tested a lowercase a and combining
diacritic, lower case vowels a,e,o,u are approximately similar widths at lower
point sizes a well designed font may position them correctly without having
appropriate support for combining diacritics.

A better test would include:

* uppercase letters - fonts without mark and mkmk opentype features will
overstrike uppercase characters.
* the letter i - fonts without mark and mkmk opentype features will not centre
macron over lowercase letter, will leave dot above i. Will overstrike uppercase
I, with macron off centre.
* the letters m and M. macron over lowercase m will be off centre. The letter
is too wide for simple placement to get it right. Uppercase M will be off
centre and overstrike letter.

For a more thorough test, try adding a second stacking diacritic to the mix and
see what happens. OpenType fonts with the mark feature will may also have the
mkmk feature.

From tests locally on Firefox 2, I'd suggest that your results were more pot
luck that actual indication of text layout and font rendering support.

I'll run my test on Firefox 3 as well to confirm.

Characters used: 
ā Ā ē Ē ī Ī ō Ō ū Ū m̄ M̄
ā́ Ā́ ḗ Ḗ ī́ Ī́ ṓ Ṓ ū́ Ū́ m̄́ M̄́

See screenshot above.
Might be worth documenting internal Windows support for combining diacritics.

Windows 2000 introduced support for Vietnamese. Microsoft's Vietnamese Unicode support was base don the model they used for Windows-1258 codepage, i.e. the diacritics U+0300 U+0301 U+0303 U+0309 U+0323 could be used with the following base characters: a A ă Ă â â e E ê Ê i I o O ô Ô ơ Ơ u U ư Ư y Y

Fonts that supported Vietnamese using this model would indicate support for the Windows-1258 codepage. Most non-Microsoft Vietnamese Unicode solutions used precomposed characters instead of using combining diacritics.

Any language using a subset of the above Vietnamese character + combining diacritic combinations would display correctly on Windows 2000 onwards.

Nothing much changed until Windows XP (Service Pack 2) which included an updated version of usp10.dll which had support for the mark and mkmk OpenType features for the Latin and Cyrillic scripts. Microsoft did not update its core fonts at this time. So only a handful of third party OpenType fonts were available with more complete combining diacritic and diacritic stacking support. For combining diacritics to work, Latin and Cyrillic scripts need to be treated as Complex scripts. On Windows XP-SP2 this means that the complex script language collection needs to be installed. 

Windows Vista introduced updated core fonts with combining diacritic support.

The only time i've seen Firefox supporting combining diacritics on versions of Windows prior to WinXP-SP2 were versions of Firefox 1.5 patch to use the Graphite rendering engine rather than Uniscribe.

Andrew

(In reply to comment #35)
> Created an attachment (id=329215) [details]
> screenshot: FF3/xulrunner1.9 on Gentoo Linux

Please see bug 425651 for Linux issues.  (This bug is for MS Windows.)
The failure of working properly with African scripts of modern browsers has been identified as one of the reasons why the Wikipedias in African languages are failing.

At this moment only Apple's Safari is able to show these characters in edit mode properly. To me this is proof that it can be done and that it is also a browser issue.

http://ln.wikipedia.org/w/index.php?title=Utilisateur:GerardM&action=edit 

This is the page that should be good. It is ok for 

http://ln.wikipedia.org/wiki/Utilisateur:GerardM

Thanks,
     GerardM
     member language committee of the Wikimedia Foundation
This seems to be a bug that only happens on Windows XP, which is a pretty big deal in that a plurality (if not a majority) of our users are on it. Is it fixable? Or should we simply punt?

(John, feel free to unassign if you're not interested in working on this.)
Assignee: pavlov → jdaggett
happens on Windows XP under certain conditions.

It is not a firefox bug, it is an issue relating to fonts and font rendering systems on various operating systems.

On pre-Vista windows platforms the situation is complicated. The only keyboard layout shipped with Windows that uses combining diacritics is the Vietnamese keyboard. It uses combining diacritics for the five tone markers. IF fonts designed to support the Windows-1258 codepage are used, and if the base character and combining diacritic combinations are ones used in Microsoft's representation of Vietnamese data (neither NFC or NFD) ... then data should display correctly. In Windows XP, and Windows XP SP2, this will be the only support you will find for combining diacritics.

Windows XP Service Pack 2 included an updated version of Uniscribe that added support for OpenType fonts that were designed for supporting combining diacritics. NO system fonts were updated and shipped with combining diacritic support (other than what was already available for Vietnamese).

To enable combining diacritic support On WinXP SP2/3 you need to enable complex script support and install OpenType fonts with the base character+diacritic support you require. The SIL Latin fonts and DejaVu Sans fonts would be appropriate.

As Gerard indicated this particularly affects African languages. BUt the issue is two fold. Firefox will work correctly and display correctly on Win XP SP2/3 if they have already enabled windows to support their language. If they are still having problems, hten the issue is more likely to be a two pronged problem:
1) poorly designed font stack in a site's stylesheet, and
2) poor font fallback mechanism (i.e. Firefox not selecting an appropriate fallback font int he absence of a required font).

The other source of problems is that the computer in question has never been configured to support the langauge that is being used.

Gerard, i have some old notes i wrote ages ago on windows XP support for African languages, i'll send the link to you.

Andrew
Assignee: jdaggett → nobody
Status: ASSIGNED → NEW
This is still happening all the way into Win 7.
http://www.reddit.com/r/funny/comments/1uxbeb/state_alters_420_mm_sign_to_thwart_thieves/cemz5q8

However as the same rendering in FF occurs in IE11, looks like the princess is in another castle.
The Testcase is WFM using Nightly 49.0a1 (2016-05-27) on Windows 10 Desktop
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WORKSFORME
Version: Trunk → 1.9.2 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: