Closed Bug 670495 Opened 14 years ago Closed 13 years ago

multiple text rendering regressions with Unicode combining marks on Windows

Categories

(Core :: Layout: Text and Fonts, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: bugzilla, Unassigned)

References

()

Details

(Keywords: testcase)

Attachments

(3 files)

The following minimal test case rendered correctly in Firefox 3.6.18 Windows and MANY previous versions of Firefox (and also renders correctly in IE6 and beyond, and Chrome and Safari on all platforms I have tried) but it now renders incorrectly in Firefox 5 Windows: http://lurkertech.com/bugzilla/testcm.html The text renders incorrectly in two different ways for the common Windows fonts "Segoe UI" and "Arial Unicode MS." For "Arial Unicode MS" (shown in the test page above), you can see that the dot over the "i" does not disappear and the mark is not centered. I am not sure if "i" is the only affected base character or not. For "Segoe UI," the problems are even worse, as can be seen the test page above. All of the grave and acute accents render incorrectly! This pretty much wipes out every European language! All my tests were done on machines with Windows XP. The problem is almost certainly related to the fact that the UTF-8 Unicode text in the text file uses Unicode combining marks, such as: U+0300 COMBINING GRAVE ACCENT This is a pretty serious regression, because even incredibly common sequences found in many European languages now render incorrectly. Until now, "Arial Unicode MS" and "Segoe UI" were the only Windows fonts which rendered combining marks correctly. Now there are NO Windows fonts which seem to render the test sequence correctly on Firefox 5. Please do NOT just say "well don't use combining marks." There are MANY important sequences in many languages that CANNOT be represented with pre-composed characters, because the desired combination of base+mark does not exist in Unicode. Plus, it's just plain wrong not to render combining marks correctly. IMPORTANT: You should be aware that with some other popular Windows fonts like "Tahoma" and "Arial" ("Arial" is not the same as "Arial Unicode MS"), the test case I have provided has never rendered correctly, due to a bug in those fonts from Microsoft. So don't get confused by accidentally using those in your test. "Arial Unicode MS" and "Segoe UI" have always worked, and now the regression is definitely in Firefox. "Arial Unicode MS" and "Segoe UI" are the fonts which many of us rely on for websites that need to use combinations of combining marks for which no pre-composed character exists. Yes this isn't politically correct, but we were simply left no choice to include OS-specific font names because there was no alternative that gave correct rendering. Now, it seems, there is no way to get correct rendering at all with Firefox 5. Please fix! Thanks.
Attaching the image testcm1.png showing correct and incorrect rendering with "Arial Unicode MS"
Attaching the image testcm2.png showing correct and incorrect rendering with "Segoe UI"
Attachment #545047 - Attachment mime type: text/plain → text/html
Component: General → Layout: Text
Keywords: testcase
Product: Firefox → Core
QA Contact: general → layout.fonts-and-text
Thank you for the bug report! For "Segoe UI", this looks like bug 654057. Can you try the beta of Firefox 6? I assume that fixes the "Segoe UI" problem; does it also fix the "Arial Unicode MS" problem?
Hello, |Can you try the beta of Firefox 6? | I assume that fixes the "Segoe UI" problem; |does it also fix the "Arial Unicode MS" problem? I just downloaded and tried Firefox 6 Beta (6.0b1). In that version, the "Segoe UI" problem is fixed, but the "Arial Unicode MS" problem is NOT fixed: With "Arial Unicode MS," the letter "i" with combining marks continues to render in the same incorrect way as it did with Firefox 5.
Thanks for checking that. Jonathan, could you take a look?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Right, the Segoe UI issue was bug 654057, now fixed. The problem with Arial Unicode is that the font does not include proper OpenType layout support for the combining marks. Some text systems - including Uniscribe and DirectWrite - will try in many cases to replace sequences of <base character, combining mark> with a single precomposed character (the Unicode NFC normalization form), and this can make certain combinations appear to work nicely even though they are not directly supported by the font. Harfbuzz does not currently implement this behavior, however, and so if the font itself does not properly support the sequence (e.g. by providing 'ccmp' features that implement the base+mark composition, or contextual substitution to replace the 'i' with a dotless form when an accent is present, plus 'mark' positioning features to align the accent nicely), then the result may be poor. This is fundamentally a deficiency in the font, although to some extent it can be mitigated by implementing NFC composition in the text layout system, and I believe Behdad intends to do this in harfbuzz eventually.
Thanks for the clear explanation. Since this is a regression from Firefox 3.6.18, what changed? Was Firefox 3.6.18 not using harfbuzz? Please do add that support, as a whole lot of websites have come to rely on these small sets of "complete" fonts on each platform for proper text display with combining marks. It would be nice if we could force the font to use the new elegant facilities, but sadly that's not gonna happen as that font is distributed with every version of Microsoft Office and many other Microsoft tools in its current form.
> Was Firefox 3.6.18 not using harfbuzz? Correct.
I actually have added compose()/decompose() callbacks to harfbuzz already and am implementing those in glib. Have not changed the cmap loop to use them yet. It's tricky, but I expect to get that done this week. As for mark advances, trying to cover all broken cases I've seen, I'm coming to the conclusion that ignoring GDEF and setting zero advance on all Unicode combining marks and perhaps enclosing marks is best. Do we have evidence to the contrary?
FWIW, the Arial Unicode Issue is still visible on Mozilla/5.0 (Windows NT 5.1; rv:10.0a1) Gecko/20111014 Firefox/10.0a1 ID:20111014030948. Has this been processed in the Harfbuzz Repo yet?
Version: 5 Branch → Trunk
Yes, HarfBuzz does composing now.
...and we implemented the relevant callbacks to enable this in bug 728866. So it'd be worth re-testing things here with a current Nightly build to see where we stand now.
Last issues (with some letters in Segoe UI) have gone away for me in current nightly (not aurora/beta yet), probably due to some recent Harfbuzz update, maybe bug 780409? Note that I am using Windows 7 with the latest font versions and DW shaping, so issues might still exist for OP for other reasons.
AFAIK, the issues here are now fixed, mainly thanks to a number of updates in harfbuzz. Closing this as WorksForMe. If there are remaining issues, please file new bugs with specific testcases for anything that's still a problem.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: