support Unicode combining characters
Categories
(Core :: Layout: Text and Fonts, defect, P3)
Tracking
()
Tracking | Status | |
---|---|---|
firefox68 | --- | fixed |
People
(Reporter: gwalla, Assigned: jfkthame)
References
()
Details
(Keywords: intl)
Attachments
(1 file)
Mozilla currently does not support combining characters in Unicode. These are characters that are supposed to be rendered in combination with (above, below, around, overlayed with, etc.) the preceding character (for example: accents). Mozilla displays them, but following the preceding character as if they were stand-alone spacing characters. http://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm is a page on IPA in Unicode, which can be used as a test page for this functionality, since the IPA relies heavily on combining characters for diacritics.
Combining characters bugs seem to be being marked duplicates of bug 60546. Does that make sense here too? (Also, should we have separate bugs for separate platforms, or is there core work that needs to be done as well?)
Reporter | ||
Comment 2•21 years ago
|
||
It sms likely that they stem from the same problem. I'm using the ClearlyU font and Adobe's Helvetica Unicode (also, fixing typo in summary)
Updated•21 years ago
|
Comment 3•21 years ago
|
||
Separate platforms need separate solutions. On Windows, the newest version of Uniscribe dll (that comes _only_ with Office 2003) supports combining diacritic marks for Latin/Greek/Cyrillic. So, if you have Office 2003 and opentype fonts for Latin/Cyrillic/Greek with necessary GSUB/GPOS tables, it should just work fine for 'unjustified' text. For justified text, it doesn't work. On Linux, even with my patch for bug 215219, Mozilla won't be able to render combining diacritic marks correctly (well, with fonts like Code2000, it works more or less because it uses 'zero-advance' glyphs for combining diacritics) because Pango doesn't support combining diacritic marks for Latin/Greek/Cyrillic. On Mac, ATSUI may already do a reasonable job, but we don't use ATSUI for 'string drawing'. Anyway, it's not a good idea (even if it's possible) to roll out own code to overcome the platform limitation here.
Comment 4•20 years ago
|
||
*** Bug 162421 has been marked as a duplicate of this bug. ***
Comment 5•20 years ago
|
||
I'm informed that 1.7a on Windows has some level of support for combining characters - a level that isn't matched by the Mac OS X version. See the attempted demo at http://www.epinions.com/content_3826622596
Comment 6•20 years ago
|
||
On Linux, this depends on bug 215219 and Pango (there's a Pango bug on combining diacrtic marks for Latin/Greek/Cyrillic). On Windows, this depends on bug 218887. For Mac, see bug 205476 and bug 121546.
Comment 7•20 years ago
|
||
*** Bug 273901 has been marked as a duplicate of this bug. ***
Seems like this bug is already solved, but when? I háve thís thíng wórking OK. Wíndows XP, Firefox 2.0.0.3 (worked as wéll in 1.5).
Comment 9•17 years ago
|
||
(In reply to comment #8) > Seems like this bug is already solved, but when? > > I háve thís thíng wórking OK. Wíndows XP, Firefox 2.0.0.3 (worked as > wéll in 1.5). Well, as you write: Windows. Your text does not display correctly for me on FF 2.0.0.3 on quite recent Gentoo Linux.
Comment 10•17 years ago
|
||
Looks similar to bug 85373.
Comment 11•16 years ago
|
||
Looks like 1.9 has this solved. T̜̀h̆î̤s ṭ̇ẹ̇x̰̎t looks q̌́uíţe g̈öȯd to me on Fedora if I use the DejaVu fonts. Right now it seems to be the font's fault if the marks are misplaced.
Comment 12•15 years ago
|
||
ACK, looks nice.
Updated•15 years ago
|
Comment 13•13 years ago
|
||
So, can someone responsible for QA confirm this is fixed? The text in comment #11 looks fine to me.
Comment 14•11 years ago
|
||
(In reply to Jens Müller (:tessarakt) from comment #13) > So, can someone responsible for QA confirm this is fixed? > > The text in comment #11 looks fine to me. This is still broken. I'm on OS X and the text in comment #11 looks like ****. Opera and Safari do fine. My example text to add to this discussion is: ∧̊ . It's a mathematical 'and' mixed with a combining circle above.
Assignee | ||
Comment 15•11 years ago
|
||
(In reply to elfprince13 from comment #14) > (In reply to Jens Müller (:tessarakt) from comment #13) > > So, can someone responsible for QA confirm this is fixed? > > > > The text in comment #11 looks fine to me. > > This is still broken. I'm on OS X and the text in comment #11 looks like > ****. Opera and Safari do fine. > > My example text to add to this discussion is: ∧̊ . It's a mathematical > 'and' mixed with a combining circle above. Both comment 11 and your example look fine here on OS X (10.7). The results will depend, however, on your font preferences (or on the fonts chosen by the web page) -- if the font being used doesn't support the diacritics, so that fallback has to pick a different font for some of them, it's likely to look bad. You can determine what font(s) are being used for a given fragment of text by selecting the text and looking in the Fonts panel of the Inspector tool (Tools / Web Developer / Inspector), or installing the fontinfo add-on[1] and then choosing Show Fonts in Selection from the right-click (context) menu for the selected text. In my case, the comments here are displayed in Courier (as the default monospaced font), and look fine. If you've changed your font preferences, you may see different results. [1] https://addons.mozilla.org/en-US/firefox/addon/fontinfo/
Comment 16•11 years ago
|
||
Yes, I can fiddle with fonts with varying results here in FF (on OS X 10.8); however, on Safari and Opera I'm getting correct rendering results regardless of which font I have selected. For example, if I set the styling to Menlo,courier,mono (respecting system defaults for mono-fonts), I get reasonable results in FF. However, if I set Verdana,sans or Arial,sans I still get reasonable results in Safari and Opera, and FF breaks. Please stop blaming the fonts: both glyphs are rendered, they just render with incorrect positioning.
Assignee | ||
Comment 17•11 years ago
|
||
I see varying results, depending on the browser and font choices. Note that neither Verdana nor Arial include the ∧ character, and so font fallback comes into play and some other font must be used. And Verdana doesn't have the combining ring either, so that also falls back to a different font. Results will be variable, depending what particular font(s) happen to be used. One thing that could help here would be bug 543200, as the worst results generally occur if the base character and diacritic end up being rendered with different fonts due to per-character fallback behavior. But for more reliable results, authors should explicitly choose fonts that support the characters they're using.
Comment 18•10 years ago
|
||
This doesn't look good in firefox: ∇×
Comment 19•10 years ago
|
||
Hm.. 2 sentences of my messages doesn't load, I was saying that some combining chars don't look good in firefox, for example, this one:(I hope it gets displayed).
Comment 20•10 years ago
|
||
Well, it doesn't in that case, look here: http://kasperpeulen.github.io/CoffeeTeX/example.html In the beginning it does work, in the last paragraph however, firefox fails, and all other browser do render it properly.
Comment 21•10 years ago
|
||
First of all, the text in comments 18 and 19 was cut off because of bug 405011. Taking the text from the last paragraph in the testcase, I assume the issue is with the combinations U+1D404 MATHEMATICAL BOLD CAPITAL E (and other letters) with U+20D7 COMBINING RIGHT ARROW ABOVE. They seem to work well for me here on Firefox 32.0.3 in Ubuntu, so this may be a font issue as described in comment 17. The testcase as data URI: data:text/html;charset=utf-8,%E2%88%87%C3%97%F0%9D%90%81%E2%83%97%20-%E2%80%89%201%E2%88%95c%E2%80%89%20%28%E2%88%82%F0%9D%90%84%E2%83%97%29%E2%88%95%28%E2%88%82t%29%20%26%20%3D%20%284%CF%80%29%E2%88%95c%20%F0%9D%90%A3%E2%83%97%20%E2%86%B5
Comment 22•5 years ago
|
||
Still valid =)
Assignee | ||
Comment 23•5 years ago
|
||
In general, Firefox does support Unicode combining characters, but this is subject to suitable fonts being used. What doesn't work well is the use of combining characters when they are not supported in the font being used for the rest of the text; in that case, font fallback will kick in and generally yield a poor result, as noted in comment 17.
In the specific example mentioned in comment 20, where we see things like <MATHEMATICAL BOLD CAPITAL E, COMBINING RIGHT ARROW ABOVE>, the problem is that the font being used does not include OpenType positioning rules for the accent. So it appears at its default position as designed in the font, which clashes with tall letters.
If the font were TrueType, fallback diacritic positioning behavior would come into effect, which would raise the accent appropriately, but the current code in gfxHarfBuzzShaper.cpp[1] does not support this for OpenType/CFF fonts, which is the format used here.
So besides bug 543200, mentioned above in comment 17, another thing that would help certain cases (including the comment 20 example) would be to implement fallback positioning support for OpenType/CFF fonts.
Assignee | ||
Comment 24•5 years ago
|
||
This provides glyph-extents support for these fonts, so that fallback diacritic positioning can work.
In principle we could try switching to the hb_ot_font functions for all fonts, but this carries
some risk of disrupting other issues: (1) on some platforms, our glyph-advance callbacks use platform
APIs rather than reading the font file directly, in order to respect hinting that may be in effect;
and (2) the hb_ot_font functions don't currently provide fallbacks for CJK Compatibility Ideographs
Standardized Variants, as implemented for Gecko in bug 989557, so that case would be regressed.
Hence, for the time being switching only for OpenType/CFF fonts, where the thebes callbacks are
known to be incomplete (no glyph-extents support) is the safer, more conservative approach.
Comment 25•5 years ago
|
||
Pushed by jkew@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/76555eaa45e1 For OpenType/CFF fonts, use harfbuzz ot-font functions rather than thebes callbacks. r=jrmuizel
Comment 26•5 years ago
|
||
bugherder |
Updated•5 years ago
|
Updated•5 years ago
|
Description
•