Closed Bug 197649 Opened 17 years ago Closed 7 months ago

support Unicode combining characters

Categories

(Core :: Layout: Text and Fonts, defect, P3)

x86
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: gwalla, Assigned: jfkthame)

References

()

Details

(Keywords: intl)

Attachments

(1 file)

Mozilla currently does not support combining characters in Unicode. These are
characters that are supposed to be rendered in combination with (above, below,
around, overlayed with, etc.) the preceding character (for example: accents).
Mozilla displays them, but following the preceding character as if they were
stand-alone spacing characters.

http://www.phon.ucl.ac.uk/home/wells/ipa-unicode.htm is a page on IPA in
Unicode, which can be used as a test page for this functionality, since the IPA
relies heavily on combining characters for diacritics.
Combining characters bugs seem to be being marked duplicates of bug 60546.  Does
that make sense here too?  (Also, should we have separate bugs for separate
platforms, or is there core work that needs to be done as well?)
It sms likely that they stem from the same problem.

I'm using the ClearlyU font and Adobe's Helvetica Unicode

(also, fixing typo in summary)
Summary: support Unicode combiing characters → support Unicode combining characters
Priority: -- → P3
Target Milestone: --- → Future
Separate platforms need separate solutions. On Windows, the newest version of
Uniscribe dll (that comes _only_ with Office 2003) supports combining diacritic
marks for Latin/Greek/Cyrillic. So, if you have Office 2003 and opentype fonts
for Latin/Cyrillic/Greek with necessary GSUB/GPOS tables, it should just work
fine for 'unjustified' text. For justified text, it doesn't work. 


On Linux, even with my patch for bug 215219, Mozilla won't be able to render
combining diacritic marks correctly (well, with fonts like Code2000, it works
more or less because it uses 'zero-advance' glyphs for combining diacritics)
because Pango doesn't support combining diacritic marks for Latin/Greek/Cyrillic.

On Mac, ATSUI may already do a reasonable job, but we don't use ATSUI for
'string drawing'. 

Anyway, it's not a good idea (even if it's possible) to roll out own code to
overcome the platform limitation here. 

Keywords: intl
*** Bug 162421 has been marked as a duplicate of this bug. ***
I'm informed that 1.7a on Windows has some level of support for combining
characters - a level that isn't matched by the Mac OS X version.

See the attempted demo at http://www.epinions.com/content_3826622596
On Linux, this depends on bug 215219 and Pango (there's a Pango bug on combining
diacrtic marks for Latin/Greek/Cyrillic). 

On Windows, this depends on bug 218887. For Mac, see bug 205476 and bug 121546.
Depends on: CTL2
Depends on: 214715
*** Bug 273901 has been marked as a duplicate of this bug. ***
Seems like this bug is already solved, but when?

I háve thís thíng wórking OK.  Wíndows XP, Firefox 2.0.0.3 (worked as wéll in 1.5).
(In reply to comment #8)
> Seems like this bug is already solved, but when?
> 
> I háve thís thíng wórking OK.  Wíndows XP, Firefox 2.0.0.3 (worked as
> wéll in 1.5).

Well, as you write: Windows. Your text does not display correctly for me on FF 2.0.0.3 on quite recent Gentoo Linux.
Looks similar to bug 85373.
Looks like 1.9 has this solved. T̜̀h̆î̤s ṭ̇ẹ̇x̰̎t looks q̌́uíţe g̈öȯd to me on Fedora if I use the DejaVu fonts. Right now it seems to be the font's fault if the marks are misplaced.
ACK, looks nice.
Assignee: layout.fonts-and-text → nobody
QA Contact: ian → layout.fonts-and-text
So, can someone responsible for QA confirm this is fixed?

The text in comment #11 looks fine to me.
(In reply to Jens Müller (:tessarakt) from comment #13)
> So, can someone responsible for QA confirm this is fixed?
> 
> The text in comment #11 looks fine to me.

This is still broken. I'm on OS X and the text in comment #11 looks like ****. Opera and Safari do fine.

My example text to add to this discussion is:  ∧̊ . It's a mathematical 'and' mixed with a combining circle above.
(In reply to elfprince13 from comment #14)
> (In reply to Jens Müller (:tessarakt) from comment #13)
> > So, can someone responsible for QA confirm this is fixed?
> > 
> > The text in comment #11 looks fine to me.
> 
> This is still broken. I'm on OS X and the text in comment #11 looks like
> ****. Opera and Safari do fine.
> 
> My example text to add to this discussion is:  ∧̊ . It's a mathematical
> 'and' mixed with a combining circle above.

Both comment 11 and your example look fine here on OS X (10.7).

The results will depend, however, on your font preferences (or on the fonts chosen by the web page) -- if the font being used doesn't support the diacritics, so that fallback has to pick a different font for some of them, it's likely to look bad.

You can determine what font(s) are being used for a given fragment of text by selecting the text and looking in the Fonts panel of the Inspector tool (Tools / Web Developer / Inspector), or installing the fontinfo add-on[1] and then choosing Show Fonts in Selection from the right-click (context) menu for the selected text.

In my case, the comments here are displayed in Courier (as the default monospaced font), and look fine. If you've changed your font preferences, you may see different results.

[1] https://addons.mozilla.org/en-US/firefox/addon/fontinfo/
Yes, I can fiddle with fonts with varying results here in FF (on OS X 10.8); however, on Safari and Opera I'm getting correct rendering results regardless of which font I have selected. For example, if I set the styling to Menlo,courier,mono (respecting system defaults for mono-fonts), I get reasonable results in FF. However, if I set Verdana,sans or Arial,sans I still get reasonable results in Safari and Opera, and FF breaks. Please stop blaming the fonts: both glyphs are rendered, they just render with incorrect positioning.
I see varying results, depending on the browser and font choices. Note that neither Verdana nor Arial include the ∧ character, and so font fallback comes into play and some other font must be used. And Verdana doesn't have the combining ring either, so that also falls back to a different font. Results will be variable, depending what particular font(s) happen to be used.

One thing that could help here would be bug 543200, as the worst results generally occur if the base character and diacritic end up being rendered with different fonts due to per-character fallback behavior. But for more reliable results, authors should explicitly choose fonts that support the characters they're using.
This doesn't look good in firefox: ∇×
Hm.. 2 sentences of my messages doesn't load, I was saying that some combining chars don't look good in firefox, for example, this one:(I hope it gets displayed).
Well, it doesn't in that case, look here:
http://kasperpeulen.github.io/CoffeeTeX/example.html

In the beginning it does work, in the last paragraph however, firefox fails, and all other browser do render it properly.
First of all, the text in comments 18 and 19 was cut off because of bug 405011.

Taking the text from the last paragraph in the testcase, I assume the issue is with the combinations U+1D404 MATHEMATICAL BOLD CAPITAL E (and other letters) with U+20D7 COMBINING RIGHT ARROW ABOVE. They seem to work well for me here on Firefox 32.0.3 in Ubuntu, so this may be a font issue as described in comment 17.

The testcase as data URI:
data:text/html;charset=utf-8,%E2%88%87%C3%97%F0%9D%90%81%E2%83%97%20-%E2%80%89%201%E2%88%95c%E2%80%89%20%28%E2%88%82%F0%9D%90%84%E2%83%97%29%E2%88%95%28%E2%88%82t%29%20%26%20%3D%20%284%CF%80%29%E2%88%95c%20%F0%9D%90%A3%E2%83%97%20%E2%86%B5

Still valid =)

In general, Firefox does support Unicode combining characters, but this is subject to suitable fonts being used. What doesn't work well is the use of combining characters when they are not supported in the font being used for the rest of the text; in that case, font fallback will kick in and generally yield a poor result, as noted in comment 17.

In the specific example mentioned in comment 20, where we see things like <MATHEMATICAL BOLD CAPITAL E, COMBINING RIGHT ARROW ABOVE>, the problem is that the font being used does not include OpenType positioning rules for the accent. So it appears at its default position as designed in the font, which clashes with tall letters.

If the font were TrueType, fallback diacritic positioning behavior would come into effect, which would raise the accent appropriately, but the current code in gfxHarfBuzzShaper.cpp[1] does not support this for OpenType/CFF fonts, which is the format used here.

So besides bug 543200, mentioned above in comment 17, another thing that would help certain cases (including the comment 20 example) would be to implement fallback positioning support for OpenType/CFF fonts.

[1] https://searchfox.org/mozilla-central/rev/d143f8ce30d1bcfee7a1227c27bf876a85f8cede/gfx/thebes/gfxHarfBuzzShaper.cpp#566

This provides glyph-extents support for these fonts, so that fallback diacritic positioning can work.

In principle we could try switching to the hb_ot_font functions for all fonts, but this carries
some risk of disrupting other issues: (1) on some platforms, our glyph-advance callbacks use platform
APIs rather than reading the font file directly, in order to respect hinting that may be in effect;
and (2) the hb_ot_font functions don't currently provide fallbacks for CJK Compatibility Ideographs
Standardized Variants, as implemented for Gecko in bug 989557, so that case would be regressed.

Hence, for the time being switching only for OpenType/CFF fonts, where the thebes callbacks are
known to be incomplete (no glyph-extents support) is the safer, more conservative approach.

Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/76555eaa45e1
For OpenType/CFF fonts, use harfbuzz ot-font functions rather than thebes callbacks. r=jrmuizel
Status: NEW → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: Future → mozilla68
Assignee: nobody → jfkthame
QA Whiteboard: [qa-68b-p2]
You need to log in before you can comment on or make changes to this bug.