Closed Bug 85373 Opened 24 years ago Closed 11 years ago

combining characters / combining mark not supported

Categories

(Core :: Internationalization, defect)

x86
Windows 98
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME
Future

People

(Reporter: Martin.T.Kutschker, Unassigned)

Details

(Keywords: fonts)

Attachments

(2 files)

From Bugzilla Helper: BuildID: 2001050515 and 2001060703 (milestones 0.9/0.9.1) In pre 0.9.x milestones the combing overline was displayed correctly above the previous character. Since 0.9 this is broken (you get a separate and rather long overscore), while it still works for other combining characters (eg combinig double overline). Reproducible: Always Steps to Reproduce: Eg. "√x̅" as a button label (square root of X) Actual Results: The sqare root character, the X and an overscore left to to right. Expected Results: The sqare root character, the X with the overscore ABOVE the X. See the effect by installing mozCalc (http://mozcalc.mozdev.org).
These two testcases work correctly for me on win2k CVS build from today. But they don't work on Linux trunk 2001061121, or Mac 2001060708; on those systems, the 773 displays as "?". To my knowledge, xptoolkit does not deal with character-to-glyph mapping. It just uses the underlying modules. -> i18n
Status: UNCONFIRMED → NEW
Ever confirmed: true
Sigh. Didn't add the testcases or reassign. Let's try that again. These two testcases work correctly for me on win2k CVS build from today. But they don't work on Linux trunk 2001061121, or Mac 2001060708; on those systems, the 773 displays as "?". <?xml version="1.0"?> <?xml-stylesheet href="chrome://global/skin" type="text/css"?> <window xmlns="http://www.mozilla.org/keymaster/gatekeeper/there.is.only.xul" xmlns:html="http://www.w3.org/1999/xhtml"> <button label="&#8730;x&#773;"/> </window> <html> <body> <form> <input type='button' value="&#8730;x&#773;"/> </form> </body> </html> To my knowledge, xptoolkit does not deal with character-to-glyph mapping. It just uses the underlying modules. -> i18n
Assignee: trudelle → nhotta
Component: XP Toolkit/Widgets: XUL → Internationalization
QA Contact: jrgm → andreasb
I tried again using Windows build 2001061204 (German Win98 SE), but I did not work either.
Martin, could you attach screen shots, both actual and expected results, thanks.
Attached image expected behaviour
Attached image actual behaviour
We have not plan to support combining mark yet. mark it as future.
Assignee: nhotta → ftang
Target Milestone: --- → Future
Hang on! It did work for 0.7 to 0.8.1 and it still works for other combining characters. I remember claims that Mozilla has the best Unicode support to be found. "Future" is really disappointing.
I think we never support combination mark correctly. You may accidentally get correct behavior if the font you choose contains both x and the overline. For example, if you use <div style="font-family: 'Lucida Sans Unicode';">#8730;x&#773;</style> in html, it may still accidentally work.
Status: NEW → ASSIGNED
QA Contact: andreasb → ylong
Changed the subject to reflect the real issue and added "fonts" and "correctness" as keywords, though would have entered "unicode" if it had been allowed.
Keywords: correctness, fonts
Summary: combining overline (0x0305) not combining → combining characters / combining mark not supported
> To my knowledge, xptoolkit does not deal with character-to-glyph mapping. > It just uses the underlying modules. I guess this is the case and underlying modules don't do much for Latin combinging characters and some other combining characters. Whether combining characters are rendered as 'spacing' or 'combing'(non-spacing) seems completely depneds on what font is used to render the page both under Windows 2k/XP and Linux/Unix/X11. For instance, http://www.columbia.edu/kermit/st-erkenwald.html gets rendered as expected when CODE2000 font by James Kass is used while it's not if Arial MS Unicode is used under MS Windows XP/2k. Another example is http://jshin.net/i18n/korean/hunmin.html. It's only displayed correctly when fonts in which glyphs for conjoining Hangul vowels and final consonants have zero-widths. are used. It's not clear what Mozilla has to do in this case. It can be argued that Mozilla should delegate this task to underlying rendering engines available in OS' (Uniscribe, Pango, QuickDraw/AAT) along with fonts with advanced features like Opentype. Others may think that Mozilla should do more for these cases like it does for Thai (and partly U+1100 Hangul Jamos under X11).
sorry for spamming. I didn't realize that the URL field is empty. Becase Middle English sample at Kermit web page seems as good as any others for demonstrating the issue at hand, it'd be nice if somebody with the previlige to do so would add it to the URL field.
> It's not clear what Mozilla has to do in this case. It can be > argued that Mozilla should delegate this task > to underlying rendering engines available in OS' > (Uniscribe, Pango, QuickDraw/AAT) along > with fonts with advanced features like Opentype. Others may > think that Mozilla should do more for these cases like > it does for Thai (and partly U+1100 Hangul Jamos under X11). I was wrong thinking that the _entire_ task of rendering Latin combining characters (for that matter, any combining characters) can be delegated to rendering engines offered by OS'. Mozilla still has to do some works. The following is what James Kass (the designer of CODE2000 Unicode font) wrote the following to the Unicode list: Code2000 has only minimal tables for Latin OpenType pending system support from Microsoft needed for testing purposes. For glyph positioning, only 3 kerning pairs are included. For glyph substitution, only 10 discretionary ligatures plus several 'enclosed' glyph substitutions. Whatever is happening in Mozilla to improve the appearance of Latin text with combiners must be an innovation of the good folks at Mozilla. Lacking Latin OpenType support in Uniscribe, many "core" fonts available through Microsoft may well have chosen to use dotted circle glyphs as the default display glyphs while awaiting system support. This is probably a good approach, because fonts like Code2000, Cardo, et cetera, still have to rely on default glyph positioning in most products, which can result in undesirable overstriking. Where they don't overstrike, these combiners are as often as not poorly aligned. There's very little font developers can do to correct this without OpenType support being enabled for Latin script. Microsoft is working on adding Latin OpenType support to Uniscribe. Meanwhile, some browsers seem to be attempting to display, for example, the string "a" + "combining acute" as "aacute" if the glyph is available in the font as a precomposed glyph and the character is available in the Standard as a precomposed character. This is why some combiners may seem to work in certain cases, and not others. For instance, in Outlook Express on Windows 9x, the following alphabet + combining acutes is entered as the single letter followed by the combining acute mark. In the default font here, the expected result is that every single combining acute glyph would overstrike its corresponding base letter. This is because the default combining glyph position in this font is designed for lower case letters (with no ascenders) heights...: (the following line is in UTF-8. set CharacterCoding to UTF-8) ĀB̄C̄D̄ĒF̄ḠH̄ĪJ̄K̄L̄M̄N̄ŌP̄Q̄R̄S̄T̄ŪV̄W̄X̄ȲZ̄ ...but, on this system, the capital letters A, E, G, I, O, and U are all getting a combining macron at the correct caps height. This is not happening based upon any instructions within the font. Rather, the system appears to be making these substitutions based on some system table, which in turn is based on TUS. (The rest of the capital letters look awful, they're being overstriked (overstricken?) by the combiners.) Anyway, hope this info is helpful.
This problem seems to affect only some combining characters. For instance, this URI shows that combining key caps combine, but not the combining circle: http://www.cadenceweb.com:8080/newsletter/sheerin/test/index.html#ExpertSet This page provides a more complete test of all combining characters: http://www.cadenceweb.com:8080/newsletter/sheerin/test/ExpertCharacterSet.html
I suspect that the characters that appear to combine do so because the font provides incorrect metrics for the combining glyph.
> I suspect that the characters that appear to combine do so because the font > provides incorrect metrics for the combining glyph Certainly it's font depedent as well as platform/toolkit depedent. However, I would not say a font provides "incorrect" metrics if it assigns zero-advance-width for non-spacing combing characters. Given the fact that neither Uniscribe(at least released version) nor Pango (for that matter, ICU layout part is not an exception. ATSUI may do better, but I have no info.) provides support for combining marks to use with Latin/Cyrillic/Greek letters, it's a reasonable fallback to assign zero advance width to non-spacing combining marks to make them work for quite a lot of cases (by simple overstriking). It may not even be a fallback but is arguably the 'right thing' to do. As for platform-depdent part... Mozilla-Win uses standard text APIs as opposed to Uniscribe APIs used by MS IE. One of the largest difference between two[1] is that the former supports complex script handling on Win2k/XP only while the latter supports it on any lang. version of Win9x/ME as well as on Win2k/XP. [2] At the moment, there's little difference between two approaches in terms of Latin combining mark handling because MS has just begun to implement it in Uniscribe. However, when it becomes available, on Win9x/ME Mozilla-win cannot handle it while MS IE can do it across Win32 platforms. On other platforms, none of X11/gtk(x11core/FT, Xft) supports it. I'm not sure of the situation on Mac, but Mozilla doesn't use Mac's native ATSUI so that I guess it's similar on Mac. [1] Another difference is that caret movement/positioning/selection don't work for complex scripts if standard text APIs are used. [2] Even standard text APIs provides support for Hebrew on Hebrew Win9x/ME , Arabic on Arabic Win9x/ME and Thai on Thai Win9x/ME. However, Indic scripts are not so lucky because MS never supported Indic scripts with Win32 'A' APIs that are used on Win9x/ME.
what a hack. I have not touch mozilla code for 2 years. I didn't read these bugs for 2 years. And they are still there. Just close them as won't fix to clean up.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → WONTFIX
Mass Re-open of Frank Tangs Won't fix debacle. Spam is his responsibility not my own
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Mass Re-assinging Frank Tangs old bugs that he closed won't fix and had to be re-open. Spam is his fault not my own
Assignee: ftang → nobody
Status: REOPENED → NEW
Looks similar to bug 197649.
do we still (in 2008, with Firefox 3 coming) support Windows 98 ? can I close this as Won't Fix ?
QA Contact: amyy → i18n
Gecko has supported combining characters from at least the Firefox 1.5 time. Reopen if you think there is a specific problem that doesn't work correctly.
Status: NEW → RESOLVED
Closed: 20 years ago11 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: