Last Comment Bug 165881 - Combining diacritics don't work
: Combining diacritics don't work
Status: RESOLVED WORKSFORME
: intl
Product: Core
Classification: Components
Component: Graphics (show other bugs)
: Trunk
: PowerPC Mac OS X
: -- normal with 6 votes (vote)
: ---
Assigned To: Nobody; OK to take it and work on it
:
: Milan Sreckovic [:milan]
Mentors:
: 383062 (view as bug list)
Depends on: atsui
Blocks: 386573 supercombiner
  Show dependency treegraph
 
Reported: 2002-08-31 03:06 PDT by Henri Sivonen (:hsivonen)
Modified: 2008-04-01 23:06 PDT (History)
14 users (show)
dsicore: blocking1.9-
reed: wanted1.9+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
test case (250 bytes, text/html; charset=utf-8)
2002-08-31 03:07 PDT, Henri Sivonen (:hsivonen)
no flags Details
Greek nu with diacritic line above (204 bytes, text/html)
2004-08-02 12:42 PDT, Hannes Mayer
no flags Details
Another test case. (508 bytes, text/html; charset=UTF-8)
2007-01-07 00:24 PST, Wim Lewis
no flags Details
Displays all of the combining diacritical mark codes for testing (1.64 KB, text/html)
2007-05-21 18:24 PDT, Brett Zamir
no flags Details
Updated list of all combining diacritical marks (1.78 KB, text/html)
2007-05-21 18:33 PDT, Brett Zamir
no flags Details
Unicode PDF listing from http://unicode.org/charts/PDF/U0300.pdf (104.38 KB, application/pdf)
2007-05-21 18:53 PDT, Brett Zamir
no flags Details

Description Henri Sivonen (:hsivonen) 2002-08-31 03:06:52 PDT
Build ID: 2002083017 (on OS X 10.2)

Reproducible: Always

Steps to reproduce:
1) Load the test case attachment (upcoming)

Actual results:
Mozilla displays an 'R' followed by an ogonek.

Expected results:
Expected Mozilla to display the 'R' and ogonek combined the way OmniWeb and
TextEdit do.

Additional info:
Since there is no precomposed character LATIN CAPITAL LETTER R WITH OGONEK, it
is appropriate to send this particular combination as non-precomposed. See the
CharMod draft: http://www.w3.org/TR/charmod/
Comment 1 Henri Sivonen (:hsivonen) 2002-08-31 03:07:27 PDT
Created attachment 97387 [details]
test case
Comment 2 Henri Sivonen (:hsivonen) 2002-08-31 03:10:10 PDT
Supporting combining diacritics requires the base character and the combining
marks to be passed to ATSUI together, so the one-char-at-a-time ATSUI fallback
won't do.
Comment 3 Greg K. 2002-08-31 10:12:17 PDT
It seems to work if you explicitly specify a font such as TITUS Cyberbit Basic
or Arial Unicode MS, but it's strange that it doesn't work if I don't, since
those fonts are defined as my default Unicode fonts.
Comment 4 Roy Yokoyama 2002-09-03 12:03:16 PDT
>Supporting combining diacritics requires the base character and the combining
>marks to be passed to ATSUI together
I better assign to nhotta. cc ftang 
Comment 5 nhottanscp 2002-09-04 17:00:25 PDT
Reassign to ftang.
Comment 6 Frank Tang 2002-10-24 10:48:15 PDT
assign
Comment 7 Hannes Mayer 2004-08-02 12:31:04 PDT
This bug is quite old, but it seems the problem still persists.
Frank, did you have any luck in investigating ?

I will attach a file that shows a greek nu with a diacritic line above (should
be the sign for an anti-neutrino). On Mozilla 1.7 on Fedora Core 2 (Xorg) the
line is right of the nu. Same mozilla, but on windows it shows the line above
the nu.
Comment 8 Hannes Mayer 2004-08-02 12:42:46 PDT
Created attachment 155014 [details]
Greek nu with diacritic line above
Comment 9 Adam Hauner 2005-03-02 00:17:57 PST
-> to default owner (rather than ftang's WONTFIX)
Comment 10 Wim Lewis 2007-01-07 00:24:24 PST
Created attachment 250739 [details]
Another test case.

I ran into this bug myself just now. Here is a small html file that exemplifies the problem. Firefox 2.0 (rv:1.8.1), Mac OS 10.4.8.
Comment 11 Brett Zamir 2007-05-21 18:24:18 PDT
Created attachment 265593 [details]
Displays all of the combining diacritical mark codes for testing
Comment 12 Brett Zamir 2007-05-21 18:31:01 PDT
Comment on attachment 265593 [details]
Displays all of the combining diacritical mark codes for testing

0300: a&#768;<br />
0301: a&#769;<br />
0302: a&#770;<br />
0303: a&#771;<br />
0304: a&#772;<br />
0305: a&#773;<br />
0306: a&#774;<br />
0307: a&#775;<br />
0308: a&#776;<br />
0309: a&#777;<br />
030A: a&#778;<br />
030B: a&#779;<br />
030C: a&#780;<br />
030D: a&#781;<br />
030E: a&#782;<br />
030F: a&#783;<br />
0310: a&#784;<br />
0311: a&#785;<br />
0312: a&#786;<br />
0313: a&#787;<br />
0314: a&#788;<br />
0315: a&#789;<br />
0316: a&#790;<br />
0317: a&#791;<br />
0318: a&#792;<br />
0319: a&#793;<br />
031A: a&#794;<br />
031B: a&#795;<br />
031C: a&#796;<br />
031D: a&#797;<br />
031E: a&#798;<br />
031F: a&#799;<br />
0320: a&#800;<br />
0321: a&#801;<br />
0322: a&#802;<br />
0323: a&#803;<br />
0324: a&#804;<br />
0325: a&#805;<br />
0326: a&#806;<br />
0327: a&#807;<br />
0328: a&#808;<br />
0329: a&#809;<br />
032A: a&#810;<br />
032B: a&#811;<br />
032C: a&#812;<br />
032D: a&#813;<br />
032E: a&#814;<br />
032F: a&#815;<br />
0330: a&#816;<br />
0331: a&#817;<br />
0332: a&#818;<br />
0333: a&#819;<br />
0334: a&#820;<br />
0335: a&#821;<br />
0336: a&#822;<br />
0337: a&#823;<br />
0338: a&#824;<br />
0339: a&#825;<br />
033A: a&#826;<br />
033B: a&#827;<br />
033C: a&#828;<br />
033D: a&#829;<br />
033E: a&#830;<br />
033F: a&#831;<br />
0340: a&#832;<br />
0341: a&#833;<br />
0342: a&#834;<br />
0343: a&#835;<br />
0344: a&#836;<br />
0345: a&#837;<br />
0346: a&#838;<br />
0347: a&#839;<br />
0348: a&#840;<br />
0349: a&#841;<br />
034A: a&#842;<br />
034B: a&#843;<br />
034C: a&#844;<br />
034D: a&#845;<br />
034E: a&#846;<br />
034F (non-visible): a&#847;<br />
0350: a&#848;<br />
0351: a&#849;<br />
0352: a&#850;<br />
0353: a&#851;<br />
0354: a&#852;<br />
0355: a&#853;<br />
Comment 13 Brett Zamir 2007-05-21 18:33:00 PDT
Comment on attachment 265593 [details]
Displays all of the combining diacritical mark codes for testing

0300: a&#768;<br />
0301: a&#769;<br />
0302: a&#770;<br />
0303: a&#771;<br />
0304: a&#772;<br />
0305: a&#773;<br />
0306: a&#774;<br />
0307: a&#775;<br />
0308: a&#776;<br />
0309: a&#777;<br />
030A: a&#778;<br />
030B: a&#779;<br />
030C: a&#780;<br />
030D: a&#781;<br />
030E: a&#782;<br />
030F: a&#783;<br />
0310: a&#784;<br />
0311: a&#785;<br />
0312: a&#786;<br />
0313: a&#787;<br />
0314: a&#788;<br />
0315: a&#789;<br />
0316: a&#790;<br />
0317: a&#791;<br />
0318: a&#792;<br />
0319: a&#793;<br />
031A: a&#794;<br />
031B: a&#795;<br />
031C: a&#796;<br />
031D: a&#797;<br />
031E: a&#798;<br />
031F: a&#799;<br />
0320: a&#800;<br />
0321: a&#801;<br />
0322: a&#802;<br />
0323: a&#803;<br />
0324: a&#804;<br />
0325: a&#805;<br />
0326: a&#806;<br />
0327: a&#807;<br />
0328: a&#808;<br />
0329: a&#809;<br />
032A: a&#810;<br />
032B: a&#811;<br />
032C: a&#812;<br />
032D: a&#813;<br />
032E: a&#814;<br />
032F: a&#815;<br />
0330: a&#816;<br />
0331: a&#817;<br />
0332: a&#818;<br />
0333: a&#819;<br />
0334: a&#820;<br />
0335: a&#821;<br />
0336: a&#822;<br />
0337: a&#823;<br />
0338: a&#824;<br />
0339: a&#825;<br />
033A: a&#826;<br />
033B: a&#827;<br />
033C: a&#828;<br />
033D: a&#829;<br />
033E: a&#830;<br />
033F: a&#831;<br />
0340: a&#832;<br />
0341: a&#833;<br />
0342: a&#834;<br />
0343: a&#835;<br />
0344: a&#836;<br />
0345: a&#837;<br />
0346: a&#838;<br />
0347: a&#839;<br />
0348: a&#840;<br />
0349: a&#841;<br />
034A: a&#842;<br />
034B: a&#843;<br />
034C: a&#844;<br />
034D: a&#845;<br />
034E: a&#846;<br />
034F (non-visible): a&#847;<br />
0350: a&#848;<br />
0351: a&#849;<br />
0352: a&#850;<br />
0353: a&#851;<br />
0354: a&#852;<br />
0355: a&#853;<br />
Comment 14 Brett Zamir 2007-05-21 18:33:57 PDT
Created attachment 265595 [details]
Updated list of all combining diacritical marks
Comment 15 Brett Zamir 2007-05-21 18:53:21 PDT
Created attachment 265597 [details]
Unicode PDF listing from http://unicode.org/charts/PDF/U0300.pdf

This is the official Unicode listing of the combining diacritical marks (posting is permitted by http://www.unicode.org/copyright.html ).

As comparison with the updated test file should show, the diacritical marks are:
1) frequently a bit off kilter (e.g., placed to the side when they should be centered)
2) some of the marks are missing entirely: 332, 333
3) 350-355 show up as question marks (though I'm not sure whether that is due to requiring combination with the Uralic alphabet or not)
4) others appear to have a wrong mark (31D, 31F, 325, 33B)
5) others seem they may be too small (326 and 329)
6) and this one seems just slightly off (32C).
Comment 16 Smokey Ardisson (offline for a while; not following bugs - do not email) 2007-06-03 15:24:06 PDT
This seems much better on the trunk, but there are still a number of problems and a few that are rendering as boxes-after where they at least rendered as characters-after on the 1.8 branch.
Comment 17 Smokey Ardisson (offline for a while; not following bugs - do not email) 2007-06-03 15:24:53 PDT
*** Bug 383062 has been marked as a duplicate of this bug. ***
Comment 18 Simon Montagu :smontagu 2007-10-08 14:28:32 PDT
The only issue that I see in attachment 265595 [details] is that many marks are shifted to the right of the base character. Repeating my comment from bug 386573: I think the broken cases are whenever there is no precomposed character equivalent to the base+diacritic pair. For example, in 
data:text/html,<p>a&#x30b; e&#x30b; i&#x30b; o&#x30b; u&#x30b;</p>
the double acute is correctly placed only on the o and the u, which match U+0151 LATIN SMALL LETTER O WITH DOUBLE ACUTE and U+0171 LATIN SMALL LETTER U WITH DOUBLE ACUTE.
Comment 19 Damon Sicore (:damons) 2007-10-09 13:27:12 PDT
Minusing.  Not a regression.  It's a feature that's never worked on the mac.  Per discussion with Vlad and Stuart.  Marking wanted-1.9.
Comment 20 Brett Zamir 2007-12-09 16:31:28 PST
Sorry for the very delayed reply, but in response to Simon...

x332 and x333 (the first of which I happen to need) don't show their marks at all, and everything from x350 to x36F are showing up as question marks. Compare with the chart at http://www.unicode.org/charts/PDF/U0300.pdf
Comment 21 Brett Zamir 2007-12-09 16:36:31 PST
Oh, and the problem, at least in my case, at least, is in Windows... (sorry I hadn't clarified)
Comment 22 Brett Zamir 2007-12-09 17:50:16 PST
Argh... sorry again, I do see it is working in Firefox 3, albeit with the characters being shifted (finally got around to figuring out how to run two profiles at once)... Nice work...
Comment 23 Smokey Ardisson (offline for a while; not following bugs - do not email) 2008-04-01 23:06:05 PDT
The remaining problem here, which smontagu identified in comment 18, now has bug 425650 to cover it specifically.

Closing this bug WFM on trunk thanks to all the rewrites.

Note You need to log in before you can comment on or make changes to this bug.