Closed Bug 772268 Opened 12 years ago Closed 4 years ago

Lithuanian doesn't retain the dot in a lowercase i when followed by accents

Categories

(Core :: Internationalization, defect)

defect
Not set
normal

Tracking

()

RESOLVED INACTIVE

People

(Reporter: unghost, Unassigned)

References

Details

Attachments

(6 files, 4 obsolete files)

Attached file testcase (obsolete) —
According to http://www.unicode.org/Public/UNIDATA/SpecialCasing.txt : # Lithuanian retains the dot in a lowercase i when followed by accents. # Remove DOT ABOVE after "i" with upper or titlecase 0307; 0307; ; ; lt After_Soft_Dotted; # COMBINING DOT ABOVE Testcase is attached. It works in Chrome and Opera and fails in Firefox. CC'ing some people from Lithuanian team. Rimas, can you take a look at comment on http://habrahabr.ru/post/147387/#comment_4970027 to confirm that I understood bug correctly?
Depends on: 231162
Alexander, could you post a screenshot of what you're seeing and what version of Firefox you're using? Unless I'm missing something, I'm seeing the intended behavior in Aurora 15.
Attached image Screenshot in 16.0a1 (obsolete) —
(In reply to Gordon P. Hemsley [:gphemsley] from comment #1) > Alexander, could you post a screenshot of what you're seeing and what > version of Firefox you're using? Unless I'm missing something, I'm seeing > the intended behavior in Aurora 15. Here it is - Mozilla/5.0 (X11; Linux i686; rv:16.0) Gecko/16.0 Firefox/16.0 ID:20120709035118
Attached image Screenshot in 15.0a2 (Linux) (obsolete) —
Attached image Screenshot in 15.0a2 (Windows) (obsolete) —
Interesting. This is what I see on Mac. I wonder if it's a font/OS issue?
Hi folks, the testcase (and the akral's comment mentioned by Alexander) seems to be slightly wrong. Indeed, the dot above a lowercase letter should be preserved when adding a stress mark on letters i and j in Lithuanian texts, but the order of diacritics seems to be messed up. It should be as follows: i + ` = i̇̀ that is, a letter i (or j) should be followed by U+0307 COMBINING DOT ABOVE and then by a particular combining stress mark (e.g. U+0300 COMBINING GRAVE ACCENT in this case), and their visual layout should be the stress mark above the dot above the (dotless) letter i. I'm not exactly sure what the problem reported in this bug is (will have to read the whole thread, I guess), but for sure, incorrect rendering might be an issue with both the font in use and/or the layout engine. There is at least one Lithuanian OpenType font which provides all Lithuanian accented letters in their precomposed form (as OpenType ligatures, if I remember the name correctly), and it has been working pretty fine since Firefox 3 at least on Windows. You can look at an example of its usage here: http://etimologija.baltnexus.lt/?w=viltis (look at the word "vilti̇̀s" in the second block, although you may have to increase the font size a bit). So, with that font (Palemonas, which is available as freeware from http://www.vlkk.lt/lit/17), rendering is correct. However, with most other fonts (for example, with Courier New, used in Bugzilla's comment box by default), rendering is incorrect, and that should probably be fixed, if possible.
Note that even though the attachment 640404 [details] is incorrect for Lithuanian, it should still be rendered correctly. Attachment 640433 [details] and attachment 640439 [details] are examples of its correct rendering.
Alexander, can you leave a comment in the thread explaining that the example with Lithuanian is slightly incorrect and should be fixed? I can't post while not logged in, nor while logged in with a bugmenot account. I can write the comment myself if you like. :)
(In reply to Rimas Kudelis from comment #8) > Note that even though the attachment 640404 [details] is incorrect for > Lithuanian, it should still be rendered correctly. Attachment 640433 [details] > [details] and attachment 640439 [details] are examples of its correct > rendering. Are you sure that attachment 640433 [details] and attachment 640439 [details] are examples of its correct rendering? They looks pretty broken to me. As you said in comment 7, "visual layout should be the stress mark above the dot above the (dotless) letter i.". In attachment 640433 [details] and attachment 640439 [details] "visual layout is the dot above the stress mark above the (dotless) letter i." (dot on the top instead of stress mark). Looks like no browser can render it properly without Palemonas font installed. Can you please attach proper testcase and screenshot of intended rendering?
(In reply to Rimas Kudelis from comment #9) > Alexander, can you leave a comment in the thread explaining that the example > with Lithuanian is slightly incorrect and should be fixed? I can't post > while not logged in, nor while logged in with a bugmenot account. > I can write the comment myself if you like. :) I can not, unfortunatedly. This site is invite-only and I don't have one (and it's pretty hard to get one).
Language-specific casing behavior such as the Lithuanian rendering of "i" with accent should only apply when the content is explicitly tagged as being in the relevant language. Attachment 640404 [details] does not have a lang="lt" attribute, so I would not expect it to show Lithuanian behavior; it should use the generic Unicode case-mapping behavior. In addition, the results will still be font-dependent. Some OpenType fonts include Lithuanian-specific tables that will adapt the behavior of accents with "i" appropriately (but again, this is dependent on the data being correctly tagged for language); others do not.
Attached file testcase v.2
(In reply to Jonathan Kew (:jfkthame) from comment #12) > Language-specific casing behavior such as the Lithuanian rendering of "i" > with accent should only apply when the content is explicitly tagged as being > in the relevant language. Attachment 640404 [details] does not have a > lang="lt" attribute, so I would not expect it to show Lithuanian behavior; > it should use the generic Unicode case-mapping behavior. > > In addition, the results will still be font-dependent. Some OpenType fonts > include Lithuanian-specific tables that will adapt the behavior of accents > with "i" appropriately (but again, this is dependent on the data being > correctly tagged for language); others do not. I've attached hopefully more correct testcase. It renders great on latest Firefox nightly for Linux with Palemonas font installed. From brief testing, rendering is not affected by presence or absence of <html lang="lt"> attribute, only Palemonas font declaration matters.
Attachment #640404 - Attachment is obsolete: true
This is really strange. This is what I see with the new testcase: Two dots above! (One I assume just comes from dotted i.) But otherwise, it still shows the diacritic on top, rather than to the side.
Looks good for me on Linux
Attachment #640429 - Attachment is obsolete: true
Looks good for me on Windows too.
Attachment #640431 - Attachment is obsolete: true
Attachment #640416 - Attachment is obsolete: true
(In reply to Alexander L. Slovesnik from comment #10) > (In reply to Rimas Kudelis from comment #8) > > Note that even though the attachment 640404 [details] is incorrect for > > Lithuanian, it should still be rendered correctly. Attachment 640433 [details] > > and attachment 640439 [details] are examples of its correct rendering. > Are you sure that attachment 640433 [details] and attachment 640439 [details] > are examples of its correct rendering? They looks pretty broken to > me. As you said in comment 7, "visual layout should be the stress mark above > the dot above the (dotless) letter i.". In attachment 640433 [details] and > attachment 640439 [details] "visual layout is the dot above the stress mark > above the (dotless) letter i." (dot on the top instead of stress mark). Sorry for the confusion, I meant that these were correct renderings for the incorrect attachment. Since it consists of an accented lowercase letter i and a dot above, dot above the accent above the letter is the expected rendering (at least I would expect that). (In reply to Jonathan Kew (:jfkthame) from comment #12) > Language-specific casing behavior such as the Lithuanian rendering of "i" > with accent should only apply when the content is explicitly tagged as being > in the relevant language. Attachment 640404 [details] does not have a > lang="lt" attribute, so I would not expect it to show Lithuanian behavior; > it should use the generic Unicode case-mapping behavior. Correct, although we don't have any casing behavior in the attachments so far. > In addition, the results will still be font-dependent. Some OpenType fonts > include Lithuanian-specific tables that will adapt the behavior of accents > with "i" appropriately (but again, this is dependent on the data being > correctly tagged for language); others do not. Not sure about that, I guess it depends on OpenType extensions in use. IIRC, some rendering systems (such as Apple's Quartz) specify language rules themselves, meanwhile others (e.g. Graphite) rely on font authors to specify them. (In reply to Alexander L. Slovesnik from comment #13) > I've attached hopefully more correct testcase. It renders great on latest > Firefox nightly for Linux with Palemonas font installed. From brief testing, > rendering is not affected by presence or absence of <html lang="lt"> > attribute, only Palemonas font declaration matters. (In reply to Gordon P. Hemsley [:gphemsley] from comment #14) > This is what I see with the new testcase: Two dots above! (One I assume just > comes from dotted i.) But otherwise, it still shows the diacritic on top, > rather than to the side. That's what I would expect. Like I said, it works with Palemonas because it has dedicated glyphs for letters like i̇̀, and specifies them as OpenType ligatures. What we should be fixing I guess is those cases when the user doesn't have such font, such as Gordon's. But we can only fix this if we have our own rendering system, and I don't think we do. AFAIK, we rely on DirectWrite, Pango and Quartz for that.

I'm going to boldly assume that this works now. Reopen if needed!

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → INACTIVE

The results are font-dependent, but it's not clear to me that we can/should do anything to try and override the rendering with fonts that don't handle it properly.

You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: