6.11 KB, image/png
3.51 KB, image/png
947 bytes, text/html
Ascending stacked characters do not fit within the constraints of a text input box, and are not visible, or only semivisible within the input box. This is a problem for stacked character languages, such as Tibetan (there are many, more popular character sets that use stacks) Descending stacked characters similarly are not displayed well either. This bug requires a stacking unicode font to see, -one is available from bug 107765, to which this bug is related. (Assigner may want to move the component over to internationalization). There may be no solution for this. In my opinion it depends upon the whole concept of 'baseline' as a standard artefact of user interfaces. this is why I have marked its severity as 'enhancement'.
reporter: this is not a form submission issue. can you please find out the right target component?
Assignee: alexsavulov → rods
Component: Form Submission → HTML Form Controls
QA Contact: vladimire → madhur
Assignee: rods → kin
Testcase is not reachable, and I'm not sure what this bug is about... jshin, do you know? Reporter, if you could attach a testcase to the bug itself, that would be great.
I guess as long as a font with the right metric is used, we're fine. Without a font to support Tibetan, I really can't test this. I know a couple of Tibetan experts and am gonna ask them where I can get Tibetan fonts. http://www.unicode.org/charts/PDF/U0F00.pdf http://www.unicode.org/versions/Unicode4.0.0/ch09.pdf (section 9.11 is about Tibetan) P.S. Mac OS X may come with a Tibetan truetype font with 'mort' table. 'mort' table is not recognized by truetype drivers on Windows and Xft/fontconfig. So, I need an opentype Tibetan font.
Created attachment 141307 [details] Pictural example of input type=text problem with stacked characters Tibetan cannot have a fixed vertical metric: Line heights necessarily change according to character context. There are extreme cases that illustrate this - for instance the syllable hamkshamalavaraya ( 0F67 0FB9 0FA8 0FB3 0FBA 0FBC 0FBB ), a vertical stack that will typically require a vertical metric that is equivalent to three lines or more, depending on the typeface. (Tibetan is predominantly written left to right, like English, but each syllable is represented as a vertical stack; this entails that one can have many unicode characters at what would normally be one character 'space' ). Because the current approach to the <input type="text" /&rt; is a static (against text-entry) rendering, it does not dynamically accomodate stacked unicode characters. I can see just one possible solutions to this ethnocentric text input UI challenge: We will need to be able to dynamically resize the input line when necessary; that is, we will need to recalculate the line height during input (or rendering when the input has a value attribute).
Thanks for the screenshot. I knew about the vertical stacking, but I didn't know that it could go that far. How does Mac OS X handle Tibetan text rendering? For instance, how does TextEdit handle multiple lines of Tibetan text? Does it use the fixed line height throughout (a paragraph) or does it adjust the line height depending on the maximum height in a given line (i.e. adjacent lines have different heights) ?
Concerning Mac OS X rendering, it dynamically adjusts line-heights in edit environments; this can sometimes look quite unpleasant, - use random line heights on any text and it is hard to read! The Tibetans use fixed line heights, but then they overlap line-extending characters, in a manner similar to (but NOT the same as!) drop-caps; most normally they adjust kerning on the affected lines (most text is written or carved out of woodblock), or overprint if the text remains legible. Back to Mac OS X, single-line input areas and other single line UI components are cropped, which blinds the user regarding Tibetan text, and other indic, stack based character sets. (Tibetan is generally acknowledged to be a 'worst case' for this linguistic feature, as the stacks can be of any height whatsoever). There are certain proposals by the Chinese government on Tibetan encoding and rendering, such as proposal N2621. These are generally considered to be unworkable, and though there is a lot of mess in the Unicode page 0F00, it suffices for Tibetan text. The "hamkshamalavaraya" issue is exceptional, but common - the syllable belongs to the Kalachakra Tantra cycle, which is very popular globally. When Tibetans render it in text, they 'drop cap' it, or overflow it if it is on the bottom line of a page.
Created attachment 141316 [details] Example of safari rendering multiline tibetan - showing variable lines. This shows Safari's current approach to multi-line editing, with a variable line-height. Note that the text input is static, and (IMHO) fails in that the user is blind to the text. (This also affects URL inputs, but let's not go there now!)
Created attachment 141317 [details] Simple UTF-8 html file that displays the stack problem on text inputs.
(In reply to comment #7) > We will need to be able to dynamically resize the input line when necessary; Which means reflowing not just the input, but the whole page on every keystroke. We used to do this, sorta (though the textbox height would never change). On current hardware, that makes it impossible to keep the display in sync with even an average-speed typist. That's why textboxes are reflow roots now.
(In reply to comment #12) > Which means reflowing not just the input, but the whole page on every keystroke. Well, you know that's not literally true. We actually need to -test- for a possible change of line height on each keystroke. As a general rule, we do not need to reflow at all: solely when stacking characters are being keyed. Also, even with stacked syllables, we only need to check to see whether the current stack defines the the maximum line-height. Moreover, we can use a (yet-to-be-designed) heuristic to keep line-height-based reflows minimal, by quantizing line-height accordingly.
Hamkshamalavaraya ཧྐྴྨླྺྼྻྃ(0F67 0FB9 0FA8 0FB3 0FBA 0FBC 0FBB 0F83) is a pretty extreeme example - certainly not a combination which occurs in normal everyday Tibetan text. The situation can be improved by using a Tibetan font with approriatly scaled glyphs for rendering such extreeme combinations. This combination Hamkshamalavaraya (ཧྐྴྨླྺྼྻྃ) in fact only occurs in a handful of religious texts connected with the Kalacakra tantra - and many Tibetan script fonts will not properly render such rarely occuring sequences. The most complex (deep) stacks in normal Tibetan text occur in such words as སྒྲུབ་བརྒྱུད་ - i.e. three consonants and a subjoined vowel. Real life examples of websites with lots of Tibetan script text can be found at: <http://www.library.gov.bt/index-DZ.html> and <http://www.thdl.org/index.php?lng=tib> Provided you have a proper OpenType shaping engine and the proper fonts installed - the pages on these two sites seem to render pretty well in Firefox except for lines not breaking properly after character U+0F0B (bug: 394954). - chris
You need to log in before you can comment on or make changes to this bug.