Closed Bug 283271 Opened 20 years ago Closed 17 years ago

CTL cluster-based operations unsupported on Windows

Categories

(Core :: Internationalization, defect)

x86
Windows XP
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 65896

People

(Reporter: samphan, Assigned: smontagu)

References

Details

(Keywords: intl, meta)

Attachments

(1 file)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20050206 Firefox/1.0 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20050206 Firefox/1.0 Backgrounds :- 8<-- Unicode 4.0 Section 2.10, Combining Characters-->8 This core concept is known as a grapheme cluster, and it consists of any combining character sequence that contains only nonspacing combining marks, or any sequence of characters that constitutes a Hangul syllable (possibly followed by one or more nonspacing marks). An implementation operating on such a cluster would almost never want to break between its elements for rendering, editing, or other such text process; the grapheme cluster is treated as a single unit. 8<-- 5.11 Editing and Selection, Consistent Text Elements -->8 As far as a user is concerned, the underlying representation of text is not a material concern, but it is important that an editing interface present a uniform implementation of what the user thinks of as characters. (See “‘Characters’ and Grapheme Clusters” in Section 2.10, Combining Characters.) The user expects them to behave as units in terms of mouse selection, arrow key movement, backspacing, and so on. For example, when such behavior is implemented, and an accented letter is represented by a sequence of base character plus nonspacing combining mark, using the right arrow key would logically skip from the start of the base character to the end of the last nonspacing character.In some cases, editing a user-perceived “character” or visual cluster element by element may be the preferred way. For example, a system might have the backspace key delete by using the underlying code point, while the delete key could delete an entire cluster. 8<-- 5.12 Strategies for Handling Nonspacing Marks When searching strings, remember to check for additional nonspacing marks in the target string that may affect the interpretation of the last matching character. 8<-------------------------------------->8 Other than shaping it correctly, a grapheme cluster must be treated as a unit in :- 1) caret moving operation (arrow left, right) must respect cluster boundaries 2) mouse selection must be between cluster boundaries 3) search must not match only part of a cluster 4) editing - baskspace by code point, delete by cluster Reproducible: Always Steps to Reproduce: Example/Test case:- A grapheme cluster in latin maybe Áà (contains 4 code points A+ ̀+A+ ̃). And in Thai maybe นี้ดี. (น+ ี+ ้+ด+ ี). I'll use "ÁÃนี้ดี" as the test case. ÁÃนี้ดี (Shaping may not work correctly. The non-spacing accents should be on the letter A.) 1) caret movement - see bug 100170 (on Solaris) The caret should move this way :- |Á|Ã|นี้|ดี| But currently it moves this way :- |A| ́|A| ̃|น| ี| ้|ด| ี| It'll look as if the caret is stop when moving it pass nonspacing marks. 2) mouse selection - see bug 100173 (on Solaris) User should not be able to mouse-select only part of a grapheme cluster. Currently, they can. Try selecting only ดี, you may also get ้ from นี้ or may get only ด. Mouse click on à must move the caret to either before à or after à but not between A and ̃ . 3) search - see bug 157534 Search operations should not found นี in นี้ or A in Ã, but since Mozilla doesn't check whether there's a nonspacing mark to the right of the matches (in other words, doesn't check that matches begin and end at cluster boundaries) so the parts of the grapheme clusters are marked as matches. 4) deletion - see bug 157546 ÁÃ|นี้ดี Press 'Delete' should delete the whole grapheme cluster 'นี้' ÁÃ|ดี Or else you'll get ÁÃี้ดี Most of the works for CTL support are implemented and tested on Linux and Solaris, though the CTL feature only works when enabled thru --enable-ctl build option. However, no work has been done for Windows platforms (except shaping, see bug 218887). A bug has been issued that suggest the need of a generic cross-platform handling of grapheme cluster (bug 229896) but still no solution. Anyway, we need cluster support on Windows for Mozilla/Firefox/Thunderbird to work with many CTL languages on Windows correctly.
Attached file The test case in HTML
Since bugzilla text is monospace, you should use the HTML test case to try the things I mentioned.
Summary: CTL cluster-based operation unsupport → CTL cluster-based operations unsupported
Did you file this bug as a meta bug/tracker? As for point #3, there are cases in which that's a feature. For instance, in Korean, incremental search should match '각', '갈', '간' and other syllables that begin with '가' when '가' is searched for. (unfortunately, the only program I've ever seen implement this incremental search correctly for Korean was a Korean emacs of the mid-1990's) The same can be said of deletion, insert, selection, and cursor movement. In many cases, grapheme-clusters had better be atomic, but not always.
P'Samphan, I think you better specify the Unicode code point of each character (i.e. U+0E01), along with the actual character. For a quicker reproduce process / fix.
can be handy but not necessary. See bug 283146 for a different approach.
In my opinion, this is not a tracker. While it do list the previous/on-going approaches on the CTL cluster-based operations problem, but none of them addresses the point of this bug -- having a univeral CTL cluster-bsed operations support for every platform (not just Linux or Solaris). Bug 229896 is about grapheme breaker/iterator --> generate 'a list of graphemes' from a string. And this bug is about, after we got those graphemes, how to handle them? (in editing, caret movement, highlight/selection, search operations), as #2 said, there're exceptions to handle after we have the list of graphemes.
You have to file separate bugs on each of those issues (per platforms if applicable). YOu cannot have a giant bug like this if it's not a tracker. Absolutely nothing will be done unless this is broken into smaller pieces. btw, in comment #4, I meant bug 281339.
sorry for bug spam. I really meant bug 260663
Change to be a tracker bug :- Note: cell is the same as cluster 1) caret movement : bug 283415 - Windows : Caret must be moved by grapheme cluster boundaries 2) mouse selection : bug 283416 - Windows : Selection must be done by grapheme cluster boundaries 3) search : bug 157534 - All : Edit->Find in Page found substring in Thai display cell, but it shouldn't be 4) deletion : bug 157546 - All : IM: <delete> key should delete WHOLE Thai "display cell"
Depends on: 283415, 283416
Keywords: intl, meta
Summary: CTL cluster-based operations unsupported → CTL cluster-based operations unsupported on Windows
Depends on: 157534, 157546
Blocks: thai
This is an automated message, with ID "auto-resolve01". This bug has had no comments for a long time. Statistically, we have found that bug reports that have not been confirmed by a second user after three months are highly unlikely to be the source of a fix to the code. While your input is very important to us, our resources are limited and so we are asking for your help in focussing our efforts. If you can still reproduce this problem in the latest version of the product (see below for how to obtain a copy) or, for feature requests, if it's not present in the latest version and you still believe we should implement it, please visit the URL of this bug (given at the top of this mail) and add a comment to that effect, giving more reproduction information if you have it. If it is not a problem any longer, you need take no action. If this bug is not changed in any way in the next two weeks, it will be automatically resolved. Thank you for your help in this matter. The latest beta releases can be obtained from: Firefox: http://www.mozilla.org/projects/firefox/ Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html Seamonkey: http://www.mozilla.org/projects/seamonkey/
Since we have a bug 65896 for tracking all Thai support bugs, this bug should be closed to clean up.
per comment #10, will close this bug as duplicate of bug 65896 (to make it refer back here) note: this bug has more detailed description about cluster-based operations than in bug 65896.
Status: UNCONFIRMED → RESOLVED
Closed: 17 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: