Selection must be done by grapheme cluster boundaries




14 years ago
12 years ago


(Reporter: samphan, Assigned: smontagu)


(Blocks: 2 bugs, {intl})

Windows XP
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)



(1 attachment)



14 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.5) Gecko/20041107 Firefox/1.0

A grapheme cluster is what users think of as a character, no matter what
underlying representation it is, e.g. base character plus combining characters.
For example, Á (A +  ́) is a grapheme cluster in Latin and นี้ (น +  ี +  ้) is a
grapheme cluster in Thai. For rendering, editing, or other such text process;
the grapheme cluster is treated as a single unit. See bug 283271 for backgrounds.

Caret must always be on cluster boundaries. A mouse click on a cluster must put
the caret either before the cluster or after the cluster. Mouse selection must
also between cluster boundaries. Users must not be able to make a selection that
contains part of a cluster, either at the begin or the end of the selection.

But currently, in Mozilla applications on Windows (and *nix build without
--enable-ctl), mouse selection  is done by Unicode characters.

Reproducible: Always

Steps to Reproduce:
1) Load the attached HTML sample below. It consists of a text input field with 5
grapheme clusters, 2 Latin and 3 Thai.
2) Click somewhere in the text
3) Type a space to see where the caret actually is, then backspace to remove it.
4) Repeat the step 2-3 again
5) Try to randomly select some clusters and copy them to the clipboard
6) Paste it somewhere to see what is actually copied
7) Repeat the step 5-6 again
Actual Results:  
You will find that you can  move the caret inside a cluster and select part of
the clusters.

Expected Results:  
Selection and the caret must be on cluster boundaries.

Comment 1

14 years ago
Try moving the caret and make selection with the mouse


14 years ago
Blocks: 283271


14 years ago
Keywords: intl
see also,
bug 100173 : (Solaris) Thai language selection broken

related bug,
bug 283415 : (Windows) Caret must be moved by grapheme cluster boundaries
Ever confirmed: true
Priority: -- → P3


14 years ago
Blocks: 65896

I understand that this has implications on line wrapping (in general but also) inside an editbox. See bug 119860 comment 21 an example with Unicode characters in the Combining Diacritical Marks Block.
Is there a tracking bug about issues related to Combining Diacritical Marks?

Another implication is the line wrapping as done for  / inside / at the email notifications of the bugzilla bug tracking system (and product) when UTF-8 is used. If &#nnnn; coding is used it would be nice that neither these sould be broken nor HTML entities.

best regards reinhardt [[user:gangleri]]


13 years ago
Blocks: 321607

Comment 4

13 years ago
*** Bug 202354 has been marked as a duplicate of this bug. ***

Comment 5

12 years ago
This appears to be WORKSFORME on trunk; reopen if you can still reproduce in a trunk build.
Last Resolved: 12 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.