The new Find Whole Word/ Find Exact String Option does not find Chinese words

NEW
Unassigned

Status

()

Toolkit
Find Toolbar
9 months ago
5 months ago

People

(Reporter: JonathanW, Unassigned)

Tracking

({intl})

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

9 months ago
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Firefox/45.0
Build ID: 20160725105554

Steps to reproduce:

In nightly beta, use find in page with the new whole word option to search for the single character word 在 on the page https://zh.wikipedia.org/wiki/Wikipedia:%E9%A6%96%E9%A1%B5 


Actual results:

only one instance is found - " 在1960" - the one in which the word is surrounded by non-CJK characters.


Expected results:

many matches should have shown up, basically the same set as when whole word matching was not used.
(Reporter)

Comment 1

9 months ago
If you try a similar search in other browsers that support the whole word option, they do find Chinese words even with the whole word option selected.

The current code appears to rely on a word break which determines breaks as changes in character class.  I suspect the current code does not work well for any language that does not separate words by spaces -- Thai, Chinese, Japanese.   There is some more info here about languages that do not use spaces  https://r12a.github.io/scripts/tutorial/part5
Blocks: 269422
Component: Untriaged → Find Toolbar
Keywords: intl
OS: Unspecified → Linux
Product: Firefox → Toolkit
Hardware: Unspecified → All
Version: 45 Branch → unspecified

Comment 2

9 months ago
"which only matches strings surrounded by word-breaking characters, like spaces or punctuation marks in latin-derived languages.", from bug 1282759.
Blocks: 269442
No longer blocks: 269422
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Linux → All
You need to log in before you can comment on or make changes to this bug.