find should normalize strings for equivalents

RESOLVED DUPLICATE of bug 202251

Status

()

--
enhancement
RESOLVED DUPLICATE of bug 202251
12 years ago
3 years ago

People

(Reporter: moyogo, Unassigned)

Tracking

(Blocks: 1 bug, {intl})

1.8.0 Branch
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

12 years ago
User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en; rv:1.8.1.2) Gecko/20061201 Epiphany/2.18 Firefox/2.0.0.2 (Ubuntu-feisty)
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en; rv:1.8.1.2) Gecko/20061201 Epiphany/2.18 Firefox/2.0.0.2 (Ubuntu-feisty)

When searching for text on a webpage precomposed characters are not
found/searched as their equivalent characters according to Unicode.

Example: 
- a page has "école" that's with <U+00E9 LATIN SMALL LETTER E WITH ACUTE>.
- search for "é" <U+0065 LATIN SMALL LETTER E;U+0301 COMBINING ACUTE ACCENT>

Result, "école" is not a match yet "é" should match "é"

Reproducible: Always

Updated

12 years ago
Keywords: intl
Component: Find Toolbar / FastFind → Keyboard: Find as you Type
Product: Firefox → Core
QA Contact: fast.find → keyboard.fayt
(Reporter)

Comment 2

12 years ago
(In reply to comment #1)
> Related bug: bug 202251

Both bugs should be fixed together. Altough I'd suggest fixing bug 374795 (this bug) first.

To fix 374795, it would be enough to normalize strings to NFD before comparison.
To fix 202251, the NFD strings could be stripped from some categories of characters (combining marks, enclosing marks, non spacing marks, and some punctuation). One might want to go as far as NFKD instead of NFD for opaque match, i.e. ² would match 2.

Comment 3

12 years ago
(In reply to comment #2)
> One might want to go as far as NFKD instead of NFD for opaque
> match, i.e. ² would match 2.

In that case, there should be a checkbox or something - overgeneralizing at search engines is annoying. Surprisingly, Google searches for “²” and not “2”, though, while Yahoo and Yandex search for “2”.

Comment 4

12 years ago
OK, let’s consider this not‐a‐duplicate.

I am now using Unicode quotes and dashes all the time, so I support this enhancement (it is, if I am right that no other browser has this feature).
I am not sure if, say, ‘«’, ‘“’ and ‘"’ can be all considered equivalent without custom configuration data…
Severity: normal → enhancement
Status: UNCONFIRMED → NEW
Ever confirmed: true
Version: unspecified → Trunk

Comment 5

12 years ago
Probably comment #4 is better suited for bug 202251, though.
(Assignee)

Updated

11 years ago
Product: Core → SeaMonkey

Comment 6

8 years ago
This originated as a Firefox bug, so it shouldn't have been moved to Seamonkey, I think.
Component: Find In Page → General
OS: Linux → All
Product: SeaMonkey → Firefox
QA Contact: keyboard.fayt → general
Hardware: x86 → All
See Also: → bug 202251

Updated

8 years ago
Blocks: 693035

Updated

8 years ago
Component: General → Find Toolbar
Product: Firefox → Toolkit
QA Contact: general → fast.find
Version: Trunk → 1.8.0 Branch

Updated

8 years ago
Duplicate of this bug: 693035

Updated

8 years ago
No longer blocks: 693035
Duplicate of this bug: 982497

Updated

4 years ago
Duplicate of this bug: 670601
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 202251
You need to log in before you can comment on or make changes to this bug.