find should normalize strings for equivalents

RESOLVED DUPLICATE of bug 202251

Status

()

enhancement
RESOLVED DUPLICATE of bug 202251
13 years ago
3 years ago

People

(Reporter: moyogo, Unassigned)

Tracking

(Blocks 1 bug, {intl})

1.8.0 Branch
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en; rv:1.8.1.2) Gecko/20061201 Epiphany/2.18 Firefox/2.0.0.2 (Ubuntu-feisty)
Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en; rv:1.8.1.2) Gecko/20061201 Epiphany/2.18 Firefox/2.0.0.2 (Ubuntu-feisty)

When searching for text on a webpage precomposed characters are not
found/searched as their equivalent characters according to Unicode.

Example: 
- a page has "école" that's with <U+00E9 LATIN SMALL LETTER E WITH ACUTE>.
- search for "é" <U+0065 LATIN SMALL LETTER E;U+0301 COMBINING ACUTE ACCENT>

Result, "école" is not a match yet "é" should match "é"

Reproducible: Always
Keywords: intl
Component: Find Toolbar / FastFind → Keyboard: Find as you Type
Product: Firefox → Core
QA Contact: fast.find → keyboard.fayt
(In reply to comment #1)
> Related bug: bug 202251

Both bugs should be fixed together. Altough I'd suggest fixing bug 374795 (this bug) first.

To fix 374795, it would be enough to normalize strings to NFD before comparison.
To fix 202251, the NFD strings could be stripped from some categories of characters (combining marks, enclosing marks, non spacing marks, and some punctuation). One might want to go as far as NFKD instead of NFD for opaque match, i.e. ² would match 2.
(In reply to comment #2)
> One might want to go as far as NFKD instead of NFD for opaque
> match, i.e. ² would match 2.

In that case, there should be a checkbox or something - overgeneralizing at search engines is annoying. Surprisingly, Google searches for “²” and not “2”, though, while Yahoo and Yandex search for “2”.
OK, let’s consider this not‐a‐duplicate.

I am now using Unicode quotes and dashes all the time, so I support this enhancement (it is, if I am right that no other browser has this feature).
I am not sure if, say, ‘«’, ‘“’ and ‘"’ can be all considered equivalent without custom configuration data…
Severity: normal → enhancement
Status: UNCONFIRMED → NEW
Ever confirmed: true
Version: unspecified → Trunk
Probably comment #4 is better suited for bug 202251, though.
Product: Core → SeaMonkey
This originated as a Firefox bug, so it shouldn't have been moved to Seamonkey, I think.
Component: Find In Page → General
OS: Linux → All
Product: SeaMonkey → Firefox
QA Contact: keyboard.fayt → general
Hardware: x86 → All
See Also: → 202251
Blocks: 693035
Component: General → Find Toolbar
Product: Firefox → Toolkit
QA Contact: general → fast.find
Version: Trunk → 1.8.0 Branch
Duplicate of this bug: 693035
No longer blocks: 693035
Duplicate of this bug: 982497
Duplicate of this bug: 670601
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 202251
You need to log in before you can comment on or make changes to this bug.