Improve search for international users by ignoring accents

RESOLVED DUPLICATE of bug 202251

Status

()

Firefox
Untriaged
--
minor
RESOLVED DUPLICATE of bug 202251
4 years ago
4 years ago

People

(Reporter: Marios Titas, Unassigned)

Tracking

26 Branch
x86_64
All
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [bugday-20140212])

(Reporter)

Description

4 years ago
Google Chrome implements a feature that makes searching for some text in a page significantly easier for some international users like myself: suppose that you would like to search for the string "née" but you don't have a keyboard layout that supports french accents. In Google Chrome, the string "née" will much against "nee".

There is no standard way to do search ignoring accents, but there is an algorithm that is widely used for that that works quite well (at least wrt western languages). The algorithm is based on a function called unaccent that takes a string s as its input and remove all accents as follows:
1. s:=NFD(s)
2. Remove from s all NonspacingMark characters
3. return NFC(s)
where NFD & NFC are standard unicode functions and NonspacingMark is a unicode character category. In the first step we separate accents from their base characters, then we remove the accents, and then we put what's left back together. For example, unaccent("ȁȂᾞçĢžᾧ") returns "aAΗcGzω".

To do now a string comparison that ignores accents, we just remove the accents and compare the two strings (see the overview section in [1]).

[1] http://userguide.icu-project.org/transforms/general

Updated

4 years ago
Status: UNCONFIRMED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → DUPLICATE
Whiteboard: [bugday-20140212]
Duplicate of bug: 202251
You need to log in before you can comment on or make changes to this bug.