Closed Bug 779068 Opened 12 years ago Closed 10 years ago

String normalisation (accents, special chars) for searching contacts

Tracking

(Not tracked)

Status:

RESOLVED WORKSFORME

People

(Reporter: arcturus, Unassigned)

Details

(Whiteboard: Interaction design)

Francisco Jordano [:arcturus] [:francisco]

Reporter

Description

•

12 years ago

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_4) AppleWebKit/536.11 (KHTML, like Gecko) Chrome/20.0.1132.57 Safari/536.11

Steps to reproduce:

Searching for a contact shouldn't force the user to enter special chars or any accents.

Search for a contact:

Manuel Español

should be searched with both strings:

espanol
ñol




Actual results:

Right now we solved this using a custom solution:

https://github.com/arcturus/gaia/blob/69862ab5fd016a319ef023df26b488abc9cf4b05/apps/contacts/js/utilities/normalizer.js

Where we normalise the string to making it searchable.


Expected results:

The platform should provide a way of normalising the string, like any of the utf 8 normalization forms:

http://unicode.org/reports/tr15/

ayman maat :maat

Updated

•

12 years ago

Component: General → Gaia::Contacts

Priority: -- → P3

Whiteboard: Interaction design

ayman maat :maat

Comment 1

•

12 years ago

agreed. normalisation of the string would deliver a more comfortable and pragmatic UX.

Axel Hecht [:Pike]

Comment 2

•

11 years ago

I don't think we have something to help here in the js i18n spec, Norbert?

Norbert Lindenberg

Comment 3

•

11 years ago

The current ECMAScript Internationalization API spec handles only one special case: A Collator with usage="search" can be used to detect that two strings are similar, and in that case accents can be ignored. However, that only works for two complete strings, not for substring matching.

There are actually two separate issues: Unicode normalization and language-specific matching of "similar" strings.

Unicode normalization, as specified in UTR 15, will be added in the ECMAScript Language Specification, edition 6:
http://wiki.ecmascript.org/doku.php?id=strawman:unicode_normalization

But Unicode normalization only erases differences that are linguistically irrelevant; it doesn't remove diacritics or change case, which often carry meaning in some languages. In Spanish, for example, pena and peña are different words with different meanings. In German, fliegen and Fliegen are different words.

What you then want is another layer to find "similar" strings, where similar depends on the language, the user's understanding of the language, the availability of input mechanisms for the language, and other circumstances. That kind of API isn't on the TC 39 agenda yet.

Francisco Jordano [:arcturus] [:francisco]

Reporter

Updated

•

10 years ago

Status: UNCONFIRMED → RESOLVED

Closed: 10 years ago

Resolution: --- → WORKSFORME

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

String normalisation (accents, special chars) for searching contacts

Categories

(Firefox OS Graveyard :: Gaia::Contacts, defect, P3)

Tracking

(Not tracked)

People

(Reporter: arcturus, Unassigned)

References

Details

(Whiteboard: Interaction design)

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Updated