Closed Bug 924190 Opened 11 years ago Closed 11 years ago

Allow transliterated searches (or those with removed diacritics)

Categories

(Participation Infrastructure :: Phonebook, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED DUPLICATE of bug 841518

People

(Reporter: rimas, Unassigned)

Details

Since we're a global community, it seems natural that sometimes we may search for a mozillian by not-so-correct name. E.g., "Jerome" instead of "Jérôme". The phonebook should take that into account when filtering people.
I agree. 

Are you aware of any algorithms or libraries that can accomplish this programmatically? 

In the past[0] I have advocated solving it with a simple "nickname" field. Whatever you put into "nickname" would be considered during searches. This seems like a lightweight way to solve it that might actually provide more coverage than any programmatic approach could, for reasons described in bug 841518 comment 6.

What do you think? Would a "nickname" field (replace "nickname" with whatever field description you think is appropriate) solve this?

[0] Bug 841518 comment 5
(In reply to Justin Crawford [:hoosteeno] from comment #1)
> Are you aware of any algorithms or libraries that can accomplish this
> programmatically?

You could try converting to "ASCII//TRANSLIT" using iconv. However, from reading the bug you mentioned, I guess this might not be enough because of different possible transliteration rules and the fact that iconv doesn't have transliteration rules for all cases.

ICU also has a normalization API [1], perhaps PyICU provides bindings for it?

> In the past[0] I have advocated solving it with a simple "nickname" field.
> Whatever you put into "nickname" would be considered during searches. This
> seems like a lightweight way to solve it that might actually provide more
> coverage than any programmatic approach could, for reasons described in bug
> 841518 comment 6.
> 
> What do you think? Would a "nickname" field (replace "nickname" with
> whatever field description you think is appropriate) solve this?


Regarding nicknames in particular, I'd say we already have them:
1) there is a Username field, and that username appears in my profile link and page, so quite likely it will contain my nickname
2) there are fields for all kinds of external accounts – these also might contain nicknames

Perhaps searching this info as well would improve the situation.

BTW I think this bug can be duped against bug 841518, because the problem being discussed is exactly the same.

[0] Bug 841518 comment 5
[1] http://www.icu-project.org/apiref/icu4c/unorm_8h.html
We can dupe this bug, but maybe I can clarify my thinking around "nicknames". I agree we already have all sorts of account handles and at least one of them is likely to be a nickname. The idea behind the field I'm talking about is more like, "People might also search for me as....".

This content might be nicknames. It could be "Mikey" or "M-Dog" for "Michael".

But it could also be more like search keywords. For someone named "Jérôme" it could be "Jerome" and "Gerome" and "Jerry". People whose names are often shortened or who have accented characters can probably anticipate at least a few variations, and this lets them do their own SEO. :)

I'll close as a dupe for now.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → DUPLICATE
I believe this can be fixed without the latin name (although latin only name does serve a complementary purpose). Bug 923014 is also related.
Bumping to verified duplicate -- would be nice to see some movement on this bug again in the near future
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.