Closed Bug 525271 Opened 16 years ago Closed 11 years ago

Tags do not respect unicode

Categories

(addons.mozilla.org Graveyard :: Developer Pages, defect, P3)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED WONTFIX
4.x (triaged)

People

(Reporter: davedash, Unassigned)

References

Details

See https://bugzilla.mozilla.org/show_bug.cgi?id=525201#c7 If the first tag in the database is "föö" all "FOO" and "foo" tags in the future will show up as "föö". We may need to rethink tags. User inputted tags should store a few things: the userid the addonid the tag as the user entered it the normalized tag and we can get rid of the tags table - or dynamically create an aggregated tags table if we need it. This way we can find all thing "FOo" "foo" "föo" when we go to the /tag/foo page *and* people will have their tags show up as they entered them, not as the first person entered them. We can incorporate sphinx to index these rather quickly too, so the tags table ultimately will be useless. IMO this is low priority since we're not delicious, and a bit involved, but we should do it one of these days.
> If the first tag in the database is "föö" all "FOO" and "foo" tags in the > future will show up as "föö". This is incorrect and is a recent regression.
Also, I don't think we should try to "normalize" unicode tags to ASCII (föö -> foo), and I am actually pretty sure we don't.
(In reply to comment #2) > Also, I don't think we should try to "normalize" unicode tags to ASCII (föö -> > foo), and I am actually pretty sure we don't. we don't and don't want to do that. That's a different issue.
Let's figure this out this week
Priority: -- → P3
Target Milestone: --- → 5.5
We're going to create a raw_text column to store next to the current tags. When doing searching we'll use a binary comparison on the current tag field. When we display we use the raw text (and just pick a version).
Whiteboard: [2010Q1]
Target Milestone: 5.5 → 4.x (triaged)
Whiteboard: [2010Q1]
Summary: Tags do not respect users desired casing → Tags do not respect users' desired casing
Assignee: dd → nobody
Component: Search → Developer Pages
QA Contact: search → developers
So there are 2 issues: 1. users' desired casing is not supported 2. users' desired diacritics (or absence of diacritics) is not supported: first similar-looking spelling is used. There's currently no work-around for issue 1. For issue 2., there's a work-around: spacing can be used to distinguish subsequent tags. P.S. I would rename this bug to Tags do not fully respect users' desired casing and desired diacritics (or absence thereof) or split it in 2 bugs. My 2 cents.
1) Case is not maintained and should not be. Tags are designed to discover similar content. Having case support resulted in FireFox, Firefox, firefox and so on. The fragmentation was horrible and reduced the tags value on site. Note that other tagging sites like flickr do not support case sensitivity. 2) I agree.
Summary: Tags do not respect users' desired casing → Tags do not respect unicode
Thanks for filing this. In an effort to not drown in existing reports we're aggressively closing old enhancements and bugs to get the buglist to a reasonable level so we can scope and process bug sprints in an effective manner. Patches for this bug are still welcome.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → WONTFIX
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.