Closed Bug 669383 Opened 13 years ago Closed 13 years ago

Searching for unicode is hard

Categories

(addons.mozilla.org Graveyard :: Admin/Editor Tools, defect)

defect
Not set
normal

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 635837
4.x (triaged)

People

(Reporter: krupa.mozbugs, Assigned: jbalogh)

References

()

Details

steps to reproduce:
1. Load https://addons.allizom.org/z/en-US/admin/addon-search
2. Search for the add-on '中止ボタンが○○に見えて困る'

expected behavior:
Searching for the add-on's localized name is supported

observed behavior:
No results are returned when searching for an add-on's localized name.

Note: Searching for the en-us name returns results as expected.
Target Milestone: 6.1.5 → 4.x (triaged)
I have something to fix this if hudson ever clears.
Assignee: nobody → jbalogh
Target Milestone: 4.x (triaged) → 6.1.5
Should have been fixed somewhere around https://github.com/jbalogh/zamboni/commit/c68448e5c1
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
No search results for https://addons.allizom.org/z/en-US/admin/addon-search?q=%E4%B8%AD%E6%AD%A2%E3%83%9C%E3%82%BF%E3%83%B3%E3%81%8C%E2%97%8B%E2%97%8B%E3%81%AB%E8%A6%8B%E3%81%88%E3%81%A6%E5%9B%B0%E3%82%8B
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: 6.1.5 → 4.x (triaged)
The bytes we're looking for come back from elastic (in _source) as u'\u4e2d\u6b62\u30dc\u30bf\u30f3\u304c\u3086\u306e\u3063\u3061\u306b\u898b\u3048\u3066\u56f0\u308b'.

In [1]: x = u'\u4e2d\u6b62\u30dc\u30bf\u30f3\u304c\u3086\u306e\u3063\u3061\u306b\u898b\u3048\u3066\u56f0\u308b'  # (from _source)

In [2]: y = u'\u4e2d\u6b62\u30dc\u30bf\u30f3\u304c\u25cb\u25cb\u306b\u898b\u3048\u3066\u56f0\u308b'  # (from the search logs)

In [3]: x == y
Out[3]: False

In [4]: print x
中止ボタンがゆのっちに見えて困る

In [5]: print y
中止ボタンが○○に見えて困る

The bytes from the database come back as u'\u4e2d\u6b62\u30dc\u30bf\u30f3\u304c\u25cb\u25cb\u306b\u898b\u3048\u3066\u56f0\u308b'. That matches the search logs, not ES. This may be a problem with pyes.

Searching with query.text finds 55 matches. Searching with query.term finds 0 matches.
Status: REOPENED → NEW
Summary: Searching for the add-on's localized name doesn't return results → Searching for unicode is hard
Status: NEW → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → DUPLICATE
Product: addons.mozilla.org → addons.mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.