Differantiate between Turkish dotted and dotless "i"

RESOLVED FIXED

Status

enhancement
P2
normal
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: selim, Assigned: jotes)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

The Turkish alphabet has two different "i" characters: "i" (uppercase: İ)" and "ı" (uppercase: I). Pontoon fails to differantiate these in search.

For example, when searching for "yer imleri", results with "Yer İmleri" aren't returned.

https://pontoon.mozilla.org/tr/firefox-aurora/all-resources/?search=Yer+İmleri
https://pontoon.mozilla.org/tr/firefox-aurora/all-resources/?search=yer+imleri

More info on Wikipedia: https://en.wikipedia.org/wiki/Dotted_and_dotless_I
This might be related to postgres behind the scenes, there are a few hits on google on that, but old ones.

Some indicate that icontains isn't as good as full-text search, so maybe https://docs.djangoproject.com/en/1.10/ref/contrib/postgres/search/ is a good answer.
I can confirm that Pontoon use a simple __contains which maps to the 'LIKE' in postgresql. Currently We're working on improving the speed of the search queries, I can look if any of my work will resolve your issue.
After some digging I found two following links:
https://www.postgresql.org/message-id/Pine.LNX.4.10.10007211331580.1451-100000@ata.cs.hun.edu.tr
http://stackoverflow.com/questions/13029824/postgres-upper-function-on-turkish-character-does-not-return-expected-result

I looked at the code to verify it and Pontoon uses icontains which translates into 'LIKE UPPER('%query%')' and that may be the reason. I'll try to look into that to estimate if it's a quick fix.
Priority: -- → P3
Assignee: nobody → jot
Priority: P3 → P2
Selim, could you have a look at the proposed fix and see if it works OK?
https://aaaaaaaasaaaaaa.herokuapp.com/tr/pontoon-intro/all-resources/

It might take a minute to load the page for the first time.

To add translation, you can log in:
username: pontoon@example.com
password: supersecretpassword
Flags: needinfo?(selim)
(In reply to Matjaz Horvat [:mathjazz] from comment #5)
> Selim, could you have a look at the proposed fix and see if it works OK?
> https://aaaaaaaasaaaaaa.herokuapp.com/tr/pontoon-intro/all-resources/

Yes, I can confirm that it works properly.
Flags: needinfo?(selim)
Thank you!
Commit pushed to master at https://github.com/mozilla/pontoon

https://github.com/mozilla/pontoon/commit/e0008021536eae30df8eb3ed54ab671fdbdc58e5
Fix bug 1346180. Set database collation for search queries. (#588)
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.