Closed Bug 937179 Opened 11 years ago Closed 3 years ago

Awesomescreen searches are case-sensitive for non-ASCII characters

Categories

(Firefox for Android Graveyard :: Data Providers, defect, P5)

All
Android
defect

Tracking

(fennec+)

RESOLVED INCOMPLETE
Tracking Status
fennec + ---

People

(Reporter: u60234, Unassigned)

References

Details

Tested on Firefox 25 and Fennec 28.

Searches for history entries in Awesomescreen are not case-insensitive for non-ASCII characters, such as the Scandinavian letters å, ä and ö.

Steps to reproduce:
- Go to http://www.laplandecostore.se/kontakt.html
- Tap in the URL bar and type 'kundtj'
- Notice the matching entry for 'Lapland Eco Store - Ekologiskt & Giftfritt - KUNDTJÄNST'
- Add a lower-case 'ä' to the search. You may need a Swedish keyboard or another Scandinavian language keyboard for this.

Actual result:
No matches found for 'kundtjä'. I have to use upper-case 'Ä' and type 'kundtjÄ' to find the entry.
We're doing this filtering in a SQL query:
http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/db/LocalBrowserDB.java#144

I wonder if this is some issue with the Android SQLite, or something that we can fix on our end.
Component: Keyboards and IME → Awesomescreen
This is a collation issue. 

At the very least we'd need to use an appropriate locale and use COLLATE LOCALIZED or UNICODE. 

But that doesn't work for LIKE. Probably the only solutions are to manually process strings and use them for searches (blegh), or to switch to a proper text indexing system that correctly supports collation.
A third option is to split search terms on non-ASCII and apply in-memory filtering of results once we reach a slightly smarter text system (Java). I'm hopeful that an in-DB approach will work. 

Also, this seems easy to write a test for...
This sounds like a really bad annoyance for the bug filer. I think we should track this.
tracking-fennec: --- → ?
I concur re tracking.

Moving this to Data Providers, because this is really a storage layer bug.
Component: Awesomescreen → Data Providers
Hardware: ARM → All
To clarify my proposed possible solutions, while I'm here:

* Use sqlite FTS3, ideally with the unicode61 tokenizer, and use MATCH instead of LIKE (perhaps also with input transformation -- e.g., to remove diacritics). If all of our supported OS revisions support FTS, then this will be a massive speed win. If the tokenizer does case-folding correctly, it will also solve this bug.

Furthermore, the ability to use matchinfo() to improve relevance would improve our Awesomebar results.

(Docs seem to suggest that this is feasible: <http://developer.android.com/training/search/search.html>)

* Manually add a column to the DB containing normalized versions of all values we match against. This ~doubles our space usage, but will make queries slightly faster (only one column to check), and will solve this bug.
  * We could also only normalize non-ASCII fields, but that introduces some complexity.

* Try to use UPPER() in our queries. This will be slow as hell, so I don't advise it in practice.
tracking-fennec: ? → +
Blocks: 935025
FYI: there's already a (old!) bug report to track the FTS stuff (bug 808872)
(In reply to Lucas Rocha (:lucasr) from comment #7)
> FYI: there's already a (old!) bug report to track the FTS stuff (bug 808872)

Heh, I'm always at least a year ahead of myself :D

I'm going to stick my neck out and say we should use FTS to solve the majority of this issue, and just be careful the rest of the time.
Depends on: 808872
filter on [mass-p5]
Priority: -- → P5
We have completed our launch of our new Firefox on Android. The development of the new versions use GitHub for issue tracking. If the bug report still reproduces in a current version of [Firefox on Android nightly](https://play.google.com/store/apps/details?id=org.mozilla.fenix) an issue can be reported at the [Fenix GitHub project](https://github.com/mozilla-mobile/fenix/). If you want to discuss your report please use [Mozilla's chat](https://wiki.mozilla.org/Matrix#Connect_to_Matrix) server https://chat.mozilla.org and join the [#fenix](https://chat.mozilla.org/#/room/#fenix:mozilla.org) channel.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → INCOMPLETE
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.