Closed
Bug 1173164
Opened 10 years ago
Closed 7 years ago
Use FTS for Awesomebar searches
Categories
(Firefox for iOS :: Data Storage, defect, P2)
Tracking
()
RESOLVED
FIXED
| Tracking | Status | |
|---|---|---|
| fxios | 13.0 | --- |
People
(Reporter: rnewman, Assigned: justindarc)
References
(Depends on 1 open bug)
Details
(Keywords: perf)
Attachments
(1 file)
|
55 bytes,
patch
|
justindarc
:
review+
nalexander
:
feedback+
|
Details | Diff | Splinter Review |
* We do LIKE %foo% querying (over two columns, but that shouldn't matter much). That's a table walk right there. We should use FTS if we can.
Comment 2•10 years ago
|
||
re: other two tracking bugs on bug 1169322 for 1.0, should this?
tracking-fxios:
--- → ?
| Reporter | ||
Comment 3•10 years ago
|
||
Too big for 1.0, unless the perf impact kills us in the polls.
Updated•10 years ago
|
| Reporter | ||
Comment 4•7 years ago
|
||
Requesting tracking: search on my iPad takes up to nine seconds to show results.
OS: iOS 8 → iOS
Summary: Use FTS3 for Awesomebar searches → Use FTS for Awesomebar searches
| Assignee | ||
Updated•7 years ago
|
Priority: -- → P2
| Assignee | ||
Updated•7 years ago
|
Priority: P2 → P3
| Assignee | ||
Comment 6•7 years ago
|
||
This actually looks easier than I had thought.
Reading the SQLite docs here:
https://www.sqlite.org/fts3.html
It appears that all we need to do is add the following compilation flags to sqlcipher.xcodeproj:
-DSQLITE_ENABLE_FTS3
-DSQLITE_ENABLE_FTS3_PARENTHESIS
Then, to create a FTS index of the `history` table and keep it in sync with INSERTs/UPDATEs/DELETEs:
```
CREATE VIRTUAL TABLE history_fts USING fts4(content="history", url, title);
INSERT INTO history_fts(history_fts) VALUES ('rebuild');
CREATE TRIGGER history_beforeupdate BEFORE UPDATE ON history BEGIN
DELETE FROM history_fts WHERE docid=old.rowid;
END;
CREATE TRIGGER history_beforedelete BEFORE DELETE ON history BEGIN
DELETE FROM history_fts WHERE docid=old.rowid;
END;
CREATE TRIGGER history_afterupdate AFTER UPDATE ON history BEGIN
INSERT INTO history_fts(docid, url, title) VALUES (new.rowid, new.url, new.title);
END;
CREATE TRIGGER history_afterinsert AFTER INSERT ON history BEGIN
INSERT INTO history_fts(docid, url, title) VALUES (new.rowid, new.url, new.title);
END;
```
This will create an FTS4 virtual table called `history_fts` that can be queried using FTS like:
```
SELECT docid FROM history_fts WHERE history_fts MATCH 'foo';
```
Assignee: nobody → jdarcangelo
Status: NEW → ASSIGNED
| Assignee | ||
Comment 7•7 years ago
|
||
Attachment #8992746 -
Flags: review?(gkeeley)
| Assignee | ||
Comment 8•7 years ago
|
||
Comment on attachment 8992746 [details] [diff] [review]
GitHub Pull Request
Nick/Richard, does this seem like a reasonable approach to start integrating FTS for awesomebar searches?
Attachment #8992746 -
Flags: feedback?(nalexander)
Attachment #8992746 -
Flags: feedback?(bugzilla)
Comment 9•7 years ago
|
||
Comment on attachment 8992746 [details] [diff] [review]
GitHub Pull Request
Review of attachment 8992746 [details] [diff] [review]:
-----------------------------------------------------------------
Yes, this looks more or less like I would expect. I'm not expert enough to understand the real performance of the triggers (fine until they're not fine; FTS4 has good amortized insertion but bad worst case insertion, IIUC) and the `INNER JOIN`.
::: Storage/SQL/BrowserSchema.swift
@@ +360,4 @@
> )
> """
>
> + let historyFTSCreate =
A little overview of what you're doing here would be good.
```
We maintain a SQLite FTS4 index to the existing content in the history table, ...
```
For Mentat, we're using specific tokenizing options (https://github.com/mozilla/mentat/blob/69c9f512a09436749f9a734142e25fd57287f330/db/src/db.rs#L209). Are you confident your search does what you want, even for `.`, `:`, and `/` in URLs? (I'm not confident Mentat does that well, honestly. Perhaps it doesn't matter.)
@@ +640,4 @@
> FROM view_awesomebar_bookmarks b LEFT JOIN favicons f ON f.id = b.faviconID
> """
>
> + fileprivate let historyBeforeUpdateTrigger = """
Explain to me what would be required to do this incrementally -- i.e., to not always drop and re-insert, but instead to determine whether a material change was made to the history table (`url` and `title` only).
::: Storage/SQL/SQLiteHistory.swift
@@ +266,4 @@
> let whereFragment = (whereData == nil) ? "" : " AND (\(whereData!))"
>
> if let urlFilter = urlFilter?.trimmingCharacters(in: .whitespaces), !urlFilter.isEmpty {
> + let ftsArgs = ["*\(urlFilter)*"]
What happens with `*` in URL filter? And aren't there other characters that need to be escaped?
@@ +307,5 @@
> if includeBookmarks {
> ungroupedSQL.append(" LEFT JOIN view_all_bookmarks ON view_all_bookmarks.url = history.url")
> }
>
> + ungroupedSQL.append(ftsWhereClause)
Explain why you don't need to scope `title` and `url` to `history.{title,url}`? (This code is complex enough that my scan might have missed why.)
::: ThirdParty/sqlcipher/sqlcipher.xcodeproj/project.pbxproj
@@ +154,4 @@
> "-DSQLITE_TEMP_STORE=2",
> "-DSQLITE_THREADSAFE=2",
> "-DSQLCIPHER_CRYPTO_CC",
> + "-DSQLITE_ENABLE_FTS3",
You're confident that you don't need `-DSQLITE_ENABLE_FTS4` as well? It's definitely present in the SQLite source. I know that FTS3 and FTS4 are very closely related...
Attachment #8992746 -
Attachment is patch: true
Attachment #8992746 -
Attachment mime type: text/x-github-pull-request → text/plain
Attachment #8992746 -
Flags: feedback?(nalexander) → feedback+
| Assignee | ||
Comment 10•7 years ago
|
||
(In reply to Nick Alexander :nalexander from comment #9)
Thanks for the feedback, Nick!!
> Comment on attachment 8992746 [details] [diff] [review]
> GitHub Pull Request
>
> Review of attachment 8992746 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Yes, this looks more or less like I would expect. I'm not expert enough to
> understand the real performance of the triggers (fine until they're not
> fine; FTS4 has good amortized insertion but bad worst case insertion, IIUC)
> and the `INNER JOIN`.
>
> ::: Storage/SQL/BrowserSchema.swift
> @@ +360,4 @@
> > )
> > """
> >
> > + let historyFTSCreate =
>
> A little overview of what you're doing here would be good.
> ```
> We maintain a SQLite FTS4 index to the existing content in the history
> table, ...
> ```
>
+1 -- I'll go ahead and add a comment here.
> For Mentat, we're using specific tokenizing options
> (https://github.com/mozilla/mentat/blob/
> 69c9f512a09436749f9a734142e25fd57287f330/db/src/db.rs#L209). Are you
> confident your search does what you want, even for `.`, `:`, and `/` in
> URLs? (I'm not confident Mentat does that well, honestly. Perhaps it
> doesn't matter.)
>
I'll do a little more testing and perhaps tweak this before I land. My initial testing seems to work pretty well for this.
> @@ +640,4 @@
> > FROM view_awesomebar_bookmarks b LEFT JOIN favicons f ON f.id = b.faviconID
> > """
> >
> > + fileprivate let historyBeforeUpdateTrigger = """
>
> Explain to me what would be required to do this incrementally -- i.e., to
> not always drop and re-insert, but instead to determine whether a material
> change was made to the history table (`url` and `title` only).
>
I took these triggers directly from SQLite's documentation for how to keep an FTS index in sync with an external content table:
https://www.sqlite.org/fts3.html#_external_content_fts4_tables_
> ::: Storage/SQL/SQLiteHistory.swift
> @@ +266,4 @@
> > let whereFragment = (whereData == nil) ? "" : " AND (\(whereData!))"
> >
> > if let urlFilter = urlFilter?.trimmingCharacters(in: .whitespaces), !urlFilter.isEmpty {
> > + let ftsArgs = ["*\(urlFilter)*"]
>
> What happens with `*` in URL filter? And aren't there other characters that
> need to be escaped?
>
Good catch. I'll go ahead and escape special chars here.
> @@ +307,5 @@
> > if includeBookmarks {
> > ungroupedSQL.append(" LEFT JOIN view_all_bookmarks ON view_all_bookmarks.url = history.url")
> > }
> >
> > + ungroupedSQL.append(ftsWhereClause)
>
> Explain why you don't need to scope `title` and `url` to
> `history.{title,url}`? (This code is complex enough that my scan might have
> missed why.)
>
The original query was unscoped too. None of the other tables being joined to have a `title` or `url` field besides `history`. A bit further down in this method is another query to search bookmarks that used to reuse this same `WHERE` clause and I'm assuming this is part of why this was never scoped. However, now that this is using `MATCH`, I believe that the column names are automatically assumed to reference the FTS table.
> ::: ThirdParty/sqlcipher/sqlcipher.xcodeproj/project.pbxproj
> @@ +154,4 @@
> > "-DSQLITE_TEMP_STORE=2",
> > "-DSQLITE_THREADSAFE=2",
> > "-DSQLCIPHER_CRYPTO_CC",
> > + "-DSQLITE_ENABLE_FTS3",
>
> You're confident that you don't need `-DSQLITE_ENABLE_FTS4` as well? It's
> definitely present in the SQLite source. I know that FTS3 and FTS4 are very
> closely related...
Yes. According to the docs:
https://www.sqlite.org/fts3.html#compiling_and_enabling_fts3_and_fts4
"Note that enabling FTS3 also makes FTS4 available. There is not a separate SQLITE_ENABLE_FTS4 compile-time option. A build of SQLite either supports both FTS3 and FTS4 or it supports neither."
| Assignee | ||
Comment 11•7 years ago
|
||
Comment on attachment 8992746 [details] [diff] [review]
GitHub Pull Request
Carrying over R+ from GitHub.
Attachment #8992746 -
Flags: review?(gkeeley) → review+
| Assignee | ||
Comment 12•7 years ago
|
||
Landed on master:
https://github.com/mozilla-mobile/firefox-ios/commit/221759181eaa5ef83c537e5140b21142ecf7bfd2
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
| Reporter | ||
Comment 13•7 years ago
|
||
(In reply to Justin D'Arcangelo [:justindarc] from comment #10)
> > For Mentat, we're using specific tokenizing options…
>
> I'll do a little more testing and perhaps tweak this before I land. My
> initial testing seems to work pretty well for this.
Bug 1425333 covers some of the edge cases in tokenizing. You might or might not have regressed from a LIKE-based system: LIKE naturally does full substring matching, whereas FTS relies on tokenizing behavior. You should make sure that you have tests for multilocale tokenizing behavior (e.g., searching for reasonable substrings of Japanese text).
Blocks: 1477051
| Reporter | ||
Updated•7 years ago
|
Attachment #8992746 -
Flags: feedback?(bugzilla)
You need to log in
before you can comment on or make changes to this bug.
Description
•