Open Bug 590376 Opened 14 years ago Updated 2 years ago

Speed up MatchAutoCompleteFunction's functions by lower-casing the token before it hits SQL

Tracking

()

Status:

NEW

People

(Reporter: justin.lebar+bug, Unassigned)

References

Details

(Keywords: perf, Whiteboard: [snt-scrubbed][search-performance])

Justin Lebar (not reading bugmail)

Reporter

Description

•

14 years ago

Even with the patch in bug 570975, MatchAutoCompleteFunction::findOnBoundary and the case-insensitive UTF8 comparator is a major hotspot. findOnBoundary spends the vast majority of its time not finding matches. In particular, the most common operation is: * compare the first character of the token we're searching for to a character in the search string * notice that they're not identical * convert both characters to lower-case * compare the lower-cased characters * notice that they're not identical. One way I think we could speed this up is to convert the token to lower-case before we query the database. Then findOnBoundary could use a custom half-case-insensitive comparator which assumes that one operand is already lowercased. bz pointed out in bug 145975 comment #59 that this will do the Wrong Thing with some letters, such as the Turkish dotless i, but it appears that we don't handle this properly right now anyway. Although they're less of a hotspot in my profiles, we could do the same for MatchAutoCompleteFunction::findAnywhere and MatchAutoCompleteFunction::findBeginning.

Justin Lebar (not reading bugmail)

Reporter

Updated

•

14 years ago

Keywords: perf

Justin Lebar (not reading bugmail)

Reporter

Comment 1

•

14 years ago

For my reference, some relevant code lives in nsPlacesAutoComplete.js; search for params.searchString.

Shawn Wilsher :sdwilsh

Comment 2

•

14 years ago

I'd like to add that we should assert in debug builds that whatever is passed to this is, in fact, lowercase already.

Justin Lebar (not reading bugmail)

Reporter

Comment 3

•

14 years ago

I think it makes sense to convert the search string to UTF-32 before passing it to the autocomplete function. That way, we can skip parsing its Unicode inside the loop altogether.

Shawn Wilsher :sdwilsh

Comment 4

•

14 years ago

(In reply to comment #3) > I think it makes sense to convert the search string to UTF-32 before passing it > to the autocomplete function. That way, we can skip parsing its Unicode inside > the loop altogether. Do we even have UTF-32 string classes?

Justin Lebar (not reading bugmail)

Reporter

Comment 5

•

14 years ago

(In reply to comment #4) > Do we even have UTF-32 string classes? Not to my knowledge, no.

Justin Lebar (not reading bugmail)

Reporter

Comment 6

•

14 years ago

It looks like I can pass blobs through the SQL parameters, but the JS interface is still magic to me.

Shawn Wilsher :sdwilsh

Comment 7

•

14 years ago

(In reply to comment #6) > It looks like I can pass blobs through the SQL parameters, but the JS interface > is still magic to me. I can probably shine light on the magic, but it is unclear what you are trying to do here.

Justin Lebar (not reading bugmail)

Reporter

Comment 8

•

14 years ago

My thought is: When I convert the string the user typed to lower-case, also convert the string to UTF-32. Pass the lower-case UTF-32 string to the search function. This way, the search function does no work on the typed string. The case-insensitive comparison is: * Take a character from the UTF-8 string from the database. * Convert it to lower-case. The result is a uint32. * Compare that lower-case character to the current character in the UTF-32 search string.

Shawn Wilsher :sdwilsh

Comment 9

•

14 years ago

(In reply to comment #6) > It looks like I can pass blobs through the SQL parameters, but the JS interface > is still magic to me. You want bindBlobByName then.

Justin Lebar (not reading bugmail)

Reporter

Comment 10

•

14 years ago

The magic code is in mozStorageAsyncStatementParams, which takes query.params.foo="bar" and calls mStatement->BindByName(foo). For now, I'm just going to add a function mozIStorageStatementParams::bindStringAsUTF32. We can try to figure out a cleaner way to do this later.

Shawn Wilsher :sdwilsh

Comment 11

•

14 years ago

(In reply to comment #10) > The magic code is in mozStorageAsyncStatementParams, which takes > query.params.foo="bar" and calls mStatement->BindByName(foo). > > For now, I'm just going to add a function > mozIStorageStatementParams::bindStringAsUTF32. We can try to figure out a > cleaner way to do this later. I thought you wanted to bind a blob here, in which case I gave you the method you want to call on a mozIStorageBindingParams object (which a mozIStorageAsyncStatement object is)

Marco Bonardo [:mak]

Updated

•

8 years ago

Priority: -- → P3

BMO Automation

Updated

•

2 years ago

Severity: normal → S3

Marco Bonardo [:mak]

Comment 12

•

2 years ago

We should verify if there is an interesting win using a profiler, before moving on with it.

Whiteboard: [snt-scrubbed][search-performance]

Jira Integration Bot

Updated

•

2 years ago

See Also: → https://mozilla-hub.atlassian.net/browse/SNT-546

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Speed up MatchAutoCompleteFunction's functions by lower-casing the token before it hits SQL

Categories

(Toolkit :: Places, defect, P3)

Tracking

()

People

(Reporter: justin.lebar+bug, Unassigned)

References

Details

(Keywords: perf, Whiteboard: [snt-scrubbed][search-performance])

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Updated

Updated

Comment 12

Updated