Closed Bug 446247 Opened 16 years ago Closed 15 years ago

Autocomplete should match any part of the string

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla1.9.2a1

People

(Reporter: ml, Assigned: MattN)

References

(Blocks 1 open bug)

Details

(Keywords: verified1.9.2)

Attachments

(2 files, 8 obsolete files)

v.1 with word boundary bonuses (WIP to apply to 370117) 15 years ago Matthew N. [:MattN] 11.03 KB, patch		Details \| Diff \| Splinter Review
v.1.1 with preference for prefix matching bonus (WIP) 15 years ago Matthew N. [:MattN] 12.12 KB, patch		Details \| Diff \| Splinter Review
v.2 use previous results 15 years ago Matthew N. [:MattN] 93.86 KB, patch		Details \| Diff \| Splinter Review
moz_formhistory DBMonster Schema 15 years ago Matthew N. [:MattN] 1.92 KB, application/xml		Details
v.3 (fix bitrot) 15 years ago Matthew N. [:MattN] 96.57 KB, patch	Dolske : review-	Details \| Diff \| Splinter Review
v.4 (push instead of splice reused + other review comments) 15 years ago Matthew N. [:MattN] 96.04 KB, patch	Dolske : review-	Details \| Diff \| Splinter Review
v.5 bug fix with test & address review 15 years ago Matthew N. [:MattN] 96.66 KB, patch	Dolske : review-	Details \| Diff \| Splinter Review
v.6 trim search, defaultIndex, bonuses 15 years ago Matthew N. [:MattN] 98.29 KB, patch	Dolske : review-	Details \| Diff \| Splinter Review
v.7 address review comments 15 years ago Matthew N. [:MattN] 98.54 KB, patch	Dolske : review+	Details \| Diff \| Splinter Review
v.8 reduce string creation 15 years ago Matthew N. [:MattN] 98.55 KB, patch		Details \| Diff \| Splinter Review

Mike Linksvayer

Reporter

Description

•

16 years ago

User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9) Gecko/2008061015 Firefox/3.0 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9) Gecko/2008061015 Firefox/3.0 The search bar seems to match previous searches only from the beginning of the string, e.g., if I've previously searched for "perl program" and "programming perl" typing "perl" would only match the first. It would be great if the search bar matched text entered against text anywhere in previous search strings. Reproducible: Always Steps to Reproduce: 1. From search bar, search for "python program" 2. From search bar, search for "programming in python" 3. From search bar, type "python" Actual Results: only "python program" appears in list of previous searches. Expected Results: Both "python program" and "programming in python" should appear in previous searches list.

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

16 years ago

Status: UNCONFIRMED → NEW

Ever confirmed: true

OS: Linux → All

Hardware: PC → All

Mike Connor [:mconnor]

Comment 1

•

16 years ago

Yeah, we should do this when we start making autocomplete smart.

Priority: -- → P3

Summary: search bar should find previous searches as location bar finds visited pages: not only beginning of string → search bar autocomplete should match any part of the string

Target Milestone: --- → Firefox 3.2a1

Matthew N. [:MattN]

Assignee

Comment 2

•

15 years ago

I'll have a patch shortly for satchel that will make this work

Assignee: nobody → mnoorenberghe

Status: NEW → ASSIGNED

Component: Search → Form Manager

Depends on: 469443

Product: Firefox → Toolkit

QA Contact: search → form.manager

Summary: search bar autocomplete should match any part of the string → Autocomplete should match any part of the string

Target Milestone: Firefox 3.6a1 → mozilla1.9.2a1

Matthew N. [:MattN]

Assignee

Comment 3

•

15 years ago

Attached patch v.1 with word boundary bonuses (WIP to apply to 370117) (obsolete) — Details — Splinter Review

Working patch with word boundary bonuses but does not re-use previous search results since word boundary bonuses need to be re-calculated every time the input text is changed.

Attachment #381348 - Flags: review?(dolske)

Matthew N. [:MattN]

Assignee

Comment 4

•

15 years ago

Attached patch v.1.1 with preference for prefix matching bonus (WIP) (obsolete) — Details — Splinter Review

Patch applies to patch on bug 370117. See https://wiki.mozilla.org/Firefox/Namoroka/Improved_form_history for information about prefs in browser.formfill.*.

Attachment #381348 - Attachment is obsolete: true

Attachment #381407 - Flags: review?(dolske)

Attachment #381348 - Flags: review?(dolske)

Matthew N. [:MattN]

Assignee

Comment 5

•

15 years ago

Attached patch v.2 use previous results (obsolete) — Details — Splinter Review

I have added a unit test to check the performance of satchel's autocomplete search. When comparing the implementation in bug 469443 to the implementation in this patch I came up with the following results for search time: Search with 30000 entries for field Old New time (ms) "" w/o previous result 1392 2713 "r" w/o previous result 149 266 "re" w/o previous result 123 806 "rea" w/o previous result 74 216 "rea" with "re" previous result 99 158 "real" with "rea" previous result 8 16 "real" with "re" previous result 105 94 "real" w/o previous result 73 157 Search with 1000 entries for field (more realistic) "" w/o previous result 46 94 "re" w/o previous result 6 30 There are three key outcomes: 1. The SQLite queries are slower now because the query is more complex and returns more records 2. Filtering previous results is faster 3. The first set of data is for a form history database with 30,000 records for the one field I am searching on which is very high and would be extremely rare for the average user.

Attachment #381407 - Attachment is obsolete: true

Attachment #386654 - Flags: review?(dolske)

Attachment #381407 - Flags: review?(dolske)

Matthew N. [:MattN]

Assignee

Comment 6

•

15 years ago

Attached file moz_formhistory DBMonster Schema — Details

I used a tool called DBMonster (http://dbmonster.kernelpanic.pl/) to generate the test databases with random English words. Attached is the XML schema file for the moz_formhistory table. Date column ranges need to be updated before using the schema file. The value field contains random English words with a maximum length of 30. Set dbmonster.jdbc.url to the path to formhistory.sqlite in dbmonster.properties: dbmonster.jdbc.url=jdbc:sqlite:/Users/mozilla/profiles/my_profile/formhistory.sqlite

Justin Dolske [:Dolske]

Comment 7

•

15 years ago

Can you update this patch after updating your patch to bug 370117? [It doesn't apply cleanly right now, so might as well kill two birds with one stone...]

Matthew N. [:MattN]

Assignee

Comment 8

•

15 years ago

Attached patch v.3 (fix bitrot) (obsolete) — Details — Splinter Review

Attachment #386654 - Attachment is obsolete: true

Attachment #389174 - Flags: review?(dolske)

Attachment #386654 - Flags: review?(dolske)

Justin Dolske [:Dolske]

Comment 9

•

15 years ago

Comment on attachment 389174 [details] [diff] [review] v.3 (fix bitrot) >- if (aPreviousResult && >- aSearchString.substr(0, aPreviousResult.searchString.length) == aPreviousResult.searchString) { >+ if (aPreviousResult && aPreviousResult.searchString.length != 1 && >+ aSearchString.indexOf(aPreviousResult.searchString) >= 0) { The indexOf() bit isn't immediately obvious... Add a comment noting that we'll reuse the previous search when the new search result will be a subset of the old result. >+ let tempEntries = result.wrappedJSObject.entries; s/tempEntries/entries/. >+ entryLoop: > for (let i = result.matchCount - 1; i >= 0; i--) { Use tempEntries.length instead of matchCount. >- let match = result.getValueAt(i); >+ let match = tempEntries[i].text; "match" is only used once, but tempEntries[i] is used repeatedly. Better to have a "let entry = entries[i]", and just use entry.text instead of "match". >+ for (var token in searchTokens) { >+ // Remove results that do not contain the token >+ // XXX bug 394604 -- .toLowerCase can be wrong for some intl chars >+ if (match.toLowerCase().indexOf(searchTokens[token]) < 0) { This would be better as "for each (let token in searchTokens)" and "...indexOf(token)". >+ this.log("Removing autocomplete entry '" + match + "'"); >+ tempEntries.splice(i, 1); Hmmmm... An alternate approach to using splice() would be create a new array, and only .push() entries that we want to leave on. I'm not sure what the cost of splice() is, might be good to ask the JS guys. I think shaver made it fast in FF3.0... >+ // re-calculate scores on remaining items >+ tempEntries[i].totalScore = this._calculateScore(tempEntries[i], aSearchString, searchTokens); Just update .totalScore in _calculateScore(), no need to return it. The comment seems superfluous (and maybe slightly wrong). Maybe s/calculateScore/updateEntryWeight/? >+ if (this._debug) // XXX update score in text when debugging >+ tempEntries[i].text = >+ tempEntries[i].text.replace(/$.+$$/, "(" + tempEntries[i].totalScore + ")"); Nuke this, since it makes using form history in debug mode difficult. >+ let trimmedSearchLength = searchString.trim().length; Can we just do "searchString = searchString.trim()" here? I'm not sure having the whitespace is useful, though maybe I'm missing something. >+ // if only one character typed, only do prefix matching >+ if (trimmedSearchLength > 1) { Nit: make the comment match the block... "Only do substring matching when more than 1 character is typed" >+ // for each word, calculate word boundary weights and add word to WHERE clause of SQL >+ for (var i in searchTokens) { I think this should just be a plain "for(i=0, i < searchTokens.length; i++)" loop, to make it crystal clear that the SQL string you're building just contains inserted numbers, and not the actual search string (which would be unsafe). >+ weightCalc += "(value LIKE :tokenBegin" + i + " ESCAPE '/') + " + >+ "(value LIKE :tokenBoundary" + i + " ESCAPE '/') + "; >+ where += "AND (value LIKE :tokenContains" + i + " ESCAPE '/') "; >+ } >+ // add more weight if we have a traditional prefix match and >+ // multiply boundary bonuses by boundary weight >+ weightCalc += ":prefixWeight * (value LIKE :valuePrefix ESCAPE '/')"; >+ params.prefixWeight = this._prefixWeight; >+ weightedBonus = "* (1 + (" + this._boundaryWeight + " / 100.0) * (" + weightCalc + "))"; Set weightCalc to this value at the top of the block, so it's just built up sequentially. Shouldn't need separate "weightedBonus" and "weightCalc" vars. Should probably have a brief comment at the top of the block explaining the approach here, too. >+ } else if (trimmedSearchLength == 1) { >+ where = "AND (value LIKE :valuePrefix ESCAPE '/') "; >+ } I'd add an else{} here to explicitly deal with trimmedSearchLength == 0, so it's clear that |where| will be an empty string and that's intentional. >+ let query = "SELECT value, " + >+ "ROUND( " + >+ "(1.0 + :agedWeight / 100.0 * (firstUsed < :expiryDate)) * " + >+ "timesUsed / MAX(1.0, (lastUsed - firstUsed) / :timeGroupingSize) * " + >+ "MAX(1.0, :maxTimeGroupings - (:now - lastUsed) / :timeGroupingSize) / "+ >+ ":bucketSize "+ >+ ", 3) AS frecency, " + >+ "ROUND( " + >+ "(1.0 + :agedWeight / 100.0 * (firstUsed < :expiryDate)) * " + >+ "timesUsed / MAX(1.0, (lastUsed - firstUsed) / :timeGroupingSize) * " + >+ "MAX(1.0, :maxTimeGroupings - (:now - lastUsed) / :timeGroupingSize) / "+ >+ ":bucketSize "+ >+ weightedBonus + It seems suboptimal to be computing the frecency twice in the statement... Can statements refer to previous values? EG, "... AS frecency, ROUND(frecency + ..." If not, I'd compute the boundary bonus as an independent item, and just have the JS code set "entry.totalScore = row.frecency + row+boundaryBonus"? >+ if (trimmedSearchLength == 1) >+ stmt.params.valuePrefix = stmt.escapeStringForLIKE(searchString.trim(), "/") + "%"; >+ else if (trimmedSearchLength > 0) >+ stmt.params.valuePrefix = escapedSearch + "%"; This seems odd, because the else would contain the first condition too. Is > 1 what's meant here? What about length == 0? Add that as an explicit else block, as above? >+ if (trimmedSearchLength > 1) >+ for (var i in searchTokens) { >+ stmt.params["tokenBegin" + i] = searchTokens[i] + "%"; >+ stmt.params["tokenBoundary" + i] = "% " + searchTokens[i] + "%"; >+ stmt.params["tokenContains" + i] = "%" + searchTokens[i] + "%"; >+ } If-block needs braces around the for-block, since although there's just 1 statement in the block it's a multiline statement. >+ _calculateScore : function (entry, aSearchString, searchTokens) { >+ let weightCalc = 0; >+ // for each word, calculate word boundary weights >+ for (var token in searchTokens) { Use "for each" here too. >+ weightCalc += (entry.text.toLowerCase().indexOf(searchTokens[token]) === 0); == 0, not === 0. >+ // now add more weight if we have a traditional prefix match and >+ // multiply boundary bonuses by boundary weight >+ weightCalc += this._prefixWeight * (entry.text.toLowerCase().indexOf(aSearchString.toLowerCase()) === 0); == 0, not === 0. Will look at tests in more detail in next review pass.

Attachment #389174 - Flags: review?(dolske) → review-

Matthew N. [:MattN]

Assignee

Comment 10

•

15 years ago

Attached patch v.4 (push instead of splice reused + other review comments) (obsolete) — Details — Splinter Review

(In reply to comment #9) > (From update of attachment 389174 [details] [diff] [review]) > >+ this.log("Removing autocomplete entry '" + match + "'"); > >+ tempEntries.splice(i, 1); > > Hmmmm... > > An alternate approach to using splice() would be create a new array, and only > .push() entries that we want to leave on. I'm not sure what the cost of > splice() is, might be good to ask the JS guys. I think shaver made it fast in > FF3.0... From a quick benchmark, pushing entries into a new array is much more efficient than splicing for large arrays. Since we generally remove more entries than we keep with every keystroke it makes sense to switch to .push() > >+ let trimmedSearchLength = searchString.trim().length; > > Can we just do "searchString = searchString.trim()" here? I'm not sure having > the whitespace is useful, though maybe I'm missing something. Currently a trailing space affects the prefix match calculation so that if a user enters "fat " (trailing space) for example, existing entries that have "fat " as a prefix (ie. "fat cat") would get a bonus whereas "fattest cow" would not. > >+ let query = "SELECT value, " + > >+ "ROUND( " + > >+ "(1.0 + :agedWeight / 100.0 * (firstUsed < :expiryDate)) * " + > >+ "timesUsed / MAX(1.0, (lastUsed - firstUsed) / :timeGroupingSize) * " + > >+ "MAX(1.0, :maxTimeGroupings - (:now - lastUsed) / :timeGroupingSize) / "+ > >+ ":bucketSize "+ > >+ ", 3) AS frecency, " + > >+ "ROUND( " + > >+ "(1.0 + :agedWeight / 100.0 * (firstUsed < :expiryDate)) * " + > >+ "timesUsed / MAX(1.0, (lastUsed - firstUsed) / :timeGroupingSize) * " + > >+ "MAX(1.0, :maxTimeGroupings - (:now - lastUsed) / :timeGroupingSize) / "+ > >+ ":bucketSize "+ > >+ weightedBonus + > > It seems suboptimal to be computing the frecency twice in the statement... > > Can statements refer to previous values? EG, "... AS frecency, ROUND(frecency + > ..." Unfortunately, SELECT expressions cannot refer to aliases but I have changed the query to calculate the parts individually and just order by the product. > > If not, I'd compute the boundary bonus as an independent item, and just have > the JS code set "entry.totalScore = row.frecency + row+boundaryBonus"? To clarify, the values are multiplied: "entry.totalScore = row.frecency * row.boundaryBonus"

Attachment #389174 - Attachment is obsolete: true

Attachment #390327 - Flags: review?(dolske)

Justin Dolske [:Dolske]

Comment 11

•

15 years ago

Comment on attachment 390327 [details] [diff] [review] v.4 (push instead of splice reused + other review comments) >+ let entries = result.wrappedJSObject.entries; >+ let currentEntries = []; Nit: currentEntries isn't a good name, as it implies it's the, err, current entries (when it's really the new results being filted). Maybe newEntries or filteredEntries? >+ for each (let token in searchTokens) { >+ // Remove results that do not contain the token >+ // XXX bug 394604 -- .toLowerCase can be wrong for some intl chars >+ if (entry.text.toLowerCase().indexOf(token) < 0) { >+ continue entryLoop; >+ } >+ } A bit more compact: // Remove results that do not contain the token // XXX bug 394604 -- .toLowerCase can be wrong for some intl chars if(searchTokens.some(function (tok) entry.text.toLowerCase().indexOf(tok) < 0)) continue Eliminates the need for the entryLoop label, too. >+ // re-calculate score >+ this._calculateScore(entry, aSearchString, searchTokens); I guess no need for the comment at all now. :) >+ result.wrappedJSObject.entries.sort(this._sortBytotalScore); This is the only place the sort callback is used. Make it a function local autoCompleteSearch(), or maybe as an anonymous inline... >+ // only do substring matching when more than one character is typed >+ if (trimmedSearchLength > 1) { Move "let where, weightCalc" to right after the comment. >+ weightCalc += ":prefixWeight * (value LIKE :valuePrefix ESCAPE '/')"; >+ params.prefixWeight = this._prefixWeight; >+ weightCalc += "))"; The last weightCalc part can just go in the string 2 lines up. Also, make sure you're adding trailing spaces where needed. You're generating a final query that looks like "...blah()blah...", which is probably only working because the parser is ok with no space after a paren? >+ } else if (trimmedSearchLength == 1) { >+ where = "AND (value LIKE :valuePrefix ESCAPE '/')"; >+ } else { >+ where = ""; >+ } Explicitly set weightCalc to 1 in these two blocks, so it's clear that it's not really being used. Let's also rename weightCalc / weightedBonus to be more specific that it's the word boundary bonus. boundaryCalc / boundaryBonus, maybe? >- "timesUsed / max(1.0, (lastUsed - firstUsed) / :timeGroupingSize) * " + >- "max(1.0, :maxTimeGroupings - (:now - lastUsed) / :timeGroupingSize) * "+ >- "max(1.0, :agedWeight * (firstUsed < :expiryDate)) / " + >+ "timesUsed / MAX(1.0, (lastUsed - firstUsed) / :timeGroupingSize) * " + >+ "MAX(1.0, :maxTimeGroupings - (:now - lastUsed) / :timeGroupingSize) * "+ >+ "MAX(1.0, :agedWeight * (firstUsed < :expiryDate)) / " + Hmm, it would be better to s/max/MAX/ in your other patch which is adding these. >+ if (trimmedSearchLength > 1) { >+ stmt.params.valuePrefix = stmt.escapeStringForLIKE(searchString, "/") + "%"; >+ for (var i in searchTokens) { let is the new var! Change the other "var"s in this patch to "let"s. This is also another spot that should be a plain "for(i=0...)" loop. >+ stmt.params["tokenBegin" + i] = searchTokens[i] + "%"; >+ stmt.params["tokenBoundary" + i] = "% " + searchTokens[i] + "%"; >+ stmt.params["tokenContains" + i] = "%" + searchTokens[i] + "%"; These all need to use stmt.escapeStringForLIKE() before adding "%". >+ } else if (trimmedSearchLength == 1) { >+ stmt.params.valuePrefix = stmt.escapeStringForLIKE(searchString.trim(), "/") + "%"; >+ } else { >+ // no addional params need to be substituted into the query when the length is zero >+ } In the final else, stmt.params.valuePrefix will be null, no? Just set that outside the trimmedSearchLength checks, and I guess the last two blocks can simply go away if they're both empty. >+ _calculateScore : function (entry, aSearchString, searchTokens) { ... >+ weightCalc += this._prefixWeight * (entry.text.toLowerCase().indexOf(aSearchString.toLowerCase()) == 0); >+ entry.totalScore = Math.round(entry.frecency * (1 + (this._boundaryWeight / 100.0) * weightCalc)); Can you clean up the line wrapping here to be ~80 column sane? >+++ b/toolkit/components/satchel/test/unit/perf_autocomplete.js >+ fh = Cc["@mozilla.org/satchel/form-history;1"]. >+ getService(Ci.nsIFormHistory2); >+ fac = Cc["@mozilla.org/satchel/form-autocomplete;1"]. >+ getService(Ci.nsIFormAutoComplete); >+ prefs = Cc["@mozilla.org/preferences-service;1"]. >+ getService(Ci.nsIPrefBranch); >+ Nit: alignment Cc/getService. >+ } catch (e) { >+ throw "FAILED in test #" + testnum + " -- " + e; >+ } >+} >\ No newline at end of file Add newline. :)

Attachment #390327 - Flags: review?(dolske) → review-

Matthew N. [:MattN]

Assignee

Comment 12

•

15 years ago

Attached patch v.5 bug fix with test & address review (obsolete) — Details — Splinter Review

Attachment #390327 - Attachment is obsolete: true

Attachment #390614 - Flags: review?(dolske)

Justin Dolske [:Dolske]

Updated

•

15 years ago

Attachment #390614 - Flags: review?(dolske) → review-

Justin Dolske [:Dolske]

Comment 13

•

15 years ago

Comment on attachment 390614 [details] [diff] [review] v.5 bug fix with test & address review >+ this.log("Reusing autocomplete entry '" + entry.text + "'"); >+ this._calculateScore(entry, aSearchString, searchTokens); I think aSearchString should be the trimmed search string here. Typing "<space>foo" won't be counted as a prefix match to the saved entry "foobar". The more I think about it, the less I think trailing space (as affected by trim()) is important, and it seems more like a complication that's going to lead to bugs. I think the function argument should be aUntrimmedSearchString, and then set "let searchString = aUntrimmedSearchString.trim()". We could get a similar effect, though, by having the scoring give a bonus when the search token exactly matches a token in the saved entry's value. This then also works for and token within a multitoken search. >+ result.wrappedJSObject.entries = filteredEntries; >+ result.wrappedJSObject.entries.sort(sortBytotalScore); filteredEntries.sort(sortBytotalScore); result.wrappedJSObject.entries = filteredEntries; >+ // update searchResult to reflect if there are no matches left >+ if (entries.length == 0) >+ result.wrappedJSObject.searchResult = Ci.nsIAutoCompleteResult.RESULT_NOMATCH; This should check filteredEntries.length. Better yet, make .searchResult a getter that returns RESULT_NOMATCH or RESULT_SUCCESS depending on the length of .entries... Does .defaultIndex need updating here too? I think so. Needs to go to -1 when there's NOMATCH, and should reflect the new position in filteredEntries (if the item is added, or the current filtedEntries.length if it's not). > this.log("Creating new autocomplete search result."); > let entries = this.getAutoCompleteValues(aInputName, aSearchString); I think the trimmed search string can be passed in here now. >+ // build up the word boundary and prefix match bonus calculation >+ boundaryCalc = "(1 + (" + this._boundaryWeight + " / 100.0) * ("; No /100 scaling. >+ // for each word, calculate word boundary weights for the SELECT clause and >+ // add word to the WHERE clause of the query >+ for (let i = 0; i < searchTokens.length; i++) { >+ boundaryCalc += "(value LIKE :tokenBegin" + i + " ESCAPE '/') + " + >+ "(value LIKE :tokenBoundary" + i + " ESCAPE '/') + "; >+ where += "AND (value LIKE :tokenContains" + i + " ESCAPE '/') "; So, I've noticed while running with this patch that short search strings yield lots of irrelevant results because they're matching in the middle of the string (vs at a word boundary). For now, can we only match on word boundaries and deal with actual substring matches in a separate patch? We'll need to tweak some weights so that they're sometimes still available, but should generally be ordered below the word boundary matches. >+ _calculateScore : function (entry, aSearchString, searchTokens) { >+ let boundaryCalc = 0; Double check that this computes the same weight as the query would? Looks right to me, but just to be sure. :) Finally, can you rerun the perf numbers with the updated patch to see if they're inline with previous results?

Matthew N. [:MattN]

Assignee

Comment 14

•

15 years ago

Attached patch v.6 trim search, defaultIndex, bonuses (obsolete) — Details — Splinter Review

(In reply to comment #13) > So, I've noticed while running with this patch that short search strings yield > lots of irrelevant results because they're matching in the middle of the string > (vs at a word boundary). > > For now, can we only match on word boundaries and deal with actual substring > matches in a separate patch? We'll need to tweak some weights so that they're > sometimes still available, but should generally be ordered below the word > boundary matches. As discussed, I tweaked the weighting to move the word boundary matches higher. > >+ _calculateScore : function (entry, aSearchString, searchTokens) { > >+ let boundaryCalc = 0; > > Double check that this computes the same weight as the query would? Looks right > to me, but just to be sure. :) I tested the calculations with a large set of data and the results were the same.

Attachment #390614 - Attachment is obsolete: true

Attachment #391388 - Flags: review?(dolske)

Justin Dolske [:Dolske]

Comment 15

•

15 years ago

Comment on attachment 391388 [details] [diff] [review] v.6 trim search, defaultIndex, bonuses >+++ b/modules/libpref/src/init/all.js ... >-pref("browser.formfill.bucketSize", 5); >+pref("browser.formfill.bucketSize", 1); FTR, we should file a followup bug to either fix or remove the bucketing, since the current scheme (w/ size > 1) wasn't working well. The ideas we discussed were: - Use some kind of non-linear grouping, since the weights were spanning a large range - Order the first N results (~10?) by totalweight, and make the rest just alphabetical. >--- a/toolkit/components/satchel/src/nsFormAutoComplete.js ... >+ if (aPreviousResult && aPreviousResult.searchString.trim().length > 1 && >+ searchString.indexOf(aPreviousResult.searchString.trim()) >= 0) { > this.log("Using previous autocomplete result"); > result = aPreviousResult; >- result.wrappedJSObject.searchString = aSearchString; >+ result.wrappedJSObject.searchString = searchString; Shouldn't this be set to aUntrimmedSearchString? Ditto for when creating a new FormAutoCompleteResult. >+ this.log("Reusing autocomplete entry '" + entry.text + "' (" + entry.frecency +" / " + entry.totalScore + ")"); >+ this._calculateScore(entry, searchString, searchTokens); Clean up line wrapping, and put the log after the calculateScore() so that it shows the new weight. Still need to update .defaultIndex here, since replacing .entries will likely make it invalid. >+ if (entries.length == 0) >+ result.wrappedJSObject.defaultIndex = -1; Still need to change this to check filteredEntries.length. >+ // build up the word boundary and prefix match bonus calculation >+ boundaryCalc = "MAX(1, :boundaryWeight * ("; ... I think it flows better if you start with the prefixWeight... boundaryCalc = "MAX(1, :prefixWeight * (value LIKE :valuePrefix ESCAPE '/') "; for (...) { boundaryCalc += "+ (value LIKE ...) " + "+ (value LIKE ...) "; } boundaryCalc += ")"; Which also avoids the need for the dummy "0". > if (entries.length > 0) { >- this.searchResult = Ci.nsIAutoCompleteResult.RESULT_SUCCESS; > this.defaultIndex = 0; > } Remove braces too.

Attachment #391388 - Flags: review?(dolske) → review-

Matthew N. [:MattN]

Assignee

Comment 16

•

15 years ago

Attached patch v.7 address review comments (obsolete) — Details — Splinter Review

(In reply to comment #15) > >+++ b/modules/libpref/src/init/all.js > ... > >-pref("browser.formfill.bucketSize", 5); > >+pref("browser.formfill.bucketSize", 1); > > FTR, we should file a followup bug to either fix or remove the bucketing, since > the current scheme (w/ size > 1) wasn't working well. The ideas we discussed > were: > - Use some kind of non-linear grouping, since the weights were spanning a > large range > - Order the first N results (~10?) by totalweight, and make the rest just > alphabetical. Filed as bug 507271

Attachment #391388 - Attachment is obsolete: true

Attachment #391471 - Flags: review?(dolske)

Justin Dolske [:Dolske]

Comment 17

•

15 years ago

Comment on attachment 391471 [details] [diff] [review] v.7 address review comments Yay. I went through everything again, and just had a few more final small nits. r+ with these fixed... >+ let searchString = aUntrimmedSearchString.trim(); I think we should throw a .toLowerCase() in here too, to make sure it isn't missed elsewhere. That also avoids the need for a number of toLowerCase() calls, which means less string creation. >+ while (stmt.step()) { >+ let entry = { >+ text: stmt.row.value, >+ frecency: stmt.row.frecency, >+ totalScore: Math.round(...) >+ } >+ values.push(entry); Similarly, there are a number of places where we keep checking entry.text.toLowerCase(), so it should be faster to just go ahead and store a textLowerCase property once (again, less string creation). > stmt = this._dbCreateStatement(query, params); Nuts. Now that we're generating different queries depending on the input, it's not such a good idea to cache these statements. Add a 3rd arg to _dbCreateStatement(), and have it not cache the statement when it's set. > toolkit/components/satchel/test/unit/formhistory_1000.sqlite FWIW, I think the script you used to create this has a bug... There seem to be quite a few entries with value == "".

Attachment #391471 - Flags: review?(dolske) → review+

Justin Dolske [:Dolske]

Comment 18

•

15 years ago

(In reply to comment #17) > Nuts. Now that we're generating different queries depending on the input, it's > not such a good idea to cache these statements. Add a 3rd arg to > _dbCreateStatement(), and have it not cache the statement when it's set. Eh, never mind, you've changed my mind. There are almost always only going to be 1 or 2 token terms in the query; as the user types more we handle the filtering in JS. Creating queries with more token terms isn't very likely to happen, and is still going to be a fairly small total number of statements.

Matthew N. [:MattN]

Assignee

Comment 19

•

15 years ago

Attached patch v.8 reduce string creation — Details — Splinter Review

(In reply to comment #17) > (From update of attachment 391471 [details] [diff] [review]) > > toolkit/components/satchel/test/unit/formhistory_1000.sqlite > > FWIW, I think the script you used to create this has a bug... There seem to be > quite a few entries with value == "". I didn't set the minimum length so it probably defaults to 0 which gives the empty string. There is also no unique constraint enforced on the name/value columns to stop duplicates at the DB level.

Attachment #391471 - Attachment is obsolete: true

Justin Dolske [:Dolske]

Comment 20

•

15 years ago

Pushed http://hg.mozilla.org/mozilla-central/rev/0f04a94f8089 Woot! Great job!

Status: ASSIGNED → RESOLVED

Closed: 15 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

Ed Lee :Mardak

Updated

•

15 years ago

Blocks: 510168

Tracy Walker [:tracy]

Comment 21

•

15 years ago

verified with: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2b1pre) Gecko/20090925 Namoroka/3.6b1pre

Status: RESOLVED → VERIFIED

Flags: in-litmus?

Keywords: verified1.9.2

Tracy Walker [:tracy]

Comment 22

•

11 years ago

removing in-litmus flag, it no longer exists

Flags: in-litmus?

Mark Banner (:standard8)

Updated

•

8 years ago

Depends on: 1325603

You need to log in before you can comment on or make changes to this bug.