Closed Bug 401869 Opened 17 years ago Closed 16 years ago

Allow multiple words search in Auto-complete/Location Bar

Categories

(Firefox :: Address Bar, defect, P1)

defect

Tracking

()

VERIFIED FIXED
Firefox 3 beta4

People

(Reporter: ronin.achilles, Assigned: Mardak)

References

Details

Attachments

(3 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9a9pre) Gecko/2007103023 Minefield/3.0a9pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9a9pre) Gecko/2007103023 Minefield/3.0a9pre

Since we now allow search on 'Titles', 'Bookmarks' and 'History' in the Location Bar, I think, for most optimal implementation we should also allow searching possible for multiple words.

e.g. 
Lets say a user has the following entry in his history:

http://reason.com/news/printer/122024.html [Reason Magazine - Creative Destruction vs. the New Industrial State]

Now 
i. if the user types for "reason", this history entry will show up
ii. if he types "reason magazine", again this history will show up

But
if he types "reason industrial" this entry will not show up.

I think it should.



Reproducible: Always

Steps to Reproduce:
1.
2.
3.
Flags: blocking-firefox3?
I don't think this behavior is going to change because a wildcard search is much slower and resource intensive than a string search. If you go to bookmarks -> organize bookmarks -> search: history -> then type the disjoint words the item appears.

On my system with a 20mb places file the address bar search is snappy where as the history search hangs Firefox for 2-3s. Hanging Firefox each time the user accesses the address bar is unacceptable.
This isn't a wildcard search, this is "foo AND bar" instead of "foo bar"

There's a bug on this already, and its something we should clearly do.
Flags: blocking-firefox3?
Whiteboard: DUPEME
Seth: Is there a Bug on this already. I see that you have just landed the multiple tag search but it does not take care of History or Bookmark Titles.

Besides in multiple tag search as well as multiple words search, aren't we going to allow random ordering of words? i.e. if a user has a tag called "read later"... will he not be able to search on "read later" as well as "later read"?
> Seth: Is there a Bug on this already. 

I couldn't find one, so let's use this one.  thanks for logging it and for the reminder.

> I see that you have just landed the multiple tag search but it does not take > care of History or Bookmark Titles.

right, that was just for tag searching.  But when we implement this, we can re-use some of that code.

> Besides in multiple tag search as well as multiple words search, aren't we
> going to allow random ordering of words? i.e. if a user has a tag called "read
> later"... will he not be able to search on "read later" as well as "later
> read"?

even after the landing of bug #395452, that does not work.  I'll log a spin off bug about that.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: blocking-firefox3?
OS: Windows Vista → All
Hardware: PC → All
Whiteboard: DUPEME
Version: unspecified → Trunk
> I'll log a spin off bug about that.

see bug #405320
Flags: blocking-firefox3? → blocking-firefox3+
Priority: -- → P2
(In reply to comment #4)

> > I see that you have just landed the multiple tag search but it does not take > care of History or Bookmark Titles.
> 
> right, that was just for tag searching.  But when we implement this, we can
> re-use some of that code.
> 
> > Besides in multiple tag search as well as multiple words search, aren't we
> > going to allow random ordering of words? i.e. if a user has a tag called "read
> > later"... will he not be able to search on "read later" as well as "later
> > read"?
> 
> even after the landing of bug #395452, that does not work.  I'll log a spin off
> bug about that.
> 

Just a clarification... from my initial comment above, I just wanted to understand if the implementation that you are planning will take care of the scenario where random order + random placements will be taken care of (for Titles, URLs and Tags)...

so for example if the Tag is "read this later" and a user searches for "read later" or "later read", this tag should show up, i.e. even if the word "this" has been left out. Same goes for titles. So basically the words could be placed anywhere (and not necessarily in continuous order) in the Tag name, Title or URL.

Is my understanding correct?
Attached image multiple words search
An example of multiple word search using del.icio.us direct.or
Assignee: nobody → sspitzer
Priority: P2 → P1
un-assigning seth's places/autocomplete bugs.
Assignee: moco → nobody
I think this should be marked blocking bug 405745 '[meta] improve url bar autocomplete results'.
It would be nice if this feature searched both the title and url at the same time.
E.g., "digg.com pic" would match "digg.com" in the url and "pic" in the title.
Blocks: 405745
Attached image screenshot of v1
Example of the search finding "plan" and "blog" on ".org" sites (so it doesn't match planet mozilla blog at blog.mozilla.com/planet)
Attached patch v1 (obsolete) — Splinter Review
This adds a matchAll(title, url, query) sql function that matches query terms (separated by spaces) against the title and url at the same time.

Here's some builds that have this patch (as well as those for global frecency bug 394038, adaptive learning bug 395739, as well as stopping early when we have enough results)

https://build.mozilla.org/tryserver-builds/2008-01-19_00:36-edward.lee@engineering.uiuc.edu-glob.fast.adap.book.matc/
Assignee: nobody → edilee
Status: NEW → ASSIGNED
Attachment #297971 - Flags: review?(dietrich)
One issue I've run into using these builds is typing out data uris or any really long address can lead to perf issues.

sdwilsh: On a somewhat separate issue, would implementing this as the MATCH function in mozStorageUnicodeFunctions.cpp avoid some overheads of creating my own function? The majority of the logic should be the same as what I have in v1 patch, but I'm assuming calling a StorageUnicodeFunctions::RegisterFunctions'd function is faster than what I have right now.
Yeah - there'll be lots of overhead this way.  What exactly does that function need to do?  Just see if something exists inside the string like CONTAINS type of thing?
It does the MATCH functionality that's usually for fulltext searching. Meaning it looks for all the space separated words in a string. (In this case I have to target strings, but that can be concated into 1 before calling MATCH.)

MATCH returns 1 if all plain/literal search terms are in the target.
So, why can't you build a query using a series of LIKE statements?

http://sqlite.org/lang_expr.html
You mean convert..
WHERE matchAll(title, url, 'term1 term2 term3')

into..
(title LIKE '%term1%' OR url LIKE '%term1%') AND (title LIKE '%term2%' OR url LIKE '%term2%') AND (title LIKE '%term3%' OR url LIKE '%term3%')

.. which would require dynamically generating the query based on the number of search terms?

Additionally, LIKE has unnecessary logic for processing _ and % when MATCH is just for the literal term.
Alright, so a few comments:
1) The name would probably be better if it were MATCH_ALL_TERMS_WITH_WORDS.
2) You should take a variable number of arguments, so it'll look like this:
WHERE MATCH_ALL_TERMS_WITH_WORDS(title, url, term1, term2, ...)
Yes, that means you'll need to dynamically build the query, but it goes along with how SQL functions tend to work.

If you do that, I'd be happy to include that into mozStorageUnicodeFunctions.

*but*, are you sure this is why you are taking such a perf hit?  If you can't get dtrace, I'd suggest you have someone else run it to see what's up.
Target Milestone: --- → Firefox 3 beta4
Comment on attachment 297971 [details] [diff] [review]
v1

this is obsolete given the recent AC changes, right?
Attachment #297971 - Flags: review?(dietrich)
Attached patch v2 (obsolete) — Splinter Review
(In reply to comment #19)
> (From update of attachment 297971 [details] [diff] [review])
> this is obsolete given the recent AC changes, right?
Ah indeed. I've reimplemented it completely differently based on how we're filtering results now.
Attachment #297971 - Attachment is obsolete: true
Attachment #300218 - Flags: review?(dietrich)
Attached patch v2.1Splinter Review
Now with more unit tests! :)

let gTests = [
  ["0: Match 2 terms all in url", [[0,0]]],
  ["1: Match 1 term in url and 1 term in title", [[0,0],[1,1]]],
  ["2: Match 3 terms all in title; display bookmark title if matched", [[1,1],[3,1]]],
  ["3: Match 2 terms in url and 1 in title; bookmark title didn't match", [[2,0],[3,0]]],
  ["4: Match 3 terms in url and 1 in title", [[1,1]]],
  ["5: Match nothing", []],
];
Attachment #300218 - Attachment is obsolete: true
Attachment #300258 - Flags: review?(dietrich)
Attachment #300218 - Flags: review?(dietrich)
Depends on: 414285
Comment on attachment 300258 [details] [diff] [review]
v2.1

"error: 'mCurrentSearchTokens' was not declared in this scope"

missing files, or missing hunks?
Attachment #300258 - Flags: review?(dietrich) → review-
(In reply to comment #22)
> (From update of attachment 300258 [details] [diff] [review])
> "error: 'mCurrentSearchTokens' was not declared in this scope"
> missing files, or missing hunks?
This bug depends on bug 414285 which seth has looked at already.
Blocks: 415403
Comment on attachment 300258 [details] [diff] [review]
v2.1

Rerequesting review on this patch.

I printed out times that it takes FullHistorySearch to complete its call with the default prefs (1000chunk, 10timeout, 25rows)

trunk:

b 5 ms    
u 8 ms  
g 9 ms
  9 ms
4 13 ms
0 30 ms

That means it took 5ms to get 25 results that match "b", 8ms to match "bu",.. 30ms to match "bug 40"

Compared to after applying this patch:

b 5 ms
u 7 ms
g 9 ms
  9 ms
4 14 ms
0 22 ms
  21 ms
m 22 ms                      
o 31 ms                    
z 30 ms
i 29 ms
l 31 ms
l 29 ms
a 28 ms
  29 ms
. 30 ms
o 34 ms
r 36 ms
g 34 ms
  34 ms
s 35 ms
h 37 ms
o 37 ms
w 36 ms
_ 38 ms
b 38 ms
u 36 ms
g 37 ms
  37 ms
? 37 ms
i 38 ms
d 38 ms

In the end I searched for "bug 40 mozilla .org show_bug ?id"

I checked this with multiple runs and indeed "bug 40" goes faster than trunk. This is probably because "bug 40" doesn't have to match exactly "bug 40".. it can match titles like "bug 340", so it's able to find more results faster.
Attachment #300258 - Flags: review- → review?(dietrich)
So in the case that the user only types 1 word, the speed is pretty much the same. When the user starts typing out more words, it'll take some time.. nothing is free ;) But the additional costs aren't significantly more when keeping the pages matched between the two tests the same.

Sure, if I searched for some other terms, it'll take more time because it needs to find additional pages that trunk can't find. But for this, I'm comparing the speeds trying to keep the results the same.
Blocks: 406359
Comment on attachment 300258 [details] [diff] [review]
v2.1

r=me, thanks for the extra effort in perf testing.
Attachment #300258 - Flags: review?(dietrich) → review+
Comment on attachment 300258 [details] [diff] [review]
v2.1


>+        // True if any of them match; false makes us quit the loop
>+        matchAll = bookmarkMatch ||
>+          CaseInsensitiveFindInReadable(*token, entryTitle) ||
>+          CaseInsensitiveFindInReadable(*token, entryURL);
>+      }
>+
>+      // Skip if we don't match all terms in the bookmark, title or url
>+      if (!matchAll)
>         continue;

just need to match one, not all, right? please rename the variable and fix the comment before checking in.
Comment on attachment 300258 [details] [diff] [review]
v2.1

This will need bug 414285 before landing.
Attachment #300258 - Flags: approval1.9?
Comment on attachment 300258 [details] [diff] [review]
v2.1

Oh, this bug is blocking-firefox3+ ;) No need to approval.
Attachment #300258 - Flags: approval1.9?
Checking in toolkit/components/places/src/nsNavHistoryAutoComplete.cpp;
/cvsroot/mozilla/toolkit/components/places/src/nsNavHistoryAutoComplete.cpp,v  <--  nsNavHistoryAutoComplete.cpp
new revision: 1.39; previous revision: 1.38
done
RCS file: /cvsroot/mozilla/toolkit/components/places/tests/unit/test_multi_word_search.js,v
done
Checking in toolkit/components/places/tests/unit/test_multi_word_search.js;
/cvsroot/mozilla/toolkit/components/places/tests/unit/test_multi_word_search.js,v  <--  test_multi_word_search.js
initial revision: 1.1
done
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
20080205_2206_firefox-3.0b4pre.en-US.win32.zip

if search with one word, matched word is bold and underline.
if with two or more word, it it normal font.

intended ?
See bug 415403.
Depends on: 416960
This bug is verified in Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9b4pre) Gecko/2008021504 Minefield/3.0b4pre.  The underline issue is tracked in bug 415403.
Status: RESOLVED → VERIFIED
Is this in Firefox 3 beta 3? When I type "bug show", no entries show up, while I would expect all bugzilla show_bug.cgi entries to show up (by matching on the URL).

~Grauw
This will be in beta 4; see the target milestone.
Blocks: 418920
Depends on: 420165
No longer depends on: 420165
This is marked as fixed, but when you search for "moz illa", it will not find "mozilla" (or, perhaps a better example, "cyber news" won't find "cybernetnews"). Is this the intended behavior?

Especially with long URLs where the user doesn't remember the exact name but does remember parts of the name, I can imagine this will lead the user to believe he typed in the wrong keywords.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: