401869 - Allow multiple words search in Auto-complete/Location Bar

Reporter

Description

•

17 years ago

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9a9pre) Gecko/2007103023 Minefield/3.0a9pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9a9pre) Gecko/2007103023 Minefield/3.0a9pre

Since we now allow search on 'Titles', 'Bookmarks' and 'History' in the Location Bar, I think, for most optimal implementation we should also allow searching possible for multiple words.

e.g. 
Lets say a user has the following entry in his history:

http://reason.com/news/printer/122024.html [Reason Magazine - Creative Destruction vs. the New Industrial State]

Now 
i. if the user types for "reason", this history entry will show up
ii. if he types "reason magazine", again this history will show up

But
if he types "reason industrial" this entry will not show up.

I think it should.



Reproducible: Always

Steps to Reproduce:
1.
2.
3.

ronin.achilles

Reporter

Updated

•

17 years ago

Flags: blocking-firefox3?

Kevin Brosnan

Comment 1

•

17 years ago

I don't think this behavior is going to change because a wildcard search is much slower and resource intensive than a string search. If you go to bookmarks -> organize bookmarks -> search: history -> then type the disjoint words the item appears.

On my system with a 20mb places file the address bar search is snappy where as the history search hangs Firefox for 2-3s. Hanging Firefox each time the user accesses the address bar is unacceptable.

Mike Connor [:mconnor]

Comment 2

•

17 years ago

This isn't a wildcard search, this is "foo AND bar" instead of "foo bar"

There's a bug on this already, and its something we should clearly do.

Flags: blocking-firefox3?

Whiteboard: DUPEME

ronin.achilles

Reporter

Comment 3

•

17 years ago

Seth: Is there a Bug on this already. I see that you have just landed the multiple tag search but it does not take care of History or Bookmark Titles.

Besides in multiple tag search as well as multiple words search, aren't we going to allow random ordering of words? i.e. if a user has a tag called "read later"... will he not be able to search on "read later" as well as "later read"?

(not reading, please use seth@sspitzer.org instead)

Comment 4

•

17 years ago

> Seth: Is there a Bug on this already. 

I couldn't find one, so let's use this one.  thanks for logging it and for the reminder.

> I see that you have just landed the multiple tag search but it does not take > care of History or Bookmark Titles.

right, that was just for tag searching.  But when we implement this, we can re-use some of that code.

> Besides in multiple tag search as well as multiple words search, aren't we
> going to allow random ordering of words? i.e. if a user has a tag called "read
> later"... will he not be able to search on "read later" as well as "later
> read"?

even after the landing of bug #395452, that does not work.  I'll log a spin off bug about that.

Status: UNCONFIRMED → NEW

Ever confirmed: true

Flags: blocking-firefox3?

OS: Windows Vista → All

Hardware: PC → All

Whiteboard: DUPEME

Version: unspecified → Trunk

(not reading, please use seth@sspitzer.org instead)

Comment 5

•

17 years ago

> I'll log a spin off bug about that.

see bug #405320

Mike Beltzner [:beltzner, not reading bugmail]

Updated

•

17 years ago

Flags: blocking-firefox3? → blocking-firefox3+

Priority: -- → P2

ronin.achilles

Reporter

Comment 6

•

17 years ago

(In reply to comment #4)

> > I see that you have just landed the multiple tag search but it does not take > care of History or Bookmark Titles.
> 
> right, that was just for tag searching.  But when we implement this, we can
> re-use some of that code.
> 
> > Besides in multiple tag search as well as multiple words search, aren't we
> > going to allow random ordering of words? i.e. if a user has a tag called "read
> > later"... will he not be able to search on "read later" as well as "later
> > read"?
> 
> even after the landing of bug #395452, that does not work.  I'll log a spin off
> bug about that.
> 

Just a clarification... from my initial comment above, I just wanted to understand if the implementation that you are planning will take care of the scenario where random order + random placements will be taken care of (for Titles, URLs and Tags)...

so for example if the Tag is "read this later" and a user searches for "read later" or "later read", this tag should show up, i.e. even if the word "this" has been left out. Same goes for titles. So basically the words could be placed anywhere (and not necessarily in continuous order) in the Tag name, Title or URL.

Is my understanding correct?

ronin.achilles

Reporter

Comment 7

•

17 years ago

Attached image multiple words search — Details

An example of multiple word search using del.icio.us direct.or

Mike Connor [:mconnor]

Updated

•

17 years ago

Assignee: nobody → sspitzer

Priority: P2 → P1

Dietrich Ayala (:dietrich)

Comment 8

•

17 years ago

un-assigning seth's places/autocomplete bugs.

Assignee: moco → nobody

AndrewM

Comment 9

•

17 years ago

I think this should be marked blocking bug 405745 '[meta] improve url bar autocomplete results'.

Ed Lee :Mardak

Assignee

Comment 10

•

17 years ago

It would be nice if this feature searched both the title and url at the same time.
E.g., "digg.com pic" would match "digg.com" in the url and "pic" in the title.

Blocks: 405745

Ed Lee :Mardak

Assignee

Comment 11

•

17 years ago

Attached image screenshot of v1 — Details

Example of the search finding "plan" and "blog" on ".org" sites (so it doesn't match planet mozilla blog at blog.mozilla.com/planet)

Ed Lee :Mardak

Assignee

Comment 12

•

17 years ago

Attached patch v1 (obsolete) — Details — Splinter Review

This adds a matchAll(title, url, query) sql function that matches query terms (separated by spaces) against the title and url at the same time.

Here's some builds that have this patch (as well as those for global frecency bug 394038, adaptive learning bug 395739, as well as stopping early when we have enough results)

https://build.mozilla.org/tryserver-builds/2008-01-19_00:36-edward.lee@engineering.uiuc.edu-glob.fast.adap.book.matc/

Assignee: nobody → edilee

Status: NEW → ASSIGNED

Attachment #297971 - Flags: review?(dietrich)

Ed Lee :Mardak

Assignee

Comment 13

•

17 years ago

One issue I've run into using these builds is typing out data uris or any really long address can lead to perf issues.

sdwilsh: On a somewhat separate issue, would implementing this as the MATCH function in mozStorageUnicodeFunctions.cpp avoid some overheads of creating my own function? The majority of the logic should be the same as what I have in v1 patch, but I'm assuming calling a StorageUnicodeFunctions::RegisterFunctions'd function is faster than what I have right now.

Shawn Wilsher :sdwilsh

Comment 14

•

17 years ago

Yeah - there'll be lots of overhead this way.  What exactly does that function need to do?  Just see if something exists inside the string like CONTAINS type of thing?

Ed Lee :Mardak

Assignee

Comment 15

•

17 years ago

It does the MATCH functionality that's usually for fulltext searching. Meaning it looks for all the space separated words in a string. (In this case I have to target strings, but that can be concated into 1 before calling MATCH.)

MATCH returns 1 if all plain/literal search terms are in the target.

Shawn Wilsher :sdwilsh

Comment 16

•

17 years ago

So, why can't you build a query using a series of LIKE statements?

http://sqlite.org/lang_expr.html

Ed Lee :Mardak

Assignee

Comment 17

•

17 years ago

You mean convert..
WHERE matchAll(title, url, 'term1 term2 term3')

into..
(title LIKE '%term1%' OR url LIKE '%term1%') AND (title LIKE '%term2%' OR url LIKE '%term2%') AND (title LIKE '%term3%' OR url LIKE '%term3%')

.. which would require dynamically generating the query based on the number of search terms?

Additionally, LIKE has unnecessary logic for processing _ and % when MATCH is just for the literal term.

Shawn Wilsher :sdwilsh

Comment 18

•

17 years ago

Alright, so a few comments:
1) The name would probably be better if it were MATCH_ALL_TERMS_WITH_WORDS.
2) You should take a variable number of arguments, so it'll look like this:
WHERE MATCH_ALL_TERMS_WITH_WORDS(title, url, term1, term2, ...)
Yes, that means you'll need to dynamically build the query, but it goes along with how SQL functions tend to work.

If you do that, I'd be happy to include that into mozStorageUnicodeFunctions.

*but*, are you sure this is why you are taking such a perf hit?  If you can't get dtrace, I'd suggest you have someone else run it to see what's up.

Mike Connor [:mconnor]

Updated

•

17 years ago

Target Milestone: --- → Firefox 3 beta4

Dietrich Ayala (:dietrich)

Comment 19

•

17 years ago

Comment on attachment 297971 [details] [diff] [review]
v1

this is obsolete given the recent AC changes, right?

Attachment #297971 - Flags: review?(dietrich)

Ed Lee :Mardak

Assignee

Comment 20

•

17 years ago

Attached patch v2 (obsolete) — Details — Splinter Review

(In reply to comment #19)
> (From update of attachment 297971 [details] [diff] [review])
> this is obsolete given the recent AC changes, right?
Ah indeed. I've reimplemented it completely differently based on how we're filtering results now.

Attachment #297971 - Attachment is obsolete: true

Attachment #300218 - Flags: review?(dietrich)

Ed Lee :Mardak

Assignee

Comment 21

•

17 years ago

Attached patch v2.1 — Details — Splinter Review

Now with more unit tests! :)

let gTests = [
  ["0: Match 2 terms all in url", [[0,0]]],
  ["1: Match 1 term in url and 1 term in title", [[0,0],[1,1]]],
  ["2: Match 3 terms all in title; display bookmark title if matched", [[1,1],[3,1]]],
  ["3: Match 2 terms in url and 1 in title; bookmark title didn't match", [[2,0],[3,0]]],
  ["4: Match 3 terms in url and 1 in title", [[1,1]]],
  ["5: Match nothing", []],
];

Attachment #300218 - Attachment is obsolete: true

Attachment #300258 - Flags: review?(dietrich)

Attachment #300218 - Flags: review?(dietrich)

Ed Lee :Mardak

Assignee

Updated

•

17 years ago

Depends on: 414285

Dietrich Ayala (:dietrich)

Comment 22

•

17 years ago

Comment on attachment 300258 [details] [diff] [review]
v2.1

"error: 'mCurrentSearchTokens' was not declared in this scope"

missing files, or missing hunks?

Attachment #300258 - Flags: review?(dietrich) → review-

Ed Lee :Mardak

Assignee

Comment 23

•

17 years ago

(In reply to comment #22)
> (From update of attachment 300258 [details] [diff] [review])
> "error: 'mCurrentSearchTokens' was not declared in this scope"
> missing files, or missing hunks?
This bug depends on bug 414285 which seth has looked at already.

Ed Lee :Mardak

Assignee

Updated

•

17 years ago

Blocks: 415403

Ed Lee :Mardak

Assignee

Comment 24

•

17 years ago

Comment on attachment 300258 [details] [diff] [review]
v2.1

Rerequesting review on this patch.

I printed out times that it takes FullHistorySearch to complete its call with the default prefs (1000chunk, 10timeout, 25rows)

trunk:

b 5 ms    
u 8 ms  
g 9 ms
  9 ms
4 13 ms
0 30 ms

That means it took 5ms to get 25 results that match "b", 8ms to match "bu",.. 30ms to match "bug 40"

Compared to after applying this patch:

b 5 ms
u 7 ms
g 9 ms
  9 ms
4 14 ms
0 22 ms
  21 ms
m 22 ms                      
o 31 ms                    
z 30 ms
i 29 ms
l 31 ms
l 29 ms
a 28 ms
  29 ms
. 30 ms
o 34 ms
r 36 ms
g 34 ms
  34 ms
s 35 ms
h 37 ms
o 37 ms
w 36 ms
_ 38 ms
b 38 ms
u 36 ms
g 37 ms
  37 ms
? 37 ms
i 38 ms
d 38 ms

In the end I searched for "bug 40 mozilla .org show_bug ?id"

I checked this with multiple runs and indeed "bug 40" goes faster than trunk. This is probably because "bug 40" doesn't have to match exactly "bug 40".. it can match titles like "bug 340", so it's able to find more results faster.

Attachment #300258 - Flags: review- → review?(dietrich)

Ed Lee :Mardak

Assignee

Comment 25

•

17 years ago

So in the case that the user only types 1 word, the speed is pretty much the same. When the user starts typing out more words, it'll take some time.. nothing is free ;) But the additional costs aren't significantly more when keeping the pages matched between the two tests the same.

Sure, if I searched for some other terms, it'll take more time because it needs to find additional pages that trunk can't find. But for this, I'm comparing the speeds trying to keep the results the same.

Ed Lee :Mardak

Assignee

Updated

•

17 years ago

Blocks: 406359

Dietrich Ayala (:dietrich)

Comment 26

•

17 years ago

Comment on attachment 300258 [details] [diff] [review]
v2.1

r=me, thanks for the extra effort in perf testing.

Attachment #300258 - Flags: review?(dietrich) → review+

Dietrich Ayala (:dietrich)

Comment 27

•

17 years ago

Comment on attachment 300258 [details] [diff] [review]
v2.1


>+        // True if any of them match; false makes us quit the loop
>+        matchAll = bookmarkMatch ||
>+          CaseInsensitiveFindInReadable(*token, entryTitle) ||
>+          CaseInsensitiveFindInReadable(*token, entryURL);
>+      }
>+
>+      // Skip if we don't match all terms in the bookmark, title or url
>+      if (!matchAll)
>         continue;

just need to match one, not all, right? please rename the variable and fix the comment before checking in.

Ed Lee :Mardak

Assignee

Comment 28

•

17 years ago

Comment on attachment 300258 [details] [diff] [review]
v2.1

This will need bug 414285 before landing.

Attachment #300258 - Flags: approval1.9?

Ed Lee :Mardak

Assignee

Comment 29

•

17 years ago

Comment on attachment 300258 [details] [diff] [review]
v2.1

Oh, this bug is blocking-firefox3+ ;) No need to approval.

Attachment #300258 - Flags: approval1.9?

Ed Lee :Mardak

Assignee

Comment 30

•

17 years ago

Checking in toolkit/components/places/src/nsNavHistoryAutoComplete.cpp;
/cvsroot/mozilla/toolkit/components/places/src/nsNavHistoryAutoComplete.cpp,v  <--  nsNavHistoryAutoComplete.cpp
new revision: 1.39; previous revision: 1.38
done
RCS file: /cvsroot/mozilla/toolkit/components/places/tests/unit/test_multi_word_search.js,v
done
Checking in toolkit/components/places/tests/unit/test_multi_word_search.js;
/cvsroot/mozilla/toolkit/components/places/tests/unit/test_multi_word_search.js,v  <--  test_multi_word_search.js
initial revision: 1.1
done

Status: ASSIGNED → RESOLVED

Closed: 17 years ago

Flags: in-testsuite+

Resolution: --- → FIXED

pal-moz

Comment 31

•

17 years ago

20080205_2206_firefox-3.0b4pre.en-US.win32.zip

if search with one word, matched word is bold and underline.
if with two or more word, it it normal font.

intended ?

Ed Lee :Mardak

Assignee

Comment 32

•

17 years ago

See bug 415403.

Nickolay_Ponomarev

Updated

•

17 years ago

Depends on: 416960

Tony Chung [:tchung]

Comment 33

•

17 years ago

This bug is verified in Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9b4pre) Gecko/2008021504 Minefield/3.0b4pre.  The underline issue is tracked in bug 415403.

Status: RESOLVED → VERIFIED

u81239

Comment 34

•

17 years ago

Is this in Firefox 3 beta 3? When I type "bug show", no entries show up, while I would expect all bugzilla show_bug.cgi entries to show up (by matching on the URL).

~Grauw

Dão Gottwald [:dao]

Comment 35

•

17 years ago

This will be in beta 4; see the target milestone.

Ed Lee :Mardak

Assignee

Updated

•

17 years ago

Blocks: 418920

[:Aleksej]

Updated

•

17 years ago

Depends on: 420165

[:Aleksej]

Updated

•

17 years ago

No longer depends on: 420165

Arnoud

Comment 36

•

16 years ago

This is marked as fixed, but when you search for "moz illa", it will not find "mozilla" (or, perhaps a better example, "cyber news" won't find "cybernetnews"). Is this the intended behavior?

Especially with long URLs where the user doesn't remember the exact name but does remember parts of the name, I can imagine this will lead the user to believe he typed in the wrong keywords.

multiple words search 17 years ago ronin.achilles 48.90 KB, image/png		Details
screenshot of v1 17 years ago Ed Lee :Mardak 44.59 KB, image/png		Details
v1 17 years ago Ed Lee :Mardak 7.94 KB, patch		Details \| Diff \| Splinter Review
v2 17 years ago Ed Lee :Mardak 3.00 KB, patch		Details \| Diff \| Splinter Review
v2.1 17 years ago Ed Lee :Mardak 10.93 KB, patch	dietrich : review+	Details \| Diff \| Splinter Review