Open Bug 342913 Opened 14 years ago Updated 1 year ago

[meta] Tracking: Add full-text indexing to bookmark and/or history searches

Categories

(Firefox :: Bookmarks & History, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: pamg.bugs, Unassigned)

References

(Depends on 1 open bug)

Details

(Keywords: meta, parity-chrome, Whiteboard: [fxsearch])

Now that SQLite, the back-end storage for Places, is going to support
full-text indexing (http://www.sqlite.org/cvstrac/wiki?p=FullTextIndex), we can
significantly enhance Firefox's search capabilities by searching more
than titles and URLs.

Among the design questions to be considered are what to index, how and where to display the results, and what options to support.

This bug is intended to serve as both a tracking bug for individual portions of the work and a discussion point (in addition to mozilla.dev.apps.firefox) for design ideas.
Depends on: 342915
Depends on: 342916
Depends on: 342917
Blocks: 154114
Depends on: 342919
Depends on: 342920
Note http://skrul.com/blog/projects/xpclucene/ , a CLucene-based solution, which is less elegant than using sqlite, but still mentioning.
I'm happy to see mozilla starts hacking natural language processing.

Then we'll really go with SQLite? I'm not too sure yet, but I'm afraid
that full text indexing in SQLite does not work nicely with CJK languages.

中文と日本語では、単語が空白で区切られていません。
BothinChineseandJapanese, wordsarenotseparatedwithwhitespaces.

To slove this probelm, Lucene uses "Tokenizer"s for specific languages.
http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/class-use/Analyzer.html
http://lucene.apache.org/java/docs/api/org/apache/lucene/analysis/Tokenizer.html
However, CJKAnalyzer sucks. And I don't believe SQLite will implement
anything better than that, in a short period.

On the other hand, I guess commercial search engines such as Google
and Yahoo are doing some kind of linguistic approach, especially
Morphology, though I don't know exactly what they are doing.  
http://en.wikipedia.org/wiki/Morpheme
http://en.wikipedia.org/wiki/Conditional_random_field

Unless we'll be able to tokenize input properly, this feature
will never reach the product-level quality. Anyway, the backend
is not just a wrapper of a SQL database. At least we should
support lang-dep-tokenizer switching.
I doubt any CJK tokenizer will be included with SQLite by default, but the text-indexing system is designed to allow developers to plug in custom functions.  The trick will be finding (or writing) good ones.
Blocks: 374945
Depends on: 377244
Flags: blocking-firefox3?
Target Milestone: Firefox 3 → Firefox 3 alpha6
Flags: blocking-firefox3? → blocking-firefox3-
Whiteboard: [wanted-firefox3]
Version: 2.0 Branch → Trunk
Target Milestone: Firefox 3 alpha6 → Firefox 3 beta1
Target Milestone: Firefox 3 M7 → Firefox 3
Flags: wanted-firefox3+
Whiteboard: [wanted-firefox3]
Depends on: 413589
No longer depends on: 342915
Target Milestone: Firefox 3 → ---
Duplicate of this bug: 485747
Blocks: 488968
Blocks: 365992
Whiteboard: [parity-chrome]
Duplicate of this bug: 682099
I want the words I type in the address bar to match the title of the pages in my history. I believe that Firefox does that now.
I don't want them to match the text content à la Chrome, because it is too heavy, slows down everything.
But being able to search, in the History panel and window, the text content of the pages would be great !
Priority: -- → P3
Whiteboard: [parity-chrome] → [parity-chrome][fxsearch]
Duplicate of this bug: 1255612
Depends on: 1340487
Mass bug change to replace various 'parity' whiteboard flags with the new canonical keywords. (See bug 1443764 comment 13.)
Keywords: parity-chrome
Whiteboard: [parity-chrome][fxsearch] → [fxsearch]

(In reply to Nicolas Barbulesco from comment #7)

I want the words I type in the address bar to match the title of the pages
in my history. I believe that Firefox does that now.
I don't want them to match the text content à la Chrome, because it is too
heavy, slows down everything.
But being able to search, in the History panel and window, the text content
of the pages would be great !

Yes, it'd be preferred that this is optional and non-default if it is implemented.

Summary: Tracking: Add full-text indexing to bookmark and/or history searches → [meta] Tracking: Add full-text indexing to bookmark and/or history searches
You need to log in before you can comment on or make changes to this bug.