Closed Bug 377244 Opened 18 years ago Closed 8 years ago

[SoC] Places: Indexing Visited Pages

Categories

(Firefox :: Bookmarks & History, defect)

defect
Not set
normal

Tracking

()

RESOLVED INCOMPLETE

People

(Reporter: chofmann, Unassigned)

References

Details

Attachments

(1 file, 7 obsolete files)

Tracking bug for SoC project
Status: NEW → ASSIGNED
OS: Windows XP → All
Hardware: PC → All
Assignee: dietrich → vertex3d2004
Blocks: 342913
Status: ASSIGNED → NEW
Attached file [WIP] Class defintion for Tokenizer (obsolete) —
Attachment #269957 - Attachment mime type: text/x-chdr → text/plain
Attachment #269958 - Attachment mime type: text/x-c++src → text/plain
In response to SQLite developer's comments, I am changing the approach. FTS2 API is frozen now. Custom tokenizer can be added. FTS3 whose API will be similar to FTS2 will allow indexing external content. So FTS2 looks a good bet. So the following are work to do: 1) Create a token registry system 2) Speed up FTS3 to allow indexing external content. 3) Merge it back to Places.
Attached file Header file Indexer API (obsolete) —
Attachment #269957 - Attachment is obsolete: true
Attachment #269958 - Attachment is obsolete: true
Attached file Source Indexer API (obsolete) —
Attached file Test for the API (obsolete) —
It could not be xpcshell test because it is not XPCOM module, it is an internal API
Attached patch Makefile to run the test (obsolete) — Splinter Review
Depends on: 342915
Attached patch Full Text Indexing Patch (obsolete) — Splinter Review
The patch offers the following functionality: i) Index all visited pages ii) When visit expire, the page is deleted from the index iii) When user clears history, all the indices are gone Please apply patch attached to bug 342915 to make this work. The other patch is patch for FTS module which this patch uses.
Attachment #280234 - Attachment is obsolete: true
Attachment #280236 - Attachment is obsolete: true
Attachment #280237 - Attachment is obsolete: true
Attachment #280238 - Attachment is obsolete: true
Attachment #293361 - Flags: review?(dietrich)
(Two"#" got missing in the previous patch. Adding them) The patch offers the following functionality: i) Index all visited pages ii) When visit expire, the page is deleted from the index iii) When user clears history, all the indices are gone Please apply patch attached to bug 342915 to make this work. The other patch is patch for FTS module which this patch uses.
Attachment #293361 - Attachment is obsolete: true
Attachment #293363 - Flags: review?(dietrich)
Attachment #293361 - Flags: review?(dietrich)
Version: unspecified → Trunk
Comment on attachment 293363 [details] [diff] [review] Full Text Indexing Patch style: please use.. - 2 space indentation - 80 char line limits >Index: browser/components/places/content/fts.js >=================================================================== >RCS file: browser/components/places/content/fts.js >diff -N browser/components/places/content/fts.js >--- /dev/null 1 Jan 1970 00:00:00 -0000 >+++ browser/components/places/content/fts.js 16 Dec 2007 07:18:15 -0000 >@@ -0,0 +1,47 @@ >+var placesFTS = { >+ init: function() { >+ getBrowser().addEventListener("load", placesIndexer, true); >+ }, >+ >+ destroy: function() { >+ getBrowser().addEventListener("unload", placesIndexer, true); >+ }, >+ >+ handleEvent: function(aEvent) { >+ switch(aEvent.type) { >+ case "load": >+ this.init(); >+ break; >+ case "unload": >+ this.destroy(); >+ break; >+ } >+ } >+}; >+ >+var placesIndexer = { >+ init: function() { >+ if (!this.historyService) { >+ this.historyService = Components.classes["@mozilla.org/browser/nav-history-service;1"] >+ .getService(Components.interfaces.nsINavHistoryService); use PlacesUtils.history here >+ } >+ }, >+ >+ handleEvent: function(aEvent) { >+ if (aEvent.type == "load") { >+ this.init(); >+ var document = aEvent.originalTarget >+ var uri = document.documentURI; >+ if (document.body) { >+ var title = "" >+ if (document.title) { >+ title = document.title >+ } >+ this.historyService.addPageForIndexing(uri, title, document, document.body) >+ } >+ } >+ } >+}; >+ >+window.addEventListener("load", placesFTS, true); >+window.addEventListener("unload", placesFTS, true); - delayedStartup() would be a better place to initialize placesFTS - seems like you could compact these into a single object - should add this to browser-places.js, doesn't need a separate file >Index: browser/components/build/Makefile.in >=================================================================== >RCS file: /cvsroot/mozilla/browser/components/build/Makefile.in,v >retrieving revision 1.59 >diff -u -8 -p -r1.59 Makefile.in >--- browser/components/build/Makefile.in 5 Sep 2007 15:56:56 -0000 1.59 >+++ browser/components/build/Makefile.in 16 Dec 2007 07:24:43 -0000 >@@ -30,16 +30,17 @@ REQUIRES = \ > txmgr \ > chardet \ > migration \ > shellservice \ > xulapp \ > places \ > browserplaces \ > microsummaries \ >+ dom \ > $(NULL) > unused? >Index: browser/components/places/src/Makefile.in >=================================================================== >RCS file: /cvsroot/mozilla/browser/components/places/src/Makefile.in,v >retrieving revision 1.25 >diff -u -8 -p -r1.25 Makefile.in >--- browser/components/places/src/Makefile.in 27 Nov 2007 00:08:50 -0000 1.25 >+++ browser/components/places/src/Makefile.in 16 Dec 2007 07:25:42 -0000 >@@ -47,16 +47,17 @@ include $(DEPTH)/config/autoconf.mk > MODULE = browserplaces > LIBRARY_NAME = browserplaces_s > FORCE_STATIC_LIB = 1 > FORCE_USE_PIC = 1 > USE_STATIC_LIBS = 1 > > REQUIRES = \ > xpcom \ >+ dom \ > string \ ditto >Index: storage/src/mozStorageService.cpp >=================================================================== >RCS file: /cvsroot/mozilla/storage/src/mozStorageService.cpp,v >retrieving revision 1.11 >diff -u -8 -p -r1.11 mozStorageService.cpp >--- storage/src/mozStorageService.cpp 19 Jun 2007 02:22:02 -0000 1.11 >+++ storage/src/mozStorageService.cpp 16 Dec 2007 07:28:10 -0000 >@@ -84,17 +84,17 @@ mozStorageService::Init() > // The service must be initialized on the main thread. The > // InitStorageAsyncIO function creates a thread which is joined with the > // main thread during shutdown. If the thread is created from a random > // thread, we'll join to the wrong parent. > NS_ENSURE_STATE(NS_IsMainThread()); > > // this makes multiple connections to the same database share the same pager > // cache. >- sqlite3_enable_shared_cache(1); >+ //sqlite3_enable_shared_cache(1); > hrm, i know this is necessary, but this is disabled for a reason. /me looks for the bug... >+ >+ /** >+ * >+ */ >+ void addPageForIndexing(in AString aURI, in AString aTitle, >+ in nsIDOMDocumentTraversal aDocumentTraversal, >+ in nsIDOMNode aNode); >+ >+ nit: fix indent >Index: toolkit/components/places/src/Makefile.in >=================================================================== >RCS file: /cvsroot/mozilla/toolkit/components/places/src/Makefile.in,v >retrieving revision 1.26 >diff -u -8 -p -r1.26 Makefile.in >--- toolkit/components/places/src/Makefile.in 17 Jul 2007 21:08:28 -0000 1.26 >+++ toolkit/components/places/src/Makefile.in 16 Dec 2007 07:29:21 -0000 >@@ -86,16 +86,17 @@ CPPSRCS = \ > nsNavHistoryAutoComplete.cpp \ > nsNavHistoryExpire.cpp \ > nsNavHistoryQuery.cpp \ > nsNavHistoryResult.cpp \ > nsNavBookmarks.cpp \ > nsMaybeWeakPtr.cpp \ > nsMorkHistoryImporter.cpp \ > nsPlacesModule.cpp \ >+ nsFTIndexer.cpp \ > $(NULL) nit: fix indent >@@ -566,16 +571,18 @@ nsNavHistory::InitDB(PRBool *aDoImport) > // creating our statements. Some of our statements depend on these external > // tables, such as the bookmarks or favicon tables. > rv = nsNavBookmarks::InitTables(mDBConn); > NS_ENSURE_SUCCESS(rv, rv); > rv = nsFaviconService::InitTables(mDBConn); > NS_ENSURE_SUCCESS(rv, rv); > rv = nsAnnotationService::InitTables(mDBConn); > NS_ENSURE_SUCCESS(rv, rv); >+ rv = nsFTIndexer::InitTables(mDBConn); >+ NS_ENSURE_SUCCESS(rv, rv); this is worrisome. we might want to use a completely different db file for this data. 180 days of text index data might get quite large. can you please provide some data on how big 180 days might be? also, once we get talos try-servers, we can run this there and see what the alexa pageset db looks like. >@@ -2028,16 +2036,67 @@ nsNavHistory::AddVisit(nsIURI* aURI, PRT > if (obsService) > obsService->NotifyObservers(aURI, NS_LINK_VISITED_EVENT_TOPIC, nsnull); > } > > return NS_OK; > } > > >+NS_IMETHODIMP >+nsNavHistory::AddPageForIndexing(const nsAString& uri, const nsAString& title, >+ nsIDOMDocumentTraversal* aDocumentTraversal, >+ nsIDOMNode* aNode) { nit: please be consistent with prefixing parameters with "a". >- // for the most visited menu query >- // we generate a super-optimized SQL query >+ // for the most visited menu query >+ // we generate a super-optimized SQL query > if (IsHistoryMenuQuery(aQueries, aOptions, > nsINavHistoryQueryOptions::SORT_BY_VISITCOUNT_DESCENDING)) { > queryString = NS_LITERAL_CSTRING( > "SELECT h.id, h.url, h.title, h.rev_host, h.visit_count, " > "(SELECT MAX(visit_date) FROM moz_historyvisits WHERE place_id = h.id " >- " AND visit_type <> 4 AND visit_type <> 0), " >+ " AND visit_type <> 4 AND visit_type <> 0), " > "f.url, null, null " > "FROM moz_places h " > "LEFT OUTER JOIN moz_favicons f ON h.favicon_id = f.id WHERE " > "h.id IN (SELECT id FROM moz_places WHERE hidden <> 1 " >- " ORDER BY visit_count DESC LIMIT "); >+ " ORDER BY visit_count DESC LIMIT "); > queryString.AppendInt(aOptions->MaxResults()); > queryString += NS_LITERAL_CSTRING(") ORDER BY h.visit_count DESC"); > return NS_OK; > } a bunch of indent fixes
(In reply to comment #10) > use PlacesUtils.history here > - delayedStartup() would be a better place to initialize placesFTS > - seems like you could compact these into a single object > - should add this to browser-places.js, doesn't need a separate file Will do both of above > > >Index: browser/components/build/Makefile.in > >=================================================================== > >RCS file: /cvsroot/mozilla/browser/components/build/Makefile.in,v > >retrieving revision 1.59 > >diff -u -8 -p -r1.59 Makefile.in > >--- browser/components/build/Makefile.in 5 Sep 2007 15:56:56 -0000 1.59 > >+++ browser/components/build/Makefile.in 16 Dec 2007 07:24:43 -0000 > >@@ -30,16 +30,17 @@ REQUIRES = \ > > txmgr \ > > chardet \ > > migration \ > > shellservice \ > > xulapp \ > > places \ > > browserplaces \ > > microsummaries \ > >+ dom \ > > $(NULL) > > > > unused? It is used. Dont remember how exactly. Will find out how exactly and update the bug > > >Index: browser/components/places/src/Makefile.in > >=================================================================== > >RCS file: /cvsroot/mozilla/browser/components/places/src/Makefile.in,v > >retrieving revision 1.25 > >diff -u -8 -p -r1.25 Makefile.in > >--- browser/components/places/src/Makefile.in 27 Nov 2007 00:08:50 -0000 1.25 > >+++ browser/components/places/src/Makefile.in 16 Dec 2007 07:25:42 -0000 > >@@ -47,16 +47,17 @@ include $(DEPTH)/config/autoconf.mk > > MODULE = browserplaces > > LIBRARY_NAME = browserplaces_s > > FORCE_STATIC_LIB = 1 > > FORCE_USE_PIC = 1 > > USE_STATIC_LIBS = 1 > > > > REQUIRES = \ > > xpcom \ > >+ dom \ > > string \ > > ditto This is used, nsPlacesImportExportService.h includes nsINavHistoryService.h which includes nsIDOMDocumentTraversal.h, nsIDOMNode.h. Hence required > > >Index: storage/src/mozStorageService.cpp > >=================================================================== > >RCS file: /cvsroot/mozilla/storage/src/mozStorageService.cpp,v > >retrieving revision 1.11 > >diff -u -8 -p -r1.11 mozStorageService.cpp > >--- storage/src/mozStorageService.cpp 19 Jun 2007 02:22:02 -0000 1.11 > >+++ storage/src/mozStorageService.cpp 16 Dec 2007 07:28:10 -0000 > >@@ -84,17 +84,17 @@ mozStorageService::Init() > > // The service must be initialized on the main thread. The > > // InitStorageAsyncIO function creates a thread which is joined with the > > // main thread during shutdown. If the thread is created from a random > > // thread, we'll join to the wrong parent. > > NS_ENSURE_STATE(NS_IsMainThread()); > > > > // this makes multiple connections to the same database share the same pager > > // cache. > >- sqlite3_enable_shared_cache(1); > >+ //sqlite3_enable_shared_cache(1); > > > > hrm, i know this is necessary, but this is disabled for a reason. /me looks for > the bug... I commented that line. Having shared cached enable will cause FTS to fail. This is a documented thing with FTS. > >@@ -566,16 +571,18 @@ nsNavHistory::InitDB(PRBool *aDoImport) > > // creating our statements. Some of our statements depend on these external > > // tables, such as the bookmarks or favicon tables. > > rv = nsNavBookmarks::InitTables(mDBConn); > > NS_ENSURE_SUCCESS(rv, rv); > > rv = nsFaviconService::InitTables(mDBConn); > > NS_ENSURE_SUCCESS(rv, rv); > > rv = nsAnnotationService::InitTables(mDBConn); > > NS_ENSURE_SUCCESS(rv, rv); > >+ rv = nsFTIndexer::InitTables(mDBConn); > >+ NS_ENSURE_SUCCESS(rv, rv); > > this is worrisome. we might want to use a completely different db file for this > data. > > 180 days of text index data might get quite large. can you please provide some > data on how big 180 days might be? > > also, once we get talos try-servers, we can run this there and see what the > alexa pageset db looks like. I'll run some kind of simulation to find out.
(In reply to comment #11) > > >Index: browser/components/build/Makefile.in > > >=================================================================== > > >RCS file: /cvsroot/mozilla/browser/components/build/Makefile.in,v > > >retrieving revision 1.59 > > >diff -u -8 -p -r1.59 Makefile.in > > >--- browser/components/build/Makefile.in 5 Sep 2007 15:56:56 -0000 1.59 > > >+++ browser/components/build/Makefile.in 16 Dec 2007 07:24:43 -0000 > > >@@ -30,16 +30,17 @@ REQUIRES = \ > > > txmgr \ > > > chardet \ > > > migration \ > > > shellservice \ > > > xulapp \ > > > places \ > > > browserplaces \ > > > microsummaries \ > > >+ dom \ > > > $(NULL) > > > > > > > unused? > It is used. Dont remember how exactly. Will find out how exactly and update the > bug nsModule.cpp includes nsPlacesImportExportService.h which in a series of inclusion includes DOMTraversal and DOMNode
Comment on attachment 293363 [details] [diff] [review] Full Text Indexing Patch need changes in comment 11
Attachment #293363 - Flags: review?(dietrich)
Depends on: 413589
No longer depends on: 342915
And it needs to use fts3, which we started to ship thanks to bug 413589, instead of fts2.
i'm assuming this functionality will enable extensions to do a full history index like what opera does with visited pages? i've been playing around with the beta and it is very useful.
Blocks: 488968
No longer blocks: 488968
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h". In Thunderbird 3.0b, you do that as follows: Tools | Message Filters Make sure the correct account is selected. Click "New" Conditions: Body contains places-to-b-and-h Change the action to "Delete Message". Select "Manually Run" from the dropdown at the top. Click OK. Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter. Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
Assignee: vertex3d2004 → nobody
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: