Write history full-text to .caminohistory files for Spotlight



Camino Graveyard
OS Integration
8 years ago
8 years ago


(Reporter: Smokey Ardisson (offline for a while; not following bugs - do not email), Unassigned)



(Whiteboard: p-safari)

One of the things I keep thinking about and keep meaning to file is piggy-backing on Safari's .webhistory importer to support full-text search in Spotlight for pages you've visited.

The .webhistory format is a simple plist, with full-text plain-text contents of the web page, title, and URL (see ~/Library/Caches/Metadata/Safari/History for samples).  We already know how to write bookmarks to the metadata cache (bug 292550 and bug 335163/bug 336497), and we know how to get the plain-text content of a web page (bug 374648); it should be fairly easy to put the two together (ha!).

As a future enhancement to this enhancement, we could hook this cache/query up to our existing history search as a new data-source of sorts, and get full-text searches in our history search field (All, Title, Location, Page Content).

There are some issues to keep in mind, of course:

1) Perf (it's not useful to do this just at launch/quit, so we'd need to do it 
         in the course of browsing, continually, perhaps in batches? Also, 
         updating a .webhistory file when the content changes?)
2) Privacy
  a) Not writing .webhistory when History is set to 0 days.
  b) Depending on persistence of the .webhistory files, we might need to allow
     a hidden pref to not write .webhistory files at all, even if history is 
     enabled, since we're now exporting this stuff beyond the bounds of Camino 
3) Persistence of the .webhistory files - do we delete them when the history 
   item ages out of history, or do they stay for all time?  (For instance, I 
   have Safari .webhistory files dating back to Fall 2007 for pages I've 
   not/never visited since.)
2c) If a user at some point sets history to 0 days after it being set at something valid before, do we then delete the stored .webhistory files?
If we make sure we follow the fix outlined in bug 335163 (we need to), users can add the ~/Library/Caches/Metadata/Camino/History folder to the Spotlight Privacy list and not have it be indexed, although that may not be enough.

See also bug 549179 for issues with making the results of Finder Spotlight searches for Camino metadata actually be useful.
Note that after bug 549179, we're no longer piggy-backing off of Safari's importer, but the general principles behind this bug are still the same.
pink promoted this to vetted project on the SoC list, so confirming (per discussion with smorgan on irc).
Ever confirmed: true
Summary: Write history full-text to .webhistory files for Spotlight → Write history full-text to .caminohistory files for Spotlight
You need to log in before you can comment on or make changes to this bug.