Closed Bug 480211 Opened 15 years ago Closed 15 years ago

Stop expiring history on every page visit

Categories

(Toolkit :: Places, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla1.9.2a1

People

(Reporter: sdwilsh, Assigned: sdwilsh)

References

()

Details

Attachments

(1 file, 8 obsolete files)

nsNavHistory::addURI expires history on every visit.  This hurts perf.

We'd like to move this flushing out of data to the same time that we do the database syncing to reduce the number of fsyncs we do as well.
Attached patch v0.1 (obsolete) — Splinter Review
oddly enough, this passes tests...
Attachment #364198 - Attachment description: v0.11 → v0.1
Attached patch v0.2 (obsolete) — Splinter Review
whoops - forgot to hg add my new idl file
Attachment #364198 - Attachment is obsolete: true
It should be noted that this doesn't use browser.history_expire_days_min.  I'm not sure if we care or not since we'll still do that in the other instances that we expire.  Feedback from dietrich and Marco appreciated here.
so we actually expire:
- onAddURI: if we can't find anything to expire we skip some adduri though, here we do a small expiration chunk
- onIdle: here we expire a larger chunk
- onQuit: we expire all orphan places

on every expiration we do:
- find visits to expire, those are all visits < expire_history_days, all visits between expire_history_days and expire_history_days_min that are above expire_sites)
- expire visits
- expire places for which we have erased visits and are now orphans

you solution will leave behind places, that means those places will be expired onQuit, causing an increase in Tshutdown. At least you should replace eraseHistory with expireHistoryParanoid, so there is a possibility it runs onIdle before we shutdown cleaning up orphan places, that would help shutdown too. Yes i think onIdle we could call expireHistoryParanoid, that will remove some work from shutdown.

Expiration of items between expire_days and expire_days_min happens only if we are over the sites limit, personally i've never seen us going over it with default 90-180 days values, so should not be a big problem, in future we should probably go back to an hard cut, users didn't find this implementation understandable.

I still see this as a big complication of our code, without any code reduction, and is only doing 1/3 of what we are doing actually.

would it be possible do the opposite? so in current impl add a timer, on timer historyExpire will build up a list of visit and place ids to be expired and passes it to dbflush. on next timed sync will simply delete those, and maybe use an expireHistoryParanoid query (maybe limited?)... that would also remove the need of onQuit places expiration since that would happen continuously, cleaning small pieces at a time.
(In reply to comment #4)
> would it be possible do the opposite? so in current impl add a timer, on timer
> historyExpire will build up a list of visit and place ids to be expired and
> passes it to dbflush. on next timed sync will simply delete those, and maybe
> use an expireHistoryParanoid query (maybe limited?)... that would also remove
> the need of onQuit places expiration since that would happen continuously,
> cleaning small pieces at a time.
That feels out of scope for 3.1 to me.
(In reply to comment #4)
> - onIdle: here we expire a larger chunk
We expire 200 here, and remove any entries in moz_places that we have orphaned at this time.

> - onQuit: we expire all orphan places
We actually only expire up to 500.


> you solution will leave behind places, that means those places will be expired
> onQuit, causing an increase in Tshutdown. At least you should replace
> eraseHistory with expireHistoryParanoid, so there is a possibility it runs
> onIdle before we shutdown cleaning up orphan places, that would help shutdown
> too. Yes i think onIdle we could call expireHistoryParanoid, that will remove
> some work from shutdown.
We can fix this by calling ExpireHistoryParanoid on idle (among other things).
Attached patch v0.3 (obsolete) — Splinter Review
I think this addresses the concerns raised so far.
Attachment #364200 - Attachment is obsolete: true
(In reply to comment #7)
> Created an attachment (id=364353) [details]
> v0.3
> 
> I think this addresses the concerns raised so far.

can you enumerate the differences, and how it address which concerns?
The difference being that we now remove orphans on idle as well as on shutdown.  We do remove half as many normal "items" on idle though so we still try and respect MAX_EXPIRE_RECORDS_ON_IDLE.
(In reply to comment #4)
> you solution will leave behind places, that means those places will be expired
> onQuit, causing an increase in Tshutdown. At least you should replace
> eraseHistory with expireHistoryParanoid, so there is a possibility it runs
> onIdle before we shutdown cleaning up orphan places, that would help shutdown
> too. Yes i think onIdle we could call expireHistoryParanoid, that will remove
> some work from shutdown.

i recommended that we only expire visits on flush, not moz_places. full expiration at flush-time is not a goal. i definitely agree with you that we should be more aggressive on expireHistoryParanoid during idle, to make up for this.

> I still see this as a big complication of our code, without any code reduction,
> and is only doing 1/3 of what we are doing actually.

the goal is not full replacement of nsNavHistoryExpire.cpp. the goal is in the bug summary: to not tie expiration to page-load. this patch consolidates these db write/delete activities into a central area, while also moving them off of the UI thread. moving this code to JS is another win IMO, and a pattern I'd like to continue for work that doesn't need to be synchronous.

> would it be possible do the opposite? so in current impl add a timer, on timer
> historyExpire will build up a list of visit and place ids to be expired and
> passes it to dbflush. on next timed sync will simply delete those, and maybe
> use an expireHistoryParanoid query (maybe limited?)... that would also remove
> the need of onQuit places expiration since that would happen continuously,
> cleaning small pieces at a time.

this approach requires a new timer, and splits the logic into two locations. i don't see how that's an improvement over the approach in this patch. it seems more complex and error prone. however, i like your idea of doing more of the expireHistoryParanoid work more often, so please file a followup for that.
(In reply to comment #10)
> this approach requires a new timer, and splits the logic into two locations. i
> don't see how that's an improvement over the approach in this patch. it seems
> more complex and error prone. however, i like your idea of doing more of the
> expireHistoryParanoid work more often, so please file a followup for that.
I've actually done that in this patch already.
Comment on attachment 364353 [details] [diff] [review]
v0.3

>diff --git a/toolkit/components/places/public/Makefile.in b/toolkit/components/places/public/Makefile.in
>--- a/toolkit/components/places/public/Makefile.in
>+++ b/toolkit/components/places/public/Makefile.in
>@@ -60,6 +60,7 @@ XPIDLSRCS += \
>   nsIDynamicContainer.idl \
>   nsITaggingService.idl  \
>   nsPIPlacesDatabase.idl \
>+  nsPIHistoryObserverNotificationsData.idl \
>   $(NULL)
> endif
> 
>diff --git a/toolkit/components/places/public/nsPIHistoryObserverNotificationsData.idl b/toolkit/components/places/public/nsPIHistoryObserverNotificationsData.idl
>new file mode 100644
>--- /dev/null
>+++ b/toolkit/components/places/public/nsPIHistoryObserverNotificationsData.idl
>@@ -0,0 +1,72 @@
>+/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*-
>+ * vim: sw=2 ts=2 sts=2
>+ * ***** BEGIN LICENSE BLOCK *****
>+ * Version: MPL 1.1/GPL 2.0/LGPL 2.1
>+ *
>+ * The contents of this file are subject to the Mozilla Public License Version
>+ * 1.1 (the "License"); you may not use this file except in compliance with
>+ * the License. You may obtain a copy of the License at
>+ * http://www.mozilla.org/MPL/
>+ *
>+ * Software distributed under the License is distributed on an "AS IS" basis,
>+ * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
>+ * for the specific language governing rights and limitations under the
>+ * License.
>+ *
>+ * The Original Code is Places code
>+ *
>+ * The Initial Developer of the Original Code is
>+ * Mozilla Corporation.
>+ * Portions created by the Initial Developer are Copyright (C) 2009
>+ * the Initial Developer. All Rights Reserved.
>+ *
>+ * Contributor(s):
>+ *   Shawn Wilsher <me@shawnwilsher.com> (Original Author)
>+ *
>+ * Alternatively, the contents of this file may be used under the terms of
>+ * either the GNU General Public License Version 2 or later (the "GPL"), or
>+ * the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
>+ * in which case the provisions of the GPL or the LGPL are applicable instead
>+ * of those above. If you wish to allow use of your version of this file only
>+ * under the terms of either the GPL or the LGPL, and not to allow others to
>+ * use your version of this file under the terms of the MPL, indicate your
>+ * decision by deleting the provisions above and replace them with the notice
>+ * and other provisions required by the GPL or the LGPL. If you do not delete
>+ * the provisions above, a recipient may use your version of this file under
>+ * the terms of any one of the MPL, the GPL or the LGPL.
>+ *
>+ * ***** END LICENSE BLOCK ***** */
>+
>+#include "nsISupports.idl"
>+
>+interface nsIURI;
>+
>+/**
>+ * This is a private interface used by Places components to pass back to the
>+ * history service to notify observers about expired pages.

"to pass back to the history service to notify..." missing something between "pass" and "back", or just grammar fail?

>+ */
>+[scriptable, uuid(1662b6da-57f7-4ca4-be47-65ba9393fe75)]
>+interface nsPIHistoryObserverNotificationsExpirationData : nsISupports

the filename doesn't match the interface name. i'd prefer they both be shorter, eg nsPIPlacesExpirationData or something like that.

>+{
>+  /**
>+   * Obtains an array of the nsIURIs to notify about.  This array has a
>+   * one-to-one mapping with the array from getVisitDates.
>+   *
>+   * @param aCount
>+   *        The number of elements in the returned array.
>+   * @returns an array of nsIURIs.
>+   */
>+  void getURIs(out unsigned long aCount,
>+                   [retval, array, size_is(aCount)] out nsIURI aURIs);

s/Obtains/Returns/

the comment could be clearer. are these URIs that have been deleted, or will be deleted? by one-to-one, do you mean that it's the same, or that they match date-to-uri?

nit: over-indented

>+
>+  /**
>+   * Obtains an array of the visit dates to notify about.  This array has a
>+   * one-to-one mapping with the array from getURIs.
>+   *
>+   * @param aCount
>+   *        The number of elements in the returned array.
>+   * @returns an array of visit dates.
>+   */
>+  void getVisitDates(out unsigned long aCount,
>+                     [retval, array, size_is(aCount)] out PRTime aDates);

ditto on clarity

to here...
patch v0.3 (expire orphans on idle) should be a good way to reduce the risk of shutdown time increases, if the risk of complicating sync code is acceptable, i'm fine with it, we have enough tests for sync.

just for completion, my idea was, on a timed sync we call a private history interface, it will give us back list of visits to be expired, we expire them with sync. historyExpire can then wait for the next sync-finished notification, and call observers (it already knows about both uris and visits).
advantages are: we only need one private method to pass visitIds, we have far less changes in sync code and we don't touch batch.
cons is that sync has to wait to receive ids to be expired, and we have the async expiration logic spread across 2 files.

Going to take a deeper look at the code now.
reminder: as said on IRC test_expiration actually passes because we still call onAddURI when expiration prefs change, see nsNavHistoryExpire::OnExpirationChanged. we should change the test that actually is wrong, ensuring tests both addURI and OnExpirationChanged paths
filed bug 480437 on test_expiration.
Attached patch v0.4 (obsolete) — Splinter Review
Less hackyness, more win.  Tests in a bit.
Attachment #364353 - Attachment is obsolete: true
OS: Mac OS X → All
Hardware: x86 → All
Attached patch v1.0 (obsolete) — Splinter Review
Now with more tests (but no cowbell)
Attachment #364422 - Attachment is obsolete: true
Attachment #364429 - Flags: review?(mak77)
Attachment #364429 - Flags: review?(dietrich)
Comment on attachment 364429 [details] [diff] [review]
v1.0


>@@ -138,16 +139,17 @@ class nsNavHistory : public nsSupportsWe
> class nsNavHistory : public nsSupportsWeakReference,
>                      public nsINavHistoryService,
>                      public nsIObserver,
>                      public nsIBrowserHistory,
>                      public nsIGlobalHistory3,
>                      public nsIDownloadHistory,
>                      public nsICharsetResolver
>                    , public nsPIPlacesDatabase
>+                   , public nsPIPlacesHistoryListenersNotifier
> #ifdef MOZ_XUL
>                      , public nsIAutoCompleteSearch,
>                      public nsIAutoCompleteSimpleResultListener

nit: fix indent, since you're there

>diff --git a/toolkit/components/places/src/nsNavHistoryExpire.cpp b/toolkit/components/places/src/nsNavHistoryExpire.cpp
>--- a/toolkit/components/places/src/nsNavHistoryExpire.cpp
>+++ b/toolkit/components/places/src/nsNavHistoryExpire.cpp
>@@ -412,16 +412,48 @@ nsNavHistoryExpire::ExpireItems(PRUint32
>     NS_WARNING("ExpireAnnotations failed.");
> 
>   rv = transaction.Commit();
>   NS_ENSURE_SUCCESS(rv, rv);
> 
>   return NS_OK;
> }
> 
>+// nsNavHistoryExpire::ExpireOrphans
>+//
>+//    Here, we try to expire aNumToExpire items that are orphans.  aNumToExpire
>+//    only limits how many moz_places we worry about.  Everything else is
>+//    completely expired.
>+
>+nsresult
>+nsNavHistoryExpire::ExpireOrphans(PRUint32 aNumToExpire)
>+{

ensure aNumToExpire is a valid value?

>diff --git a/toolkit/components/places/src/nsPlacesDBFlush.js b/toolkit/components/places/src/nsPlacesDBFlush.js
>--- a/toolkit/components/places/src/nsPlacesDBFlush.js
>+++ b/toolkit/components/places/src/nsPlacesDBFlush.js
>@@ -46,64 +46,98 @@ const Cc = Components.classes;
> const Cc = Components.classes;
> const Ci = Components.interfaces;
> const Cr = Components.results;
> const Cu = Components.utils;
> 
> const kQuitApplication = "quit-application";
> const kSyncFinished = "places-sync-finished";
> 
>-const kSyncPrefName = "syncDBTableIntervalInSecs";
>+const kSyncPrefName = "places.syncDBTableIntervalInSecs";
> const kDefaultSyncInterval = 120;
>+const kExpireDaysPrefName = "browser.history_expire_days";
>+const kDefaultExpireDays = 90;
>+
>+// The number of milliseconds in a day.
>+const kMSPerDay = 86400000;
>+
>+// The max number of entries we will flush out when we expire.
>+const kMaxExpire = 12;

talked IRL, double this for now...

>+
>+// Query Constants.  These describe some of the queries we use.
>+const kQuerySyncPlaces = 0;
>+const kQuerySyncHistoryVisits = 1;
>+const kQuerySelectExpire = 2;
>+const kQueryExpire = 3;

kinda want "Id" appended to these, since these consts aren't actually references to the queries themselves.

>   //////////////////////////////////////////////////////////////////////////////
>   //// mozIStorageStatementCallback
>+
>+  handleResult: function DBFlush_handleResult(aResultSet)
>+  {
>+    // The only results we'll ever get back is for notifying about expiration.
>+    if (!this._expiredResults)
>+      this._expiredResults = []
>+
>+    let row;
>+    while (row = aResultSet.getNextRow()) {
>+      if (row.getResultByName("hidden"))
>+        continue;
>+
>+      this._expiredResults.push({
>+        uri: this._ios.newURI(row.getResultByName("url"), null, null),
>+        visitDate: row.getResultByName("visit_date")
>+      });
>+    }
>+  },

why process here, then re-process again on completion? why not just set this._expiredResults = aResultSet

>   handleCompletion: function DBFlush_handleCompletion(aReason)
>   {
>     if (aReason == Ci.mozIStorageStatementCallback.REASON_FINISHED) {
>+      // Dispatch to history that we've finished expiring if we have results.
>+      if (this._expiredResults) {
>+        let hsn = Cc["@mozilla.org/browser/nav-history-service;1"].
>+                  getService(Ci.nsPIPlacesHistoryListenersNotifier);
>+        while (this._expiredResults.length) {
>+          let visit = this._expiredResults.shift();
>+          hsn.notifyOnPageExpired(visit.uri, visit.visitDate);
>+        }
>+
>+        // And reset it...
>+        delete this._expiredResults;
>+      }
>+

was worried about double-notifications and lots of xpconnect crossings, but since we're doing fairly small chunks of work, likely ok.

>+
>+      case kQuerySelectExpire:
>+        // Determine which entries will be flushed out entries from
>+        // moz_historyvisits when kQueryExpire runs.

an extraneous "entries" in there?

code changes look mostly ok so far, still need to go through the tests.
Attachment #364429 - Flags: review?(dietrich) → review+
Comment on attachment 364429 [details] [diff] [review]
v1.0


>diff --git a/toolkit/components/places/tests/sync/test_expire_on_flush.js b/toolkit/components/places/tests/sync/test_expire_on_flush.js
>new file mode 100644
>--- /dev/null
>+++ b/toolkit/components/places/tests/sync/test_expire_on_flush.js
>@@ -0,0 +1,104 @@
>+/* -*- Mode: C++; tab-width: 2; indent-tabs-mode: nil; c-basic-offset: 2 -*-
>+ * vim: sw=2 ts=2 sts=2 expandtab
>+ * ***** BEGIN LICENSE BLOCK *****
>+ * Version: MPL 1.1/GPL 2.0/LGPL 2.1
>+ *
>+ * The contents of this file are subject to the Mozilla Public License Version
>+ * 1.1 (the "License"); you may not use this file except in compliance with
>+ * the License. You may obtain a copy of the License at
>+ * http://www.mozilla.org/MPL/
>+ *
>+ * Software distributed under the License is distributed on an "AS IS" basis,
>+ * WITHOUT WARRANTY OF ANY KIND, either express or implied. See the License
>+ * for the specific language governing rights and limitations under the
>+ * License.
>+ *
>+ * The Original Code is mozilla.org code.
>+ *
>+ * The Initial Developer of the Original Code is
>+ * Mozilla Corporation.
>+ * Portions created by the Initial Developer are Copyright (C) 2009
>+ * the Initial Developer. All Rights Reserved.
>+ *
>+ * Contributor(s):
>+ *   Shawn Wilsher <me@shawnwilsher.com> (Original Author)
>+ *
>+ * Alternatively, the contents of this file may be used under the terms of
>+ * either the GNU General Public License Version 2 or later (the "GPL"), or
>+ * the GNU Lesser General Public License Version 2.1 or later (the "LGPL"),
>+ * in which case the provisions of the GPL or the LGPL are applicable instead
>+ * of those above. If you wish to allow use of your version of this file only
>+ * under the terms of either the GPL or the LGPL, and not to allow others to
>+ * use your version of this file under the terms of the MPL, indicate your
>+ * decision by deleting the provisions above and replace them with the notice
>+ * and other provisions required by the GPL or the LGPL. If you do not delete
>+ * the provisions above, a recipient may use your version of this file under
>+ * the terms of any one of the MPL, the GPL or the LGPL.
>+ *
>+ * ***** END LICENSE BLOCK ***** */
>+
>+var hs = Cc["@mozilla.org/browser/nav-history-service;1"].
>+         getService(Ci.nsINavHistoryService);
>+var prefs = Cc["@mozilla.org/preferences-service;1"].
>+            getService(Ci.nsIPrefBranch);
>+var os = Cc["@mozilla.org/observer-service;1"].
>+         getService(Ci.nsIObserverService);
>+
>+const TEST_URI = "http://test.com/";
>+
>+const kSyncPrefName = "places.syncDBTableIntervalInSecs";
>+const SYNC_INTERVAL = 1;
>+const kSyncFinished = "places-sync-finished";
>+
>+const kExpireDaysPrefName = "browser.history_expire_days";
>+const EXPIRE_DAYS = 1;
>+
>+var observer = {
>+  notificationRecieved: false,
>+  syncCount: 0,
>+  observe: function(aSubject, aTopic, aData) {
>+    if (aTopic == kSyncFinished) {
>+      print("observer being called");
>+      // We won't have expired on the first sync (we expire first, then sync),
>+      // so we have to wait for the second sync to check.
>+      if (this.syncCount == 0) {
>+        this.syncCount++;
>+        return;
>+      }

blech. maybe the notification data for places-sync-finished should contain identifiers for the flush actions taken?

>+
>+      // Ensure that we received a notification about expiration.
>+      do_check_true(this.notificationRecieved);
>+
>+      // remove the observer, we don't need to observe sync on quit
>+      os.removeObserver(this, kSyncFinished);
>+      // Check the visit
>+      new_test_visit_uri_event(place, TEST_URI, false, true);
>+    }
>+  }
>+}
>+os.addObserver(observer, kSyncFinished, false);
>+
>+// Used to update observer visitId
>+var historyObserver = {
>+  onPageExpired: function(aURI, aVisitTime, aWholeEntry)
>+  {
>+    do_check_true(aURI.equals(uri(TEST_URI)));
>+    observer.notificationRecieved = true;
>+    hs.removeObserver(this, false);
>+  }
>+}
>+hs.addObserver(historyObserver, false);
>+
>+var place;

for clarity, define up top before it's first referenced.

r=me otherwise. one more thing though: please do a full audit of the cases and conditions tested in test_expiration.js and make sure that cases that your new code replaces have accompanying tests.
Comment on attachment 364429 [details] [diff] [review]
v1.0

>+  /**
>+   * Calls onPageExpired on registered listeners with the history service.
>+   *
>+   * @param aURI
>+   *        The nsIURI object representing the URI of the page being expired.
>+   * @param aVisitTime
>+   *        The time, in microseconds, that the page being expired was visited.
>+   */

add a note to look at nsINavHistoryObserver

>+  return NS_OK;
>+}
>+
>+// nsPIPlacesHistoryListenersNotifier ******************************************
>+
>+NS_IMETHODIMP
>+nsNavHistory::NotifyOnPageExpired(nsIURI *aURI, PRTime aVisitTime)
>+{
>+  // XXX aWholeEntry has never been passed correctly and it has always been
>+  //     dispatched as false.

hm, looks like 
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/src/nsNavHistoryExpire.cpp#395
could dispatch aWholeEntry == true (EraseHistory sets .erased to true)

>+// nsNavHistoryExpire::ExpireOrphans
>+//
>+//    Here, we try to expire aNumToExpire items that are orphans.  aNumToExpire
>+//    only limits how many moz_places we worry about.  Everything else is
>+//    completely expired.
>+
>+nsresult
>+nsNavHistoryExpire::ExpireOrphans(PRUint32 aNumToExpire)

could you look if possible to call this onQuit instead of duplicating code?

>+// Query Constants.  These describe some of the queries we use.
>+const kQuerySyncPlaces = 0;
>+const kQuerySyncHistoryVisits = 1;
>+const kQuerySelectExpire = 2;
>+const kQueryExpire = 3;

maybe would be more clear
kQuerySelectVisitsToExpire
kQueryExpireVisits

>+  let (pb2 = this._prefs.QueryInterface(Ci.nsIPrefBranch2)) {
>+    pb2.addObserver(kSyncPrefName, this, false);
>+    pb2.addObserver(kExpireDaysPrefName, this, false);
>+  }

where are these removed?

>   //////////////////////////////////////////////////////////////////////////////
>   //// mozIStorageStatementCallback
>+
>+  handleResult: function DBFlush_handleResult(aResultSet)
>+  {
>+    // The only results we'll ever get back is for notifying about expiration.
>+    if (!this._expiredResults)
>+      this._expiredResults = []

semicolon got lost

>     if (aReason == Ci.mozIStorageStatementCallback.REASON_FINISHED) {
>+      // Dispatch to history that we've finished expiring if we have results.
>+      if (this._expiredResults) {
>+        let hsn = Cc["@mozilla.org/browser/nav-history-service;1"].
>+                  getService(Ci.nsPIPlacesHistoryListenersNotifier);

we get history service with 2 different interfaces, we get it for dbConn, on close for commitPendingStatements, and here...i guess if we could not add an history service lazy getter and make other getters simple QUeryInterface calls on that.

>   /**
>    * Finalizes all of our mozIStorageStatements so we can properly close the
>    * database.

where are the new statements finalized?

tests review follows
Comment on attachment 364429 [details] [diff] [review]
v1.0


>diff --git a/toolkit/components/places/tests/sync/test_expire_on_flush.js b/toolkit/components/places/tests/sync/test_expire_on_flush.js

call this test_expire_on_timed_flush.js please


>+var observer = {
>+  notificationRecieved: false,

typo: should not be received?

>+  syncCount: 0,

_ on virtually private property _syncCount

>+      if (this.syncCount == 0) {
>+        this.syncCount++;
>+        return;
>+      }

if (++this.syncCount == 1)
  return;

is more compact and still readable, imho better having compact tests

>+// Used to update observer visitId

wrong comment

>+var place;
>+function run_test()
>+{
>+  // First set the preference for the timer to a small value
>+  prefs.setIntPref(kSyncPrefName, SYNC_INTERVAL);
>+  prefs.setIntPref(kExpireDaysPrefName, EXPIRE_DAYS);

better setting expireDaysPrefName before sync interval...
notice that setting this will trigger expiration, if there's any visit in the db that could be expired potentially this will call you history observer.

>+
>+  // Now add the visit two days ago, in microseconds
>+  let date = Date.now() * 1000 - 2 * 24 * 60 * 60 * 1000000;

put the ms->us conversion (* 1000) out of the calculation, should be more readable.

>+  place = hs.addVisit(uri(TEST_URI), date, null, hs.TRANSITION_TYPED, false, 0);

addvisit returns a visitId, not a place, btw i'd prefer if you set a property in the observer instead of a global var.

>diff --git a/toolkit/components/places/tests/sync/test_no_expire_on_flush.js b/toolkit/components/places/tests/sync/test_no_expire_on_flush.js

call this test_no_expire_on_forced_flush.js please

my other tests comments apply to this too, i won't repeat them

>+  syncCount: 0,

as in other test

>+  // First set the preference for the timer to a small value

should not we set this to a large value so we are sure a timed sync does not run?


>+  // Now add the visit for now, in microseconds
>+  let date = Date.now() * 1000;
>+  place = hs.addVisit(uri(TEST_URI), date, null, hs.TRANSITION_TYPED, false, 0);

the visit should potentially be expirable, since we want to be sure it won't be expired
Attachment #364429 - Flags: review?(mak77) → review+
i originally thought your second test was to check on forced sync we don't expire... but after clarification on IRC, i think second test path is correct, i'd then like to see a test for my case, so "on forced sync (add bookmark) we don't expire"
(In reply to comment #18)
> ensure aNumToExpire is a valid value?
It's an unsigned 32-bit integer.  I'm not sure what an invalid value could be.

> why process here, then re-process again on completion? why not just set
> this._expiredResults = aResultSet
We can get more than one call to handleResults, so we cannot do this.

> was worried about double-notifications and lots of xpconnect crossings, but
> since we're doing fairly small chunks of work, likely ok.
The amount of xpconnect crossings wouldn't really be that bad anyway since most of our listeners are native code.

(In reply to comment #19)
> blech. maybe the notification data for places-sync-finished should contain
> identifiers for the flush actions taken?
Not sure how that would help.  Note that I have to wait for the second sync only because we expire first, and they data has not made it into our permanent tables yet from the temporary ones.  Perhaps I am misunderstanding your point though.

(In reply to comment #20)
> hm, looks like 
> http://mxr.mozilla.org/mozilla-central/source/toolkit/components/places/src/nsNavHistoryExpire.cpp#395
> could dispatch aWholeEntry == true (EraseHistory sets .erased to true)
Hurg...I misread this code in the past.  I'll have to update stuff to pass the right value here, as well as a test for this.

> could you look if possible to call this onQuit instead of duplicating code?
They both do very different things, so I think this isn't the best of ideas.

> we get history service with 2 different interfaces, we get it for dbConn, on
> close for commitPendingStatements, and here...i guess if we could not add an
> history service lazy getter and make other getters simple QUeryInterface calls
> on that.
Didn't quite go that far, but I did add a lazy getter for hsn.

> where are the new statements finalized?
_finalizeInternalStatements iterates all of _cachedStatements so as long as we add to it, we don't have to worry about it.
Attached patch v1.1 (obsolete) — Splinter Review
Forgot to add the test Marco asked for, but posting this while I work on that.
Attachment #364429 - Attachment is obsolete: true
Attachment #364558 - Flags: review?(mak77)
(In reply to comment #24)
> Forgot to add the test Marco asked for, but posting this while I work on that.
Actually, I'll argue that that test doesn't make sense to do.  The test would have to be heavily implementation dependent, and could start to pass if we suddenly if we change the order in which we do queries.  I think we'd be better suited to add a comment in nsPlacesDBFlush.js when we sync for bookmarks to indicate that we do not want to expire due to performance concerns.
Attached patch v1.2 (obsolete) — Splinter Review
Per my previous comments
Attachment #364558 - Attachment is obsolete: true
Attachment #364572 - Flags: review?(mak77)
Attachment #364572 - Flags: review?(dietrich)
Attachment #364558 - Flags: review?(mak77)
Comment on attachment 364572 [details] [diff] [review]
v1.2

>diff --git a/toolkit/components/places/src/nsNavHistory.cpp b/toolkit/components/places/src/nsNavHistory.cpp
>--- a/toolkit/components/places/src/nsNavHistory.cpp
>+++ b/toolkit/components/places/src/nsNavHistory.cpp
>@@ -90,16 +90,17 @@
> #include "mozIStorageStatement.h"
> #include "mozIStorageFunction.h"
> #include "mozStorageCID.h"
> #include "mozStorageHelper.h"
> #include "nsPlacesTriggers.h"
> #include "nsAppDirectoryServiceDefs.h"
> #include "nsIIdleService.h"
> #include "nsILivemarkService.h"
>+#include "nsPIHistoryObserverNotificationsData.h"

This header looks like a forgot from previous versions patches.

>diff --git a/toolkit/components/places/src/nsNavHistoryExpire.cpp b/toolkit/components/places/src/nsNavHistoryExpire.cpp
>--- a/toolkit/components/places/src/nsNavHistoryExpire.cpp
>+++ b/toolkit/components/places/src/nsNavHistoryExpire.cpp
>@@ -200,28 +200,19 @@ nsNavHistoryExpire::OnQuit()
>   nsCOMPtr<nsIPrefBranch> prefs(do_GetService("@mozilla.org/preferences-service;1"));
>   PRBool sanitizeOnShutdown, sanitizeHistory;
>   prefs->GetBoolPref(PREF_SANITIZE_ON_SHUTDOWN, &sanitizeOnShutdown);
>   prefs->GetBoolPref(PREF_SANITIZE_ITEM_HISTORY, &sanitizeHistory);
>   if (sanitizeHistory && sanitizeOnShutdown)
>     return;
> 
>   // vacuum up dangling items

vacuum verb looks confusing due to the fact sqlite supports such a command, the function name now is self explanatory, so i think this comment can go away (or be fixed to "expire orphan items")

>+// nsNavHistoryExpire::ExpireOrphans
>+//
>+//    Here, we try to expire aNumToExpire items that are orphans.  aNumToExpire
>+//    only limits how many moz_places we worry about.  Everything else is
>+//    completely expired.

"Everything else" is a bit too much "crowded space", reader thinks "what's everything"?

>+  this.__defineGetter__("_hsn", function() {
>+    delete this._hsn;
>+    return this._hsn = Cc["@mozilla.org/browser/nav-history-service;1"].
>+                       getService(Ci.nsPIPlacesHistoryListenersNotifier);
>+  });

i still think makes more sense lazy get history service, but no problem taking this.

>   onItemAdded: function(aItemId, aParentId, aIndex)
>   {
>-    // Sync only if we added a TYPE_BOOKMARK item
>+    // Sync only if we added a TYPE_BOOKMARK item.  Note, we don't want to run
>+    // the least amount of queries as possible here for performance reasons.

this won't apply cleanly on 1.9.1, so i really hope we can sync places code between branch and trunk before taking this (there are 2 patches that involve nsDBFlush waiting for approval IIRC

But i don't understand the comment "We don't want run least amount of queries"? should not be "We want"?

>+      this._expiredResults.push({
>+        uri: this._ios.newURI(row.getResultByName("url"), null, null),
>+        visitDate: row.getResultByName("visit_date"),
>+        wholeEntry: (row.getResultByName("visit_count") == 1)

i guess what happens if we expire a visit with visit_type 4 or 7 (that doesn't change visit_count)

>diff --git a/toolkit/components/places/tests/sync/test_expire_on_timed_flush.js b/toolkit/components/places/tests/sync/test_expire_on_timed_flush.js

>+      if (this._syncCount++ == 0)
>+        return;

even if it's correct, to me var++==0 looks more confusing than ++var==1, it's probably a personal vision though.

>+  // First set our preferences to small values.
>+  prefs.setIntPref(kExpireDaysPrefName, EXPIRE_DAYS);

i've still some fear about this, but since this will catch expiration after 3.5s, our sync expiration should happen always before.

>+  prefs.setIntPref(kSyncPrefName, SYNC_INTERVAL);
>+
>+  // Sanity check to ensure that our test uri is not already visited.
>+  do_check_false(bh.isVisited(uri(TEST_URI)));

better calling bh.removeAllPages() on test start? so we run in a clean env.

>+  // Now add the visit two days ago.
>+  let date = Date.now() - 2 * 24 * 60 * 60 * 1000; // (milliseconds)

let dateInMs would be self-explanatory

>diff --git a/toolkit/components/places/tests/sync/test_expire_on_timed_flush_not_whole_entry.js b/toolkit/components/places/tests/sync/test_expire_on_timed_flush_not_whole_entry.js
>diff --git a/toolkit/components/places/tests/sync/test_expire_on_timed_flush_obeys_date.js b/toolkit/components/places/tests/sync/test_expire_on_timed_flush_obeys_date.js

same as above
Attachment #364572 - Flags: review?(mak77) → review+
(In reply to comment #27)
> i guess what happens if we expire a visit with visit_type 4 or 7 (that doesn't
> change visit_count)
We don't notify about hidden visits, so I don't think that's a problem.

> better calling bh.removeAllPages() on test start? so we run in a clean env.
We should be getting a clean one from the head file anyway, but I'd want this check in as a sanity check anyway.
Attached patch v1.3 (obsolete) — Splinter Review
Addresses comments
Attachment #364572 - Attachment is obsolete: true
Attachment #364580 - Flags: review?(dietrich)
Attachment #364572 - Flags: review?(dietrich)
Comment on attachment 364580 [details] [diff] [review]
v1.3

r=me. some comment clarity and grammar nits below.

>+[scriptable, uuid(b96adaff-e02c-48da-a379-8af5d10e09af)]
>+interface nsPIPlacesHistoryListenersNotifier : nsISupports
>+{
>+  /**
>+   * Calls onPageExpired on registered listeners with the history service.
>+   *
>+   * @param aURI
>+   *        The nsIURI object representing the URI of the page being expired.
>+   * @param aVisitTime
>+   *        The time, in microseconds, that the page being expired was visited.
>+   * @param aWholeEntry
>+   *        Indicates if this is the last visit in entry for this URI.
>+   */
>+  void notifyOnPageExpired(in nsIURI aURI, in PRTime aVisitTime,
>+                           in boolean aWholeEntry);
>+};

"last visit in entry" ?

clearer to just do s/in entry//

>diff --git a/toolkit/components/places/src/nsNavHistoryExpire.cpp b/toolkit/components/places/src/nsNavHistoryExpire.cpp
>--- a/toolkit/components/places/src/nsNavHistoryExpire.cpp
>+++ b/toolkit/components/places/src/nsNavHistoryExpire.cpp
>@@ -199,29 +199,20 @@ nsNavHistoryExpire::OnQuit()
>   // because we be doing a full expiration on shutdown in ClearHistory()
>   nsCOMPtr<nsIPrefBranch> prefs(do_GetService("@mozilla.org/preferences-service;1"));
>   PRBool sanitizeOnShutdown, sanitizeHistory;
>   prefs->GetBoolPref(PREF_SANITIZE_ON_SHUTDOWN, &sanitizeOnShutdown);
>   prefs->GetBoolPref(PREF_SANITIZE_ITEM_HISTORY, &sanitizeHistory);
>   if (sanitizeHistory && sanitizeOnShutdown)
>     return;
> 
>-  // vacuum up dangling items
>-  rv = ExpireHistoryParanoid(connection, EXPIRATION_CAP_PLACES);
>+  // Get rid of all orphans now that we've created.
>+  rv = ExpireOrphans(EXPIRATION_CAP_PLACES);
>   if (NS_FAILED(rv))

"Get rid of all records orphaned due to expiration."

>@@ -412,16 +403,48 @@ nsNavHistoryExpire::ExpireItems(PRUint32
>     NS_WARNING("ExpireAnnotations failed.");
> 
>   rv = transaction.Commit();
>   NS_ENSURE_SUCCESS(rv, rv);
> 
>   return NS_OK;
> }
> 
>+// nsNavHistoryExpire::ExpireOrphans
>+//
>+//    Here, we try to expire aNumToExpire items that are orphans.  aNumToExpire
>+//    only limits how many moz_places we worry about.  Everything else
>+//    (favicons, annotations, and input history) is completely expired.
>+

s/Here, we try/Try/

>+nsresult
>+nsNavHistoryExpire::ExpireOrphans(PRUint32 aNumToExpire)
>+{
>+  mozIStorageConnection* connection = mHistory->GetStorageConnection();
>+  NS_ENSURE_TRUE(connection, NS_ERROR_OUT_OF_MEMORY);
>+
>+  mozStorageTransaction transaction(connection, PR_FALSE);
>+
>+  // Expire our entries
>+  nsresult rv = ExpireHistoryParanoid(connection, aNumToExpire);
>+  NS_ENSURE_SUCCESS(rv, rv);

this comment is not helpful. "ExpireHistoryParanoid" is ambiguously named, doesn't indicate whether moz_places or moz_historyvisits or both are effected. either describe the specific action taken or just remove the comment.

>+// Query Constants.  These describe some of the queries we use.
>+const kQuerySyncPlacesId = 0;
>+const kQuerySyncHistoryVisitsId = 1;
>+const kQuerySelectExpireVisitsId = 2;
>+const kQueryExpireVisitsId = 3;

some? should say what queries are not described and why


>+
>+  // Get our maximum expiration in days.

"maximum expiration" is ambiguous. would be good to indicate that the value is the maximum allowable age of history visits, in days.

>+  try {
>+    // We want to silently fail if the preference does not exist, and use a
>+    // default to fallback to.
>+    this._expireDays = this._prefs.getIntPref(kExpireDaysPrefName);
>+    if (this._expireDays <= 0)
>+      this._expireDays = kDefaultExpireDays;
>+  }
>+  catch (e) {
>+    // The preference did not exist, so use the default.
>+    this._expireDays = kDefaultExpireDays;
>   }

"we want to silently fail" is an odd way to put it. would be better to explain that get*Pref throws for non-existent prefs, hence the try/catch.

we srsly need to get myk's preferences module in core...


>   //////////////////////////////////////////////////////////////////////////////
>   //// nsPlacesDBFlush
>   _syncInterval: kDefaultSyncInterval,
> 
>   /**
>-   * Execute async statements to sync temporary places table.
>-   * @param aTableNames
>-   *        array of table names that should be synced, as moz_{TableName}_temp.
>+   * Execute async statements to flush tables with the specified queries..

s/.././
Attachment #364580 - Flags: review?(dietrich) → review+
Attached patch v1.4Splinter Review
Addresses review comments.  Ready to land.
Attachment #364580 - Attachment is obsolete: true
On a fresh profile, running Tp3, no reduction of fsyncs.  Gonna see if I can't use an existing places.sqlite...
And same results with my own places.sqlite file.  This may end up being trivial gains in the tests I've ran, but the theoretical gains are nice.  We end up reducing fsyncs on the main thread, as well as writes by a large amount.  It's possible the places.sqlite files I'm testing just aren't triggering this situation.
http://hg.mozilla.org/mozilla-central/rev/ed2854522e78
Status: ASSIGNED → RESOLVED
Closed: 15 years ago
Flags: in-testsuite+
Flags: in-litmus-
Resolution: --- → FIXED
Comment on attachment 364952 [details] [diff] [review]
v1.4

> nsNavHistory::OnIdle()
..
>   if (idleTime > EXPIRE_IDLE_TIME_IN_MSECS) {
>+    (void)mExpire.ExpireItems(MAX_EXPIRE_RECORDS_ON_IDLE / 2, &keepGoing);
>+    (void)mExpire.ExpireOrphans(MAX_EXPIRE_RECORDS_ON_IDLE / 2);
I thought expiration was happening when it's time to flush the temp table. Why are we still doing this extra idle timer check outside of the flushing code?
(In reply to comment #35)
> I thought expiration was happening when it's time to flush the temp table. Why
> are we still doing this extra idle timer check outside of the flushing code?
Because my reviewers asked me to.  Feel free to file a new bug if you think that should change.
Depends on: 482069
Blocks: 407782
You need to log in before you can comment on or make changes to this bug.