Closed Bug 741590 Opened 12 years ago Closed 12 years ago

Multiple history entries with the same URL

Categories

(Firefox for Android Graveyard :: General, defect)

ARM
Android
defect
Not set
normal

Tracking

(firefox15 verified, blocking-fennec1.0 +)

VERIFIED FIXED
Firefox 14
Tracking Status
firefox15 --- verified
blocking-fennec1.0 --- +

People

(Reporter: Margaret, Assigned: Margaret)

References

Details

Attachments

(3 files, 4 obsolete files)

Madhava and I both noticed duplicate entries in our "Top Sites", and I found that mine was caused by multiple history entries with the same URL. I'm looking into what's causing this, but even if we fix this problem, we're probably going to need to do some sort of database cleanup to merge the duplicates.

On a related note, a user can create multiple bookmarks with the same URL, and those can also appear to look like duplicate entries. Maybe the Top Sites query should just combine these, but this should be a separate bug.
I was able to reproduce this once with a new profile, but I added some logging and haven't been able to reproduce it since. Is it possible there could be some sort of race condition causing us to create a duplicate history entry?
Assignee: margaret.leibovic → nobody
(In reply to Margaret Leibovic [:margaret] from comment #0)

> On a related note, a user can create multiple bookmarks with the same URL,
> and those can also appear to look like duplicate entries. Maybe the Top
> Sites query should just combine these, but this should be a separate bug.

I filed bug 741630 about this.
blocking-fennec1.0: ? → +
Assignee: nobody → margaret.leibovic
I found that I was able to reliably reproduce this issue by trying to visit a site that's in my sync history immediately after setting up sync (while it's running in the background).

Following a conversation with rnewman on IRC, this patch attempts to solve a race condition in updateVisitedHistory by creating a updateOrInsertHistory method in BrowserContentProvider that gets executed in a transaction. This patch is missing some functionality, but I wanted to just get far enough to test if this actually fixes the problem.

Maybe I did something wrong, but this actually isn't fixing the problem, and the fact that I can reproduce this so reliably makes me think that it might not be caused by a different (and larger) concurrent database access issue. rnewman, do you have any idea what could be going on here?
Attachment #612022 - Flags: feedback?(rnewman)
Alas, I do: Bug 742120.

It's still a good idea to shift database operations in Fennec to be closer to the metal and inside a transaction, but I'll work on solving Bug 742120 before I get back to this bug :D
Migration for bad records:

* Find all non-unit sets of records with the same URI.
* For each set:
** Find most recent of each set.
** Accrete sum of visits from other records to most recent record. (Ideally this would update Sync's tables, too, but alas.)
** Mark other records as deleted.
** Bump modified time on all records in set.

Doing this entire operation in a single SQL query is left as an exercise for the reader :D
Comment on attachment 612022 [details] [diff] [review]
create updateOrInsertHistory method

Review of attachment 612022 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/db/BrowserProvider.java.in
@@ +1282,5 @@
>                          new String[] { Long.toString(ContentUris.parseId(uri)) });
>                  // fall through
>              case HISTORY: {
>                  debug("Updating history: " + uri);
> +                updated = updateOrInsertHistory(uri, values, selection, selectionArgs);

I would suggest checking a query param here (see the use of IS_SYNC) to switch between update and update-or-insert.

Some callers won't want implicit creation; they'll expect the UPDATE to fail.

@@ +1625,5 @@
> +
> +    int updateOrInsertHistory(Uri uri, ContentValues values, String selection,
> +            String[] selectionArgs) {
> +        return updateHistory(uri, values, selection, selectionArgs,
> +                true /* insert if needed */);

I'd suggest turning this around a bit: don't modify updateHistory(), and don't introduce updateExistingHistory() -- just do this in updateOrInsertHistory:

  int updated = updateHistory(...);
  if (updated > 0)
      return updated;

  // Doesn't already exist. Fix that.
  insertHistory(uri, values); 
  return 0;

Note that there's nothing transactional happening here yet.

For that you'll need to wrap the update()/insert() pair in:

http://developer.android.com/reference/android/database/sqlite/SQLiteDatabase.html#beginTransaction%28%29

::: mobile/android/base/db/LocalBrowserDB.java
@@ +229,4 @@
>  
> +        values.put(History.URL, uri);
> +        // TODO: actually update the visit count correctly
> +        values.put(History.VISITS, 1);

You probably want an "increment" option in the CP interface, such that it will *add* this visit count to the existing count. That way you don't have to query to figure out this before you update (which is a race of its own, albeit a less important one).

One way to do that is to have a separate CP URI -- mUpdateHistoryUriWithProfile, say -- that the CP understands to be "I want to merge the input values with whatever's already in the DB, or create a new row if needed".

@@ +242,2 @@
>  
> +        // TODO: re-enable this for cases where we insert a new history entry

You can define your CP URI as returning 0 if it didn't update any rows (i.e., it added one). Then check the return value of cr.update() to decide whether to truncate.
Attachment #612022 - Flags: feedback?(rnewman) → feedback+
Blocks: 742166
Comment on attachment 612022 [details] [diff] [review]
create updateOrInsertHistory method

Review of attachment 612022 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/db/LocalBrowserDB.java
@@ +223,5 @@
>                  cursor.close();
>          }
>      }
>  
>      public void updateVisitedHistory(ContentResolver cr, String uri) {

(Drive-by) If this patch is about fixing races in this code: wouldn't it be much simpler to just make updateVisitedHistory() synchronized? You'd avoid any interleaving inserts/updates from multiple callers.
(In reply to Richard Newman [:rnewman] from comment #6)
> @@ +1625,5 @@
> > +
> > +    int updateOrInsertHistory(Uri uri, ContentValues values, String selection,
> > +            String[] selectionArgs) {
> > +        return updateHistory(uri, values, selection, selectionArgs,
> > +                true /* insert if needed */);

> Note that there's nothing transactional happening here yet.
> 
> For that you'll need to wrap the update()/insert() pair in:
> 
> http://developer.android.com/reference/android/database/sqlite/
> SQLiteDatabase.html#beginTransaction%28%29

The call to updateOrInsertHistory is already inside a transaction (the transaction is created inside of update and updateOrInsertHistory is called by way of updateInTransaction).

(In reply to Lucas Rocha (:lucasr) from comment #7)
> ::: mobile/android/base/db/LocalBrowserDB.java
> @@ +223,5 @@
> >                  cursor.close();
> >          }
> >      }
> >  
> >      public void updateVisitedHistory(ContentResolver cr, String uri) {
> 
> (Drive-by) If this patch is about fixing races in this code: wouldn't it be
> much simpler to just make updateVisitedHistory() synchronized? You'd avoid
> any interleaving inserts/updates from multiple callers.

This does seem much simpler. rnewman, what do you have to say about that?
(In reply to Margaret Leibovic [:margaret] from comment #8)

> The call to updateOrInsertHistory is already inside a transaction (the
> transaction is created inside of update and updateOrInsertHistory is called
> by way of updateInTransaction).

Ah, great. I missed that. Thanks!

> This does seem much simpler. rnewman, what do you have to say about that?

Yeah, would solve the issue if Fennec were racing with itself.

But synchronizing a method in LocalBrowserDB isn't going to help with a race that occurs between the CP and Fennec. (Unless the CP calls into LocalBrowserDB?)
This patch addresses the comments. I added two query parameters for the functionally distinct parts of what I want from the new URI, but maybe they could be combined? The functionality seems correct, but I want to continue to look at it to make sure I didn't miss anything.
Attachment #612022 - Attachment is obsolete: true
Attachment #612678 - Flags: feedback?(rnewman)
Attachment #612678 - Flags: feedback?(lucasr.at.mozilla)
Attached patch test for updateOrInsertHistory (obsolete) — Splinter Review
I started working on a test for this. Testcase ideas welcome!
Attachment #612679 - Flags: feedback?(rnewman)
Attachment #612679 - Flags: feedback?(lucasr.at.mozilla)
Comment on attachment 612678 [details] [diff] [review]
create updateOrInsertHistory method (v2)

Review of attachment 612678 [details] [diff] [review]:
-----------------------------------------------------------------

Looking good to me.

::: mobile/android/base/db/BrowserContract.java.in
@@ +60,5 @@
>      public static final String PARAM_IS_SYNC = "sync";
>      public static final String PARAM_SHOW_DELETED = "show_deleted";
>      public static final String PARAM_IS_TEST = "test";
> +    public static final String PARAM_INSERT_IF_NEEDED = "insert_if_needed";
> +    public static final String PARAM_INCREMENT_VISITS = "increment_visits";

On the one hand, combining these would slightly simplify the code.

On the other hand, Sync could make use of this code path: "here's what we think this should be; make the database match it", rather than having to compute a visit count delta.

Update-or-insert is a sufficiently generic operation that I'd be glad to see it as a fundamental, separate from the specialized incrementing logic, so thumbs up from me.

::: mobile/android/base/db/BrowserProvider.java.in
@@ +1355,5 @@
>              case HISTORY: {
>                  debug("Updating history: " + uri);
> +                // Check query param to decide between updateOrInsert or update
> +                String insertIfNeeded = uri.getQueryParameter(BrowserContract.PARAM_INSERT_IF_NEEDED);
> +                if (!TextUtils.isEmpty(insertIfNeeded))

protected boolean shouldUpdateOrInsert(Uri uri) {
  String insertIfNeeded = uri.getQueryParameter(BrowserContract.PARAM_INSERT_IF_NEEDED);
  return !TextUtils.isEmpty(insertIfNeeded);
}

Modify as needed to avoid

  ?insertIfNeeded=false

misbehaving :)

@@ +1706,5 @@
> +            return updated;
> +
> +        // Insert a new entry if necessary
> +        values.put(History.VISITS, 1);
> +        values.put(History.TITLE, values.getAsString(History.URL));

These should be conditional:

if (!values.containsKey(History.VISITS)) {
  ...
}

etc.

@@ +1723,5 @@
>  
> +        final String[] historyProjection = new String[] {
> +            History._ID, // 0
> +            History.URL, // 1
> +            History.VISITS // 2

Align comments for easier scanning, plz.

@@ +1728,4 @@
>          };
>  
>          Cursor cursor = db.query(TABLE_HISTORY, historyProjection, selection,
>                  selectionArgs, null, null, null);

I have a sneaking suspicion that this method does too much work...

@@ +1747,5 @@
>                  trace("Updating history entry with ID: " + id);
>  
> +                String incrementVisits = uri.getQueryParameter(BrowserContract.PARAM_INCREMENT_VISITS);
> +                if (!TextUtils.isEmpty(incrementVisits))
> +                    values.put(History.VISITS, cursor.getLong(2) + 1);

Not + 1:

  long existing = cursor.getLong(2);
  Long additional = values.getAsLong(History.VISITS);
  values.put(History.VISITS, existing + ((additional == null) ? additional.longValue() : 1));

Test idea: you want to increment by 3 visits...
Attachment #612678 - Flags: feedback?(rnewman) → feedback+
Comment on attachment 612679 [details] [diff] [review]
test for updateOrInsertHistory

Review of attachment 612679 [details] [diff] [review]:
-----------------------------------------------------------------

Looks like the right idea. Tests for updating with multiple visits, and the things that should be guarded by containsKey, seem like wise additions.
Attachment #612679 - Flags: feedback?(rnewman) → feedback+
(In reply to Richard Newman [:rnewman] from comment #12)

> ::: mobile/android/base/db/BrowserProvider.java.in
> @@ +1355,5 @@
> >              case HISTORY: {
> >                  debug("Updating history: " + uri);
> > +                // Check query param to decide between updateOrInsert or update
> > +                String insertIfNeeded = uri.getQueryParameter(BrowserContract.PARAM_INSERT_IF_NEEDED);
> > +                if (!TextUtils.isEmpty(insertIfNeeded))
> 
> protected boolean shouldUpdateOrInsert(Uri uri) {
>   String insertIfNeeded =
> uri.getQueryParameter(BrowserContract.PARAM_INSERT_IF_NEEDED);
>   return !TextUtils.isEmpty(insertIfNeeded);
> }
> 
> Modify as needed to avoid
> 
>   ?insertIfNeeded=false
> 
> misbehaving :)

This is currently a problem with some of our other query parameters, so we should file a bug to fix that as well:
http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/db/BrowserProvider.java.in#1029. Also, looking at where these are set, it seems we use both "1"/"0" and "true"/"false" :/
Addressed comments.

(In reply to Richard Newman [:rnewman] from comment #12)

> @@ +1728,4 @@
> >          };
> >  
> >          Cursor cursor = db.query(TABLE_HISTORY, historyProjection, selection,
> >                  selectionArgs, null, null, null);
> 
> I have a sneaking suspicion that this method does too much work...

I decided this was out of scope for this bug. Feel free to file if you think there are improvements to be made!
Attachment #612678 - Attachment is obsolete: true
Attachment #612678 - Flags: feedback?(lucasr.at.mozilla)
Attachment #613016 - Flags: review?(rnewman)
Added suggested testcases. Looking at this patch after the fact, I'm thinking maybe I should break up this giant test, but it gets the job done for now.
Attachment #612679 - Attachment is obsolete: true
Attachment #612679 - Flags: feedback?(lucasr.at.mozilla)
Attachment #613018 - Flags: review?(rnewman)
I need to write a test for this. I'm basically just asking for feedback on these SQL statements:

CREATE TEMP TABLE duped_urls AS
SELECT url, SUM(visits) AS total, MAX(modified) AS latest, MAX(_id) AS winner
FROM history
GROUP BY url
HAVING count(url) > 1;

UPDATE history
SET visits = (SELECT total FROM duped_urls WHERE duped_urls.url=history.url),
    modified = (SELECT latest FROM duped_urls WHERE duped_urls.url=history.url),
    deleted = (_id <> (SELECT winner FROM duped_urls WHERE duped_urls.url=history.url))
WHERE url IN (SELECT url FROM duped_urls);

DROP TABLE duped_urls;
Attachment #613020 - Flags: feedback?(rnewman)
Comment on attachment 613020 [details] [diff] [review]
migration to combine duplicate history entries

Review of attachment 613020 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good to me. Little things inline.

::: mobile/android/base/db/BrowserProvider.java.in
@@ +806,5 @@
> +            db.execSQL("CREATE TEMP TABLE " + TABLE_DUPES + " AS" +
> +                      " SELECT " + History.URL + ", " +
> +                                  "SUM(" + History.VISITS + ") AS " + TOTAL + ", " +
> +                                  "MAX(" + History.DATE_MODIFIED + ") AS " + LATEST + ", " +
> +                                  "MAX(" + History._ID + ") AS " + WINNER +

I think we really the oldest ID with the youngest modified time, or some other more sophisticated measure. But computing that is a real bastard without the FIRST() aggregate function, so this is fine :)

Just calling that out for bug posterity.

@@ +816,5 @@
> +                                       qualifyColumn(TABLE_HISTORY, History.URL);
> +
> +            db.execSQL("UPDATE " + TABLE_HISTORY +
> +                      " SET " + History.VISITS + " = " +
> +                                    "(SELECT " + TOTAL + " FROM " + TABLE_DUPES + " WHERE " + whereClause "), " +

These FROM..WHERE.. chunks are the same for each. Lift out to `whereClause`, named `fromClause`?

@@ +821,5 @@
> +                                History.DATE_MODIFIED + " = " +
> +                                    "(SELECT " + LATEST + " FROM " + TABLE_DUPES + " WHERE " + whereClause "), " +
> +                                History.IS_DELETED + " = (" + History._ID + " <> " +
> +                                    "(SELECT " + WINNER + " FROM " + TABLE_DUPES + " WHERE " + whereClause + ")" +
> +                      " WHERE " + History.URL + " IN (SELECT " + History.URL + " FROM " + TABLE_DUPES ")");

How long does this whole shebang take for, say, 20 duplicates in a 20,000 record history?

The reason I ask is that the WHERE clause (and the subqueries for assignment) are implicit joins between tables in two databases (TEMP tables live in the temp DB).

Given this extensive lookup by URL, you might want to put a unique index on that column:

  db.execSQL("CREATE UNIQUE INDEX " + TABLE_DUPES + "_url_index ON " +
             TABLE_DUPES + " (" + History.URL + ")");

That will also make readers less uncomfortable about the un-limited inner SELECTs.

Timing part of the test for this (with enough synthetic data) will help make this determination.
Attachment #613020 - Flags: feedback?(rnewman) → feedback+
Comment on attachment 613016 [details] [diff] [review]
create updateOrInsertHistory method (v3)

Perfecto!
Attachment #613016 - Flags: review?(rnewman) → review+
Comment on attachment 613018 [details] [diff] [review]
test for updateOrInsertHistory (with moar tests!)

Review of attachment 613018 [details] [diff] [review]:
-----------------------------------------------------------------

This looks good. But boy those Robotium tests are painful to write, aren't they?

::: mobile/android/base/tests/testBrowserProvider.java.in
@@ +924,5 @@
> +            Uri updateHistoryUriWithProfile = mHistoryUri.buildUpon().
> +                appendQueryParameter("insert_if_needed", "true").
> +                appendQueryParameter("increment_visits", "true").build();
> +
> +            // Update a non-existant history entry, without specifying visits or title

s/existant/existent

@@ +930,5 @@
> +            values.put(mHistoryUrlCol, TEST_URL_1);
> +
> +            int updated = mProvider.update(updateHistoryUriWithProfile, values,
> +                                           mHistoryUrlCol + " = ?",
> +                                           new String[] { values.getAsString(mHistoryUrlCol) });

Can't you just use TEST_URL_1 here instead of values.getAsString()?
Attachment #613018 - Flags: review?(rnewman) → review+
I tested the raw SQL manually (inspecting the DB) to make sure it does the correct thing, and I exercised this code path by going through the upgrade on my device to make sure there were no problems creating these concatenated statements.

We discussed this on IRC, but for bug posterity, I tried making a ContentProviderTest to exercise this upgrade code, but trying to find a way to trigger the upgrade turned into a nightmare, so we decided to just settle for manual testing.

mfinkle is testing the performance of the raw SQL on the terrible DB we've been using for performance testing, and hopefully he'll report back that this doesn't take too long.
Attachment #613020 - Attachment is obsolete: true
Attachment #613441 - Flags: review?(rnewman)
(In reply to Margaret Leibovic [:margaret] from comment #21)

> mfinkle is testing the performance of the raw SQL on the terrible DB we've
> been using for performance testing, and hopefully he'll report back that
> this doesn't take too long.

Looks like we're good to go:

[09 17:07:17] <mfinkle> margaret, all of the sql went very fast
[09 17:07:18] <mfinkle> no delays
Comment on attachment 613441 [details] [diff] [review]
migration to combine duplicate history entries (v2)

Review of attachment 613441 [details] [diff] [review]:
-----------------------------------------------------------------

Shiny!
Attachment #613441 - Flags: review?(rnewman) → review+
Target Milestone: --- → Firefox 14
There are no longer duplicate entries in Top Sites on the latest Nightly build.

Closing bug as verified fixed on:

Firefox 15.0a1 (2012-05-23)
Device: Galaxy Nexus
OS: Android 4.0.2
Status: RESOLVED → VERIFIED
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: