Last Comment Bug 708151 - Handling of deleted records
: Handling of deleted records
Status: RESOLVED FIXED
:
Product: Firefox for Android
Classification: Client Software
Component: General (show other bugs)
: unspecified
: All All
: P1 normal (vote)
: ---
Assigned To: Lucas Rocha (:lucasr)
:
:
Mentors:
Depends on: 708331
Blocks: 695463 704490
  Show dependency treegraph
 
Reported: 2011-12-06 17:49 PST by Richard Newman [:rnewman]
Modified: 2012-01-09 11:38 PST (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed
11+


Attachments
(1/3) Add _id column to images and implement missing operations (21.01 KB, patch)
2011-12-09 08:55 PST, Lucas Rocha (:lucasr)
blassey.bugs: review+
rnewman: feedback+
Details | Diff | Splinter Review
(2/3) Handle deleted records in a sync-friendly way (20.74 KB, patch)
2011-12-09 09:04 PST, Lucas Rocha (:lucasr)
blassey.bugs: review+
rnewman: feedback+
Details | Diff | Splinter Review
(3/3) Fix switch indentation in BrowserProvider (3.41 KB, patch)
2011-12-09 09:05 PST, Lucas Rocha (:lucasr)
blassey.bugs: review+
Details | Diff | Splinter Review
Change GUID generation to be more compliant with Sync (5.93 KB, patch)
2011-12-12 06:09 PST, Lucas Rocha (:lucasr)
rnewman: review+
Details | Diff | Splinter Review
Add indexes where needed in local bookmarks/history DB (3.80 KB, patch)
2011-12-12 09:01 PST, Lucas Rocha (:lucasr)
rnewman: review+
Details | Diff | Splinter Review

Description Richard Newman [:rnewman] 2011-12-06 17:49:11 PST
As far as I know, when Fennec deletes a record from its local DB it just disappears. If we perform a timestamp query, we don't get any indication of deletions.

In order to solve this, we need one of the following:

* Fennec to clear, rather than delete, the bookmark record (except for its modification time). An empty record can be counted as deleted. We can prune these after a successful sync.
* Fennec to track deletions in a separate table. We can prune deletions after some period of time, or a successful sync.
* Sync to track existing records, allowing it to compute the records that have been deleted since the last sync by "diffing".

The same applies to "clear history" -- if this just wipes the DB, Sync won't know to clear the server history. We need to know which records have been deleted.
Comment 1 Lucas Rocha (:lucasr) 2011-12-07 02:34:01 PST
Android handles that by marking records as deleted (instead of actually deleting them) when the content provider is being used by the app. When the content provider is being used by the sync process, it actually deletes the records. The content provider knows if the sync process is the caller by checking the presence of a query argument.

What I don't like about this approach is that it relies on sync process to prune deleted records. What if the user is not using sync at all? This means we'd need to know if sync is setup for the given profile in order to know if we should we should delete or mark as deleted accordingly. What do you think?
Comment 2 Richard Newman [:rnewman] 2011-12-07 10:33:48 PST
(In reply to Lucas Rocha (:lucasr) from comment #1)
> Android handles that by marking records as deleted (instead of actually
> deleting them) when the content provider is being used by the app. When the
> content provider is being used by the sync process, it actually deletes the
> records. The content provider knows if the sync process is the caller by
> checking the presence of a query argument.

Does this mean you already support doing this?

> What I don't like about this approach is that it relies on sync process to
> prune deleted records. What if the user is not using sync at all? This means
> we'd need to know if sync is setup for the given profile in order to know if
> we should we should delete or mark as deleted accordingly. What do you think?

I suggest automated pruning after a time threshold, and active pruning from Sync for improved efficiency. Users who aren't using Sync will have deleted records stick around for, say, a month. If they notice bloat, they can clear app data in Android.

See my comment on Bug 704490.

(You could also adapt this interval or pruning approach by asking AccountManager if the user has a Sync account. Not perfect.)

Alternatively, you could have a table of deletions (guid, androidID, timestamp), and we can query both tables… but you've still got to store the data *somehow*.
Comment 3 Lucas Rocha (:lucasr) 2011-12-09 08:55:44 PST
Created attachment 580436 [details] [diff] [review]
(1/3) Add _id column to images and implement missing operations
Comment 4 Lucas Rocha (:lucasr) 2011-12-09 09:04:47 PST
Created attachment 580439 [details] [diff] [review]
(2/3) Handle deleted records in a sync-friendly way

- This patch adds a mechanism to track deleted records in bookmarks, history, and images table.

- In order to see deleted records in queries, you'll have to use the show_deleted query argument in the CP uri. It doesn't return delete records by default.

- To make the CP actually delete the records on "delete" command, use the "sync" query argument in the CP uri (sync should be using this argument for all interactions with the CP).

- I added a housekeeping mechanism to slowly purge old deleted records from the local DB. The cleanup routine (see cleanupSomeOldDeletedRecords() method) is triggered every time a records is marked as deleted in a table. It removes only 5 old deleted records at a time to avoid overloading the system with one huge deletion at once. This means old records will only gradually disappear from DB. The cleanup routine removes deleted records that are 20 days old. I'm still not entirely happy with how/when we trigger the cleanup routine but given that it's removing only 5 items at a time, it shouldn't impact performance. Let me know if you have better ideas.
Comment 5 Lucas Rocha (:lucasr) 2011-12-09 09:05:24 PST
Created attachment 580440 [details] [diff] [review]
(3/3) Fix switch indentation in BrowserProvider
Comment 6 Richard Newman [:rnewman] 2011-12-09 09:40:00 PST
Comment on attachment 580436 [details] [diff] [review]
(1/3) Add _id column to images and implement missing operations

Review of attachment 580436 [details] [diff] [review]:
-----------------------------------------------------------------

f+, with notes about GUID generation and indices…

::: mobile/android/base/db/BrowserProvider.java
@@ +259,5 @@
>                      Images.THUMBNAIL + " BLOB," +
>                      Images.DATE_CREATED + " INTEGER," +
>                      Images.DATE_MODIFIED + " INTEGER," +
>                      Images.GUID + " TEXT" +
>                      ");");

You probably want to create an index on date modified, guid, URL, and favicon URL. Otherwise your favicon lookups for pages are going to be linear scans, and Sync is going to thrash the DB when we do searches.

@@ +973,2 @@
>              // Generate GUID for new image
>              values.put(Images.GUID, UUID.randomUUID().toString());

I think everyone would be happier if we used a 12-char UUID instead; Sync generates 12-char GUIDs with 72 bits of entropy that are URL-safe. For a table full of favicons, that's a non-trivial drop in space, and it saves us URL-encoding GUIDs if you follow this format.

Here's how we do it:

https://github.com/mozilla-services/android-sync/blob/develop/src/main/java/org/mozilla/android/sync/Utils.java#L39

The same is true for everywhere else you generate GUIDs, btw.
Comment 7 Richard Newman [:rnewman] 2011-12-09 09:55:09 PST
Comment on attachment 580439 [details] [diff] [review]
(2/3) Handle deleted records in a sync-friendly way

Review of attachment 580439 [details] [diff] [review]:
-----------------------------------------------------------------

Hurrah!

::: mobile/android/base/db/BrowserProvider.java
@@ +75,5 @@
> +    // Maximum age of deleted records to be cleaned up (20 days in ms)
> +    static final long MAX_AGE_OF_DELETED_RECORDS = 86400000 * 20;
> +
> +    // Number of records marked as deleted to be removed
> +    static final long REMOVE_N_DELETED_RECORDS = 5;

DELETED_RECORDS_PURGE_LIMIT? The query param that this ends up as is PARAM_LIMIT...

@@ +374,5 @@
>  
> +    private void cleanupSomeDeletedRecords(Uri fromUri, Uri targetUri, String tableName) {
> +        // we cleanup records marked as deleted that are older than a
> +        // predefined max age. It's important not be too greedy here and
> +        // remove only a few old deleted records at a time.

Add a 'maxDeletions' parameter to this method, instead of using the const directly? And perhaps make this protected or public? That way we have the ability to suggest a prune without having to delete a record…

(We could do it with plain ol' SQL, of course.)

@@ +385,5 @@
> +        Uri uriWithArgs = targetUri.buildUpon()
> +                .appendQueryParameter(BrowserContract.PARAM_PROFILE, profile)
> +                .appendQueryParameter(BrowserContract.PARAM_LIMIT, String.valueOf(REMOVE_N_DELETED_RECORDS))
> +                .appendQueryParameter(BrowserContract.PARAM_SHOW_DELETED, String.valueOf(1))
> +                .appendQueryParameter(BrowserContract.PARAM_IS_SYNC, String.valueOf(1))

Perhaps just inline "1" here instead of String.valueOf?

@@ +1045,5 @@
>          long now = System.currentTimeMillis();
>  
>          // Thumbnails update on every page load. We don't want to flood
>          // sync with meaningless last modified date. Only update modified
>          // date when favicons bits change.

Thank you thank you thank you <3
Comment 8 Richard Newman [:rnewman] 2011-12-09 10:17:05 PST
(Downside to clicking the Splinter link directly: missing questions and comments here!)

(In reply to Lucas Rocha (:lucasr) from comment #4)
> I'm still not entirely happy with how/when we trigger
> the cleanup routine but given that it's removing only 5 items at a time, it
> shouldn't impact performance. Let me know if you have better ideas.

I think hooking into delete is actually fairly reasonable. I'd be inclined to also hook into your history expiry mechanism here: that's essentially an event which shows that the user has been using their browser a bunch, and you're already doing a truncate on the history table, so seems like a good time to do other pruning.

(Places has a whole maintenance infrastructure for this kind of thing, fwiw, if you want to crib.)

And of course we can use one of Android's recommended background data handling mechanisms -- a SyncAdapter -- to help out here when Sync is enabled. We can clean up at the end of every successful sync, if the pruning API is public or if you're OK with us essentially running the same query directly, perhaps with a much bigger limit.
Comment 9 Richard Newman [:rnewman] 2011-12-09 22:25:09 PST
Putting this in the right place. I think lucasr is going to have this done by Monday.
Comment 10 Brad Lassey [:blassey] (use needinfo?) 2011-12-09 23:13:03 PST
Comment on attachment 580439 [details] [diff] [review]
(2/3) Handle deleted records in a sync-friendly way

Review of attachment 580439 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/db/BrowserProvider.java
@@ +374,5 @@
>  
> +    private void cleanupSomeDeletedRecords(Uri fromUri, Uri targetUri, String tableName) {
> +        // we cleanup records marked as deleted that are older than a
> +        // predefined max age. It's important not be too greedy here and
> +        // remove only a few old deleted records at a time.

what rnewman said
Comment 11 Brad Lassey [:blassey] (use needinfo?) 2011-12-09 23:15:07 PST
Comment on attachment 580440 [details] [diff] [review]
(3/3) Fix switch indentation in BrowserProvider

Review of attachment 580440 [details] [diff] [review]:
-----------------------------------------------------------------

please attach a patch with whitespace changes ignored (just for review purposes)
Comment 12 Lucas Rocha (:lucasr) 2011-12-12 05:02:37 PST
(In reply to Richard Newman [:rnewman] from comment #6)
> Comment on attachment 580436 [details] [diff] [review]
> (1/3) Add _id column to images and implement missing operations
> 
> Review of attachment 580436 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> f+, with notes about GUID generation and indices…
> 
> ::: mobile/android/base/db/BrowserProvider.java
> @@ +259,5 @@
> >                      Images.THUMBNAIL + " BLOB," +
> >                      Images.DATE_CREATED + " INTEGER," +
> >                      Images.DATE_MODIFIED + " INTEGER," +
> >                      Images.GUID + " TEXT" +
> >                      ");");
> 
> You probably want to create an index on date modified, guid, URL, and
> favicon URL. Otherwise your favicon lookups for pages are going to be linear
> scans, and Sync is going to thrash the DB when we do searches.

I'd prefer to add the missing indexes on a separate patch as I'll have to add indexes for the other two tables as well. I'll attach the patch here in a minute.

> @@ +973,2 @@
> >              // Generate GUID for new image
> >              values.put(Images.GUID, UUID.randomUUID().toString());
> 
> I think everyone would be happier if we used a 12-char UUID instead; Sync
> generates 12-char GUIDs with 72 bits of entropy that are URL-safe. For a
> table full of favicons, that's a non-trivial drop in space, and it saves us
> URL-encoding GUIDs if you follow this format.
> 
> Here's how we do it:
> 
> https://github.com/mozilla-services/android-sync/blob/develop/src/main/java/
> org/mozilla/android/sync/Utils.java#L39
> 
> The same is true for everywhere else you generate GUIDs, btw.

Ok, I'll submit a separate patch here changing the GUID generation code everywhere.
Comment 13 Lucas Rocha (:lucasr) 2011-12-12 05:23:12 PST
(In reply to Richard Newman [:rnewman] from comment #7)
> Comment on attachment 580439 [details] [diff] [review]
> (2/3) Handle deleted records in a sync-friendly way
> 
> Review of attachment 580439 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Hurrah!
> 
> ::: mobile/android/base/db/BrowserProvider.java
> @@ +75,5 @@
> > +    // Maximum age of deleted records to be cleaned up (20 days in ms)
> > +    static final long MAX_AGE_OF_DELETED_RECORDS = 86400000 * 20;
> > +
> > +    // Number of records marked as deleted to be removed
> > +    static final long REMOVE_N_DELETED_RECORDS = 5;
> 
> DELETED_RECORDS_PURGE_LIMIT? The query param that this ends up as is
> PARAM_LIMIT...

Renamed.
 
> @@ +374,5 @@
> >  
> > +    private void cleanupSomeDeletedRecords(Uri fromUri, Uri targetUri, String tableName) {
> > +        // we cleanup records marked as deleted that are older than a
> > +        // predefined max age. It's important not be too greedy here and
> > +        // remove only a few old deleted records at a time.
> 
> Add a 'maxDeletions' parameter to this method, instead of using the const
> directly? And perhaps make this protected or public? That way we have the
> ability to suggest a prune without having to delete a record…
> 
> (We could do it with plain ol' SQL, of course.)

Hmm, I'd prefer to do that only if/when we need this method somewhere else.

> @@ +385,5 @@
> > +        Uri uriWithArgs = targetUri.buildUpon()
> > +                .appendQueryParameter(BrowserContract.PARAM_PROFILE, profile)
> > +                .appendQueryParameter(BrowserContract.PARAM_LIMIT, String.valueOf(REMOVE_N_DELETED_RECORDS))
> > +                .appendQueryParameter(BrowserContract.PARAM_SHOW_DELETED, String.valueOf(1))
> > +                .appendQueryParameter(BrowserContract.PARAM_IS_SYNC, String.valueOf(1))
> 
> Perhaps just inline "1" here instead of String.valueOf?

Done.

> @@ +1045,5 @@
> >          long now = System.currentTimeMillis();
> >  
> >          // Thumbnails update on every page load. We don't want to flood
> >          // sync with meaningless last modified date. Only update modified
> >          // date when favicons bits change.
> 
> Thank you thank you thank you <3

:-)
Comment 14 Lucas Rocha (:lucasr) 2011-12-12 05:26:51 PST
(In reply to Richard Newman [:rnewman] from comment #8)
> (Downside to clicking the Splinter link directly: missing questions and
> comments here!)
> 
> (In reply to Lucas Rocha (:lucasr) from comment #4)
> > I'm still not entirely happy with how/when we trigger
> > the cleanup routine but given that it's removing only 5 items at a time, it
> > shouldn't impact performance. Let me know if you have better ideas.
> 
> I think hooking into delete is actually fairly reasonable. I'd be inclined
> to also hook into your history expiry mechanism here: that's essentially an
> event which shows that the user has been using their browser a bunch, and
> you're already doing a truncate on the history table, so seems like a good
> time to do other pruning.

Given that the history truncation will pretty much mark old history entries as deleted, the cleanup routine will be triggered in this case as well.

> (Places has a whole maintenance infrastructure for this kind of thing, fwiw,
> if you want to crib.)
> 
> And of course we can use one of Android's recommended background data
> handling mechanisms -- a SyncAdapter -- to help out here when Sync is
> enabled. We can clean up at the end of every successful sync, if the pruning
> API is public or if you're OK with us essentially running the same query
> directly, perhaps with a much bigger limit.

You can simply call delete in CP using with PARAM_IS_SYNC query argument. I'm expecting sync to cleanup the deleted records that have already been handled properly.
Comment 15 Lucas Rocha (:lucasr) 2011-12-12 06:09:20 PST
Created attachment 580893 [details] [diff] [review]
Change GUID generation to be more compliant with Sync
Comment 16 Lucas Rocha (:lucasr) 2011-12-12 09:01:06 PST
Created attachment 580933 [details] [diff] [review]
Add indexes where needed in local bookmarks/history DB
Comment 17 Richard Newman [:rnewman] 2011-12-12 13:40:39 PST
Comment on attachment 580893 [details] [diff] [review]
Change GUID generation to be more compliant with Sync

Review of attachment 580893 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/db/BrowserProvider.java
@@ +201,5 @@
> +    private static byte[] generateRandomBytes(int length) {
> +        byte[] bytes = new byte[length];
> +
> +        Random random = new Random(System.nanoTime());
> +        random.nextBytes(bytes);

I guess this is fine, but someone who knows the best practice for generating random numbers in Java should probably opine whether generating a new Random is best, or whether a static or per-thread Random would be better.

Can wait for a follow-up or during systemic review, though.

This isn't used for anything high-stress, so SecureRandom is likely to be overkill.
Comment 18 Richard Newman [:rnewman] 2011-12-12 13:45:38 PST
Comment on attachment 580933 [details] [diff] [review]
Add indexes where needed in local bookmarks/history DB

Review of attachment 580933 [details] [diff] [review]:
-----------------------------------------------------------------

r=blassey? :)

::: mobile/android/base/db/BrowserProvider.java
@@ +251,5 @@
>                      ");");
>  
> +            db.execSQL("CREATE INDEX bookmarks_url_index ON " + TABLE_BOOKMARKS + "("
> +                    + Bookmarks.URL + ")");
> +            db.execSQL("CREATE INDEX bookmarks_guid_index ON " + TABLE_BOOKMARKS + "("

This can/should be CREATE UNIQUE INDEX.

@@ +271,5 @@
>                      ");");
>  
> +            db.execSQL("CREATE INDEX history_url_index ON " + TABLE_HISTORY + "("
> +                    + History.URL + ")");
> +            db.execSQL("CREATE INDEX history_guid_index ON " + TABLE_HISTORY + "("

Same.

@@ +293,5 @@
>                      ");");
>  
>              db.execSQL("CREATE INDEX images_url_index ON " + TABLE_IMAGES + "("
>                      + Images.URL + ")");
> +            db.execSQL("CREATE INDEX images_guid_index ON " + TABLE_IMAGES + "("

Same.
Comment 19 Lucas Rocha (:lucasr) 2011-12-13 03:45:40 PST
(In reply to Richard Newman [:rnewman] from comment #18)
> Comment on attachment 580933 [details] [diff] [review]
> Add indexes where needed in local bookmarks/history DB
> 
> Review of attachment 580933 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> r=blassey? :)
> 
> ::: mobile/android/base/db/BrowserProvider.java
> @@ +251,5 @@
> >                      ");");
> >  
> > +            db.execSQL("CREATE INDEX bookmarks_url_index ON " + TABLE_BOOKMARKS + "("
> > +                    + Bookmarks.URL + ")");
> > +            db.execSQL("CREATE INDEX bookmarks_guid_index ON " + TABLE_BOOKMARKS + "("
> 
> This can/should be CREATE UNIQUE INDEX.

Fixed.

> @@ +271,5 @@
> >                      ");");
> >  
> > +            db.execSQL("CREATE INDEX history_url_index ON " + TABLE_HISTORY + "("
> > +                    + History.URL + ")");
> > +            db.execSQL("CREATE INDEX history_guid_index ON " + TABLE_HISTORY + "("
> 
> Same.

Fixed.

> @@ +293,5 @@
> >                      ");");
> >  
> >              db.execSQL("CREATE INDEX images_url_index ON " + TABLE_IMAGES + "("
> >                      + Images.URL + ")");
> > +            db.execSQL("CREATE INDEX images_guid_index ON " + TABLE_IMAGES + "("
> 
> Same.

Fixed.

Note You need to log in before you can comment on or make changes to this bug.