Closed Bug 1363010 Opened 7 years ago Closed 4 years ago

Implement browsingData.removeHistory WebExtension API method on android

Categories

(WebExtensions :: Android, enhancement, P5)

enhancement

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 1625233

People

(Reporter: shatur, Assigned: shatur, Mentored)

References

Details

(Whiteboard: [triaged])

Attachments

(2 files, 1 obsolete file)

This method clears the browser's history. The doc can be found at [1].

[1]. https://developer.mozilla.org/en-US/Add-ons/WebExtensions/API/browsingData/removeHistory
Blocks: 1362118
Priority: -- → P5
Whiteboard: [triaged]
Assignee: nobody → tushar.saini1285
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attached patch removeHist.patch (obsolete) — Splinter Review
Hey!!

I was not able to push patch using mozreview. As this patch is build over previous bug 1362994 patch, I think maybe that is causing the problem.

For time being please have a look this way. :)

Thanks
Attachment #8876431 - Flags: review?(s.kaspari)
Attachment #8876431 - Flags: review?(mwein)
(In reply to Tushar Saini (:shatur) from comment #1)
> I was not able to push patch using mozreview. As this patch is build over
> previous bug 1362994 patch, I think maybe that is causing the problem.

If the patches depend on each other but you only want to push one of them then you can use: "hg push -c <changeset>" - or if there are multiple patches that you want to push (but not everything in the current bookmark) then use: "hg push -r <changeset>:<changeset>" for pushing a range.
Comment on attachment 8876431 [details] [diff] [review]
removeHist.patch

@grisha: Can you review the Android pieces and especially the code modifying the database? (I'll be on PTO starting Wednesday - so hopefully you could help shatur push this over the finish line?)
Attachment #8876431 - Flags: review?(gkruglov)
Comment on attachment 8876431 [details] [diff] [review]
removeHist.patch

Review of attachment 8876431 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/java/org/mozilla/gecko/BrowserApp.java
@@ +2037,5 @@
>                  BrowserDB.from(getProfile()).clearHistory(
> +                        getContentResolver(),
> +                        message.getBoolean("clearSearchHistory", false),
> +                        message.getLong("startTime", 0),
> +                        message.getLong("endTime", System.currentTimeMillis())

I wonder if it would be cleaner to allow no values here - or use Integer.MAX_INT. It looks like it's straight-forward to use currentTimeMillis() but there are timezones and other reasons why time values are occasionally wrong/different and then this won't catch all history. :)

::: mobile/android/components/extensions/ext-browsingData.js
@@ +57,5 @@
>            return Promise.resolve({options, dataToRemove, dataRemovalPermitted});
>          },
> +        removeHistory(options) {
> +          const startTime = options.since == null ? null : options.since;
> +          const endTime = (new Date()).getTime();

It looks like you always set the endTime to the current time in all cases (or did I miss something?) - Is endTime even needed? And RemovalOptions[1] only supports "since" as a way to define a time range (no "end")?

[1] https://developer.chrome.com/extensions/browsingData#type-RemovalOptions
Attachment #8876431 - Flags: review?(s.kaspari) → feedback+
Hey!

> I wonder if it would be cleaner to allow no values here - or use
> Integer.MAX_INT. It looks like it's straight-forward to use
> currentTimeMillis() but there are timezones and other reasons why time
> values are occasionally wrong/different and then this won't catch all
> history. :)

We were using currentTimeMillis() to save history to db [1]. So, I thought it will be nice to remove history using this as end time. Should I use different approach?

Actually for our case, we don't need an end range. We can just check if our history items have time stamp greater than since. I added this range so that in future we can have more flexibility removing history elements. If we do not want it, I can modify it for our use case only.

> It looks like you always set the endTime to the current time in all cases
> (or did I miss something?) - Is endTime even needed? And RemovalOptions
> only supports "since" as a way to define a time range (no "end")?

Actually, you're right we do not need endTime here. It will be set to current time in BrowserApp.java automatically. I will update this in next patch. :)

> If the patches depend on each other but you only want to push one of them
> then you can use: "hg push -c <changeset>" - or if there are multiple
> patches that you want to push (but not everything in the current bookmark)
> then use: "hg push -r <changeset>:<changeset>" for pushing a range.

I tried to push just one changeset using "hg push -r id review", but it gave me some error. I think the reason is, the files I am trying to update does not exist in m-c currently. I will push again to mozreview and will ping you with error output. :) 

[1]. https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/db/LocalBrowserDB.java#750

Thanks!!
(In reply to Tushar Saini (:shatur) from comment #5)
> We were using currentTimeMillis() to save history to db [1]. So, I thought
> it will be nice to remove history using this as end time. Should I use
> different approach?

Yeah. In theory that's working fine. But it's not guaranteed that:

> long x = System.currentTimeMillis();
> long y = System.currentTimeMillis();
>
> assertTrue(y > x);

There are a lot of reasons why a later call /can/ return a lower value. Some examples:

* I traveled and now I'm in a timezone with an earlier time
* The phone got a new time from the network provider and updated its local time (backwards!)
* A user manually/accidentally changed the date or time back/forward
* You are using Firefox within a DeLorean


> Actually for our case, we don't need an end range. We can just check if our
> history items have time stamp greater than since. I added this range so that
> in future we can have more flexibility removing history elements. If we do
> not want it, I can modify it for our use case only.

Okay, in this case just leave it out. Adding (unused) code for later is often hard without actually knowing the future use case. In the worst case it's never used and just adds complexity - or it actually breaks and no one notices it because it's not used. And then this isn't too complex to add later. :)


> I tried to push just one changeset using "hg push -r id review", but it gave
> me some error. I think the reason is, the files I am trying to update does
> not exist in m-c currently. I will push again to mozreview and will ping you
> with error output. :) 

Yeah, if you use "-r" then you always push the whole bookmark with all unlanded changes. But with "-c" you can just push a single commit. Or with "-r id1:id2" just a range.
You can mark the old patch as obsolete now (Click on "Details").
Attachment #8876839 - Flags: review?(s.kaspari) → review?(gkruglov)
Comment on attachment 8876839 [details]
Bug 1363010 - Adding support for deletion of history items wrt time.

https://reviewboard.mozilla.org/r/148164/#review152886
Attachment #8876839 - Flags: review+
Attachment #8876431 - Attachment is obsolete: true
Attachment #8876431 - Flags: review?(mwein)
Attachment #8876431 - Flags: review?(gkruglov)
Comment on attachment 8876839 [details]
Bug 1363010 - Adding support for deletion of history items wrt time.

https://reviewboard.mozilla.org/r/148164/#review153610

We record history in two tables: history and visits. Each visit is associated with a history item, and has a timestamp. When pages are visited, we record a visit (and insert a new history item if user _never_ visited that page before).

Seeing that we can limit how much history we're deleting, I want to understand a little better what this API is actually intended to do.

From the docs: passing in a startTime will "clear only records of web pages visited after a given time". That's somewhat vague. One way to read this is that we should "delete any information about a page that was visited since time X" - that means deleting its metadata and all of its visits, regardless of when the visits themselves were made. E.g. I could have visited a page a month ago, and today, and after a call to "removeHistory since yesterday" my month-old visit is also deleted.
Another way to read this is as "for a given history item, delete any visits made since time X. if we deleted all visits, delete the history item itself".

I'm somewhat leaning towards the former, but that seems both heavy handed and not that useful - it's also what you've currently implemented.

How do other browsers handle this API - desktop, Chrome..?

::: mobile/android/modules/Sanitizer.jsm:36
(Diff revision 1)
>  
>  this.EXPORTED_SYMBOLS = ["Sanitizer"];
>  
>  function Sanitizer() {}
>  Sanitizer.prototype = {
> -  clearItem: function (aItemName)
> +  clearItem: function (aItemName, startTime)

I feel that this whole generic `clearItem` thing is a bug waiting to happen - but it's even more of a foot gun after these changes.

You've modified an external API of this object in a way that suggests that I can do think like `Sanitizer.clearItem('bookmarks', 12134513)`, and expect something other than "everything gets deleted" to happen.

Given that history's support of `startTime` is going to be an exception to the rule for the time being, and that getting this wrong could mean accidentally deleting more than was intended, I think we should throw here if `aItemName===history` and startTime !== undefined.
Attachment #8876839 - Flags: review?(gkruglov) → review-
Andy, can you or someone from your team help interpret the intended behaviour of removeHistory webextension api? See Comment 10.
Flags: needinfo?(amckay)
Flags: needinfo?(amckay) → needinfo?(bob.silverberg)
Comment on attachment 8876839 [details]
Bug 1363010 - Adding support for deletion of history items wrt time.

https://reviewboard.mozilla.org/r/148164/#review153628

The WebExtension side looks fine at this point. Clearing review until tests are written though.

::: mobile/android/components/extensions/ext-browsingData.js:60
(Diff revision 1)
>            }
>  
>            return Promise.resolve({options, dataToRemove, dataRemovalPermitted});
>          },
> +        removeHistory(options) {
> +          const startTime = options.since == null ? null : options.since;

This is equivalent to `const startTime = options.since;`

I think this should probably be `const startTime = options.since || 0;`.
Attachment #8876839 - Flags: review?(mwein)
(In reply to :Grisha Kruglov from comment #10)
> Comment on attachment 8876839 [details]
> Bug 1363010 - Implement browsingData.removeHistory WebExtension API method
> on android.
> 
> https://reviewboard.mozilla.org/r/148164/#review153610
> 
> We record history in two tables: history and visits. Each visit is
> associated with a history item, and has a timestamp. When pages are visited,
> we record a visit (and insert a new history item if user _never_ visited
> that page before).
> 
> Seeing that we can limit how much history we're deleting, I want to
> understand a little better what this API is actually intended to do.
> 
> From the docs: passing in a startTime will "clear only records of web pages
> visited after a given time". That's somewhat vague. One way to read this is
> that we should "delete any information about a page that was visited since
> time X" - that means deleting its metadata and all of its visits, regardless
> of when the visits themselves were made. E.g. I could have visited a page a
> month ago, and today, and after a call to "removeHistory since yesterday" my
> month-old visit is also deleted.
> Another way to read this is as "for a given history item, delete any visits
> made since time X. if we deleted all visits, delete the history item itself".
> 
> I'm somewhat leaning towards the former, but that seems both heavy handed
> and not that useful - it's also what you've currently implemented.
> 
> How do other browsers handle this API - desktop, Chrome..?

Our desktop API uses a pre-existing implementation of history deleting code from browser/base/content/sanitize.js [1]. This calls 
PlacesUtils.history.removeVisitsByFilter, which is implemented at [2]. That code removes any visits that fall into the date range, and then removes any pages that are orphaned after the visits were removed. In other words, it behaves like the latter interpretation you discuss above, so I think that is also how our code on Android should behave.

[1] http://searchfox.org/mozilla-central/source/browser/base/content/sanitize.js#298
[2] http://searchfox.org/mozilla-central/source/toolkit/components/places/History.jsm#900
Flags: needinfo?(bob.silverberg)
(In reply to Bob Silverberg [:bsilverberg] from comment #13)
> In other words, it
> behaves like the latter interpretation you discuss above, so I think that is
> also how our code on Android should behave.

Thank you for helping interpret this, Bob!

Tushar, that means you'll need to change your implementation quite a bit. Currently we don't have a "delete history since time X" functionality that is granular enough and concerns itself with visits, so that's going to be the crux of this bug.

First, we have two types of history visits: those that were made locally, and those that were synced. The primary use for this distinction is so that we may weight them differently while calculating your top sites (and perhaps Activity Stream highlights and other elements in the future). These flags are not synced together with history visits, and in various scenarios we might get them wrong, so I wouldn't concern myself too much with trying to draw a distinction between deleting local and remote history - besides, the API we're implementing doesn't make the distinction either.

Second, we keep materialized local and remote visit counts in the history table itself. This is done as an easy way to speed up our Top Sites queries, essentially - otherwise we'll be joining the two tables. This means that whenever you remove a visit from the visits table, you'd need to update a count in the history table. These counts are a performance optimization, and we don't try too hard to ensure they're kept in lock-step with the visits table. There are various error conditions that might arise which will cause those counts to diverge with reality.

We also keep "lastVisitedLocally" and "lastVisitedRemotely" timestamps in the history table, which should help you here. Assuming those timestamps are accurate, and kept up-to-date as user browses around and syncs, we should be able to narrow down what exactly you'd need to delete.

A naive version of this will roughly do the following:
0) begin a transaction!

1) narrow down scope of work
- select history with lastVisited>startTime

2) delete visits for the selected history items & update counts
- for each selected history, delete its visits which were visited > startTime
- figure out how many visits are left, and of which types - remote, local
-- you can do that as part of the delete query, if you perform two delete queries instead of one.
-- that is, first delete matching local visits, which will give you a deleted count. then, delete all remote visits.
-- using the deleted counts, and the current materialized counts, you may calculate new counts
-- however, keep in mind that the materialized counts might be inaccurate! it's possible that you'll get obviously wrong values (negative #), or subtly wrong numbers
-- finally, update history items with new counts and lastVisited* timestamps!

3) delete orphaned history
- delete any history items that don't have any associated visits. Perhaps you can do this purely based on counts, but keeping in mind that they might be slightly off, you might want to do this instead based on what's in the visits table
- don't forget about everything else that's associated with history, such as metadata.

4) commit your transaction

As you're implementing this, you'd want to push as much of the work into SQL queries as you can (combining multiple operations into batched 'DELETE stuff WHERE guid IN []' statements, etc), and using compiled statements if you find the need to repeat certain statements.

BrowseProvider is the obvious place to draw the API abstraction line. Pass in the required parameters from LocalBrowserProvider via a URI param, and expand upon deleteHistory and friends.

Write lots of unit tests for this! See other BrowserProvider tests for inspiration: https://dxr.mozilla.org/mozilla-central/source/mobile/android/tests/background/junit4/src/org/mozilla/gecko/db
Hey Grisha!

So, I looked into BrowserProvider and all related code which currently manages history deletion. Currently the process [which I observed, I can be wrong], is first we get HistoryGuids on basis of our selection args,[1] then we delete entries from Visits table. But we delete a large chunk of visits, by passing all History_GUIDs in one go [2] and then move to other things, ie, deleting History table elements and updating PageMetaData table.

And If I understand correctly, for our implementation we need:

a) Get all History_GUIDS where last-visited > since. [From History Table]
b) Delete visits from Visit table with selection args: History_GUID = element from our History_GUIDs list, last_visited > since & is_remote/is_local. (This is needed so that we can get counts, to update Visits_count in history table)
c) Update History table with new counts and PageMetaData table.
d) Delete orphaned history elements.

Am I on right path?

For now, I am updating current implementation to support this feature, or should it be implemented separately?  

I have made some changes regarding this, I will push it for feedback.

Thanks!!

[1]. https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java#603
[2]. https://dxr.mozilla.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java#606,2210,2221-2225
(In reply to Tushar Saini (:shatur) from comment #15)
> c) Update History table with new counts and PageMetaData table.

Don't forget to update the lastVisited local & remote dates.

> d) Delete orphaned history elements.

Make sure to delete in a sync-friendly way. That is, history records are not deleted, but are nulled-out and are marked as deleted. See how the delete functions differentiate deletes from the browser vs. deletes from sync - you're aiming for the browser-style deletes.

> Am I on right path?

Looks like it!

> For now, I am updating current implementation to support this feature, or
> should it be implemented separately?  

Let's do the work in this Bug, since this is going to be the primary consumer of it, and the bug already has the a lot of context for why this is being done to begin with.

> I have made some changes regarding this, I will push it for feedback.

Looking forward to it, thanks!
Comment on attachment 8876839 [details]
Bug 1363010 - Adding support for deletion of history items wrt time.

https://reviewboard.mozilla.org/r/148164/#review159016

Hey Grisha!!

This is just a vague implementation of our feature. I will refactoring and modifying a lot of things.

But I have some queries:

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:614
(Diff revision 2)
> +                hisID = historyGUIDs.toArray(hisID);
> +                Log.e(DB_DEBUG," HISTORY_GUID"+Arrays.toString(hisID));
>  
>                  if (!isCallerSync(uri)) {
> -                    deleteVisitsForHistory(db, historyGUIDs);
> +                    Log.e(DB_DEBUG," delete visits callee");
> +                    deleteVisitsForHistory(db, uri, historyGUIDs);

Here, I want to get substring "time/date" (since) from selection arguement (ie, if we are passing one), so that we can further pass it to deleteVisitsForHistory which is required for our query. So, what will be the best way to do it?

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:2231
(Diff revision 2)
> +            tempVisitCount = new ArrayList<Integer>();
> +            tempVisitCount.clear();
> +
> +            // Get local_visit count, to update History table
> +            // TO DO: Add since option
> +            deleted_local += db.delete(

Here, we are deleting elements from visits table in very low numbers as compare to previous method where we were deleting big chunks altogether. This really worries me about how it will effect our efficiency. Can you suggest any other method where we can do this more effeciently or this is method will do fine?

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:2353
(Diff revision 2)
> +            uri,
> +            DBUtils.computeSQLInClause(historyTobeDeleted.size(), History.GUID),
> +            historyTobeDeleted.toArray(new String[historyTobeDeleted.size()])
> +        );
> +
> +        Log.e(DB_DEBUG, "Delete History");

I will refactor this method and update some things, so that it makes more sense. For now please overlook this.

Thanks!!
Flags: needinfo?(gkruglov)
Comment on attachment 8876839 [details]
Bug 1363010 - Adding support for deletion of history items wrt time.

Moving review to Shane because I have a lot on my plate right now - sorry for not doing this earlier.
Attachment #8876839 - Flags: review?(mwein) → review?(mixedpuppy)
Attachment #8876839 - Flags: review?(bob.silverberg)
Comment on attachment 8876839 [details]
Bug 1363010 - Adding support for deletion of history items wrt time.

https://reviewboard.mozilla.org/r/148164/#review160940

This roughly follows what you're looking to do. See my comments below on where to go from here.

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:614
(Diff revision 2)
> +                hisID = historyGUIDs.toArray(hisID);
> +                Log.e(DB_DEBUG," HISTORY_GUID"+Arrays.toString(hisID));
>  
>                  if (!isCallerSync(uri)) {
> -                    deleteVisitsForHistory(db, historyGUIDs);
> +                    Log.e(DB_DEBUG," delete visits callee");
> +                    deleteVisitsForHistory(db, uri, historyGUIDs);

Another approach is to treat "delete SINCE" as a special snowflake case. None of the other consumers of this code rely on or need the behaviour you're implementing, and I don't foresee that need in the near future beyond these WE APIs, and so I'd be OK with passing SINCE separately as part of a URI param - as opposed to trying to support `selection` and `selectionArgs` for a general case. If that proves troublesome, we can always refactor this code down the road.

This will let you have a separate code path for the fancy SINCE deletions that involve history, which helps in de-risking testing and landing these changes.

So from my point of view, the best thing to do here is to decouple your changes from the existing flows (within reason, share code whenever feasible).

See `isCallerSync` for an example of how one might pass in a flag/parameter via the URI.

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:2213
(Diff revision 2)
>  
>      private int deletePageMetadataForHistory(SQLiteDatabase db, ArrayList<String> historyGUIDs) {
>          return bulkDeleteByHistoryGUID(db, historyGUIDs, PageMetadata.TABLE_NAME, PageMetadata.HISTORY_GUID);
>      }
>  
> -    private int deleteVisitsForHistory(SQLiteDatabase db, ArrayList<String> historyGUIDs) {
> +    private int deleteVisitsForHistory(SQLiteDatabase db, Uri uri, ArrayList<String> historyGUIDs) {

Be very explicit about data structures you're building up as you go through the operations, and describe both what and why (especially why) in comments for each step.

Separate this out from the main execution path. This type of a delete operation is going to have only one consumer for now (the WE API), and so let's not risk regressing other consumers.

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:2231
(Diff revision 2)
> +            tempVisitCount = new ArrayList<Integer>();
> +            tempVisitCount.clear();
> +
> +            // Get local_visit count, to update History table
> +            // TO DO: Add since option
> +            deleted_local += db.delete(

Check out how bulkDeleteByHistoryGUID works. You're looking at a very similar flow here.

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:2239
(Diff revision 2)
> +                        null
> +                    );
> +
> +            // Get remote_visit count, to update History table
> +            // TO DO: Add since option
> +            deleted_remote = db.delete(

If you follow my advice below about getting new counts, etc directly from a single lookup, you should be able to just stick to a single DELETE operation for both local and remote visits. The only reason not to is so that you may have separate deletion counts, but you should be able to instead get all the information you need while checking for orphaned history records below.

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:2260
(Diff revision 2)
> +
> +        // Get last_visited time stamp for History_GUID
> +        Log.e(DB_DEBUG, "Querying Visit table to get last_visited");
> +        HashMap<String, Long> lastVisited = new HashMap<String, Long>();
> +
> +        Cursor cr = db.query(

You should limit these by GUID IN [...], only looking at the visits you might have deleted above.

This will let you do two things in one query:
- look up max timestamps and new counts
- determine if you now have an orphan history GUID

Also, do a GROUP BY guid, use MAX(...) to get the latest timestamps, and use COUNT(...) to get the visit counts.

At worst, you might need to do two queries, segmenting by is_local, if you can't figure out how to combine it all into one.

Also, keep in mind that you _will_ run into unexpected cases such as "there are no visits for history X, even though timestamps on history record indicate that there must be some!". Because distributed systems are hard, and sync might/will mess things up in interesting ways. Do not make any assumtions about the data you're working with. If you do make them, make sure to 1) document them explicitely, 2) ensure they hold, and 3) have a plan for when they don't hold.

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:2289
(Diff revision 2)
> +        cr.close();
> +
> +        // Update History Table
> +        // Refactor it!
> +        Log.e(DB_DEBUG, "Updating History table.");
> +        Cursor newCur = db.query(

I don't think this is necessary. At this point you should have all the necessary information to delete or update history records.

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:2337
(Diff revision 2)
> +                if(remote>0){
> +                    values.put(History.REMOTE_VISITS, new_remote);
> +                }
> +                values.put(History.DATE_LAST_VISITED, lastVisited.get(his_ID));
> +
> +                final int updated = db.update(History.TABLE_NAME, values, History.GUID+"='"+his_ID+"'",null);

You're looking to build up two operation sets. One is "delete orphaned history records", and the other is "update non-orphaned history records".

Deletion should be performed in at most a few chunking delete statements, see the bulkDelete method elsewhere.

You might be looking at performing A LOT of history record updates. The fast thing to do here is to use compiled statements. See bulkInsertVisits for an example of this.
Attachment #8876839 - Flags: review-
Flags: needinfo?(gkruglov)
Attachment #8876839 - Flags: review+ → feedback+
Comment on attachment 8876839 [details]
Bug 1363010 - Adding support for deletion of history items wrt time.

https://reviewboard.mozilla.org/r/148164/#review161262

I was asked to review this, but it looks like all of the WebExtensions code was removed in the latest version of the patch. Please flag me again for review once the WebExtensions code is available to review.
Attachment #8876839 - Flags: review?(bob.silverberg) → review-
Attachment #8876839 - Flags: review?(mixedpuppy)
Hey Grisha!!

So, when  I am trying to get all data required to update our history table, I am not able to find suitable method to get everything. What I am doing, is :

1). Raw querying Visit table twice(was not able to merge them) to get counts for both remote and local visits like this:

"Select history_guid, count(is_local) as local_visits, max(date) from (select * from visits where is_local is 0) group by history_guid having history guid in ....."

"Select history_guid, count(is_local) as remote_visits, max(date) from (select * from visits where is_local is 1) group by history_guid having history guid in ....."

and storing them in ContentValues[][] and was thinking of updating the History table twice, but I am not able to understand how can I get total count correctly. I also tried join to merge these two tables but failed.

Is there a better way to query visit table for getting all these data?

Thank you!
Flags: needinfo?(gkruglov)
Hi Tushar,

Here's roughly what you're looking to do. I think Step 2 below answers your question specifically.

0) build list of history guids that have been visited in the timeframe you're interested in:

guids_list = select guid from history where date > since; 

1) delete the visits:

delete from visits where history_guid in (guids_list) and date > since;

2) figure out orphans and compute new materialized counts for the history table:

select history_guid,count(history_guid),max(date), is_local from visits where history_guid in (guids_list) group by history_guid, is_local;

Given that we have an index on history_guid, this even looks mildly efficient for smallish sets of guids:
0|0|0|SEARCH TABLE visits USING INDEX visits_history_guid_index (history_guid=?)
0|0|0|EXECUTE LIST SUBQUERY 1
0|0|0|USE TEMP B-TREE FOR GROUP BY

sample output might be:
(guid|count|latest_date|is_local)
nnJz-CHj3lHc|1|1501261271546000|0
nnJz-CHj3lHc|1|1501612967720000|1
r0rmbjs2ijVh|4|1497037302434060|0
uLubKTZGKyxT|6|1497037295202749|0

Iterate through the results, building a list of orphans and new materialized counts/date.
Orphans are all the guids in guids_list which you didn't encounter in the latest select.
At the same time build a mapping of guid->update values. A simple HashMap<String, ArrayList> will probably do the trick. Looking ahead, you'll be using these to populate a prepared sql statement, so a simple fixed sized array makes sense. Start off all values in the array at default 0, and update them as you iterate.

3) update all non-orphaned history rows with the new counts/dates.

since you'll be doing this a bunch of times, use prepared update sql statements. You're looking to update three counts, three dates (sigh...), and the modified column (with current time, so that sync picks up the changes).

4) delete all orphaned history rows. just use the regular deleteHistory method in the BrowserProvider.

For any SQL statements involving IN (list), beware of DBUtils.SQLITE_MAX_VARIABLE_NUMBER limit - so in steps (1), (2) and (4) you'll be looking to do "chunking" delete, select and delete, respectively. See bulkDeleteByHistoryGUID for an example, and you can just use that directly for steps 1 and 4.
Flags: needinfo?(gkruglov)
Comment on attachment 8876839 [details]
Bug 1363010 - Adding support for deletion of history items wrt time.

https://reviewboard.mozilla.org/r/148164/#review175982

::: mobile/android/base/java/org/mozilla/gecko/db/BrowserProvider.java:567
(Diff revision 3)
>          }
>      }
>  
> +    public Long isCallerHaveSince(Uri uri) {
> +        String since = uri.getQueryParameter("since");
> +        Log.e(DB_DEBUG, "HaveSince: " + since);

I guess those debug messages were only for testing and do not need to land?
Comment on attachment 8900716 [details]
Bug 1363010 - Implement browsingData.removeHistory WebExtension API method on android.

https://reviewboard.mozilla.org/r/172162/#review177416

This is still WIP patch and is really messed up. So for now I have cleared all review flag.

::: mobile/android/components/extensions/test/mochitest/test_ext_browsingData_history.html:48
(Diff revision 1)
> +  const db = await Sqlite.openConnection({path: DB_PATH, sharedMemoryCache: false});
> +  // Check table exists
> +  ok((db.tableExists("history")), 'history table exists.');
> +  ok((db.tableExists("visits")), 'visits table exists.');
> +
> +  let result = await db.execute("SELECT * FROM history", null, function onRow(row) {

I am trying to test some changes I made regarding deletion of history items. The problem I am facing is when I query the table again, I am not able to retrieve newest values. Its returning same old values. I have verified that the java side code is working fine. I also verified that the entries I made through test inserted into visit and history table are as expected. I logged the table at various points at java side and didn't notice anything problematic. IMHO i think the problem is with Sqlite.jsm or at test, but I am not sure. So, can anyone help, please?


Thanks
Richard, do you know why a change in the browser.db from the Java side might not be visible to SQLite.jsm? Or would you have another idea how to test/verify the deletion from JS?
Flags: needinfo?(rnewman)
(In reply to Sebastian Kaspari (:sebastian) from comment #29)

> Richard, do you know why a change in the browser.db from the Java side might
> not be visible to SQLite.jsm? Or would you have another idea how to
> test/verify the deletion from JS?

Deleting history does not delete rows from the database. We mark them as deleted sentinels so they can be uploaded by Sync.

See the block comment:

                /**
                 * Deletes from Sync are actual DELETE statements, which will cascade delete relevant visits.
                 * Fennec's deletes mark records as deleted and wipe out all information (except for GUID).
                 * Eventually, Fennec will purge history records that were marked as deleted for longer than some
                 * period of time (e.g. 20 days).
                 * See {@link SharedBrowserDatabaseProvider#cleanUpSomeDeletedRecords(Uri, String)}.
                 */


If you've implemented a SQL-based API for querying browser.db from JS (brave and a little foolhardy!), then it should be aware of deletion sentinels, and you can use that API.

If you're querying directly, you need to be aware that only visits will actually be deleted, and history items will just be partially blanked. Accessing browser.db from JS bypasses BrowserProvider, which is an enormous violation of encapsulation, and you will be playing whack-a-mole for these kinds of gaps.

Unfortunately, as browsingData.removeHistory works with visits, as Bob explained, the feature is totally incompatible with Firefox Sync. Sync doesn't have any way of representing the removal of visits. This is a problem on desktop with "Forget", too.

This patch needs very careful review to make sure that it works correctly wrt Sync, and that it doesn't suffer from concurrency or performance problems.
Flags: needinfo?(rnewman)
Comment on attachment 8876839 [details]
Bug 1363010 - Adding support for deletion of history items wrt time.

https://reviewboard.mozilla.org/r/148164/#review180708

Hey!

I've tested the following behaviour in this patch and it works as expected:

1. Delete all.
2. Delete with since, where no history items are orphaned. History table is updated as expected, with new count and dates.
3. Delete with since, where some history items are orphaned. History table is updated as expected, with new count and dates.

Now I want to test for performance but I have no idea how I can do that. I tried "adb shell setprop log.tag.SQLiteTime VERBOSE", but this didn't worked. So, how can I test for it?


Looking for feedback on how we can improve the db queries. :)

Also, I haven't added review flag for web-extension part, as test is not yet complete.

Thanks!!
Blocks: 1411651
This seems to have went off the radar, and generally we're a bit in-flux w.r.t. WebExtension API support for Android in light of the upcoming GeckoView work. If there's a strong desire to land this, I'm happy to help push this work over the line when the time comes.

(cc :ddurst for general awareness).
Comment on attachment 8876839 [details]
Bug 1363010 - Adding support for deletion of history items wrt time.

https://reviewboard.mozilla.org/r/148164/#review227282
Attachment #8876839 - Flags: review?(gkruglov)
Product: Toolkit → WebExtensions
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: