Implement clearing of IndexedDB in browsingData API

NEW
Unassigned

Status

()

Toolkit
WebExtensions: Compatibility
P5
normal
6 months ago
26 days ago

People

(Reporter: bsilverberg, Unassigned)

Tracking

(Blocks: 1 bug)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [browsingData]triaged)

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

6 months ago
Chrome's browsingData API allows for clearing of IndexedDB and LocalStorage for all visited sites. This has not yet been implemented in Firefox, and a conversation with Jan Varga indicates that some work on the underlying Firefox APIs will be required in order to implement this. Some bugs will be opened for that work, which will block this bug, once it is better understood.
(Reporter)

Updated

4 months ago
Assignee: bob.silverberg → nobody
Status: ASSIGNED → NEW
Blocks: 1043081
Will this tracking bug include removal of IndexedDB and LocalStorage content on a per-domain basis or should I open a separate bug for that?
Flags: needinfo?(bob.silverberg)
(Reporter)

Comment 2

4 months ago
(In reply to Johannes Pfrang [:johnp] from comment #1)
> Will this tracking bug include removal of IndexedDB and LocalStorage content
> on a per-domain basis or should I open a separate bug for that?

Bug 1340511 has already been opened for that.
Flags: needinfo?(bob.silverberg)
No longer blocks: 1043081

Comment 3

3 months ago
I almost have patches ready for this, but I've run into a stumbling block: it doesn't look like "since" support will be possible for localStorage, as the data for all hosts is all stored in one sqlite database, and there are no timestamp records in the database to work with.

Is this something we should block on, or should we simply throw some sort of "not implemented" error when users try to combine clearing localStorage with since for now, and file a follow-up bug?

Please note that I'd imagine that figuring out how to change the schema of webappstore.sqlite would require more time than I have to invest in this effort right now :(

PS: "since" support for indexedDBs is possible, since they're stored in per-host directories on the file system. As such it's not too much trouble to clear the host's IDBs if they're all newer-or-equal to the since date.
Flags: needinfo?(bob.silverberg)
(Reporter)

Comment 4

3 months ago
I believe that when Kris and I discussed this ages ago, he suggested that changes should be made to the data stores to support since, as it's possible, but just not currently available. We should probably wait for that, rather than implementing a "not quite there" solution.
Flags: needinfo?(bob.silverberg)

Comment 5

3 months ago
Ah, alright. Would it be alright to at least try landing the indexedDB support while we wait for the localStorage changes? I can't see that being a problem, and I'd prefer to not have that bitrot too much if possible.
Flags: needinfo?(bob.silverberg)
(Reporter)

Comment 6

3 months ago
Sure, that sounds fine, but we should probably split it into two bugs in that case, so we don't lose track of the localStorage case. Either this bug or a new one can be used to track the changes for indexedBD, and vice versa for localStorage. Can you please take care of that, Thomas?
Flags: needinfo?(bob.silverberg)

Updated

3 months ago
See Also: → bug 1355576

Comment 7

3 months ago
Sure, I've split out the localStorage case into bug 1355576, and will try to get a patch up for review tonight (with both yourself and :baku as reviewers, as suggested in bug 1340511).
Summary: Implement clearing of IndexedDB and LocalStorage in browsingData API → Implement clearing of IndexedDB in browsingData API
Comment hidden (mozreview-request)

Comment 9

3 months ago
mozreview-review
Comment on attachment 8857146 [details]
Bug 1333050 - add support for browsingData.removeIndexedDB and remove(indexedDB);

https://reviewboard.mozilla.org/r/129080/#review131588

::: browser/components/extensions/ext-browsingData.js:129
(Diff revision 1)
> +  yield AllStorageHostsIterator(function* (hostPath) {
> +    let idbPath = OS.Path.join(hostPath.path, "idb");
> +    let idbStat = yield OS.File.stat(idbPath);
> +    if (idbStat.isDir) {
> +      let shouldClear = false;
> +      if (options.since) {

I'm not 100% sure which logic we want here, precisely. Would we want to only clear a host's IDBs if ALL of them have been accessed since the cutoff date, or if ANY of them have been accessed since the cutoff date?

If for some reason we'd like to be finer-grained and clear on a per-DB basis (rather than a per-host basis), then we will need further platform support, as right now there is only support to clear all storages for a given host.

Comment 10

3 months ago
I'm not sure collecting origin directories outside of quota manager is the right thing to do. Every access to <profile>/storage should use a directory lock.
Take a look at https://dxr.mozilla.org/mozilla-central/source/dom/indexedDB/ActorsParent.cpp#18621

And btw, last access time is tracked in .metadata-v2 file.

Comment 11

3 months ago
Thanks, :janv. I was actually hoping to avoid pointless locking entirely, since I'm only trying to find out the list of hosts with IDBs at that stage (I actually do my call to try clearing the host through the sanctioned APIs, which I believe does the necessary locking/etc).

That said, if this is too fragile and locking is required, then I'm obviously all for it. I just don't know how to get a list of hosts with IDBs in any other way, and I'm not sure how to access the info in .metadata-v2, for that matter. Do you have any advice on how I should go about exposing such info to JS in a better way? With luck I may be able to find the time to do it.
Flags: needinfo?(jvarga)

Comment 12

3 months ago
(In reply to Thomas Wisniewski from comment #11)
> Thanks, :janv. I was actually hoping to avoid pointless locking entirely,
> since I'm only trying to find out the list of hosts with IDBs at that stage

What if the origin directory you are trying to process in JS is being deleted at the same time ?
Yeah, it seems pointless until you see intermittent failures and strange crashes :)

> (I actually do my call to try clearing the host through the sanctioned APIs,
> which I believe does the necessary locking/etc).
> 
> That said, if this is too fragile and locking is required, then I'm
> obviously all for it. I just don't know how to get a list of hosts with IDBs
> in any other way, and I'm not sure how to access the info in .metadata-v2,
> for that matter. Do you have any advice on how I should go about exposing
> such info to JS in a better way? With luck I may be able to find the time to
> do it.

I think the list should be collected in C++ as part of a new method in nsIQuotaManagerService.
Flags: needinfo?(jvarga)

Comment 13

3 months ago
>What if the origin directory you are trying to process in JS is being deleted at the same time ?

I'll defer to your wisdom :)

I had suspected that it would fail one way or the other in the event of such a race condition, and my code would end up trying to delete a DB that was already gone in the worst case.

But taking a closer look at DirectoryIterator, I doubt my code is fault-tolerant enough regardless, so I'm going to cancel review for now and try to figure out how to get this working in nsIQuotaManagerService as you suggest.

Updated

3 months ago
Attachment #8857146 - Attachment is obsolete: true
Attachment #8857146 - Flags: review?(bob.silverberg)
Attachment #8857146 - Flags: review?(amarchesini)
Comment hidden (mozreview-request)

Comment 15

3 months ago
Hey :janv, I'm not sure if you're willing to actually review this latest patch, but if you could at least take a look to see if it's what you had in mind, I'd appreciate it.

Comment 16

3 months ago
Man, I appreciate your contribution.
However, I'm afraid that this is not right direction, sorry.

You would also need to expose directory locking stuff, etc. just because you want to do processing of origin directories in JS
Is there any particular reason for that ?

I thought that a new method in nsIQuotaManagerService would just take a "since" argument.

Comment 17

3 months ago
Ah It's no big deal that I misinterpreted you, but clearly I should make 100% sure what you mean this time :)

Did you mean I should just add a clearSince method and a clearStoragesForPrincipalIfSince method? 

The second one is going to be needed soon. We need to be able to clear the iDBs for all origins that have been accessed since a given time, but also on a per-origin basis (we'll also have a "hostname" parameter in addition to a "since" parameter, in the browsingData.removeIndexedDBs API). I guess that means I'd either need to query for the origin's metadata access time, or just add an ifSince parameter to the existing method. The latter seems the easier way to go, but would you prefer something else?
Flags: needinfo?(jvarga)

Comment 18

3 months ago
Is the API and any possible future changes/improvements described somewhere ?

So options is like a dictionary, right ? It can contain one or all of these: hostname, since

And you propose these new methods:
ClearSince(since)
ClearStoragesForPrincipalIfSince(hostname, since)

I think it would be more flexible to just pass options to a new method:
ClearXXX(in jsval options)

I'm not sure about suitable naming for this right now, that can be decided later.

Also, is this only about IndexedDB ? I know that localStorage part of work can't be done at the moment, but what about DOM cache or any other storage APIs managed by quota manager ?
Flags: needinfo?(jvarga)

Comment 19

3 months ago
>is this only about IndexedDB ?

This ticket, yes. We also want to deal with localStorage to achieve closer Chrome parity in the browsingData WebExtension API, but I can't see how to support "since" for localStorage without changing the schema of the sqlite stores to include per-host access times, and I'm not sure I have the time or knowledge to do that right without very careful mentoring.

>I know that localStorage part of work can't be done at the moment, but what about DOM cache or any other storage APIs managed by quota manager ?

I'm new to the quota manager code, so I don't yet know what other types of stores it handles. The priority right now for WebExtensions is just to add iDB support for clearing iDBs with an optional "since" date, then to add a "host(s)" filter. But later on we might expose the ability to clear more browsingData, so I'm certainly not against making a more general-purpose API. Maybe:

  qms.clearStorages({
    storages: ["indexedDB", "domCache"],
    hosts: ["http://www.host1.com"],
    since: timestamp
  }, callback)

Where if all parameters are omitted it just clears everything? If that's agreeable, then I don't mind implementing it as such. The issue is that I wouldn't be sure what storage types the quota manager handles aside from the two we've mentioned already. Is there an obvious list and set of APIs I can use to handle each type, or should I limit it to just iDBs for now and require at least one type to be passed in? (And would all the storage types even be able to support "since" as a parameter?)
Flags: needinfo?(jvarga)

Comment 20

3 months ago
Gentle needinfo ping (but if you'd like more time to chew on this, then I don't mind).

Comment 21

3 months ago
(In reply to Thomas Wisniewski from comment #19)

Sorry for the delay...

> >is this only about IndexedDB ?
> 
> This ticket, yes. We also want to deal with localStorage to achieve closer
> Chrome parity in the browsingData WebExtension API, but I can't see how to
> support "since" for localStorage without changing the schema of the sqlite
> stores to include per-host access times, and I'm not sure I have the time or
> knowledge to do that right without very careful mentoring.

Yeah, supporting localStorage is not easy with the current localStorage architecture. Should be easier once we plug it into Quota Manager.

> 
> >I know that localStorage part of work can't be done at the moment, but what about DOM cache or any other storage APIs managed by quota manager ?
> 
> I'm new to the quota manager code, so I don't yet know what other types of
> stores it handles. The priority right now for WebExtensions is just to add
> iDB support for clearing iDBs with an optional "since" date, then to add a
> "host(s)" filter. But later on we might expose the ability to clear more
> browsingData, so I'm certainly not against making a more general-purpose
> API. Maybe:
> 
>   qms.clearStorages({
>     storages: ["indexedDB", "domCache"],
>     hosts: ["http://www.host1.com"],
>     since: timestamp
>   }, callback)
> 
> Where if all parameters are omitted it just clears everything? If that's
> agreeable, then I don't mind implementing it as such. The issue is that I
> wouldn't be sure what storage types the quota manager handles aside from the
> two we've mentioned already. Is there an obvious list and set of APIs I can
> use to handle each type, or should I limit it to just iDBs for now and
> require at least one type to be passed in? (And would all the storage types
> even be able to support "since" as a parameter?)

I would design it as an array of "storage types" as you suggested, but limit to IndexedDB for now.
Flags: needinfo?(jvarga)

Comment 22

3 months ago
mozreview-review
Comment on attachment 8857736 [details]
Bug 1333050 - Add a new method QuotaManagerService.clearStorages(in jsval aOptions), which is a dict with an array of storageTypes (only indexedDB supported for now), optional array of principals, and optional 'since' cutoff timestamp;

https://reviewboard.mozilla.org/r/129690/#review138474
Attachment #8857736 - Flags: review?(jvarga)

Comment 23

3 months ago
No problem on the delay. But I'm relatively new to dealing with the idl stuff, so could you give me some pointers on what the best way to implement the "in jsval" parts would be? That is, I'm not sure if learning how to do this with isObject() and JS::ForOfIterator and so on would be the way to go, or if I should just avoid all of that and pass in several array arguments instead of a jsval, like this:

  clearStorages([optional] in ACString[] aStorageTypes,
                [optional] in nsIPrincipal[] aPrincipals,
                [optional] in int32_t aSince)
Flags: needinfo?(jvarga)

Comment 24

3 months ago
Hmm. I'm indeed stuck on the jsval version of the implementation, and unsure what to do. When I create a principal like this in a mochitest:

  let ssm = SpecialPowers.Services.scriptSecurityManager;
  var principal = ssm.createCodebasePrincipalFromOrigin("http://google.ca");
  dump(principal);

The dump of the object shows that its indeed an nsIPrincipal:

  [xpconnect wrapped (nsISupports, nsIPrincipal, nsISerializable) @ 0x7fcad3ca7320 (native @ 0x7fcad4aeb520)]

However, in my C++ code the Rooted JS object doesn't seem to be unwrappable using UnwrapArg<nsIPrincipal>, just UnwrapArg<nsISupports>.

Is there a special trick to grabbing an nsIPrincipal* for such objects? I seem to be able to unwrap a plain nsIURI just fine using UnwrapArg, but its dump doesn't show support for multiple XPCOM interfaces: [xpconnect wrapped nsIURI]. Is that perhaps related?
Flags: needinfo?(jvarga)

Comment 25

3 months ago
In case a bit more context helps, this is the code I'm using to unwrap the principals in my inner loop (with some error-checking removed for clarity):

  JS::Rooted<JS::Value> temp(cx);
  iter.next(&temp, &done); // JS::ForOfIterator
  JS::RootedObject obj(cx, &temp.toObject());
  RefPtr<nsIPrincipal> principal;
  if (NS_FAILED(UnwrapArg<nsIPrincipal>(cx, obj, getter_AddRefs(principal)))) {
    ThrowErrorMessage(cx, MSG_DOES_NOT_IMPLEMENT_INTERFACE, "Argument of clearStorages", "Principal");
    return NS_ERROR_INVALID_ARG;
  }

Comment 26

3 months ago
Btw, do you really need to support an array of hosts/principals ?

Comment 27

3 months ago
Hmm. It appears that the problem I was running into is specific to the mochitest I was using (or perhaps mochitests in general). I just tried running a test from an xpcshell-test instead, and it seems able to unwrap the nsIPrincipals just fine. Hopefully it was a false alarm.


>Btw, do you really need to support an array of hosts/principals?

No, I don't suppose I need an array of them, but figure that I might as well. It seems it may be more efficient to just batch up one request, rather than making multiple calls, and the extra code to iterate an array doesn't seem all that egregious to me. I'm not against changing it though if you feel that's the better approach.
Comment hidden (mozreview-request)

Comment 29

3 months ago
Whoops, my bad, I didn't realize reviewboard would make a useless interdiff for this new patch. Just ignore that.

Comment 30

a month ago
Just another friendly status ping.

Comment 31

a month ago
Comment on attachment 8857736 [details]
Bug 1333050 - Add a new method QuotaManagerService.clearStorages(in jsval aOptions), which is a dict with an array of storageTypes (only indexedDB supported for now), optional array of principals, and optional 'since' cutoff timestamp;

Pinging another indexeddb peer who might be able to help with review.
Attachment #8857736 - Flags: review?(btseng)

Comment 32

a month ago
Sorry, but I repeated this multiple times, it seems to me that the locking is not handled in the last patch at all.
Using locking methods or even extending the locking API is quite hard, so I wouldn't try to make the new method too generic. I would just try to add the minimum you need for browsingData API.

For example, clearAllOrigins needs to exclusively lock the whole storage directory.
On the other hand, when you pass a list of origins, you need to exclusively lock only those origins.

You just can't go ahead and delete whichever files you want in <profile>/storage, they can be in use.

I can't review this, until this is addressed, sorry.

Comment 33

a month ago
mozreview-review
Comment on attachment 8857736 [details]
Bug 1333050 - Add a new method QuotaManagerService.clearStorages(in jsval aOptions), which is a dict with an array of storageTypes (only indexedDB supported for now), optional array of principals, and optional 'since' cutoff timestamp;

https://reviewboard.mozilla.org/r/129690/#review155876
Attachment #8857736 - Flags: review?(jvarga)
Attachment #8857736 - Flags: review?(btseng)

Comment 34

26 days ago
In addition to creating an API for clearing IndexedDB, Firefox should actually delete this data on exit, when the user has selected "Clear history when Firefox closes->Offline website data". Firefox 52 ESR does not clear this data, which is kind of fooling the user. Tracking possibilities galore! Tested it here https://static.raymondcamden.com/demos/2014/feb/7/index.html#/home
You need to log in before you can comment on or make changes to this bug.