Open Bug 1531276 Opened 9 months ago Updated 29 days ago

Use nsIClearDataService in browsingData

Categories

(WebExtensions :: General, enhancement, P2)

enhancement

Tracking

(Not tracked)

People

(Reporter: baku, Unassigned)

References

Details

We are trying to unifying all the site-data cleanup in one single module: nsIClearDataService. We should use it in browsingData as well.

https://searchfox.org/mozilla-central/rev/2a6f3dde00801374d3b2a704232de54a132af389/toolkit/components/cleardata/nsIClearDataService.idl

I took a quick look sometimes ago to the nsIClearDataService, and I noticed that it is already started to be used internally by Firefox Desktop's Sanitizer.jsm.

Sanitizer.jsm is what browsingData API is using internally to implement most of its features, and so it seems to me that a good strategy would be to double-check if Sanitizer.jsm is now able to cover more of what browsingData needs thanks to nsIClearDataServer.

By looking to Firefox Desktop browsingData API implementation, it seems that the following features are the only ones that are not using methods from Sanitizer.jsm:

At a first glance it seems that nsIClearDataService.idl should be able to cover clearCookies and clearPassword, but it doesn't seem to be possible to clear IndexedDB and localStorage separately, is that right?

As additional side notes:

we have two separate implementations of the browsingData API namespace:

Both these API modules uses Sanitizer.jsm internally for most of the feature implemented, and Sanitizer.jsm seems to also have two separate implementations:

At a first glance it looks that, while nsIClearDataService is already been using internally by the Firefox Desktop Sanitizer.jsm, it doesn't seem to be used by the Firefox for Android Sanitizer.jsm.

Do you know if there is any actual reason why nsIClearDataService is not being used in the Firefox for Android Sanitizer.jsm?

Flags: needinfo?(amarchesini)

Sanitizer.jsm is what browsingData API is using internally to implement most of its features, and so it seems to me that a good strategy would be to double-check if Sanitizer.jsm is now able to cover more of what browsingData needs thanks to nsIClearDataServer.

Would be great if we can remove this extra step and use nsIClearDataService directly. Note that is exposed to JS via: Services.clearData.

This is supported.

This is supported as well

We can introduce these 2.
Question: why do we want to allow the cleanup of IDB and localStorage separately?

Do you know if there is any actual reason why nsIClearDataService is not being used in the Firefox for Android Sanitizer.jsm?

Bug 1466130 - never finished.

Flags: needinfo?(amarchesini)

(In reply to Andrea Marchesini [:baku] from comment #2)

Sanitizer.jsm is what browsingData API is using internally to implement most of its features, and so it seems to me that a good strategy would be to double-check if Sanitizer.jsm is now able to cover more of what browsingData needs thanks to nsIClearDataServer.

Would be great if we can remove this extra step and use nsIClearDataService directly. Note that is exposed to JS via: Services.clearData.

Does you know if the features wrapped by Sanitizer.jsm are doing anything more than what Services.clearData already does on its own?

We can introduce these 2.
Question: why do we want to allow the cleanup of IDB and localStorage separately?

This API has been inherited by the ones supported on Chrome, and indexedDB and localStorage are two of the supported data types, and technically they could be used separately from each other by the extensions that uses this API.

Flags: needinfo?(amarchesini)

(In reply to Luca Greco [:rpl] from comment #3)

This API has been inherited by the ones supported on Chrome, and indexedDB and localStorage are two of the supported data types, and technically they could be used separately from each other by the extensions that uses this API.

I said this in https://bugzilla.mozilla.org/show_bug.cgi?id=1526246#c3 too, but I think the only reasonable course of action is to map these fine-grained deletion requests onto "site data". I think it's reasonable to separate "cookies". For a sort of meta-discussion where I expound, at length, on this, check out bug 1488583.

But for a shorter recap here:

  • Web standards promise that origins will be cleared at a "bucket" granularity for an origin. For now this means all "site data". It's very likely for a site's ServiceWorker to depend on specific IndexedDB and Cache API state and to break if only part of the data is cleared. If only part of the data is cleared, it's possible that sites will break, which is bad for them and bad for any resulting Firefox support costs, as the only answer to such a problem is "don't do that".
  • It's very dangerous for an API that deals with privacy-related things to be able to accidentally miss things. The user intent of clearing data is very clear. Because of attempting to follow this API we have bug 1526246 where user-intent is not honored. If the API gains additional distinct partitions and a WebExtension is not updated to use them, personal data will be potentially be left behind, etc.
  • It is conceivable that there will be individuals who have a specific reason to want to clear only IndexedDB, etc. as is/was the case in bug 1488583. However, it's not clear that this is a use-case we want to support for both the aforementioned support reasons and the potential risk for bugs/oversights in WebExtension code where what users of the WebExtension think is happening is not actually what's happening.
See Also: → 1526246

Does you know if the features wrapped by Sanitizer.jsm are doing anything more than what Services.clearData already does on its own?

At the moment, Sanitizer.jsm is needed just for 'clearFormData' and probably we can move that complexity into nsIClearDataService as well. Anything else is just calling Sanitizer.jsm which calls Service.clearData.deletedata() directly (with or without range).

Here the list of things which do not use Sanitizer.jsm nor nsIClearDataService:

  • clearCookies() - probably it could be ported to nsIClearDataService easily.
  • clearIndexedDB() - see the asuth's comment. I think we should just delete all the storage data.
  • clearLocalStorage() - same here.
  • clearPasswords() - this requires nsIClearDataService to remove by range. Easy task.
  • clearServiceWorkers() - this should be merged with clearIndexedDB() and clearLocalStorage().

Interesting: I would add 'storage' type, because there are many things left in the current cleanup browsingData feature. A quick list is: MediaDevices, AppCache,PredictornetworkCleaner, PushNotification, Authentication tokens/cache, Permissions, Reports...

See: https://searchfox.org/mozilla-central/rev/00f3836a87b844b5e4bc82f698c559b9966e4be2/toolkit/components/cleardata/nsIClearDataService.idl#92-195

Flags: needinfo?(amarchesini)

Haven't seen it mentioned so far so let me note that Sanitizer.jsm contains a lot of logic for doing sanitize on shutdown, manage pending sanitizations, etc. We would like to specialize that module to these tasks and avoid it being a general purpose data cleaner (because its API is just not that great otherwise and has this weird relationship with privacy.sanitize prefs).

Hence I think we should move all users that don't strictly need to be using Sanitizer.jsm off of it.

Flags: needinfo?(mconca)

If we want to move away from sanitizer.jsm and over to nsIClearDataService, it is important to maintain API compatibility. Providing extensions granular control over browser features, perhaps more than Firefox offers, is an attractive aspect of the API and allows extension developers more creativity and control. Furthermore, maintaining compatibility with Chrome is an important and explicit goal of WebExtensions - changing the browsingData API (after following the strict deprecation policy) would result in several webcompat bugs with Chrome.

Flags: needinfo?(mconca)

(In reply to Mike Conca [:mconca] from comment #7)

Providing extensions granular control over browser features, perhaps more than Firefox offers, is an attractive aspect of the API and allows extension developers more creativity and control. Furthermore, maintaining compatibility with Chrome is an important and explicit goal of WebExtensions - changing the browsingData API (after following the strict deprecation policy) would result in several webcompat bugs with Chrome.

Are you aware of a specific use-case for clearing only a subset of an origin's bucket storage that necessitate this level of "creativity and control"? My concern continues to be that this places users at risk of:

  1. Using a WebExtension that is intended to preserve privacy, but because of the design of the API and/or bugs in the webextension they are using, their intent to clear private data fails to clear private data.
  2. Breaking the website for the user as sites usually assume storage is an all-or-nothing behavior, as that is how the web works without this extension.

I think it's worth noting that Web Extensions already have a means of directly accessing storage for an origin via content scripts (or devtools) if they want to perform more targeted manipulations of data. And such targeted manipulations are likely to be much safer in regards to the 2nd concern, as there's a big difference between deleting a single IndexedDB database (or manipulating a single row, etc.) than wiping out all databases for the origin.

See Also: → 1535606

(In reply to Andrew Sutherland [:asuth] from comment #8)

Are you aware of a specific use-case for clearing only a subset of an origin's bucket storage that necessitate this level of "creativity and control"? My concern continues to be that this places users at risk of:

  1. Using a WebExtension that is intended to preserve privacy, but because of the design of the API and/or bugs in the webextension they are using, their intent to clear private data fails to clear private data.
  2. Breaking the website for the user as sites usually assume storage is an all-or-nothing behavior, as that is how the web works without this extension.

I think it's worth noting that Web Extensions already have a means of directly accessing storage for an origin via content scripts (or devtools) if they want to perform more targeted manipulations of data. And such targeted manipulations are likely to be much safer in regards to the 2nd concern, as there's a big difference between deleting a single IndexedDB database (or manipulating a single row, etc.) than wiping out all databases for the origin.

I appreciate what you are saying, Andrew, and generally don't disagree. A poorly written extension could potentially leave the user in a worse privacy situation than if they hadn't used the extension. Let me see if I can find what use cases there may be for having separate API for each storage area. I'm open to changing this behavior if it doesn't break something important.

This is starting to feel off-topic from the original bug, though. I filed bug 1535763 to track this issue separately.

See Also: → 1535763
Priority: -- → P2
You need to log in before you can comment on or make changes to this bug.