[meta] Support managing site data by first-party site when dFPI is active
Categories
(Core :: Privacy: Anti-Tracking, enhancement, P2)
Tracking
()
Tracking | Status | |
---|---|---|
relnote-firefox | --- | 91+ |
People
(Reporter: englehardt, Unassigned)
References
(Depends on 1 open bug, Blocks 2 open bugs)
Details
(Keywords: meta)
Attachments
(3 files)
The site data manager is designed for a world where cookies and site data are defined by the origin of the resource and nothing else. This model has been a bit outdated for a while since it doesn't consider origin attributes, but it will become especially pronounced when dFPI ships.
In dFPI we have a concept of the storage that belongs to the site itself and all third-party storage that has been partitioned under that top-level site. I'm going to refer to this as the site's storage jar, for lack of a better word. Specifically this means all storage keyed off an origin from that site (i.e., scheme + eTLD+1), as well as any third-party storage that has its partitonKey
origin attribute set to that site. As an example, first-party.example
's storage jar would include cookies and site data that correspond to foo.first-party.example
and well as tracker1.example^partitionKey=first-party.example
and tracker2.example^partitionKey=first-party.example
.
Right now, the site data manager allows users to clear storage for specific origins. So a user could choose to clear storage from foo.first-party.example
and we'd clear foo.first-party.example
's first-party storage but we would not clear tracker1.example^partitionKey=first-party.example
or tracker2.example^partitionKey=first-party.example
and so on.
We should re-think this experience around the site storage jar concept. Rather than display a bunch of individual first-party and third-party origins we could display the actual top-level sites that a user visits and have the mechanism clear storage for all third parties partitioned under those sites (as well as all first-party origins from those sites). This is similar to what Mike Perry imagined in Bug 565965 Comment 21.
Things to consider:
- We currently allow users to clear storage by individual origins. We'd have to move from origin to site in this new flow because third-party storage will be keyed off of site. That is, we won't know whether
tracker1.example^partitionKey=foo.example
was set onbar.foo.example
orsub.foo.example
. Exposingbar.foo.example
andsub.foo.example
as different choices doesn't make a lot of sense when we intend to also clear the partitioned third parties. - We should decide whether clearing a site's storage jar should clear all partitioned storage from that site in addition to the storage jar itself. Should we reach into other site's storage jars? As an example. This would mean that clearing storage from
tracker.example
would clear all oftracker.example
's first-party storage, all storage matchingtracker.example^partitionKey=<any site>
and all storage matching<any origin>^partitionKey=tracker.example
.
(1) means that users have less control, but I suspect that's okay. Given that cookies can be scoped to eTLD+1 this seems safer than the current design.
(2) is likely what we'll do for cookie purging. Not clearing partitioned storage across cookie jars allows a tracker to easily re-build/re-spawn the user's identity using data storage in their partitioned storage locations.
Reporter | ||
Updated•4 years ago
|
Comment 1•4 years ago
|
||
What's the relationship between 1. Clearing cookies and site data and 2. access granted to third party domains on a website?
If I clear cookies/site data (which clears both my cookies/site data for both origin AND third parties in this future state), do third parties still have access to my tracking activity while on the site?
Reporter | ||
Comment 2•4 years ago
•
|
||
(In reply to Meridel from comment #1)
What's the relationship between 1. Clearing cookies and site data and 2. access granted to third party domains on a website?
For what's currently implemented: when you clear cookies and site data for an origin, we will clear all storage access permissions that have been granted to the origin being cleared. So if we were to clear facebook.com
, we'd clear any storage access permissions that were granted when the user was visiting Facebook as a first party (the grants providing access to various third parties). We would not clear storage access grants that have been provided to facebook.com on other websites.
EDIT: It looks like we were incorrect here. The cookies and site data management UI doesn't seem to impact storage access permissions at all. We can of course change that in the future.
If I clear cookies/site data (which clears both my cookies/site data for both origin AND third parties in this future state), do third parties still have access to my tracking activity while on the site?
All third-party access grants that were previously active on the site you cleared data for will be deleted. These third parties would need to re-request storage access.
EDIT same as above, we were wrong in our understanding here; storage access permissions will be untouched, so third parties will still be able to access cross-site storage after cookie clears (of course it will be a new version of the storage, since the older stuff was deleted).
There's a question of whether we should delete the storage access grants that have been given to the site you've cleared (i.e., grants for which the site is a third party). As an example let's say you clear Facebook's cookies but Facebook has storage access on 10 other sites as a third party. Should we clear their access on those sites as well so they need to re-request? These access grants aren't a storage mechanism in themselves, and I can't imagine any practical attacks that could be used to re-identify the user by their previous storage access grants. It's more of a question of what the user would expect. I'm inclined to say we should clear them: it matches what we'd do for the sites third-party storage and the user really doesn't have any other means to revoke the permissions en masse.
Reporter | ||
Comment 3•4 years ago
|
||
During our meeting today there was some confusion around what I meant in Comment 0 since it's quite technical. I've attached a simpler visual example. In this example the user has visited four sites as first parties (tracker.example, A.example, B.example, and C.example), and thus there are four separate storage jars. These sites loaded a variety of third parties (e.g., analytics.example, tracker.example, and video.example). Thus, these third parties have partitioned storage within the four storage jars. Note that tracker.example is visited directly as a first party and also embedded as a third party.
The diagram shows what I'm suggesting we clear when we clear cookies and site data for tracker.example.
Reporter | ||
Comment 4•4 years ago
|
||
We chatted a bit more on the topic of whether this UI should clear storage access permissions as well, and we think it makes sense to add that as part of the update. So when you clear facebook.com, we would clear all storage access permissions that have been granted on facebook.com as a first party (i.e., those that were granted to third parties embedded in facebook.com).
Reporter | ||
Comment 5•4 years ago
•
|
||
In summary, we've landed on the following functional changes:
- Display storage and cookies by registrable domain (eTLD+1) rather than origin. Note that dFPI partitions by site (scheme + eTLD+1), but I don't think we want to expose scheme here (e.g., show entries for both
https://example.com
andhttp://example.com
). Instead, we should display by eTLD+1. (See Comment 1) - Clear all cookies and storage that are associated with the eTLD+1 selected. We currently clear all storage associated with the host being cleared (ignoring any origin attributes), and we should continue to do that. In addition, we should clear storage that is partitioned under an eTLD+1 when that eTLD+1's data is cleared. Again this means we should ignore scheme when clearing things that include the eTLD+1 in their
partitionKey
attribute. For example, if the user clearsexample.com
we'd clearthird-party.example^partitionKey=https://example.com
as well asthird-party.example^partitionKey=http://example.com
. (See Comment 3) - Clear storage access permissions. If a user clears
example.com
we should clear both the storage access permissions granted toexample.com
on other websites, as well as storage access permissions granted onexample.com
to other third parties.
Reporter | ||
Comment 6•4 years ago
|
||
Reporter | ||
Comment 7•4 years ago
|
||
Reporter | ||
Comment 8•4 years ago
•
|
||
For point (1) of Comment 5 we also need to think about how to display things in the confirmation dialog (Comment 7). We need to decide whether to display things by eTLD+1 or hostname. Note that this dialog is displayed both when a user goes through the about:preferences#privacy UI and when a user clicks "Clear Cookies and Site Data..." from the Site Information Panel (Comment 6).
The "Clear Cookies and Site Data..." button (Comment 6) clears by eTLD+1 of the site open in the active tab. This is great as it matches clearing by eTLD+1 which we'd like to move to within the main cookies and site data management UI. However, the resulting confirmation dialog (Comment 7) shows all of the hosts that will be cleared. I suggest we follow this established pattern within the main cookies and site data management UI confirmation dialog. That is, we allow users to choose what to clear by eTLD+1 and display the full set of hosts that will be cleared by that action in the confirmation dialog.
Comment 9•3 years ago
|
||
Updating priority, since we've identified this bug as MVP follow-up.
Comment 10•3 years ago
|
||
For both this bug and Bug 1629664 we will need to support the following in the ClearDataService (given example.com is the host to be cleared):
- For clearing
example.com
1st-party storage and all storage partitioned under it, we need either of the following:- Update
deleteByHost
and storage implementations to support clearing all data by host across all origin attributes, including the partition key. This might conflict with cases where we want to delete by host and empty origin attributes explicitly. Though I'd argue that we should usedeleteByPrincipal
in these cases (if possible). - Update
deleteByHost
and storage implementations to support clearing all data by host with a wildcard for the partitionKey in the origin attributes.{ partitionKey: "*" }
. This is only necessary for cleaner implementations whose storage backends actually use origin attributes to key items (such as cookies).
- Update
- For clearing all storage of example.com partitioned as a third-party we can use
deleteByOriginAttributes
. It needs to support wildcards in the partitionKey for scheme and port, like(*, example.com, *)
.
Alternatively, we could also iterate over the storage items in ClearDataService ourselves, like we already do for some storage backends, like the permission manager: https://searchfox.org/mozilla-central/rev/0379f315c75a2875d716b4f5e1a18bf27188f1e6/toolkit/components/cleardata/ClearDataService.jsm#881,884
However, it looks like this is not supported by all storage implementations.
Johann, does this approach sound sensible or am I missing something obvious here?
Comment 11•3 years ago
|
||
I've worked on a small prototype for the cookie cleaner. That seems to work well. For cases where the storage is keyed by origin with origin attributes (for example derived from a principal) we're probably best-off iterating over it from the ClearDataService. That's slower, but then we can inspect the partitionKey without needing to add some universal wildcard *
support. For some of the storages we might need to add an iterator / getAll method for that.
Reporter | ||
Comment 12•3 years ago
|
||
@pbz: check out https://bugzilla.mozilla.org/show_bug.cgi?id=1696632#c20 as it relates to this work.
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 13•3 years ago
|
||
Release Note Request (optional, but appreciated)
[Why is this notable]:
Improved site data clearing in Firefox, including changes to the data clearing UI in about:preferences.
[Affects Firefox for Android]:
No
[Suggested wording]:
Building on Total Cookie Protection, we've added a more comprehensive logic for clearing cookies that prevents hidden data leaks and makes it easy for users to understand which websites are storing local information.
[Links (documentation, blog post, etc)]: Blog post WIP
I realize it's late, but maybe we also want to add this to the 91 beta release notes?
Updated•3 years ago
|
Description
•