Closed Bug 1646215 Opened 4 years ago Closed 3 years ago

[meta] Support managing site data by first-party site when dFPI is active

Categories

(Core :: Privacy: Anti-Tracking, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
91 Branch
Tracking Status
relnote-firefox --- 91+

People

(Reporter: englehardt, Unassigned)

References

(Depends on 1 open bug, Blocks 2 open bugs)

Details

(Keywords: meta)

Attachments

(3 files)

The site data manager is designed for a world where cookies and site data are defined by the origin of the resource and nothing else. This model has been a bit outdated for a while since it doesn't consider origin attributes, but it will become especially pronounced when dFPI ships.

In dFPI we have a concept of the storage that belongs to the site itself and all third-party storage that has been partitioned under that top-level site. I'm going to refer to this as the site's storage jar, for lack of a better word. Specifically this means all storage keyed off an origin from that site (i.e., scheme + eTLD+1), as well as any third-party storage that has its partitonKey origin attribute set to that site. As an example, first-party.example's storage jar would include cookies and site data that correspond to foo.first-party.example and well as tracker1.example^partitionKey=first-party.example and tracker2.example^partitionKey=first-party.example.

Right now, the site data manager allows users to clear storage for specific origins. So a user could choose to clear storage from foo.first-party.example and we'd clear foo.first-party.example's first-party storage but we would not clear tracker1.example^partitionKey=first-party.example or tracker2.example^partitionKey=first-party.example and so on.

We should re-think this experience around the site storage jar concept. Rather than display a bunch of individual first-party and third-party origins we could display the actual top-level sites that a user visits and have the mechanism clear storage for all third parties partitioned under those sites (as well as all first-party origins from those sites). This is similar to what Mike Perry imagined in Bug 565965 Comment 21.

Things to consider:

  1. We currently allow users to clear storage by individual origins. We'd have to move from origin to site in this new flow because third-party storage will be keyed off of site. That is, we won't know whether tracker1.example^partitionKey=foo.example was set on bar.foo.example or sub.foo.example. Exposing bar.foo.example and sub.foo.example as different choices doesn't make a lot of sense when we intend to also clear the partitioned third parties.
  2. We should decide whether clearing a site's storage jar should clear all partitioned storage from that site in addition to the storage jar itself. Should we reach into other site's storage jars? As an example. This would mean that clearing storage from tracker.example would clear all of tracker.example's first-party storage, all storage matching tracker.example^partitionKey=<any site> and all storage matching <any origin>^partitionKey=tracker.example.

(1) means that users have less control, but I suspect that's okay. Given that cookies can be scoped to eTLD+1 this seems safer than the current design.

(2) is likely what we'll do for cookie purging. Not clearing partitioned storage across cookie jars allows a tracker to easily re-build/re-spawn the user's identity using data storage in their partitioned storage locations.

Summary: Support managing site data by first party when dFPI is active → Support managing site data by first-party site when dFPI is active

What's the relationship between 1. Clearing cookies and site data and 2. access granted to third party domains on a website?

If I clear cookies/site data (which clears both my cookies/site data for both origin AND third parties in this future state), do third parties still have access to my tracking activity while on the site?

Flags: needinfo?(senglehardt)

(In reply to Meridel from comment #1)

What's the relationship between 1. Clearing cookies and site data and 2. access granted to third party domains on a website?

For what's currently implemented: when you clear cookies and site data for an origin, we will clear all storage access permissions that have been granted to the origin being cleared. So if we were to clear facebook.com, we'd clear any storage access permissions that were granted when the user was visiting Facebook as a first party (the grants providing access to various third parties). We would not clear storage access grants that have been provided to facebook.com on other websites.

EDIT: It looks like we were incorrect here. The cookies and site data management UI doesn't seem to impact storage access permissions at all. We can of course change that in the future.

If I clear cookies/site data (which clears both my cookies/site data for both origin AND third parties in this future state), do third parties still have access to my tracking activity while on the site?

All third-party access grants that were previously active on the site you cleared data for will be deleted. These third parties would need to re-request storage access.

EDIT same as above, we were wrong in our understanding here; storage access permissions will be untouched, so third parties will still be able to access cross-site storage after cookie clears (of course it will be a new version of the storage, since the older stuff was deleted).

There's a question of whether we should delete the storage access grants that have been given to the site you've cleared (i.e., grants for which the site is a third party). As an example let's say you clear Facebook's cookies but Facebook has storage access on 10 other sites as a third party. Should we clear their access on those sites as well so they need to re-request? These access grants aren't a storage mechanism in themselves, and I can't imagine any practical attacks that could be used to re-identify the user by their previous storage access grants. It's more of a question of what the user would expect. I'm inclined to say we should clear them: it matches what we'd do for the sites third-party storage and the user really doesn't have any other means to revoke the permissions en masse.

Flags: needinfo?(senglehardt)

During our meeting today there was some confusion around what I meant in Comment 0 since it's quite technical. I've attached a simpler visual example. In this example the user has visited four sites as first parties (tracker.example, A.example, B.example, and C.example), and thus there are four separate storage jars. These sites loaded a variety of third parties (e.g., analytics.example, tracker.example, and video.example). Thus, these third parties have partitioned storage within the four storage jars. Note that tracker.example is visited directly as a first party and also embedded as a third party.

The diagram shows what I'm suggesting we clear when we clear cookies and site data for tracker.example.

We chatted a bit more on the topic of whether this UI should clear storage access permissions as well, and we think it makes sense to add that as part of the update. So when you clear facebook.com, we would clear all storage access permissions that have been granted on facebook.com as a first party (i.e., those that were granted to third parties embedded in facebook.com).

Depends on: 1649480

In summary, we've landed on the following functional changes:

  1. Display storage and cookies by registrable domain (eTLD+1) rather than origin. Note that dFPI partitions by site (scheme + eTLD+1), but I don't think we want to expose scheme here (e.g., show entries for both https://example.com and http://example.com). Instead, we should display by eTLD+1. (See Comment 1)
  2. Clear all cookies and storage that are associated with the eTLD+1 selected. We currently clear all storage associated with the host being cleared (ignoring any origin attributes), and we should continue to do that. In addition, we should clear storage that is partitioned under an eTLD+1 when that eTLD+1's data is cleared. Again this means we should ignore scheme when clearing things that include the eTLD+1 in their partitionKey attribute. For example, if the user clears example.com we'd clear third-party.example^partitionKey=https://example.com as well as third-party.example^partitionKey=http://example.com. (See Comment 3)
  3. Clear storage access permissions. If a user clears example.com we should clear both the storage access permissions granted to example.com on other websites, as well as storage access permissions granted on example.com to other third parties.

For point (1) of Comment 5 we also need to think about how to display things in the confirmation dialog (Comment 7). We need to decide whether to display things by eTLD+1 or hostname. Note that this dialog is displayed both when a user goes through the about:preferences#privacy UI and when a user clicks "Clear Cookies and Site Data..." from the Site Information Panel (Comment 6).

The "Clear Cookies and Site Data..." button (Comment 6) clears by eTLD+1 of the site open in the active tab. This is great as it matches clearing by eTLD+1 which we'd like to move to within the main cookies and site data management UI. However, the resulting confirmation dialog (Comment 7) shows all of the hosts that will be cleared. I suggest we follow this established pattern within the main cookies and site data management UI confirmation dialog. That is, we allow users to choose what to clear by eTLD+1 and display the full set of hosts that will be cleared by that action in the confirmation dialog.

Updating priority, since we've identified this bug as MVP follow-up.

Priority: P3 → P2

For both this bug and Bug 1629664 we will need to support the following in the ClearDataService (given example.com is the host to be cleared):

  1. For clearing example.com 1st-party storage and all storage partitioned under it, we need either of the following:
    1. Update deleteByHost and storage implementations to support clearing all data by host across all origin attributes, including the partition key. This might conflict with cases where we want to delete by host and empty origin attributes explicitly. Though I'd argue that we should use deleteByPrincipal in these cases (if possible).
    2. Update deleteByHost and storage implementations to support clearing all data by host with a wildcard for the partitionKey in the origin attributes. { partitionKey: "*" }. This is only necessary for cleaner implementations whose storage backends actually use origin attributes to key items (such as cookies).
  2. For clearing all storage of example.com partitioned as a third-party we can use deleteByOriginAttributes. It needs to support wildcards in the partitionKey for scheme and port, like (*, example.com, *).

Alternatively, we could also iterate over the storage items in ClearDataService ourselves, like we already do for some storage backends, like the permission manager: https://searchfox.org/mozilla-central/rev/0379f315c75a2875d716b4f5e1a18bf27188f1e6/toolkit/components/cleardata/ClearDataService.jsm#881,884
However, it looks like this is not supported by all storage implementations.

Johann, does this approach sound sensible or am I missing something obvious here?

Flags: needinfo?(jhofmann)

I've worked on a small prototype for the cookie cleaner. That seems to work well. For cases where the storage is keyed by origin with origin attributes (for example derived from a principal) we're probably best-off iterating over it from the ClearDataService. That's slower, but then we can inspect the partitionKey without needing to add some universal wildcard * support. For some of the storages we might need to add an iterator / getAll method for that.

@pbz: check out https://bugzilla.mozilla.org/show_bug.cgi?id=1696632#c20 as it relates to this work.

Blocks: 1696632
Keywords: meta
Summary: Support managing site data by first-party site when dFPI is active → [meta] Support managing site data by first-party site when dFPI is active
Depends on: 1705028
Depends on: 1705029
Depends on: 1705030
Depends on: 1705032
Depends on: 1705033
Depends on: 1705034
Depends on: 1705035
Depends on: 1705036
Blocks: DynamicFirstPartyIsolation
No longer blocks: dfpi-mvp-ui
Depends on: 1709621
Depends on: 1709623
Depends on: 1709624
Depends on: 1711759
Depends on: 1541885
Depends on: 1711869
Depends on: 1712170
Depends on: 1713139
Flags: needinfo?(jhofmann)
Depends on: 1714608
Depends on: 1715499
Depends on: 1715823
Depends on: 1717463
Depends on: 1717602
Depends on: 1718091
Depends on: 1719864
Depends on: 1719867
Depends on: 1720266

Release Note Request (optional, but appreciated)
[Why is this notable]:
Improved site data clearing in Firefox, including changes to the data clearing UI in about:preferences.
[Affects Firefox for Android]:
No
[Suggested wording]:
Building on Total Cookie Protection, we've added a more comprehensive logic for clearing cookies that prevents hidden data leaks and makes it easy for users to understand which websites are storing local information.

[Links (documentation, blog post, etc)]: Blog post WIP

I realize it's late, but maybe we also want to add this to the 91 beta release notes?

relnote-firefox: --- → ?

This was included in the final 91.0 release notes.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 91 Branch
See Also: → 1629667
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: