Closed Bug 1348733 Opened 3 years ago Closed 3 years ago

List sites using quota storage or appcache in Settings of Site Data

Categories

(Firefox :: Preferences, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
Firefox 55
Tracking Status
firefox55 --- fixed

People

(Reporter: Fischer, Assigned: Fischer)

References

(Depends on 1 open bug)

Details

(Whiteboard: [storage-v1])

Attachments

(2 files)

Currently we only list sites requesting persistent storage permission. We should change that to list sites as long as using quota storage.
No longer depends on: 1312349, 1313003
Summary: List sites using quota storage → List sites using quota storage in Settings of SIte Data
Depends on: 1348660
Summary: List sites using quota storage in Settings of SIte Data → List sites using quota storage in Settings of Site Data
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

Hi Gijs,
Previously we retrieved and listed sites from the Permission Manager. This patch now utilizes the new getUsage[1] API from the nsIQuotaManagerService to retrieve and list sites using quota storage.

Two things to explain:
1. The new getUsage API will return an array as the result inside the onUsageResult callback. Each item of that array contains each site's origin and usage and persisted(persistent storage) status. If no site, then an empty array is returned.

2. You will see both the in-content-old and the in-content folder have the browser_advanced_siteData.js test. This is because Bug 1343682 has created the in-content-old for Preferences Reorg. The updates are exactly the same so you only have to review one of them.


[1] https://dxr.mozilla.org/mozilla-central/source/dom/quota/nsIQuotaManagerService.idl#56
[2] TRY: https://treeherder.mozilla.org/#/jobs?repo=try&revision=f81bd73d76dfa38e16c9e83a41108c4765d3ad61

Thanks
Attachment #8852806 - Flags: review?(gijskruitbosch+bugs)
Assignee: nobody → fliu
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

https://reviewboard.mozilla.org/r/124966/#review127624

I think the mixture of usage of origins and principals here is problematic. There needs to be more clarity about what's really required for what operation.

::: browser/components/preferences/SiteDataManager.jsm:25
(Diff revision 1)
>  
>    _diskCache: Services.cache2.diskCacheStorage(Services.loadContextInfo.default, false),
>  
>    _appCache: Cc["@mozilla.org/network/application-cache-service;1"].getService(Ci.nsIApplicationCacheService),
>  
> -  // A Map of sites using the persistent-storage API (have requested persistent-storage permission)
> +  // A Map of sites using usage in Quota Manager

English nit: "A Map of sites and their disk usage according to Quota Manager".

::: browser/components/preferences/SiteDataManager.jsm:52
(Diff revision 1)
> -    let e = Services.perms.enumerator;
> -    while (e.hasMoreElements()) {
> +            let principal = Services.scriptSecurityManager.createCodebasePrincipal(
> +              Services.io.newURI(site.origin), {});

Why are we creating principals from the origins?

Also, does this mean storage is shared between 2 instances of an origin with different origin attributes? Because that would be Bad. If you don't know, can you check with Jan?

If it's just summing up usage between different OA for the same origin, that should be fine, but even then we really don't need to create principals for our internal maps. Just the URI is enough. I only see us using a principal to remove data for an origin, but that mismatch (we got usage for a string origin, not for that specific principal) seems concerning.

Another question: will it be possible for there to be 2 quota results for the same origin, one for persistent usage and one for non-persistent usage? If so, this won't currently work with the code as-is as the keys in the map will be the same (so we'll overwrite part of the data in the map and not display its usage in the treeview).

::: browser/components/preferences/in-content-old/tests/browser_advanced_siteData.js:121
(Diff revision 1)
> -    SiteDataManager.getSites = this._originalGetSites;
> +    SiteDataManager._getQuotaUsage = this._originalGetQuotaUsage;
> +    SiteDataManager._removeQuotaUsage = this._originalRemoveQuotaUsage;
>    }
>  };
>  
>  function addPersistentStoragePerm(origin) {

Quite some, but not all, of the callsites of this were removed. Can you elaborate on why that was the case?
Attachment #8852806 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

https://reviewboard.mozilla.org/r/124964/#review130334

::: browser/components/preferences/SiteDataManager.jsm:169
(Diff revision 3)
>                          usage += cache.dataSize;
>                        }
>                        list.push({
>                          usage,
> -                        status: site.status,
> -                        uri: NetUtil.newURI(origin)
> +                        origin,
> +                        host: site.principal.URI.host,

No need to create uri here. Host is already available. And origin is one specific key to one specific site. Returning host and origin is enough.
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

(In reply to :Gijs from comment #3)

Thanks for bringing up these questions. That reminds a lot of key points.
Please notice that because of the bug 1349051 and Pref Reorg, the test files have been split into:
- updates for the old Preferences tests is:
    in-content-old/tests/browser_advanced_siteData.js   
- updates for the new Preferences tests are:
    in-content/tests/browser_siteData.js
    in-content/tests/browser_siteData2.js
    in-content/tests/head.js

> Comment on attachment 8852806 [details]
> Bug 1348733 - List sites using quota storage in Settings of Site Data
> https://reviewboard.mozilla.org/r/124966/#review127624
> > +  // A Map of sites using usage in Quota Manager
> English nit: "A Map of sites and their disk usage according to Quota Manager".
Thanks, updated.

> ::: browser/components/preferences/SiteDataManager.jsm:52
> (Diff revision 1)
> > -    let e = Services.perms.enumerator;
> > -    while (e.hasMoreElements()) {
> > +            let principal = Services.scriptSecurityManager.createCodebasePrincipal(
> > +              Services.io.newURI(site.origin), {});
> 
> Why are we creating principals from the origins?
> 
> Also, does this mean storage is shared between 2 instances of an origin with
> different origin attributes? Because that would be Bad. If you don't know,
> can you check with Jan?
> 
Checked with the DOM team, underlying, no. Differernt origin attributes mean differernt storages.
Actually the `getUsage` api of QuotaManager would return origin with origin attribute suffix as well, for example, https://www.foo.com^userContextId=2.
Have updated to `let principal = Services.scriptSecurityManager.createCodebasePrincipalFromOrigin(site.origin);`. So we are making sure creating specific principal for specific origin including origin attribute.
And updated the test data to test origin with origin attribute suffix.
One correction was also made: call `clearStoragesForPrincipal(site.principal, null, false);`. The 3rd parameter should be false so we are clearing that specific origin; otherwise true means clearing all storages under the given origin disregarding origin attribute.

> If it's just summing up usage between different OA for the same origin, that
> should be fine, but even then we really don't need to create principals for
> our internal maps. Just the URI is enough.
> I only see us using a principal to remove data for an origin, but that mismatch (we got usage for a string
> origin, not for that specific principal) seems concerning.
> 
Yes, the current matching approach seems concerning. In fact, the matching keypoint should be host. Like, match http cache asset of https://www.abc.com/img/image.jpg with https://www.abc.com. Have updated to match based on host.

> Another question: will it be possible for there to be 2 quota results for
> the same origin, one for persistent usage and one for non-persistent usage?
> If so, this won't currently work with the code as-is as the keys in the map
> will be the same (so we'll overwrite part of the data in the map and not
> display its usage in the treeview).
> 
No. One origin only can have one status, persistent or not persistent.

> :::
> browser/components/preferences/in-content-old/tests/browser_advanced_siteData.js:121 (Diff revision 1)
> >  function addPersistentStoragePerm(origin) {
> 
> Quite some, but not all, of the callsites of this were removed. Can you
> elaborate on why that was the case?
This is because right now we switch to Quota Manager for getting sites to display, not getting from Permission Manager any more. However, some old test cases get sites and do tests on UI by adding some fake origins into Permission Manager as fake test data. That old test approach is no longer applicable so we remove it and take the mockSiteDataManager as the fake test data source.
Attachment #8852806 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

https://reviewboard.mozilla.org/r/124966/#review130422

OK, I stopped reviewing without reviewing the tests because I think the abstractions here are wrong, in addition to issues with the implementation of the existing abstractions meaning the code already doesn't function as intended. The tests should have caught this, so at a minimum they should be updated to deal with that issue, too.

The main issue is that we're creating a list of usage keyed by origin, but we then display that list based on hosts, and don't use origin attributes or protocols or ports. So it's possible to have an item occur in the list multiple times for different protocols, different ports, and different origin attributes (containers). Then in other places we use URIs and principals, and worse, we're using broken string matching to convert/match one identifying type for storage to another. We should take a step back and be very careful about what is a sensible thing to key/collapse items into.

This should take into consideration both the UI and the implementation efficiencies given the APIs that the quota service and the appcache / other data sources provide. The current implementation is extremely inefficient in some places right now. Ideally we should be able to avoid that.

I see at least 3 possible options here (where I've not fully thought through the pros and cons of each, so please improve on them as necessary!), but there might be more:

1) site data manager API should abstract things per protocol+host+port combination (uri.prePath). Then we can use that as an identifier and show it in the UI (as it's user-decipherable, unlike the origin attributes which might be included in origin). The site data manager jsm should be responsible for clearing *all* the stored data for that item, from any origin attribute, for summing up the totals across OA/private browsing items, etc.
2) show a separate column and/or separate columnS for OA bits so that users can remove data from specific containers / OA items.
3) Filter out items based on the OA (so if you open preferences in a particular container, we only show items from that container - this includes the default container and private browsing as 'separate' from the other contexts!)

What do we do for cookies right now? Or for other per-OA items that we have an overall list for?

::: browser/components/preferences/SiteDataManager.jsm:97
(Diff revision 3)
>  
>    _updateAppCache() {
>      let groups = this._appCache.getGroups();
>      for (let site of this._sites.values()) {
>        for (let group of groups) {
>          let uri = Services.io.newURI(group);

If there are 100 sites, you'll construct each group's URI 100 times. Better to put the group loop outside the sites loop.

::: browser/components/preferences/SiteDataManager.jsm:98
(Diff revision 3)
>    _updateAppCache() {
>      let groups = this._appCache.getGroups();
>      for (let site of this._sites.values()) {
>        for (let group of groups) {
>          let uri = Services.io.newURI(group);
> -        if (site.perm.matchesURI(uri, true)) {
> +        if (site.principal.origin.includes(uri.host)) {

This check is wrong right now because if you had an origin of `foobar.com` and a uri.host of `bar.com`, it would match, even though they're clearly different sites. There would also be problems for identical domains but different protocols or different ports. You should be careful to check the entire origin, *potentially* minus the origin attributes (ie everything after the `^`). I would expect nsIURI and/or nsIPrincipal to have methods for doing this - we shouldn't need to resort to hand-rolling string checks.

Getting these kinds of checks wrong in this kind of code can lead to dataloss or (outside of the preferences) security issues. Please be cautious when writing it.

::: browser/components/preferences/SiteDataManager.jsm:113
(Diff revision 3)
>        if (this._sites.size) {
>          let sites = this._sites;
>          let visitor = {
>            onCacheEntryInfo(uri, idEnhance, dataSize) {
>              for (let site of sites.values()) {
> -              if (site.perm.matchesURI(uri, true)) {
> +              if (site.principal.origin.includes(uri.host)) {

Same comment as for the `_updateAppCache case`.

::: browser/components/preferences/SiteDataManager.jsm:127
(Diff revision 3)
>            },
>            onCacheEntryVisitCompleted() {
>              resolve();
>            }
>          };
>          this._diskCache.asyncVisitStorage(visitor, true);

Generally this seems very inefficient. For every disk cache entry you will loop over all sites in `this._sites`, even though you should be able to identify the right sites based on the hostnames.
Attachment #8852806 - Flags: review?(gijskruitbosch+bugs) → review-
(In reply to :Gijs from comment #8)
> Comment on attachment 8852806 [details]
> Bug 1348733 - List sites using quota storage in Settings of Site Data
> 
> https://reviewboard.mozilla.org/r/124966/#review130422
> 
> The main issue is that we're creating a list of usage keyed by origin, but
> we then display that list based on hosts, and don't use origin attributes or
> protocols or ports. So it's possible to have an item occur in the list
> multiple times for different protocols, different ports, and different
> origin attributes (containers). Then in other places we use URIs and
> principals, and worse, we're using broken string matching to convert/match
> one identifying type for storage to another. We should take a step back and
> be very careful about what is a sensible thing to key/collapse items into.
> 
One reason of not showing protocol is that the Persistent Storage Web API (navigator.storage.persist) is only allowed within secure context, please see StorageManager.webidl at [1]. Expecting only `https://` so not displaying protocol to reduce complexity of info delivered. About port, usually not seeing port in website origins and users might be unfamiliar with port so not displaying port to reduce complexity of info delivered. However, if there were some key points to consider, please bring them up, should discuss to make improvements.

As for origin attributes, told the UX about this issue already(Not yet file a follow-up bug. Will do), like how to make user know different container from the UI. It would be confusing to display https://www.foo.com^userContextId=2. No one would know userContextId=2 means work container. This is a todo item for the v2 specs as well.

[1] https://bugzilla.mozilla.org/page.cgi?id=splinter.html&bug=1286717&attachment=8854499

> This should take into consideration both the UI and the implementation
> efficiencies given the APIs that the quota service and the appcache / other
> data sources provide. The current implementation is extremely inefficient in
> some places right now. Ideally we should be able to avoid that.
> 
> I see at least 3 possible options here (where I've not fully thought through
> the pros and cons of each, so please improve on them as necessary!), but
> there might be more:
> 
> 1) site data manager API should abstract things per protocol+host+port
> combination (uri.prePath). Then we can use that as an identifier and show it
> in the UI (as it's user-decipherable, unlike the origin attributes which
> might be included in origin). The site data manager jsm should be
> responsible for clearing *all* the stored data for that item, from any
> origin attribute, for summing up the totals across OA/private browsing
> items, etc.
Http cache and appcache probably not problems because easy to retrieve(just reload) and website wouldn't be broken down without them(plus considering appcache is deprecated). IndexedDB could be a discussion topic since more likely website would store important data in IndexedDB and what should be our removal policy for IndexedDB across different container types. One of pros is that this may be relatively simple to understand; only one protocol+host+port for one site and operation is applied to all containers. One of cons may be sacrificing the separation feature provided by container.

> 2) show a separate column and/or separate columnS for OA bits so that users
> can remove data from specific containers / OA items.
This is an option for the v2 specs. Maybe could combine container color for a visual hint. And not only userContextId but also other attributes such as privateBrowsingId to think about.

> 3) Filter out items based on the OA (so if you open preferences in a
> particular container, we only show items from that container - this includes
> the default container and private browsing as 'separate' from the other
> contexts!)
> 
Currently different container type is having different color on tab, like orange for work container. So I would say to some degree this approach would give user a consistent hint that your are editing Site Data settings in Preferences under a orange-tab container so only seeing sites under orange-tab container. The issues could be that other settings in Preferences may not be container-specific and by default Preferences is opened in default container (difficult for user to open a "orange-tab" Preferences)

> What do we do for cookies right now? Or for other per-OA items that we have an overall list for?
> 
Currently not considering OA. If we were removing IndexedDB per OA base, probably the same policy could applied to cookies, because we are respecting OA. If removing *All* IndexedDB disregarding OA, then removing all cookies for all OA would be consistent. Based on the current v1 specs, probably removing *All* policy is more suitable since no proper method to distinguish container/OA in the UI right now. That is, the option 1 above, collect sites with different OA under the same uri.prePath, user only sees one item(uri.prePath) on the list. And operation on that item would take effect for all OA types. How do you think about this topic?

> ::: browser/components/preferences/SiteDataManager.jsm:98
> (Diff revision 3)
> >    _updateAppCache() {
> >      let groups = this._appCache.getGroups();
> >      for (let site of this._sites.values()) {
> >        for (let group of groups) {
> >          let uri = Services.io.newURI(group);
> > -        if (site.perm.matchesURI(uri, true)) {
> > +        if (site.principal.origin.includes(uri.host)) {
> 
> This check is wrong right now because if you had an origin of `foobar.com`
> and a uri.host of `bar.com`, it would match, even though they're clearly
> different sites. There would also be problems for identical domains but
> different protocols or different ports. You should be careful to check the
> entire origin, *potentially* minus the origin attributes (ie everything
> after the `^`). I would expect nsIURI and/or nsIPrincipal to have methods
> for doing this - we shouldn't need to resort to hand-rolling string checks.
> 
> Getting these kinds of checks wrong in this kind of code can lead to
> dataloss or (outside of the preferences) security issues. Please be cautious
> when writing it.
> 
Thanks for pointing out this matching issue, will fix it. 
The reasons of collecting and removing http cache and appcache based on host(a loose condition) are:
  - The current UI design guideline[2] in the standard suggests removing based on schemeless origin group,
  - Removing http cache and appcache is less harmful for websites. Would not make sites broken down (For appcache, it affects the offline case. But, since could not do many things without internet and appcache is deprecated, the impact probably isn't high).
  - For website of https://www.foo.com, could have http cache of http://www.foo.com/img/logo.jpg

[2] https://storage.spec.whatwg.org/#ui-guidelines
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Fischer [:Fischer] from comment #9)
> (In reply to :Gijs from comment #8)
> > Comment on attachment 8852806 [details]
> > Bug 1348733 - List sites using quota storage in Settings of Site Data
> > 
> > https://reviewboard.mozilla.org/r/124966/#review130422
> > 
> > The main issue is that we're creating a list of usage keyed by origin, but
> > we then display that list based on hosts, and don't use origin attributes or
> > protocols or ports. So it's possible to have an item occur in the list
> > multiple times for different protocols, different ports, and different
> > origin attributes (containers). Then in other places we use URIs and
> > principals, and worse, we're using broken string matching to convert/match
> > one identifying type for storage to another. We should take a step back and
> > be very careful about what is a sensible thing to key/collapse items into.
> > 
> One reason of not showing protocol is that the Persistent Storage Web API
> (navigator.storage.persist) is only allowed within secure context, please
> see StorageManager.webidl at [1].

Sure, but we're showing both persistent and non-persistent data, and insecure contexts can have non-persistent data, right? So the result is you can have two separate origins (for the secure and insecure contexts).

> Expecting only `https://` so not
> displaying protocol to reduce complexity of info delivered.

That would be fine, but then you need to do the work to unify the origins and sum up their respective quota usages.

> About port,
> usually not seeing port in website origins and users might be unfamiliar
> with port so not displaying port to reduce complexity of info delivered.
> However, if there were some key points to consider, please bring them up,
> should discuss to make improvements.

The port would only be shown if not using the default - for an nsIURI of https://example.com/, hostPort or prePath are both going to use example.com/, not example.com:443/ .

If that's still objectionable, just like in the protocol case, we need to do the work of summing up the usages.

> 
> As for origin attributes, told the UX about this issue already(Not yet file
> a follow-up bug. Will do), like how to make user know different container
> from the UI. It would be confusing to display
> https://www.foo.com^userContextId=2. No one would know userContextId=2 means
> work container. This is a todo item for the v2 specs as well.

Which, once more, would be fine, but then we need to do the work to ensure we unify the entries, and remove *all* the relevant origins (protocols, ports and OA versions) when the user elects to remove an entry.



> > 1) site data manager API should abstract things per protocol+host+port
> > combination (uri.prePath). Then we can use that as an identifier and show it
> > in the UI (as it's user-decipherable, unlike the origin attributes which
> > might be included in origin). The site data manager jsm should be
> > responsible for clearing *all* the stored data for that item, from any
> > origin attribute, for summing up the totals across OA/private browsing
> > items, etc.
> Http cache and appcache probably not problems because easy to retrieve(just
> reload) and website wouldn't be broken down without them(plus considering
> appcache is deprecated). IndexedDB could be a discussion topic since more
> likely website would store important data in IndexedDB and what should be
> our removal policy for IndexedDB across different container types. One of
> pros is that this may be relatively simple to understand; only one
> protocol+host+port for one site and operation is applied to all containers.
> One of cons may be sacrificing the separation feature provided by container.

Right, but fixing that is v2, you just said... There's no point showing 2 entries that the user can't distinguish because there's nothing from which they can tell which is which...

> > 3) Filter out items based on the OA (so if you open preferences in a
> > particular container, we only show items from that container - this includes
> > the default container and private browsing as 'separate' from the other
> > contexts!)
> > 
> Currently different container type is having different color on tab, like
> orange for work container. So I would say to some degree this approach would
> give user a consistent hint that your are editing Site Data settings in
> Preferences under a orange-tab container so only seeing sites under
> orange-tab container. The issues could be that other settings in Preferences
> may not be container-specific and by default Preferences is opened in
> default container (difficult for user to open a "orange-tab" Preferences)

Yeah, so I expect (1) is the best.

> > What do we do for cookies right now? Or for other per-OA items that we have an overall list for?
> > 
> Currently not considering OA. If we were removing IndexedDB per OA base,
> probably the same policy could applied to cookies, because we are respecting
> OA. 

I'm confused. We show individual cookies, right? What happens if we have identically named cookies for example.com in 2 containers? Do they show up separately? Can the user distinguish them somehow?

> If removing *All* IndexedDB disregarding OA, then removing all cookies
> for all OA would be consistent. Based on the current v1 specs, probably
> removing *All* policy is more suitable since no proper method to distinguish
> container/OA in the UI right now. That is, the option 1 above, collect sites
> with different OA under the same uri.prePath, user only sees one
> item(uri.prePath) on the list. And operation on that item would take effect
> for all OA types. How do you think about this topic?

This makes sense to me.

> > ::: browser/components/preferences/SiteDataManager.jsm:98
> > (Diff revision 3)
> > >    _updateAppCache() {
> > >      let groups = this._appCache.getGroups();
> > >      for (let site of this._sites.values()) {
> > >        for (let group of groups) {
> > >          let uri = Services.io.newURI(group);
> > > -        if (site.perm.matchesURI(uri, true)) {
> > > +        if (site.principal.origin.includes(uri.host)) {
> > 
> > This check is wrong right now because if you had an origin of `foobar.com`
> > and a uri.host of `bar.com`, it would match, even though they're clearly
> > different sites. There would also be problems for identical domains but
> > different protocols or different ports. You should be careful to check the
> > entire origin, *potentially* minus the origin attributes (ie everything
> > after the `^`). I would expect nsIURI and/or nsIPrincipal to have methods
> > for doing this - we shouldn't need to resort to hand-rolling string checks.
> > 
> > Getting these kinds of checks wrong in this kind of code can lead to
> > dataloss or (outside of the preferences) security issues. Please be cautious
> > when writing it.
> > 
> Thanks for pointing out this matching issue, will fix it. 
> The reasons of collecting and removing http cache and appcache based on
> host(a loose condition) are:
>   - The current UI design guideline[2] in the standard suggests removing
> based on schemeless origin group,
>   - Removing http cache and appcache is less harmful for websites. Would not
> make sites broken down (For appcache, it affects the offline case. But,
> since could not do many things without internet and appcache is deprecated,
> the impact probably isn't high).
>   - For website of https://www.foo.com, could have http cache of
> http://www.foo.com/img/logo.jpg

Yes, just using the host (and deleting things for all OA, all ports, and all protocols) is fine, but we need to actually use the host properly. We should probably keep a normalized host per entry and a list of actual origins that need to be cleared, or something.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to :Gijs from comment #10)
> (In reply to Fischer [:Fischer] from comment #9)
> 
> > About port,
> > usually not seeing port in website origins and users might be unfamiliar
> > with port so not displaying port to reduce complexity of info delivered.
> > However, if there were some key points to consider, please bring them up,
> > should discuss to make improvements.
> 
> The port would only be shown if not using the default - for an nsIURI of
> https://example.com/, hostPort or prePath are both going to use
> example.com/, not example.com:443/ .
> 
> If that's still objectionable, just like in the protocol case, we need to do
> the work of summing up the usages.
> 
This is ok. Probably only a small group of people would see port since port is rarely seen on web pages and not really so scary.
And it would align with the option 1(per protocol+host+port).
 
> > > What do we do for cookies right now? Or for other per-OA items that we have an overall list for?
> > > 
> > Currently not considering OA. If we were removing IndexedDB per OA base,
> > probably the same policy could applied to cookies, because we are respecting
> > OA. 
> 
> I'm confused. We show individual cookies, right? What happens if we have
> identically named cookies for example.com in 2 containers? Do they show up
> separately? Can the user distinguish them somehow?
> 
We don't list individual cookies in Site Data Settings. We only list the sites under which we will remove cookies when doing removal operation.
About the Cookies dialog, as my try, suppose
 - https://www.foo.com saves a cookie of pet=dog
 - https://www.foo.com^userContextId=2 saves a cookie of pet=dog
then the Cookies dialog would list
 - www.foo.com
   - www.foo.com ,   pet
   - www.foo.com ,   pet

Which separate but unable to distinguish each other.

> > If removing *All* IndexedDB disregarding OA, then removing all cookies
> > for all OA would be consistent. Based on the current v1 specs, probably
> > removing *All* policy is more suitable since no proper method to distinguish
> > container/OA in the UI right now. That is, the option 1 above, collect sites
> > with different OA under the same uri.prePath, user only sees one
> > item(uri.prePath) on the list. And operation on that item would take effect
> > for all OA types. How do you think about this topic?
> 
> This makes sense to me.
OK, so the option 1 is most suitable for the scope of the v1 spec.

> > > ::: browser/components/preferences/SiteDataManager.jsm:98
> > > (Diff revision 3)
> > > >    _updateAppCache() {
> > > >      let groups = this._appCache.getGroups();
> > > >      for (let site of this._sites.values()) {
> > > >        for (let group of groups) {
> > > >          let uri = Services.io.newURI(group);
> > > > -        if (site.perm.matchesURI(uri, true)) {
> > > > +        if (site.principal.origin.includes(uri.host)) {
> > > 
> > > This check is wrong right now because if you had an origin of `foobar.com`
> > > and a uri.host of `bar.com`, it would match, even though they're clearly
> > > different sites. There would also be problems for identical domains but
> > > different protocols or different ports. You should be careful to check the
> > > entire origin, *potentially* minus the origin attributes (ie everything
> > > after the `^`). I would expect nsIURI and/or nsIPrincipal to have methods
> > > for doing this - we shouldn't need to resort to hand-rolling string checks.
> > > 
> > > Getting these kinds of checks wrong in this kind of code can lead to
> > > dataloss or (outside of the preferences) security issues. Please be cautious
> > > when writing it.
> > > 
> > Thanks for pointing out this matching issue, will fix it. 
> > The reasons of collecting and removing http cache and appcache based on
> > host(a loose condition) are:
> >   - The current UI design guideline[2] in the standard suggests removing
> > based on schemeless origin group,
> >   - Removing http cache and appcache is less harmful for websites. Would not
> > make sites broken down (For appcache, it affects the offline case. But,
> > since could not do many things without internet and appcache is deprecated,
> > the impact probably isn't high).
> >   - For website of https://www.foo.com, could have http cache of
> > http://www.foo.com/img/logo.jpg
> 
> Yes, just using the host (and deleting things for all OA, all ports, and all
> protocols) is fine, but we need to actually use the host properly. We should
> probably keep a normalized host per entry and a list of actual origins that
> need to be cleared, or something.
What do you mean by "keep a normalized host per entry and a list of actual origins that need to be cleared, or something."?
Would you please elaborate on this, thank you.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Fischer [:Fischer] from comment #11)
> > > > ::: browser/components/preferences/SiteDataManager.jsm:98
> > > > (Diff revision 3)
> > > > >    _updateAppCache() {
> > > > >      let groups = this._appCache.getGroups();
> > > > >      for (let site of this._sites.values()) {
> > > > >        for (let group of groups) {
> > > > >          let uri = Services.io.newURI(group);
> > > > > -        if (site.perm.matchesURI(uri, true)) {
> > > > > +        if (site.principal.origin.includes(uri.host)) {
> > > > 
> > > > This check is wrong right now because if you had an origin of `foobar.com`
> > > > and a uri.host of `bar.com`, it would match, even though they're clearly
> > > > different sites. There would also be problems for identical domains but
> > > > different protocols or different ports. You should be careful to check the
> > > > entire origin, *potentially* minus the origin attributes (ie everything
> > > > after the `^`). I would expect nsIURI and/or nsIPrincipal to have methods
> > > > for doing this - we shouldn't need to resort to hand-rolling string checks.
> > > > 
> > > > Getting these kinds of checks wrong in this kind of code can lead to
> > > > dataloss or (outside of the preferences) security issues. Please be cautious
> > > > when writing it.
> > > > 
> > > Thanks for pointing out this matching issue, will fix it. 
> > > The reasons of collecting and removing http cache and appcache based on
> > > host(a loose condition) are:
> > >   - The current UI design guideline[2] in the standard suggests removing
> > > based on schemeless origin group,
> > >   - Removing http cache and appcache is less harmful for websites. Would not
> > > make sites broken down (For appcache, it affects the offline case. But,
> > > since could not do many things without internet and appcache is deprecated,
> > > the impact probably isn't high).
> > >   - For website of https://www.foo.com, could have http cache of
> > > http://www.foo.com/img/logo.jpg
> > 
> > Yes, just using the host (and deleting things for all OA, all ports, and all
> > protocols) is fine, but we need to actually use the host properly. We should
> > probably keep a normalized host per entry and a list of actual origins that
> > need to be cleared, or something.
> What do you mean by "keep a normalized host per entry and a list of actual
> origins that need to be cleared, or something."?
> Would you please elaborate on this, thank you.

Well, we need to show only a single thing per host(+port) in the dialog, but there might be multiple entries (for different OA and/or protocol and/or port) as far as the quota manager and other APIs are concerned. So we'll need to be able to map from the entry that the user selects to remove, and the actual items we then need to pass to the relevant APIs to remove that data/permissions.
Flags: needinfo?(gijskruitbosch+bugs)
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

https://reviewboard.mozilla.org/r/124966/#review131436

::: browser/components/preferences/SiteDataManager.jsm:57
(Diff revision 4)
> -    let perm = null;
> -    let status = null;
> -    let e = Services.perms.enumerator;
> -    while (e.hasMoreElements()) {
> -      perm = e.getNext();
> -      status = Services.perms.testExactPermissionFromPrincipal(perm.principal, "persistent-storage");
> +        .then(results => {
> +          for (let result of results) {
> +            let principal =
> +              Services.scriptSecurityManager.createCodebasePrincipalFromOrigin(result.origin);
> +            let uri = principal.URI;
> +            if (uri.scheme == "http" || uri.scheme == "https") {

During testing, saw "about:" protocol returned, not yet find this is by design or bug of `getUage` API (will check with the DOM team). However, no matter it is by design or a bug, we only want to display "http" and "https" so adding a check here

::: browser/components/preferences/SiteDataManager.jsm:205
(Diff revision 4)
> -    this._qms.clearStoragesForPrincipal(site.perm.principal, null, true);
> +    return new Promise(resolve => {
> +      // We are clearing *All* across OAs so need a principal without suffix here,
> +      // or the call of `clearStoragesForPrincipal` would fail.
> +      let principal = Services.scriptSecurityManager
> +                              .createCodebasePrincipalFromOrigin(site.principals[0].originNoSuffix);
> +      let request = this._qms.clearStoragesForPrincipal(principal, null, true);

As the code comment, when clearing across OA the `clearStoragesForPrincipal` function only accpet default OA [1] so using `originNoSuffix` to create principal.

[1] https://dxr.mozilla.org/mozilla-central/rev/f914d40a48009c5acd1093e9939cc0ec035696dd/dom/quota/QuotaManagerService.cpp#681

::: browser/components/preferences/siteDataSettings.js:146
(Diff revision 4)
>  
> -      let statusStrId = data.status === Ci.nsIPermissionManager.ALLOW_ACTION ? "important" : "default";
> -      let size = DownloadUtils.convertByteUnits(data.usage);
> +      let statusStrId = site.status === Ci.nsIPermissionManager.ALLOW_ACTION ? "important" : "default";
> +      let size = DownloadUtils.convertByteUnits(site.usage);
>        let item = document.createElement("richlistitem");
> -      item.setAttribute("data-origin", data.uri.spec);
> -      item.setAttribute("host", host);
> +      item.setAttribute("data-origin", origin);
> +      item.setAttribute("site", origin);

Although right now `site` and `data-origin` save the same value - origin, in case one day the data format of `site` would change, here still keep `data-origin` as key to data so that could have some flexibility.
Attachment #8856991 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

Hi Gijs,

The patch has been split into 2 parts so as to let the review more focused (if, not a good way, please let me know).

Part 1: Update based on the option 1:
  - Group and list site per uri.prePath (origin without OA suffix)
  - Clear quota uasge and cookies across OAs.
  - Match htttp cache and app cache by host

Part 2: Updates for the tests.

Thank you
Attachment #8852806 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

https://reviewboard.mozilla.org/r/124966/#review131898

::: browser/components/preferences/SiteDataManager.jsm:29
(Diff revision 4)
>  
> -  // A Map of sites using the persistent-storage API (have requested persistent-storage permission)
> -  // Key is site's origin.
> +  // A Map of sites and their disk usage according to Quota Manager
> +  // Key is site's origin (uri.prePath).
>    // Value is one object holding:
> -  //   - perm: persistent-storage permision; instance of nsIPermission
> -  //   - status: the permission granted/rejected status
> +  //   - principals: instances of nsIPrincipal. Because one site could save data in different container,
> +  //                 that is, there could be https://www.foo.com and https://www.foo.com^userContainerId=2

Nit: userContextId, not userContainerId

::: browser/components/preferences/SiteDataManager.jsm:108
(Diff revision 4)
> +    if (this._sites.size == 0) {
> +      return;
> +    }

Isn't it possible for there to be 0 quota managed sites but for appcache to have a non-0 number of users?

::: browser/components/preferences/SiteDataManager.jsm:116
(Diff revision 4)
> +        let p = site.principals[0];
> +        if (p.URI.host == uri.host) {

So, I realized that I was wrong about prePath - well, maybe. In particular, prePath can contain user/pass combinations (though I don't know if those are stripped by all the APIs we deal with here, let's not take the risk). You talked about schemeless origins earlier, in which case I think we should be using uri.hostPort.

This code, then, should also compare hostPort (or prePath), I think, to avoid storing the same group for multiple sites that share a host but not a protocol and/or port.

More generally, I think you should be able to remove one of the loops as follows:

```js
for (let group of groups) {
  let uri;
  try {
    uri = Services.io.newURI(group);
    uri.hostPort; // check that this doesn't throw, either.
  } catch (ex) { continue; } // ignore invalid URIs.
  let site = this._sites.get(uri.hostPort);
  if (!site) {
    // What to do if an appcache entry has no item in the _sites list?
    // This case is already not handled by the code here...
  } else {
    let cache = this._appCache.getActiveCache(group);
    site.appCacheList.push(cache);
  }
}
```

::: browser/components/preferences/SiteDataManager.jsm:139
(Diff revision 4)
> -          onCacheEntryInfo(uri, idEnhance, dataSize) {
> +        onCacheEntryInfo(uri, idEnhance, dataSize) {
> -            for (let site of sites.values()) {
> +          let site = map.get(uri.host);

Shouldn't we use uri.hostPort here, too? Why use the host only?

Also, uri.host and uri.hostPort can throw for some URLs...

::: browser/components/preferences/SiteDataManager.jsm:141
(Diff revision 4)
> +        map.set(uri.host, site);
> +      }
> -        let visitor = {
> +      let visitor = {
> -          onCacheEntryInfo(uri, idEnhance, dataSize) {
> +        onCacheEntryInfo(uri, idEnhance, dataSize) {
> -            for (let site of sites.values()) {
> -              if (site.perm.matchesURI(uri, true)) {
> +          let site = map.get(uri.host);
> +          if (site) {

What happens if there's no site but we do have a cache entry info?

::: browser/components/preferences/SiteDataManager.jsm:196
(Diff revision 4)
>                      return list;
>                    });
>    },
>  
>    _removePermission(site) {
> -    Services.perms.removePermission(site.perm);
> +    Services.perms.removeFromPrincipal(site.principals[0], "persistent-storage");

The comment about permission being shared across containers would probably be helpful here, too.

::: browser/components/preferences/SiteDataManager.jsm:206
(Diff revision 4)
> +      // We are clearing *All* across OAs so need a principal without suffix here,
> +      // or the call of `clearStoragesForPrincipal` would fail.
> +      let principal = Services.scriptSecurityManager
> +                              .createCodebasePrincipalFromOrigin(site.principals[0].originNoSuffix);
> +      let request = this._qms.clearStoragesForPrincipal(principal, null, true);
> +      request.callback = () => resolve();

You can just:

```js
request.callback = resolve;
```

right?

::: browser/components/preferences/SiteDataManager.jsm:226
(Diff revision 4)
>  
>    _removeCookie(site) {
> -    let host = site.perm.principal.URI.host;
> -    let e = Services.cookies.getCookiesFromHost(host, {});
> +    for (let principal of site.principals) {
> +      removeCookiesFromHost(principal);
> +    }
> +    function removeCookiesFromHost(principal) {

This should be called `removeCookiesFromPrincipal`.

::: browser/components/preferences/SiteDataManager.jsm:251
(Diff revision 4)
> -        this._removeQuotaUsage(site);
>          this._removeDiskCache(site);
>          this._removeAppCache(site);
>          this._removeCookie(site);
> +        promises.push(this._removeQuotaUsage(site));
>        }

So, this is guaranteed to exist, right? In what case does it not? Should we throw for that case separately?
Attachment #8852806 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8856991 [details]
Part 2: Bug 1348733 - Update tests for site data manager now that we use quota storage manager,

https://reviewboard.mozilla.org/r/128904/#review131906

I think splitting this is fine but let's figure out the code changes first as the test will need to be updated for those...
Attachment #8856991 - Flags: review?(gijskruitbosch+bugs)
(In reply to :Gijs from comment #17)
> Comment on attachment 8852806 [details]
> Part 1: Bug 1348733 - List sites using quota storage in Settings of Site Data
> 
> https://reviewboard.mozilla.org/r/124966/#review131898
> 
> ::: browser/components/preferences/SiteDataManager.jsm:29 (Diff revision 4)
> > +  //   - principals: instances of nsIPrincipal. Because one site could save data in different container,
> > +  //                 that is, there could be https://www.foo.com and https://www.foo.com^userContainerId=2
> 
> Nit: userContextId, not userContainerId
> 
Sorry for this typo. Will fix it.

> ::: browser/components/preferences/SiteDataManager.jsm:108
> (Diff revision 4)
> > +    if (this._sites.size == 0) {
> > +      return;
> > +    }
> 
> Isn't it possible for there to be 0 quota managed sites but for appcache to
> have a non-0 number of users?
> 
Yes, there could be a chance. Because appcache is deprecated and not a popular feature, here doesn't give it too much attention.
So those not listed would be removed during the clear-all operation.

> ::: browser/components/preferences/SiteDataManager.jsm:116
> (Diff revision 4)
> > +        let p = site.principals[0];
> > +        if (p.URI.host == uri.host) {
> 
> So, I realized that I was wrong about prePath - well, maybe. In particular,
> prePath can contain user/pass combinations (though I don't know if those are
> stripped by all the APIs we deal with here, let's not take the risk). You
> talked about schemeless origins earlier, in which case I think we should be
> using uri.hostPort.
> 
Maybe could use `principal.origin` as our internal site map key. 
Consider "https://user:pw@www.mozilla.com:987",
```
 var principal = Services.scriptSecurityManager
                         .createCodebasePrincipalFromOrigin("https://user:pw@www.mozilla.com:987");
```
principal.origin is https://www.mozilla.com:987
principal.URI.prePath is https://user:pw@www.mozilla.com:987


> This code, then, should also compare hostPort (or prePath), I think, to
> avoid storing the same group for multiple sites that share a host but not a
> protocol and/or port.
> 
> More generally, I think you should be able to remove one of the loops as
> follows:
> 
> ```js
> for (let group of groups) {
>   let uri;
>   try {
>     uri = Services.io.newURI(group);
>     uri.hostPort; // check that this doesn't throw, either.
>   } catch (ex) { continue; } // ignore invalid URIs.
>   let site = this._sites.get(uri.hostPort);
>   if (!site) {
>     // What to do if an appcache entry has no item in the _sites list?
>     // This case is already not handled by the code here...
>   } else {
>     let cache = this._appCache.getActiveCache(group);
>     site.appCacheList.push(cache);
>   }
> }
> ```
> 
OK, matching hostPort for appcache since, like your saying, port means different for appcache.

> ::: browser/components/preferences/SiteDataManager.jsm:139
> (Diff revision 4)
> > -          onCacheEntryInfo(uri, idEnhance, dataSize) {
> > +        onCacheEntryInfo(uri, idEnhance, dataSize) {
> > -            for (let site of sites.values()) {
> > +          let site = map.get(uri.host);
> 
> Shouldn't we use uri.hostPort here, too? Why use the host only?
> 
> Also, uri.host and uri.hostPort can throw for some URLs...
> 
Thank you for reminding this throw issue.

For the http cache case, the reason using host is http cache is really broad.
What "broad" means here is that after loading a, for example, https://www.foo.com, there could lots of cache saved, such as,
 - https://www.foo.com/css/theme.css
 - http://www.foo.com:123/shared/img/logo.jpg
 - https://www.abc.com/style/cool-black-theme.css
Ideally, clearing and summing up usage of these caches for https://www.foo.com would be good because of providing a more accurate number of usage.
However, no way to know which cache was brought by which, at least, we cannot know https://www.abc.com/style/cool-black-theme.css is brought by https://www.foo.com. And consider, maybe the same cache resource would be brought by https://www.foo.com and http://www.foo.com:123 both (Just a matter of load time). So here take one "proximity" approach of using host.


> ::: browser/components/preferences/SiteDataManager.jsm:141
> (Diff revision 4)
> > +        map.set(uri.host, site);
> > +      }
> > -        let visitor = {
> > +      let visitor = {
> > -          onCacheEntryInfo(uri, idEnhance, dataSize) {
> > +        onCacheEntryInfo(uri, idEnhance, dataSize) {
> > -            for (let site of sites.values()) {
> > -              if (site.perm.matchesURI(uri, true)) {
> > +          let site = map.get(uri.host);
> > +          if (site) {
> 
> What happens if there's no site but we do have a cache entry info?
> 
It wouldn't count. Still can be removed by clear-all function.
The site data section (storage standard) is trying to provide a "persistent" storage feature(primary persistent indexedDB) and while listing these usage site by site, also collect other cache, cookie, appcache usage relating to site.

> ::: browser/components/preferences/SiteDataManager.jsm:196
> (Diff revision 4)
> >                      return list;
> >                    });
> >    },
> >  
> >    _removePermission(site) {
> > -    Services.perms.removePermission(site.perm);
> > +    Services.perms.removeFromPrincipal(site.principals[0], "persistent-storage");
> 
> The comment about permission being shared across containers would probably
> be helpful here, too.
> 
> ::: browser/components/preferences/SiteDataManager.jsm:206
> (Diff revision 4)
> > +      // We are clearing *All* across OAs so need a principal without suffix here,
> > +      // or the call of `clearStoragesForPrincipal` would fail.
> > +      let principal = Services.scriptSecurityManager
> > +                              .createCodebasePrincipalFromOrigin(site.principals[0].originNoSuffix);
> > +      let request = this._qms.clearStoragesForPrincipal(principal, null, true);
> > +      request.callback = () => resolve();
> 
> You can just:
> 
> ```js
> request.callback = resolve;
> ```
> 
> right?
> 
Thanks, will update.

> ::: browser/components/preferences/SiteDataManager.jsm:226
> (Diff revision 4)
> >  
> >    _removeCookie(site) {
> > -    let host = site.perm.principal.URI.host;
> > -    let e = Services.cookies.getCookiesFromHost(host, {});
> > +    for (let principal of site.principals) {
> > +      removeCookiesFromHost(principal);
> > +    }
> > +    function removeCookiesFromHost(principal) {
> 
> This should be called `removeCookiesFromPrincipal`.
> 
Thanks, will update.

> ::: browser/components/preferences/SiteDataManager.jsm:251
> (Diff revision 4)
> > -        this._removeQuotaUsage(site);
> >          this._removeDiskCache(site);
> >          this._removeAppCache(site);
> >          this._removeCookie(site);
> > +        promises.push(this._removeQuotaUsage(site));
> >        }
> 
> So, this is guaranteed to exist, right? In what case does it not? Should we
> throw for that case separately?
Do you mean what if the operation of removeQuotaUsage failed?
Thank you.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Fischer [:Fischer] from comment #19)
> (In reply to :Gijs from comment #17)
> > ::: browser/components/preferences/SiteDataManager.jsm:108
> > (Diff revision 4)
> > > +    if (this._sites.size == 0) {
> > > +      return;
> > > +    }
> > 
> > Isn't it possible for there to be 0 quota managed sites but for appcache to
> > have a non-0 number of users?
> > 
> Yes, there could be a chance. Because appcache is deprecated and not a
> popular feature, here doesn't give it too much attention.
> So those not listed would be removed during the clear-all operation.

I don't think this is a good idea. We can just remove the early return.

More generally, I think we should add sites to the map if we find appcache or http cache entries for them and they don't have their own entry based on the quota manager.

> > ::: browser/components/preferences/SiteDataManager.jsm:116
> > (Diff revision 4)
> > > +        let p = site.principals[0];
> > > +        if (p.URI.host == uri.host) {
> > 
> > So, I realized that I was wrong about prePath - well, maybe. In particular,
> > prePath can contain user/pass combinations (though I don't know if those are
> > stripped by all the APIs we deal with here, let's not take the risk). You
> > talked about schemeless origins earlier, in which case I think we should be
> > using uri.hostPort.
> > 
> Maybe could use `principal.origin` as our internal site map key. 
> Consider "https://user:pw@www.mozilla.com:987",
> ```
>  var principal = Services.scriptSecurityManager
>                         
> .createCodebasePrincipalFromOrigin("https://user:pw@www.mozilla.com:987");
> ```
> principal.origin is https://www.mozilla.com:987
> principal.URI.prePath is https://user:pw@www.mozilla.com:987

Well, that would include the protocol, and I think that if we don't display the protocol, we should have 1 entry for foo.com that covers both https://foo.com and http://foo.com . Hence my suggestion to use hostPort.

> > ::: browser/components/preferences/SiteDataManager.jsm:139
> > (Diff revision 4)
> > > -          onCacheEntryInfo(uri, idEnhance, dataSize) {
> > > +        onCacheEntryInfo(uri, idEnhance, dataSize) {
> > > -            for (let site of sites.values()) {
> > > +          let site = map.get(uri.host);
> > 
> > Shouldn't we use uri.hostPort here, too? Why use the host only?
> > 
> > Also, uri.host and uri.hostPort can throw for some URLs...
> > 
> Thank you for reminding this throw issue.
> 
> For the http cache case, the reason using host is http cache is really broad.
> What "broad" means here is that after loading a, for example,
> https://www.foo.com, there could lots of cache saved, such as,
>  - https://www.foo.com/css/theme.css
>  - http://www.foo.com:123/shared/img/logo.jpg
>  - https://www.abc.com/style/cool-black-theme.css
> Ideally, clearing and summing up usage of these caches for
> https://www.foo.com would be good because of providing a more accurate
> number of usage.
> However, no way to know which cache was brought by which, at least, we
> cannot know https://www.abc.com/style/cool-black-theme.css is brought by
> https://www.foo.com. And consider, maybe the same cache resource would be
> brought by https://www.foo.com and http://www.foo.com:123 both (Just a
> matter of load time). So here take one "proximity" approach of using host.

OK. So if we key the `sites` map by hostPort, we can query that map directly by uri.host, and if that fails, by uri.hostPort, and that should normally work (but see below).

> > ::: browser/components/preferences/SiteDataManager.jsm:141
> > (Diff revision 4)
> > > +        map.set(uri.host, site);
> > > +      }
> > > -        let visitor = {
> > > +      let visitor = {
> > > -          onCacheEntryInfo(uri, idEnhance, dataSize) {
> > > +        onCacheEntryInfo(uri, idEnhance, dataSize) {
> > > -            for (let site of sites.values()) {
> > > -              if (site.perm.matchesURI(uri, true)) {
> > > +          let site = map.get(uri.host);
> > > +          if (site) {
> > 
> > What happens if there's no site but we do have a cache entry info?
> > 
> It wouldn't count. Still can be removed by clear-all function.
> The site data section (storage standard) is trying to provide a "persistent"
> storage feature(primary persistent indexedDB) and while listing these usage
> site by site, also collect other cache, cookie, appcache usage relating to
> site.

I think here we should create a new site object and put it in the sites map instead, so that it does show up. I realize that the "clear all" function would also remove it, but I think it might mislead the user if we do store data for a domain but it does not show up in this list.

> > ::: browser/components/preferences/SiteDataManager.jsm:251
> > (Diff revision 4)
> > > -        this._removeQuotaUsage(site);
> > >          this._removeDiskCache(site);
> > >          this._removeAppCache(site);
> > >          this._removeCookie(site);
> > > +        promises.push(this._removeQuotaUsage(site));
> > >        }
> > 
> > So, this is guaranteed to exist, right? In what case does it not? Should we
> > throw for that case separately?
> Do you mean what if the operation of removeQuotaUsage failed?
> Thank you.

No, I mean we're passing in an array of strings and then we ask the sites map for an entry keyed to that string, and check if (site) { ... } and then pass `site` to all the _remove* functions. The fact that we're null-checking `site` implies that it's possible someone passes us something that's not in our list of sites. Shouldn't we throw an error in that case, or in some other way improve how we pass this data to avoid that case? It would be bad if it was possible for consumers to pass a whole list of URLs that lead to 0 data being deleted, and we wouldn't tell the consumer in any way.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to :Gijs from comment #20)
> (In reply to Fischer [:Fischer] from comment #19)
> > (In reply to :Gijs from comment #17)
> > > ::: browser/components/preferences/SiteDataManager.jsm:108
> > > (Diff revision 4)
> > > > +    if (this._sites.size == 0) {
> > > > +      return;
> > > > +    }
> > > 
> > > Isn't it possible for there to be 0 quota managed sites but for appcache to
> > > have a non-0 number of users?
> > > 
> > Yes, there could be a chance. Because appcache is deprecated and not a
> > popular feature, here doesn't give it too much attention.
> > So those not listed would be removed during the clear-all operation.
> 
> I don't think this is a good idea. We can just remove the early return.
> 
> More generally, I think we should add sites to the map if we find appcache
> or http cache entries for them and they don't have their own entry based on
> the quota manager.
> 
> > > ::: browser/components/preferences/SiteDataManager.jsm:141 (Diff revision 4)
> > > > +        map.set(uri.host, site);
> > > > +      }
> > > > -        let visitor = {
> > > > +      let visitor = {
> > > > -          onCacheEntryInfo(uri, idEnhance, dataSize) {
> > > > +        onCacheEntryInfo(uri, idEnhance, dataSize) {
> > > > -            for (let site of sites.values()) {
> > > > -              if (site.perm.matchesURI(uri, true)) {
> > > > +          let site = map.get(uri.host);
> > > > +          if (site) {
> > > 
> > > What happens if there's no site but we do have a cache entry info?
> > > 
> > It wouldn't count. Still can be removed by clear-all function.
> > The site data section (storage standard) is trying to provide a "persistent"
> > storage feature(primary persistent indexedDB) and while listing these usage
> > site by site, also collect other cache, cookie, appcache usage relating to
> > site.
> 
> I think here we should create a new site object and put it in the sites map
> instead, so that it does show up. I realize that the "clear all" function
> would also remove it, but I think it might mislead the user if we do store
> data for a domain but it does not show up in this list.
> 
OK, do you mean that seeing this "site data" from a user's perspective?
Which means since called "site data", it would be better to treat indexedDB, localstorage, appcache, http cache all as *data*.
Maybe a site doesn't store indexedDB but stores appcache/http cache, it would be better still to display this site on the list.
In this manner, the list is consistently displaying sites storing data. The main difference is that some sites store persistent indexedDB data and some don't.


> > > ::: browser/components/preferences/SiteDataManager.jsm:251
> > > (Diff revision 4)
> > > > -        this._removeQuotaUsage(site);
> > > >          this._removeDiskCache(site);
> > > >          this._removeAppCache(site);
> > > >          this._removeCookie(site);
> > > > +        promises.push(this._removeQuotaUsage(site));
> > > >        }
> > > 
> > > So, this is guaranteed to exist, right? In what case does it not? Should we
> > > throw for that case separately?
> > Do you mean what if the operation of removeQuotaUsage failed?
> > Thank you.
> 
> No, I mean we're passing in an array of strings and then we ask the sites
> map for an entry keyed to that string, and check if (site) { ... } and then
> pass `site` to all the _remove* functions. The fact that we're null-checking
> `site` implies that it's possible someone passes us something that's not in
> our list of sites. Shouldn't we throw an error in that case, or in some
> other way improve how we pass this data to avoid that case? It would be bad
> if it was possible for consumers to pass a whole list of URLs that lead to 0
> data being deleted, and we wouldn't tell the consumer in any way.
Thanks for reminding this. A throw will be put here since SiteDataManager.jsm may be used by other scripts.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Fischer [:Fischer] from comment #21)
> > > > ::: browser/components/preferences/SiteDataManager.jsm:141 (Diff revision 4)
> > > > > +        map.set(uri.host, site);
> > > > > +      }
> > > > > -        let visitor = {
> > > > > +      let visitor = {
> > > > > -          onCacheEntryInfo(uri, idEnhance, dataSize) {
> > > > > +        onCacheEntryInfo(uri, idEnhance, dataSize) {
> > > > > -            for (let site of sites.values()) {
> > > > > -              if (site.perm.matchesURI(uri, true)) {
> > > > > +          let site = map.get(uri.host);
> > > > > +          if (site) {
> > > > 
> > > > What happens if there's no site but we do have a cache entry info?
> > > > 
> > > It wouldn't count. Still can be removed by clear-all function.
> > > The site data section (storage standard) is trying to provide a "persistent"
> > > storage feature(primary persistent indexedDB) and while listing these usage
> > > site by site, also collect other cache, cookie, appcache usage relating to
> > > site.
> > 
> > I think here we should create a new site object and put it in the sites map
> > instead, so that it does show up. I realize that the "clear all" function
> > would also remove it, but I think it might mislead the user if we do store
> > data for a domain but it does not show up in this list.
> > 
> OK, do you mean that seeing this "site data" from a user's perspective?
> Which means since called "site data", it would be better to treat indexedDB,
> localstorage, appcache, http cache all as *data*.
> Maybe a site doesn't store indexedDB but stores appcache/http cache, it
> would be better still to display this site on the list.
> In this manner, the list is consistently displaying sites storing data. The
> main difference is that some sites store persistent indexedDB data and some
> don't.

Yes, exactly. It's possible we can simplify this code further in future if/when more types of data will be taken into account by the quota manager, but until then we should make sure that all types of data show up in here. That will also help with then removing the other lists in about:preferences, simplifying our UI.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to :Gijs from comment #22)
> (In reply to Fischer [:Fischer] from comment #21)
> > OK, do you mean that seeing this "site data" from a user's perspective?
> > Which means since called "site data", it would be better to treat indexedDB,
> > localstorage, appcache, http cache all as *data*.
> > Maybe a site doesn't store indexedDB but stores appcache/http cache, it
> > would be better still to display this site on the list.
> > In this manner, the list is consistently displaying sites storing data. The
> > main difference is that some sites store persistent indexedDB data and some
> > don't.
> 
> Yes, exactly. It's possible we can simplify this code further in future
> if/when more types of data will be taken into account by the quota manager,
> but until then we should make sure that all types of data show up in here.
> That will also help with then removing the other lists in about:preferences,
> simplifying our UI.
Yes, this is in fact what UX is thinking next. They are planning how to approach these data-related parts to deliver a more consistent data management in Preferences.
Whiteboard: [storage-v1]
Depends on: 1356161
Gijs, it seems that while before "Offline Web Content and User Data" listed a fair number of entries, the new "Site Data" lists nothing. I believe basing it on QuotaManager would at least keep what we had there previously (or is appcache really separate from that?) and would add more over time as QuotaManager becomes more capable (e.g., gets localStorage added).

I agree that long term it would be great if we could also integrate HTTP cache and HTTP cookies in some reasonable fashion, but given that those are not origin-based and therefore have to be stored differently for web compatibility reasons, it doesn't seem good to block an obvious short term improvement (basing this on QuotaManager rather than the permission manager which is useless for this) over something that might take a long time to get right.
(And in particular, it's my understanding that if we don't land this and don't find someone to do the work you want to be done (which is a lot of design work as you need to work through all the implications of removing data with different scopes and how to present that, etc.), we'll just end up shipping what is there now, which is something that nobody wants, I hope.)
(In reply to Anne (:annevk) from comment #25)
> (And in particular, it's my understanding that if we don't land this and
> don't find someone to do the work you want to be done (which is a lot of
> design work as you need to work through all the implications of removing
> data with different scopes and how to present that, etc.), we'll just end up
> shipping what is there now, which is something that nobody wants, I hope.)

There aren't any questions here, so I'm not sure what your expectation is, but just to clarify how I'm approaching this:

1) it's behind a pref, so we don't need to ship anything. We can just pref it off if it's not ready to ship.
2) the dialog as-is already displays appcache and other items (or at least, there is extant code for this before this patch). I'm not asking for that to be added; we're changing how we list quota manager and other data, and we should be reconciling the list correctly. That doesn't seem controversial.
3) The issues I'm raising regarding duplicate principals / origins for quota manager (ie getting something like 5 items for "foo.com" that are indistinguishable to the user but really have different ports/protocols/origin attributes), and how we display them, are there irrespective of whether we include appcache and other items.

(In reply to Anne (:annevk) from comment #24)
> Gijs, it seems that while before "Offline Web Content and User Data" listed
> a fair number of entries, the new "Site Data" lists nothing.

This feels like a separate bug that just needs investigating.
Based on the con-call meeting, the specs has been changed as below:
- exclude counting and removing http cache per site
- group sites based on host across scheme, port and origin attributes
- list sites using quota usage or appcache
Summary: List sites using quota storage in Settings of Site Data → List sites using quota storage or appcache in Settings of Site Data
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

Sorry Gijs, cancel the r? to make some adjustment
Attachment #8852806 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

https://reviewboard.mozilla.org/r/124966/#review139808

This is almost ready. Comments (mostly, but not all, nits) below. Thanks!

::: browser/components/preferences/SiteDataManager.jsm:56
(Diff revision 6)
> -        this._sites.set(perm.principal.URI.spec, {
> -          perm,
> +                site = {
> +                  persisted: false,
> -          status,
> -          quotaUsage: 0,
> +                  quotaUsage: 0,
> -          appCacheList: [],
> -          diskCacheList: []
> +                  principals: [],
> +                  appCacheList: []

Nit: trailing comma please.

::: browser/components/preferences/SiteDataManager.jsm:111
(Diff revision 6)
> +      if (!site) {
> +        site = {
> +          persisted: false,
> +          quotaUsage: 0,
> +          principals: [ principal ],
> +          appCacheList: []

Nit: trailing comma please

::: browser/components/preferences/SiteDataManager.jsm:114
(Diff revision 6)
> +          quotaUsage: 0,
> +          principals: [ principal ],
> +          appCacheList: []
> +        };
> +        this._sites.set(uri.host, site);
> +      } else if (!site.principals.find(p => p.origin == principal.origin)) {

Nit: use `some` rather than `find`

::: browser/components/preferences/SiteDataManager.jsm:123
(Diff revision 6)
> -      }
> -    });
>    },
>  
>    getTotalUsage() {
> -    return Promise.all([this._updateQuotaPromise, this._updateDiskCachePromise])
> +    return Promise.all([this._getQuotaUsagePromise])

Looks like this can stop using Promise.all? Same below.

::: browser/components/preferences/SiteDataManager.jsm:142
(Diff revision 6)
>                        let cache = null;
>                        let usage = site.quotaUsage;
>                        for (cache of site.appCacheList) {
>                          usage += cache.usage;
>                        }

Nit: omit `let cache = null;` and just `for (let cache of site.appCacheList)`

::: browser/components/preferences/SiteDataManager.jsm:160
(Diff revision 6)
>    },
>  
>    _removePermission(site) {
> -    Services.perms.removePermission(site.perm);
> +    let removals = [];
> +    for (let principal of site.principals) {
> +      let originNoSuffix = principal.originNoSuffix;

Nit: here and below:

`let {originNoSuffix} = principal;`

::: browser/components/preferences/SiteDataManager.jsm:161
(Diff revision 6)
>  
>    _removePermission(site) {
> -    Services.perms.removePermission(site.perm);
> +    let removals = [];
> +    for (let principal of site.principals) {
> +      let originNoSuffix = principal.originNoSuffix;
> +      if (removals.indexOf(originNoSuffix) >= 0) {

Nits for here and `_removeQuotaUsage`: use a set instead of an array:

`let removals = new Set()`

`if (removals.has(originNoSuffix))`

`removals.add(originNoSuffix)`

::: browser/components/preferences/SiteDataManager.jsm:234
(Diff revision 6)
>          this._removeCookie(site);
> +        promises.push(this._removeQuotaUsage(site));
> +      } else {
> +        if (promises.length > 0) {
> +          // Had removed some sites.
> +          // Let's still remeber to update sites before throwing.

Nit: remember

::: browser/components/preferences/SiteDataManager.jsm:247
(Diff revision 6)
>  
>    removeAll() {
> +    let promises = [];
>      for (let site of this._sites.values()) {
>        this._removePermission(site);
> -      this._removeQuotaUsage(site);
> +      promises.push(this._removeQuotaUsage(site));

Instead of doing this, can't we just use `this._qms.clear();` ?

::: browser/components/preferences/siteDataSettings.js:19
(Diff revision 6)
>  
>  "use strict";
>  
>  let gSiteDataSettings = {
>  
>    // Array of meatdata of sites. Each array element is object holding:

Happened to notice this when looking for context... while "meatdata" is really funny, can you correct to "metadata" please? :-)

::: browser/components/preferences/siteDataSettings.js:42
(Diff revision 6)
>      this._list = document.getElementById("sitesList");
>      this._searchBox = document.getElementById("searchBox");
>      this._prefStrBundle = document.getElementById("bundlePreferences");
>      SiteDataManager.getSites().then(sites => {
>        this._sites = sites;
> -      let sortCol = document.getElementById("hostCol");
> +      let sortCol = document.querySelector("treecol[data-isCurrentSortCol=true]");

This looks like it's fixing an unrelated bug. Can you document this in the commit message?

::: browser/components/preferences/siteDataSettings.js:164
(Diff revision 6)
> -      let origin = item.getAttribute("data-origin");
> +      let host = item.getAttribute("site");
>        for (let site of this._sites) {
> -        if (site.uri.spec === origin) {
> +        if (site.host === host) {
>            site.userAction = "remove";
>            break;
>          }
>        }

Nit:

```js
let host = item.getAttribute("site");
let siteForHost = this._sites.find(site => site.host == host);
if (siteForHost) {
  siteForHost.userAction = "remove";
}
```

::: browser/components/preferences/siteDataSettings.js:209
(Diff revision 6)
>          }
>        } else {
>          // User only removes partial sites.
>          // We will remove cookies based on base domain, say, user selects "news.foo.com" to remove.
>          // The cookies under "music.foo.com" will be removed together.
>          // We have to prmopt user about this action.

Nit: prompt

::: browser/components/preferences/siteDataSettings.js:243
(Diff revision 6)
> +          try {
> -          SiteDataManager.remove(removals);
> +            SiteDataManager.remove(removals);
> +          } catch (e) {
> +            // Hit error, maybe remove unknown site.
> +            // Let's print out the error, then proceed to close this settings dialog.
> +            // By the next time of open again, we will get sites from SiteDataManager and refresh the list.

English nit:

When we next open again we will once more get sites from the SiteDataManager and refresh the list.

::: browser/components/preferences/siteDataSettings.js:244
(Diff revision 6)
> -          SiteDataManager.remove(removals);
> +            SiteDataManager.remove(removals);
> +          } catch (e) {
> +            // Hit error, maybe remove unknown site.
> +            // Let's print out the error, then proceed to close this settings dialog.
> +            // By the next time of open again, we will get sites from SiteDataManager and refresh the list.
> +            console.error(e);

Please use Cu.reportError instead.

::: browser/components/preferences/siteDataSettings.xul:41
(Diff revision 6)
>  
>      <richlistbox id="sitesList" orient="vertical" flex="1">
>        <listheader>
> -        <treecol flex="4" width="50" label="&hostCol.label;" id="hostCol"/>
> +        <treecol flex="4" width="50" label="&siteCol.label;" id="siteCol"/>
>          <treecol flex="2" width="50" label="&statusCol.label;" id="statusCol"/>
> -        <treecol flex="1" width="50" label="&usageCol.label;" id="usageCol"/>
> +        <!-- Sorted in usage by default so user could quickly figure out usages in the 1st glance. -->

In English sorting is done "by" something, not "in" something, so consider:

"Sorted by usage so the user can quickly see which sites use the most data."


I'm assuming, at least, that we're sorting descending, ie large usage at the top? If not, maybe that's worth doing?

::: browser/locales/en-US/chrome/browser/preferences/siteDataSettings.dtd:6
(Diff revision 6)
>  <!-- This Source Code Form is subject to the terms of the Mozilla Public
>     - License, v. 2.0. If a copy of the MPL was not distributed with this
>     - file, You can obtain one at http://mozilla.org/MPL/2.0/. -->
>  
>  <!ENTITY     window.title                  "Settings - Site Data">
> -<!ENTITY     hostCol.label                 "Site">
> +<!ENTITY     settings.description          "The following websites asked to store site data in your disk. You can specify which websites are allowed to store site data. Default site data is temporary and could be deleted automatically.">

This looks like it's unused? Why are you adding it?
Attachment #8852806 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8856991 [details]
Part 2: Bug 1348733 - Update tests for site data manager now that we use quota storage manager,

https://reviewboard.mozilla.org/r/128904/#review139812

This is basically done but I'm confused about the manifest being wrong and that not causing the test to fail... do you know what's going on there? Maybe we're caching the manifest as well, and that's enough? In which case, do we really need the extra file?

::: commit-message-943b6:1
(Diff revision 3)
> +Part 2: Bug 1348733 - Update tests, r?Gijs

Please make the first line slightly more descriptive, e.g. "update tests for site data manager now that we use quota storage manager"

::: browser/components/preferences/in-content-old/tests/browser_advanced_siteData.js:547
(Diff revision 3)
> +      principal: Services.scriptSecurityManager
> +                         .createCodebasePrincipalFromOrigin("http://email.bar.com"),
> +      persisted: false
> +    },
> +  ];
> +  let fakseHosts = mockSiteDataManager.fakeSites.map(site => site.principal.URI.host);

Here and several more times below, 'fakeHosts', please (not "fakseHosts").

::: browser/components/preferences/in-content-old/tests/offline/manifest.appcache:3
(Diff revision 3)
> +CACHE MANIFEST
> +# V1
> +https://example.org/browser/browser/components/preferences/in-content/tests/offline/offline.html

Copy-paste error - this should be in-content-old ?

How come the tests pass with this path being wrong?

Wouldn't it be simpler to just use a relative path in both files (ie just "offline.html" ) ?

::: browser/components/preferences/in-content/tests/browser_siteData.js:245
(Diff revision 3)
>    is(status, Ci.nsIPermissionManager.UNKNOWN_ACTION, "Should remove permission");
>  
>    cacheUsage = yield cacheUsageGetter.get();
> -  quotaUsage = yield getQuotaUsage(TEST_ORIGIN);
> +  quotaUsage = yield getQuotaUsage(TEST_QUOTA_USAGE_ORIGIN);
>    totalUsage = yield SiteDataManager.getTotalUsage();
>    is(cacheUsage, 0, "The cahce usage should be removed");

Nit: 'cache', not 'cahce'

::: browser/components/preferences/in-content/tests/browser_siteData2.js:51
(Diff revision 3)
> +      principal: Services.scriptSecurityManager
> +                         .createCodebasePrincipalFromOrigin("http://email.bar.com"),
> +      persisted: false
> +    },
>    ];
> -  fakeOrigins.forEach(origin => addPersistentStoragePerm(origin));
> +  let fakseHosts = mockSiteDataManager.fakeSites.map(site => site.principal.URI.host);

More "fakseHosts"...
Attachment #8856991 - Flags: review?(gijskruitbosch+bugs)
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

https://reviewboard.mozilla.org/r/124966/#review139808

> Nit: trailing comma please.

Updated, thanks.

> Nit: trailing comma please

Updated, thanks.

> Nit: use `some` rather than `find`

Updated, thanks.

> Looks like this can stop using Promise.all? Same below.

Updated, thanks.

> Nit: omit `let cache = null;` and just `for (let cache of site.appCacheList)`

Updated, thanks.

> Nit: here and below:
> 
> `let {originNoSuffix} = principal;`

Updated, thanks.

> Nits for here and `_removeQuotaUsage`: use a set instead of an array:
> 
> `let removals = new Set()`
> 
> `if (removals.has(originNoSuffix))`
> 
> `removals.add(originNoSuffix)`

Updated, thanks.

> Nit: remember

Sorry for the typo, fixed.

> Instead of doing this, can't we just use `this._qms.clear();` ?

Consulted Jan before. Souldn't use it. The `clear` method deletes beyond what we want.
https://bugzilla.mozilla.org/show_bug.cgi?id=1312361#c9

> This looks like it's fixing an unrelated bug. Can you document this in the commit message?

Definitely, thanks.

> Nit:
> 
> ```js
> let host = item.getAttribute("site");
> let siteForHost = this._sites.find(site => site.host == host);
> if (siteForHost) {
>   siteForHost.userAction = "remove";
> }
> ```

Updated, thanks

> Nit: prompt

Sorry for the typo. fixed.

> English nit:
> 
> When we next open again we will once more get sites from the SiteDataManager and refresh the list.

Thanks for the advice, updated

> Please use Cu.reportError instead.

Thanks, updated.

> In English sorting is done "by" something, not "in" something, so consider:
> 
> "Sorted by usage so the user can quickly see which sites use the most data."
> 
> 
> I'm assuming, at least, that we're sorting descending, ie large usage at the top? If not, maybe that's worth doing?

Thanks for the English advice.
Yes, this is going to sort descending by default.

> This looks like it's unused? Why are you adding it?

The `settings.description` was removed by another bug. Because of rebasing, this bug's patch adds it back. Sorry for not noticing this. Deleted it, thanks.
Comment on attachment 8856991 [details]
Part 2: Bug 1348733 - Update tests for site data manager now that we use quota storage manager,

https://reviewboard.mozilla.org/r/128904/#review139812

> Please make the first line slightly more descriptive, e.g. "update tests for site data manager now that we use quota storage manager"

Sure, will do, thanks.

> Here and several more times below, 'fakeHosts', please (not "fakseHosts").

Will fix, thanks

> Copy-paste error - this should be in-content-old ?
> 
> How come the tests pass with this path being wrong?
> 
> Wouldn't it be simpler to just use a relative path in both files (ie just "offline.html" ) ?

Sorry for the error. Yes, simple relative path works. Will update, thanks.

Did you run `in-content/tests/browser_siteData.js` before `in-content-old/tests/browser_advanced_siteData.js`?
I was able to make a failure to run browser_advanced_siteData.js first from a fresh start.
If run browser_siteData.js first then browser_advanced_siteData.js, the `in-content/tests/offline/offline.html` would be generated in the test back-end.
And since the same `https://example.org` origin, no matter in-content or in-content-old both are fine to load that appcache resource.
During my local test runs, I was doing browser_siteData.js then browser_advanced_siteData.js so did not see failure.

> Nit: 'cache', not 'cahce'

Will fix, thanks

> More "fakseHosts"...

Sorry for these typos, will fix.
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

https://reviewboard.mozilla.org/r/124966/#review131446

::: browser/components/preferences/SiteDataManager.jsm:190
(Diff revisions 6 - 7)
>          principal = Services.scriptSecurityManager.createCodebasePrincipalFromOrigin(originNoSuffix);
>          let request = this._qms.clearStoragesForPrincipal(principal, null, true);
>          request.callback = resolve;
> +      }));
> -      }
> +    }
> -    });
> +    return Promise.all(promises);

Sorry. There was a logic error in the previous way. Here should use promises because we are actually clearing storages over multiple principals. Such as clearing “https://www.foo.com”, “http://www.foo.com” and “https://www.foo.com:123”, even OAs don’t matter but origin does so this is going to clear 3 different origins, which should be 3 different promises

::: browser/components/preferences/SiteDataManager.jsm:236
(Diff revisions 6 - 7)
> -          // Had removed some sites.
> -          // Let's still remeber to update sites before throwing.
> -          Promise.all(promises).then(() => this.updateSites());
> +      Promise.all(promises).then(() => this.updateSites());
> -        }
> +    }
> -        throw `SiteDataManager: removing unknown site of ${host}`;
> +    if (unknownHost) {
> +      throw `SiteDataManager: removing unknown site of ${unknownHost}`;

Rearrange a bit here so as to make it always hit promise for update before throwing and make sure no useless promise for update was generated.

::: browser/components/preferences/siteDataSettings.xul:39
(Diff revisions 6 - 7)
>      </hbox>
>      <separator class="thin"/>
>  
>      <richlistbox id="sitesList" orient="vertical" flex="1">
>        <listheader>
> -        <treecol flex="4" width="50" label="&siteCol.label;" id="siteCol"/>
> +        <treecol flex="4" width="50" label="&hostCol.label;" id="hostCol"/>

Here rollback to `hostCol` so as to reduce changes compared to the central and avoid misunderstandings. Although initially switching from `hostCol` to `siteCol` is to use a more generic naming to preserve some flexibility for the future v2 change, this also made myself kind of miss `settings.description` is useless.
Comment on attachment 8852806 [details]
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data,

https://reviewboard.mozilla.org/r/124966/#review141050

Nit in the commit message, otherwise this looks good.

::: commit-message-272ce:7
(Diff revisions 6 - 7)
>  
>  This patch does:
>  - remove codes about counting and removing http cache per site
>  - group and list sites based on host across scheme, port and origin attributes
>  - list sites using quota usage or appcache
> +- sort sites decending by usage by default

Nit: 'descending'
Attachment #8852806 - Flags: review?(gijskruitbosch+bugs) → review+
Comment on attachment 8856991 [details]
Part 2: Bug 1348733 - Update tests for site data manager now that we use quota storage manager,

https://reviewboard.mozilla.org/r/128904/#review141056

Nice, thanks!
Attachment #8856991 - Flags: review?(gijskruitbosch+bugs) → review+
Pushed by ryanvm@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/c50e480ea035
Part 1: Bug 1348733 - List sites using quota storage or appcache in Settings of Site Data, r=Gijs
https://hg.mozilla.org/integration/autoland/rev/1bd9e6b07fc0
Part 2: Bug 1348733 - Update tests for site data manager now that we use quota storage manager, r=Gijs
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/c50e480ea035
https://hg.mozilla.org/mozilla-central/rev/1bd9e6b07fc0
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 55
When I tried to clear data for facebook.com.

JavaScript error: chrome://browser/content/preferences/siteDataSettings.js, line 225: NS_ERROR_INSUFFICIENT_DOMAIN_LEVELS: Component returned failure code: 0x804b0050 (NS_ERROR_INSUFFICIENT_DOMAIN_LEVELS) [nsIEffectiveTLDService.getBaseDomainFromHost]

It can't be deleted due to js error.

I've added a patch to fix it. It seems to work now. I will open an follow up bug.
(In reply to Shawn Huang [:shawnjohnjr] from comment #47)
> When I tried to clear data for facebook.com.
> 
> JavaScript error: chrome://browser/content/preferences/siteDataSettings.js,
> line 225: NS_ERROR_INSUFFICIENT_DOMAIN_LEVELS: Component returned failure
> code: 0x804b0050 (NS_ERROR_INSUFFICIENT_DOMAIN_LEVELS)
> [nsIEffectiveTLDService.getBaseDomainFromHost]
> 
> It can't be deleted due to js error.
> 
> I've added a patch to fix it. It seems to work now. I will open an follow up
> bug.

To clarify that it's not because 'facebook.com', it's because the code scan sites and try to extract base domain.
However, one site "s3-us-west-2.amazonaws.com" in that list, and it's defined in etld_data.inc [1].

And getBaseDomainFromHost gets eTLD+1, but site "s3-us-west-2.amazonaws.com" is already in the list. So it throws NS_ERROR_INSUFFICIENT_DOMAIN_LEVELS. This exception is thrown if there were insufficient subdomain levels in the hostname to satisfy the requested aAdditionalParts value. So it expects at least one part.

[1] https://dxr.mozilla.org/mozilla-central/source/obj-x86_64-pc-linux-gnu/netwerk/dns/etld_data.inc#4709
Whiteboard: [storage-v1]
Depends on: 1450448
You need to log in before you can comment on or make changes to this bug.