Closed Bug 1573238 Opened 3 months ago Closed 2 months ago

Provide a minimal BrowserUsageTelemetry.jsm module in GeckoView (completely independent of its embedder) that implements nsIBrowserUsage

Categories

(GeckoView :: General, task, P2)

Unspecified
All

Tracking

(firefox70 wontfix, firefox71 fixed)

RESOLVED FIXED
mozilla71
Tracking Status
firefox70 --- wontfix
firefox71 --- fixed

People

(Reporter: ehsan, Assigned: ehsan)

References

(Blocks 1 open bug)

Details

Attachments

(1 file)

In bug 1573236 I'm adding some code that depends on the presence of this JSM: resource:///modules/BrowserUsageTelemetry.jsm which would implement nsIBrowserUsage as defined in dom/interfaces/base/nsIBrowserUsage.idl.

This module basically needs to export one global function named getUniqueDomainsVisitedInPast24Hours() which returns the number of unique domains that the user has visited in the past 24 hours.

GV doesn't keep browsing history. I think GV should always ask app (Fenix) to show the Storage Access user prompt and then the app implements this "getUniqueDomainsVisitedInPast24Hours" logic.

I filed a new Fenix issue ("Implement user permission prompt for ETP's Storage Access cookie API") and mentioned that Fenix should include this "getUniqueDomainsVisitedInPast24Hours" logic.

https://github.com/mozilla-mobile/fenix/issues/4916

WONTFIX because Fenix should implement this "getUniqueDomainsVisitedInPast24Hours" logic in https://github.com/mozilla-mobile/fenix/issues/4916.

Status: NEW → RESOLVED
Closed: 3 months ago
Resolution: --- → WONTFIX

(In reply to Chris Peterson [:cpeterson] from comment #1)

GV doesn't keep browsing history. I think GV should always ask app (Fenix) to show the Storage Access user prompt and then the app implements this "getUniqueDomainsVisitedInPast24Hours" logic.

That's unfortunately not workable. What Fenix wants to do here isn't relevant, this bug is about GeckoView itself.

Here is the flow of displaying storage access API prompts:

  1. Gecko gets a call from the web page to document.requestStorageAccess().
  2. It proceeds to do a number of checks to determine whether the page is allowed to call the API (e.g. whether it's a third-party frame, whether ETP is active)
  3. If those checks pass, it proceeds to do a number of checks determining if it can predetermine the result of the storage access check without needing to prompt the user.
  4. If those checks determine that we don't need to prompt, storage access is granted.
  5. Otherwise, we prompt the user. At this point the GV embedder app gets involved and displays a prompt and communicates the result back to Gecko.
  6. Gecko responds to the API call from the web page based on the results either from step 4 or 5.

It is during step 3 when Gecko tries to access resource:///modules/BrowserUsageTelemetry.jsm and call the getUniqueDomainsVisitedInPast24Hours() API. Implementing this API by GeckoView is not optional if it wants to share the heuristics in step 3 with desktop. Without this JSM module, we do not support granting storage access automatically and will fail step 3 and will display a prompt for every call to the web page visible API.

I filed a new Fenix issue ("Implement user permission prompt for ETP's Storage Access cookie API") and mentioned that Fenix should include this "getUniqueDomainsVisitedInPast24Hours" logic.

https://github.com/mozilla-mobile/fenix/issues/4916

Based on what was filed there, I think the summary of the situation here is that the GeckoView embedder is aware of the browsing history, so GeckoView itself can't provide a useful definition of getUniqueDomainsVisitedInPast24Hours(). I think that's quite fair and makes perfect sense.

My suggestion for how to handle this is a model like this:

  • We change the getUniqueDomainsVisitedInPast24Hours() API on both desktop/GV to return a promise that resolves to the number of unique domains visited in the past 24 hours.
  • On desktop we'll return Promise.resolve(result) where result is the number we're returning today.
  • On GV, we'll call back into the embedder asking them to calculate this number, and when it's available we'll resolve the promise with that.
  • We'll make sure Document::AutomaticStorageAccessCanBeGranted() is asynchronous in the parent process. It is already asynchronous in the content process so that should be quite easy.

How does this setup sound?

Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---

(In reply to :ehsan akhgari from comment #3)

(In reply to Chris Peterson [:cpeterson] from comment #1)

GV doesn't keep browsing history. I think GV should always ask app (Fenix) to show the Storage Access user prompt and then the app implements this "getUniqueDomainsVisitedInPast24Hours" logic.

That's unfortunately not workable. What Fenix wants to do here isn't relevant, this bug is about GeckoView itself.

Here is the flow of displaying storage access API prompts:

  1. Gecko gets a call from the web page to document.requestStorageAccess().
  2. It proceeds to do a number of checks to determine whether the page is allowed to call the API (e.g. whether it's a third-party frame, whether ETP is active)
  3. If those checks pass, it proceeds to do a number of checks determining if it can predetermine the result of the storage access check without needing to prompt the user.
  4. If those checks determine that we don't need to prompt, storage access is granted.
  5. Otherwise, we prompt the user. At this point the GV embedder app gets involved and displays a prompt and communicates the result back to Gecko.
  6. Gecko responds to the API call from the web page based on the results either from step 4 or 5.

It is during step 3 when Gecko tries to access resource:///modules/BrowserUsageTelemetry.jsm and call the getUniqueDomainsVisitedInPast24Hours() API. Implementing this API by GeckoView is not optional if it wants to share the heuristics in step 3 with desktop. Without this JSM module, we do not support granting storage access automatically and will fail step 3 and will display a prompt for every call to the web page visible API.

Can you share the heuristic that's being used in step 3? A searchfox link would be fine.

It seems like it may be possible to conflate steps 3-5 in the prompt request to the app. The app wouldn't have to actually prompt if it has other information that indicates it doesn't need to. If all it needs to do is check if you visited in the last 24 hours, then that's something it will be able to do.

I filed a new Fenix issue ("Implement user permission prompt for ETP's Storage Access cookie API") and mentioned that Fenix should include this "getUniqueDomainsVisitedInPast24Hours" logic.

https://github.com/mozilla-mobile/fenix/issues/4916

Based on what was filed there, I think the summary of the situation here is that the GeckoView embedder is aware of the browsing history, so GeckoView itself can't provide a useful definition of getUniqueDomainsVisitedInPast24Hours(). I think that's quite fair and makes perfect sense.

My suggestion for how to handle this is a model like this:

  • We change the getUniqueDomainsVisitedInPast24Hours() API on both desktop/GV to return a promise that resolves to the number of unique domains visited in the past 24 hours.
  • On desktop we'll return Promise.resolve(result) where result is the number we're returning today.
  • On GV, we'll call back into the embedder asking them to calculate this number, and when it's available we'll resolve the promise with that.
  • We'll make sure Document::AutomaticStorageAccessCanBeGranted() is asynchronous in the parent process. It is already asynchronous in the content process so that should be quite easy.

How does this setup sound?

I think we need to have a high bar for exposing things like this in the public API. If we don't, we'll end up with an utterly massive surface that is hard to both maintain and use. In my opinion, due to the highly specific nature and overall utility of this new API, this doesn't pass the bar. We can use a slightly different heuristic, if necessary. If we end up prompting people more often than desktop I think that is probably acceptable, but there may be some other mitigations we can employ as well.

needinfo'ing Ehsan for James' questions in comment 4 about re-implementing the requestStorageAccess auto-allow heuristics in the app's Storage Access user prompt code.

Flags: needinfo?(ehsan)
Rank: 15
Priority: -- → P2

(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) (he/him) from comment #4)

(In reply to :ehsan akhgari from comment #3)

(In reply to Chris Peterson [:cpeterson] from comment #1)

GV doesn't keep browsing history. I think GV should always ask app (Fenix) to show the Storage Access user prompt and then the app implements this "getUniqueDomainsVisitedInPast24Hours" logic.

That's unfortunately not workable. What Fenix wants to do here isn't relevant, this bug is about GeckoView itself.

Here is the flow of displaying storage access API prompts:

  1. Gecko gets a call from the web page to document.requestStorageAccess().
  2. It proceeds to do a number of checks to determine whether the page is allowed to call the API (e.g. whether it's a third-party frame, whether ETP is active)
  3. If those checks pass, it proceeds to do a number of checks determining if it can predetermine the result of the storage access check without needing to prompt the user.
  4. If those checks determine that we don't need to prompt, storage access is granted.
  5. Otherwise, we prompt the user. At this point the GV embedder app gets involved and displays a prompt and communicates the result back to Gecko.
  6. Gecko responds to the API call from the web page based on the results either from step 4 or 5.

It is during step 3 when Gecko tries to access resource:///modules/BrowserUsageTelemetry.jsm and call the getUniqueDomainsVisitedInPast24Hours() API. Implementing this API by GeckoView is not optional if it wants to share the heuristics in step 3 with desktop. Without this JSM module, we do not support granting storage access automatically and will fail step 3 and will display a prompt for every call to the web page visible API.

Can you share the heuristic that's being used in step 3? A searchfox link would be fine.

https://searchfox.org/mozilla-central/rev/250f5cc9fb8bdcbb6b23d2a06acfd48addb2f99b/dom/base/Document.cpp#15648

The main heuristic is here, the rest of that function gathers the required information.

It seems like it may be possible to conflate steps 3-5 in the prompt request to the app. The app wouldn't have to actually prompt if it has other information that indicates it doesn't need to. If all it needs to do is check if you visited in the last 24 hours, then that's something it will be able to do.

This is definitely one way of designing things, but it's an inferior choice IMO because:

  • All GeckoView embedders would need to copy the exact same code, otherwise they would have a different implementation of the Storage Access API than other Gecko based applications.
  • We would need to create an expose other hooks into the asynchronous processing model behind requestStorageAccess() for this to work, and those hooks are almost guaranteed to be more complex than what this bug was filed for.
  • Updating the behaviour of the Storage Access API would require changing code in different places, potentially different repositories with different shipping modes, etc. as opposed to only updating the mozilla-central unified code we have now. We are almost certainly going to be modifying this heuristic in the future versions of Gecko.

Is there a reason you would prefer this over what's suggested here?

I filed a new Fenix issue ("Implement user permission prompt for ETP's Storage Access cookie API") and mentioned that Fenix should include this "getUniqueDomainsVisitedInPast24Hours" logic.

https://github.com/mozilla-mobile/fenix/issues/4916

Based on what was filed there, I think the summary of the situation here is that the GeckoView embedder is aware of the browsing history, so GeckoView itself can't provide a useful definition of getUniqueDomainsVisitedInPast24Hours(). I think that's quite fair and makes perfect sense.

My suggestion for how to handle this is a model like this:

  • We change the getUniqueDomainsVisitedInPast24Hours() API on both desktop/GV to return a promise that resolves to the number of unique domains visited in the past 24 hours.
  • On desktop we'll return Promise.resolve(result) where result is the number we're returning today.
  • On GV, we'll call back into the embedder asking them to calculate this number, and when it's available we'll resolve the promise with that.
  • We'll make sure Document::AutomaticStorageAccessCanBeGranted() is asynchronous in the parent process. It is already asynchronous in the content process so that should be quite easy.

How does this setup sound?

I think we need to have a high bar for exposing things like this in the public API. If we don't, we'll end up with an utterly massive surface that is hard to both maintain and use. In my opinion, due to the highly specific nature and overall utility of this new API, this doesn't pass the bar. We can use a slightly different heuristic, if necessary. If we end up prompting people more often than desktop I think that is probably acceptable, but there may be some other mitigations we can employ as well.

Public to whom, GV embedders or to Gecko? This paragraph confuses me quite a lot. I definitely didn't intend to suggest nsIBrowserUsage or anything under the hood from its implementation should be a public GV API. My goal in designing this was to ensure that the public GV API would only be the prompting API where the prompt would be dealt with very similarly to other prompt that GV shows, so that embedders do not need to care about any of this complexity. The problem mentioned in comment 1 (GV not maintaining history) for example could be alleviated by returning 0 or some such from getUniqueDomainsVisitedInPast24Hours(). I don't think all of the possibilities here have been enumerated at all...

But reading this I'm fairly convinced that we are probably talking about two (or more?) different things here, and to be honest I'm not really sure which specific idea you're rejecting in the last paragraph, so I'm a bit lost as to where to take this discussion. So far the suggestions here (having different prompting heuristics on mobile, having embedders involved in the prompting heuristics) are all explicit design non-goals as far as I'm concerned.

May I suggest perhaps at some point myself and whoever is going to work on this plus maybe you, Chris and other folks who are needed to get together to discuss this over before settling on really drastic resolutions here?

Flags: needinfo?(ehsan) → needinfo?(snorp)

(In reply to :ehsan akhgari from comment #6)
<snip>

My suggestion for how to handle this is a model like this:

  • We change the getUniqueDomainsVisitedInPast24Hours() API on both desktop/GV to return a promise that resolves to the number of unique domains visited in the past 24 hours.
  • On desktop we'll return Promise.resolve(result) where result is the number we're returning today.
  • On GV, we'll call back into the embedder asking them to calculate this number, and when it's available we'll resolve the promise with that.
  • We'll make sure Document::AutomaticStorageAccessCanBeGranted() is asynchronous in the parent process. It is already asynchronous in the content process so that should be quite easy.

How does this setup sound?

This is the proposal I'm responding to below, in which getUniqueDomainsVisitedInPast24Hours() is delegated to the app.

I think we need to have a high bar for exposing things like this in the public API. If we don't, we'll end up with an utterly massive surface that is hard to both maintain and use. In my opinion, due to the highly specific nature and overall utility of this new API, this doesn't pass the bar. We can use a slightly different heuristic, if necessary. If we end up prompting people more often than desktop I think that is probably acceptable, but there may be some other mitigations we can employ as well.

Public to whom, GV embedders or to Gecko? This paragraph confuses me quite a lot.

By "public", I'm talking about the GeckoView API exposed to applications.

I definitely didn't intend to suggest nsIBrowserUsage or anything under the hood from its implementation should be a public GV API.

But you proposed exactly this by delegating the implementation of getUniqueDomainsVisitedInPast24Hours() to the application. I guess I'm not understanding something correctly.

My goal in designing this was to ensure that the public GV API would only be the prompting API where the prompt would be dealt with very similarly to other prompt that GV shows, so that embedders do not need to care about any of this complexity. The problem mentioned in comment 1 (GV not maintaining history) for example could be alleviated by returning 0 or some such from getUniqueDomainsVisitedInPast24Hours(). I don't think all of the possibilities here have been enumerated at all...

If GV can get by with implementing a stub nsIBrowserUsage that can just always return 0 for getUniqueDomainsVisitedInPast24Hours(), then that's fine with me.

But reading this I'm fairly convinced that we are probably talking about two (or more?) different things here, and to be honest I'm not really sure which specific idea you're rejecting in the last paragraph, so I'm a bit lost as to where to take this discussion. So far the suggestions here (having different prompting heuristics on mobile, having embedders involved in the prompting heuristics) are all explicit design non-goals as far as I'm concerned.

May I suggest perhaps at some point myself and whoever is going to work on this plus maybe you, Chris and other folks who are needed to get together to discuss this over before settling on really drastic resolutions here?

Yeah I'm happy to talk further about this. Clearly there's some kind of misunderstanding on my end.

Flags: needinfo?(snorp)
Flags: needinfo?(ehsan)

(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) (he/him) from comment #7)

(In reply to :ehsan akhgari from comment #6)
<snip>

My suggestion for how to handle this is a model like this:

  • We change the getUniqueDomainsVisitedInPast24Hours() API on both desktop/GV to return a promise that resolves to the number of unique domains visited in the past 24 hours.
  • On desktop we'll return Promise.resolve(result) where result is the number we're returning today.
  • On GV, we'll call back into the embedder asking them to calculate this number, and when it's available we'll resolve the promise with that.
  • We'll make sure Document::AutomaticStorageAccessCanBeGranted() is asynchronous in the parent process. It is already asynchronous in the content process so that should be quite easy.

How does this setup sound?

This is the proposal I'm responding to below, in which getUniqueDomainsVisitedInPast24Hours() is delegated to the app.

As in the GV embedder? That wasn't really what I had in mind when designing the Gecko side of this to be honest. I was thinking (perhaps naively) that GV (and not its embedder) would implement that part.

I think we need to have a high bar for exposing things like this in the public API. If we don't, we'll end up with an utterly massive surface that is hard to both maintain and use. In my opinion, due to the highly specific nature and overall utility of this new API, this doesn't pass the bar. We can use a slightly different heuristic, if necessary. If we end up prompting people more often than desktop I think that is probably acceptable, but there may be some other mitigations we can employ as well.

Public to whom, GV embedders or to Gecko? This paragraph confuses me quite a lot.

By "public", I'm talking about the GeckoView API exposed to applications.

I understand now. I think both of us agree that nothing here (except for the prompting part) should ideally be exposed to the GV embedder.

I definitely didn't intend to suggest nsIBrowserUsage or anything under the hood from its implementation should be a public GV API.

But you proposed exactly this by delegating the implementation of getUniqueDomainsVisitedInPast24Hours() to the application. I guess I'm not understanding something correctly.

I did no such thing. :-) I suggested delegating the implementation of getUniqueDomainsVisitedInPast24Hours() to GeckoView. In comment 1, Chris said that GV doesn't keep browsing history and then closed the bug with the same suggestion that you made again in comment 4 (for Gecko to ask the GV embedder app to figure out whether or not to show a prompt for the storage access API and for the embedder to be responsible for implementing the heuristic, including computing the number of unique domains visited in the past 24 hours, and I reopened the bug because I don't think that is a good solution for the reasons I have stated before.

I am now editing the summary of this bug to clarify without any ambiguity what this bug is talking about. At least on my side, I am not proposing involving the embedder application in the prompting procedure in any way before Gecko decides to prompt. If GV on its own cannot determine "how many domains were visited in the past 24 hours" perhaps we can figure out how to adapt the heuristic for Gecko embedders that do not access to history or something like that.

My goal in designing this was to ensure that the public GV API would only be the prompting API where the prompt would be dealt with very similarly to other prompt that GV shows, so that embedders do not need to care about any of this complexity. The problem mentioned in comment 1 (GV not maintaining history) for example could be alleviated by returning 0 or some such from getUniqueDomainsVisitedInPast24Hours(). I don't think all of the possibilities here have been enumerated at all...

If GV can get by with implementing a stub nsIBrowserUsage that can just always return 0 for getUniqueDomainsVisitedInPast24Hours(), then that's fine with me.

Great, that's probably a fine solution here.

To clarify why this is, the prompting heuristic looks at 1% of the number of the domains visited in the past 24 hours, with a minimum cap of 5 domains, in order to prevent prompts from showing up before a tracker is about to obtain tracking power over a significant portion of the user's cross-site browsing activity (that is, we do not want to allow automatic access grants over 1% of the domains). We have the dom.storage_access.max_concurrent_auto_grants which establishes the minimum cap here (set to 5 by default) so if we return 0 here the minimum cap would always take effect. That would only become inaccurate if the user has browsed more than 500 top-level eTLD's in the past 24 hours, which should be a very unlikely scenario on mobile anyway.

But reading this I'm fairly convinced that we are probably talking about two (or more?) different things here, and to be honest I'm not really sure which specific idea you're rejecting in the last paragraph, so I'm a bit lost as to where to take this discussion. So far the suggestions here (having different prompting heuristics on mobile, having embedders involved in the prompting heuristics) are all explicit design non-goals as far as I'm concerned.

May I suggest perhaps at some point myself and whoever is going to work on this plus maybe you, Chris and other folks who are needed to get together to discuss this over before settling on really drastic resolutions here?

Yeah I'm happy to talk further about this. Clearly there's some kind of misunderstanding on my end.

I think now we are both on the same page. I'm going to submit a patch based on this discussion now...

Flags: needinfo?(ehsan)
Summary: Provide a minimal BrowserUsageTelemetry.jsm module that implements nsIBrowserUsage → Provide a minimal BrowserUsageTelemetry.jsm module in GeckoView (completely independent of its embedder) that implements nsIBrowserUsage
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/746c872d6242
Provide a minimal BrowserUsageTelemetry.jsm module in GeckoView that implements nsIBrowserUsage; r=snorp
Flags: needinfo?(ehsan)
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5a2bf4210a6e
Provide a minimal BrowserUsageTelemetry.jsm module in GeckoView that implements nsIBrowserUsage; r=snorp
Status: REOPENED → RESOLVED
Closed: 3 months ago2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla71
Flags: needinfo?(ehsan)

firefox70=wontfix because we don't need to uplift BrowserUsageTelemetry.jsm to GeckoView 70 Beta.

Assignee: nobody → ehsan
You need to log in before you can comment on or make changes to this bug.