Closed Bug 1499214 Opened 1 year ago Closed 1 year ago

Unify Tracking Protection and Safe Browsing APIs

Categories

(GeckoView :: General, enhancement, P1)

51 Branch
All
Android
enhancement

Tracking

(geckoview64 wontfix, geckoview65 wontfix, firefox63 wontfix, firefox64 wontfix, firefox65 wontfix, firefox66 wontfix, firefox67 fixed)

RESOLVED FIXED
mozilla67
Tracking Status
geckoview64 --- wontfix
geckoview65 --- wontfix
firefox63 --- wontfix
firefox64 --- wontfix
firefox65 --- wontfix
firefox66 --- wontfix
firefox67 --- fixed

People

(Reporter: esawin, Assigned: esawin)

References

Details

Attachments

(3 files, 4 obsolete files)

Currently, safe browsing and tracking protection are controlled through two separate APIs and callbacks while handling similar semantics in GeckoView.
Because of this, I'm proposing to merge these APIs into one.

Ideally, this should also help us reflect the latest changes of the content blocking settings on desktop and also prepare the API for easier extension in light of future requirements like custom blocklist support.

I've sketched out the unified API draft with some in-line comments:

    // Maybe we should start moving all the constants out of
    // GeckoSession for a cleaner and more modular API.
    UriClassifier {
      // Anti-tracking
      AD_TRACKER
      ANALYTICS_TRACKER
      SOCIAL_TRACKER
      CONTENT_TRACKER
      TEST_TRACKER

      // Safe Browsing
      MALWARE_URI
      UNWANTED_URI
      HARMFUL_URI
      PHISHING_URI

      // Used for custom blocklists
      BLOCKLIST

      // Alternatively, we can move the Delegates into their respective
      // context, instead of spamming GeckoSession.
      // Delegate { onUriBlocked(..) }
    }

    // A dedicated delegate or move the callback into NavigationDelegate.
    UriClassifierDelegate {
      // For blocked page loads (safe browsing) the returned string can
      // contain the error page URI (same behaviour as with onLoadError).
      GeckoResult<String>
      onUriBlocked(GeckoSession session,
                   String uri,
                   @UriClassifier.Type int type,
                   // For BLOCKLIST, blocklist would contain the custom name.
                   @Nullable String blocklist)
    }

For settings we can have a session setting for anti-tracking (aka tracking protection) and one runtime setting for the blocklist/type selection of the classifier.

    GeckoSessionSettings {
      USE_ANTI_TRACKING
    }

    GeckoRuntimeSettings {
      blockUriTypes(@UriClassifier.Type int types,
                    ArrayList<String> blocklists)
    }

The app may want to disable the classifier for a single load request, for this we'll add a new load flag.

    GeckoSession {
      LOAD_FLAGS_BYPASS_CLASSIFIER
    }

The load flag should make it possible to construct an error page with override links (e.g., using a custom URI scheme and add the load flag on onLoadRequest).

Additionally, we could look into supporting whitelisting of URIs, which will require some Gecko changes.

I would be happy about any feedback on the current draft, so far!
Flags: needinfo?(snorp)
(In reply to Eugen Sawin [:esawin] from comment #0)
> Currently, safe browsing and tracking protection are controlled through two
> separate APIs and callbacks while handling similar semantics in GeckoView.
> Because of this, I'm proposing to merge these APIs into one.
> 
> Ideally, this should also help us reflect the latest changes of the content
> blocking settings on desktop and also prepare the API for easier extension
> in light of future requirements like custom blocklist support.
> 
> I've sketched out the unified API draft with some in-line comments:
> 
>     // Maybe we should start moving all the constants out of
>     // GeckoSession for a cleaner and more modular API.

This seems fine to me, especially if we would like to use various UrlClassifier settings with WebExecutor one day.

>     UriClassifier {
>       // Anti-tracking
>       AD_TRACKER
>       ANALYTICS_TRACKER
>       SOCIAL_TRACKER
>       CONTENT_TRACKER
>       TEST_TRACKER
> 
>       // Safe Browsing
>       MALWARE_URI
>       UNWANTED_URI
>       HARMFUL_URI
>       PHISHING_URI
> 
>       // Used for custom blocklists
>       BLOCKLIST
> 
>       // Alternatively, we can move the Delegates into their respective
>       // context, instead of spamming GeckoSession.
>       // Delegate { onUriBlocked(..) }

So using that would be something like: session.getUrlClassifier().setDelegate(new UrlClassifier.Delegate() { ... }) right?

We discussed this at one of the all-hands and I think everyone liked the idea. I tried to start implementing, but it turned out to be a big PITA (though don't remember details). Worth revisiting, probably.

>     }
> 
>     // A dedicated delegate or move the callback into NavigationDelegate.
>     UriClassifierDelegate {
>       // For blocked page loads (safe browsing) the returned string can
>       // contain the error page URI (same behaviour as with onLoadError).
>       GeckoResult<String>
>       onUriBlocked(GeckoSession session,
>                    String uri,
>                    @UriClassifier.Type int type,
>                    // For BLOCKLIST, blocklist would contain the custom name.
>                    @Nullable String blocklist)
>     }

That seems ok. I think we could also consider a UrlClassifier.BlockResult or similar which will be more easily extended in the future; we wouldn't have to add arguments.

> 
> For settings we can have a session setting for anti-tracking (aka tracking
> protection) and one runtime setting for the blocklist/type selection of the
> classifier.
> 
>     GeckoSessionSettings {
>       USE_ANTI_TRACKING
>     }

Maybe USE_CONTENT_BLOCKING? What's the right terminology now?

Also, I really wish the session settings worked like the runtime ones. I cringe at the setBoolean() stuff.

> 
>     GeckoRuntimeSettings {
>       blockUriTypes(@UriClassifier.Type int types,
>                     ArrayList<String> blocklists)
>     }
> 
> The app may want to disable the classifier for a single load request, for
> this we'll add a new load flag.
> 
>     GeckoSession {
>       LOAD_FLAGS_BYPASS_CLASSIFIER
>     }
> 
> The load flag should make it possible to construct an error page with
> override links (e.g., using a custom URI scheme and add the load flag on
> onLoadRequest).

A custom scheme is a great idea for this kind of communication that I hadn't considered. Nice!

> 
> Additionally, we could look into supporting whitelisting of URIs, which will
> require some Gecko changes.

Yeah, blah. Things that modify persistent state like that are annoying. We'll need similar stuff for managing extensions one day. I was thinking stuff like that could hang off of GeckoRuntime: runtime.getUrlClassifierConfig().addBlocklist("https://something")

> 
> I would be happy about any feedback on the current draft, so far!

Spam the slack channel like I do :)
Flags: needinfo?(snorp)
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #1)
> So using that would be something like:
> session.getUrlClassifier().setDelegate(new UrlClassifier.Delegate() { ... })
> right?

I'm not sure if we'll need an actual UriClassifier instance, we could just use it for its namespace, so:

    session.setUriClassifierDelegate(new UriClassifier.Delegate() {})

Having an actual instance would make it nicer for keeping state, if we want that.

> We discussed this at one of the all-hands and I think everyone liked the
> idea. I tried to start implementing, but it turned out to be a big PITA
> (though don't remember details). Worth revisiting, probably.

I can look into it for this case and then we can decide whether that's something we want to enforce consistently.

> That seems ok. I think we could also consider a UrlClassifier.BlockResult or
> similar which will be more easily extended in the future; we wouldn't have
> to add arguments.

Good idea!

> Maybe USE_CONTENT_BLOCKING? What's the right terminology now?

My understanding is that anti-tracking is the new term for tracking protection, where content blocking is the broader term containing anti-tracking, cookie policy, FastBlock and maybe more.
 
> Also, I really wish the session settings worked like the runtime ones. I
> cringe at the setBoolean() stuff.

Do we want to file a separate bug to refactor (good first bug)?

> Yeah, blah. Things that modify persistent state like that are annoying.
> We'll need similar stuff for managing extensions one day. I was thinking
> stuff like that could hang off of GeckoRuntime:
> runtime.getUrlClassifierConfig().addBlocklist("https://something")

Shouldn't that better go into GeckoRuntimeSettings? e.g.:

    runtime.getSettings()[.uriClassifier].addBlocklists(...)
I have a couple of WIP patches we can use to discuss some of the ideas I would like to introduce with this work.

The first patch adds the UriClassifier runtime settings as a nested setting including nested builder.

Most of the rough edges in this code can be solved through generics and reflection, so please don't look too much into the implementation details yet.

With nested settings support we could provide this kind of user experience:
    
    GeckoRuntimeSettings settings = GeckoRuntimeSettings.newBuilder()
                                        .consoleOutput(true)
                                        .uriClassifier()
                                            .categories(...)
                                            .done()
                                        .remoteDebugging(true)
                                        .build();

At the same time, we could easily support (not in this patch) modular handling of nested setting, e.g.:

    UriClassifier.Settings ucSettings = UriClassifier.Settings.newBuilder()
                                            .categories(...)
                                            .build();

    GeckoRuntimeSettings settings = GeckoRuntimeSettings.newBuilder()
                                        .consoleOutput(true)
                                        .uriClassifier(ucSettings)
                                        .build();

With this patch we would also move an API module out of GeckoSession to improve maintainability and code separation.
Attachment #9019792 - Flags: feedback?(snorp)
Attachment #9019792 - Flags: feedback?(nchen)
Currently, we only allow a single nsILoadURIDelegate handler (GeckoViewNavigationContent) which makes it difficult to share responsibilities between GV modules.

For the UriClassifier API, we will want to enable custom error pages for safe browsing errors as before, while leaving onLoadError for actual page load errors.

To make this work with the load delegate, we have to allow multiple handlers for nsILoadURIDelegate.handleLoadError, one to handle page load errors and the other to handle safe browsing "errors".
Alternatively, we could move the proxy into a shared JSM module (a JSM module shared by two Java delegate handlers).

In my proposal, we designate LoadURIDelegate as the handler proxy to call into the registered handlers.
For the current scenario, chain-calling into the registered handlers and returning the result of the handler which successfully handled the callback works because the handlers don't handle the same error codes.

To make this future-proof, we would either need a solution that does not depend on a load delegation proxy (shared JSM module solution), introduce handler priorities or handler event filtering.
Attachment #9019797 - Flags: feedback?(snorp)
Attachment #9019797 - Flags: feedback?(nchen)
The UriClassifier module based on the new mechanics for reference.
Currently, we manually maintain an array of Pref instances to allow for automatic flushing (now committing).
With the introducing of nested settings, including pref-based settings, this becomes cumbersome to support.

RuntimePrefCollection is a helper to allow easier management of (nested) pref-based settings.
Attachment #9019800 - Flags: feedback?(snorp)
Attachment #9019800 - Flags: feedback?(nchen)
Comment on attachment 9019792 [details] [diff] [review]
0001-Bug-1499214-1.0-Add-nested-UriClassifier-runtime-set.patch

Review of attachment 9019792 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good! I wouldn't worry about the nesting part though; it makes things a little complex. This looks just as fine to me:

> GeckoRuntimeSettings settings = new GeckoRuntimeSettings.Builder()
>        .consoleOutput(true)
>        .uriClassifier(new UriClassifier.Builder()
>                .categories(...)
>                .build())
>        .remoteDebugging(true)
>        .build();

Most of our users will probably be using Kotlin. If we really want nesting, we should implement that on the Kotlin level by using Kotlin's type-safe-builder idiom [1].

[1] https://kotlinlang.org/docs/reference/type-safe-builders.html
Attachment #9019792 - Flags: feedback?(nchen) → feedback+
BTW in Kotlin, I think the above would look something like this,

> val settings = GeckoRuntimeSettings.with {
>     consoleOutput(true)
>     uriClassifier {
>         categories(...)
>     }
>     remoteDebugging(true)
> }
Comment on attachment 9019797 [details] [diff] [review]
0002-Bug-1499214-2.0-Allow-multiple-handlers-for-LoadURID.patch

Review of attachment 9019797 [details] [diff] [review]:
-----------------------------------------------------------------

I don't think this will interfere with async nsILoadURIDelegate (bug 1441059) too much, but it's something to keep in mind.

::: mobile/android/modules/geckoview/LoadURIDelegate.jsm
@@ +19,5 @@
> +  _handlers: [],
> +
> +  register: function(aDocShell, aModule) {
> +    if (!this._handlers.length) {
> +      aDocShell.loadURIDelegate = this;

How would multiple different docshells be handled? Seems like the same handlers would be called even if that handler was registered with a different docshell?

@@ +21,5 @@
> +  register: function(aDocShell, aModule) {
> +    if (!this._handlers.length) {
> +      aDocShell.loadURIDelegate = this;
> +    }
> +    if (this._handlers.indexOf(aModule) >= 0) {

`this._handlers.includes(aModule)`

@@ +38,5 @@
> +
> +  // nsILoadURIDelegate.
> +  loadURI: function(...args) {
> +    let res = false;
> +    this._handlers.forEach(handler => {

This should return early if a handler already handled the call.

@@ +39,5 @@
> +  // nsILoadURIDelegate.
> +  loadURI: function(...args) {
> +    let res = false;
> +    this._handlers.forEach(handler => {
> +      res = res || typeof handler.loadURI === "function"

Add some parentheses
Attachment #9019797 - Flags: feedback?(nchen) → feedback+
Comment on attachment 9019800 [details] [diff] [review]
0004-Bug-1499214-4.0-Use-RuntimePrefCollection-to-manage-.patch

Review of attachment 9019800 [details] [diff] [review]:
-----------------------------------------------------------------

Looks reasonable
Attachment #9019800 - Flags: feedback?(nchen) → feedback+
Comment on attachment 9019792 [details] [diff] [review]
0001-Bug-1499214-1.0-Add-nested-UriClassifier-runtime-set.patch

Review of attachment 9019792 [details] [diff] [review]:
-----------------------------------------------------------------

I agree with Jim, the nesting seems like it adds complexity without gaining a whole lot in return.
Attachment #9019792 - Flags: feedback?(snorp) → feedback+
Comment on attachment 9019797 [details] [diff] [review]
0002-Bug-1499214-2.0-Allow-multiple-handlers-for-LoadURID.patch

Review of attachment 9019797 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/modules/geckoview/LoadURIDelegate.jsm
@@ +19,5 @@
> +  _handlers: [],
> +
> +  register: function(aDocShell, aModule) {
> +    if (!this._handlers.length) {
> +      aDocShell.loadURIDelegate = this;

Yeah, this seems like it'll be a problem that we need to fix.
Attachment #9019797 - Flags: feedback?(snorp) → feedback+
Attachment #9019800 - Flags: feedback?(snorp) → feedback+
64=wontfix because Eugen says this API change doesn't need to be uplifted to Beta.
Product: Firefox for Android → GeckoView
Depends on: 1515728
Attachment #9019792 - Attachment is obsolete: true
Attachment #9019797 - Attachment is obsolete: true
Attachment #9019799 - Attachment is obsolete: true
Attachment #9019800 - Attachment is obsolete: true
See Also: → 1517641
Attachment #9032786 - Attachment description: Bug 1499214 - [1.1] Add Content Blocking API. → Bug 1499214 - [1.2] Add Content Blocking API.
Attachment #9032786 - Attachment description: Bug 1499214 - [1.2] Add Content Blocking API. → Bug 1499214 - [1.3] Add Content Blocking API.
Attachment #9032786 - Attachment description: Bug 1499214 - [1.3] Add Content Blocking API. → Bug 1499214 - [1.4] Add Content Blocking API.
Attachment #9032786 - Attachment description: Bug 1499214 - [1.4] Add Content Blocking API. → Bug 1499214 - [1.5] Add Content Blocking API.
Attachment #9032786 - Attachment description: Bug 1499214 - [1.5] Add Content Blocking API. → Bug 1499214 - [1.6] Add Content Blocking API.
Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/edf04064498b
[1.6] Add Content Blocking API. r=snorp,geckoview-reviewers
https://hg.mozilla.org/integration/autoland/rev/f6a3ce97d18a
[2.0] Update content blocking tests and GVE. r=snorp
Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/2af45706cd19
[3.0] Fix changelog version numbers for latest commits.
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla67
Depends on: 1525155
You need to log in before you can comment on or make changes to this bug.