Closed Bug 1423229 Opened 7 years ago Closed 6 years ago

Extend GeckoView tracking protection API

Categories

(GeckoView :: General, enhancement)

51 Branch
All
Android
enhancement
Not set
normal

Tracking

(firefox60 fixed)

RESOLVED FIXED
mozilla60
Tracking Status
firefox60 --- fixed

People

(Reporter: esawin, Assigned: esawin)

References

Details

Attachments

(4 files, 13 obsolete files)

5.91 KB, patch
francois
: review+
Details | Diff | Splinter Review
6.78 KB, patch
esawin
: review+
Details | Diff | Splinter Review
14.15 KB, patch
esawin
: review+
Details | Diff | Splinter Review
3.21 KB, patch
esawin
: review+
Details | Diff | Splinter Review
Currently, we only support enabling and disabling of tracking protection (TP) through the GeckoView settings.

We would like to provide a TP API to allow for more customization and monitoring.
Initially, we want to expose the blocked trackers counter and allow setting the tracking table by tracker category.

Optionally, we want to expose individual stats on the blocked elements like its URL and tracker category.

In future, we would like to allow for per-GeckoSession configuration of the tracking table which is currently app-global.
Add "MozTrackingProtection:Blocked" event for individual element blocking callbacks. This might not be necessary, if we already have an event which coincides with aborted channel loads.
Currently, only an app-global TP configuration is supported. With the TrackingProtection class we would allow static configuration of the tracking table by tracker categories. Eventually, this could move to GeckoSession/GeckoSessionSettings, but that would require some modifications in Gecko.

The GeckoSession listener would provide a callback to track the individual blocked elements (URL and tracker category) and the total blocked elements counter. The tracker category currently defaults to Tracker.AD, but we should be able to support this with some minor modifications in Gecko.

Also this adds yet another JSM/content module pair, but that's an issue we need to solve in a general fashion.
Add TP listener to the GeckoView example app for demonstration.
Depends on: 1425075
Moving the discussion from e-mail to this bug, these are the open items that need to be addressed to fully support the TP API draft:

1. Category-based tracker list split (ad, analytic, social, content)
2. Category-based blacklist result type reporting. Currently, NS_ERROR_TRACKING_URI is the only signal we receive for blocked elements, but we would like to know the relevant category or tracker list
3. Per-GeckoSession configuration of the tracking table, which is currently tied to the Gecko-global pref

#1 is being addressed in bug 1425075.

To support #2, we could expose the classification results (nsIClassifiedChannel) through a callback - do we want a dedicated listener for this or would this work with nsIWebProgressListener?

At this point, we also need to decide whether we want to explicitly support the tracker category in Gecko or implicitly through the tracker list (name) - do we want Gecko to know about tracker "categories"?

The current plan is for GeckoView to maintain the mapping between GeckoView's tracker categories and Gecko's tracker lists and keeping this connection as loose as possible is desirable (note: currently patch 2 exposes the tracker list names, which will be refactored).

#3 may be addressed in a follow-up bug, it's not a blocking issue for the initial API implementation.
We can avoid introducing additional events if we could make the matched tracker list accessible in the onSecurityChange callback.

There might be a better way to do this, but this would be an example implementation that's sufficient to support the GeckoView TP API.
Attachment #8934566 - Attachment is obsolete: true
The GV TP module implementation based on patch 1.1.

With bug 1425075 resolved, we need to update the category-list mappings, currently all trackers are classified as either AD or CONTENT.
Attachment #8934572 - Attachment is obsolete: true
Attachment #8945088 - Flags: review?(snorp)
This is how an example use case of the new API could look like.
Attachment #8934573 - Attachment is obsolete: true
Attachment #8945090 - Flags: review?(snorp)
Comment on attachment 8945088 [details] [diff] [review]
0002-Bug-1423229-2.1-Add-GeckoView-tracking-protection-AP.patch

Jim, can you please comment on the TrackingProtection.Tracker enum, would you consider this to be a good case for an enum or should I rewrite this to use integers?

Using the enum it makes it easy to keep the mapping between the public API categories and the internal Gecko tracker lists.
Attachment #8945088 - Flags: feedback?(nchen)
Comment on attachment 8945087 [details] [diff] [review]
0001-Bug-1423229-1.1-Enable-nsIClassifiedChannel-interfac.patch

Mike, do you think there is a better way to pass the matched list around?
Attachment #8945087 - Flags: review?(mconley)
Comment on attachment 8945088 [details] [diff] [review]
0002-Bug-1423229-2.1-Add-GeckoView-tracking-protection-AP.patch

Review of attachment 8945088 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoSession.java
@@ +186,5 @@
> +                                      final String event,
> +                                      final GeckoBundle message,
> +                                      final EventCallback callback) {
> +
> +                if ("GeckoView:TrackingProtection:Blocked".equals(event)) {

"GeckoView:TrackingProtectionBlocked" for consistency

@@ +695,5 @@
> +    * Set the tracking protection callback handler.
> +    * This will replace the current handler.
> +    * @param listener An implementation of TrackingProtectionListener.
> +    */
> +    public void setTrackingProtectionListener(TrackingProtectionListener listener) {

We need a `getTrackingProtectionListener` (and a `getScrollListener` for that matter)

@@ +1663,5 @@
> +    public interface TrackingProtectionListener {
> +        /**
> +         * A tracking element has been blocked from loading.
> +         *
> +        * @param view The GeckoSession that initiated the callback.

@param session

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/TrackingProtection.java
@@ +10,5 @@
> +import java.util.HashSet;
> +
> +import android.util.Log;
> +
> +public class TrackingProtection {

Tracking protection is already in GeckoSessionSettings, so we should probably just go with per-GeckoSession TrackingProtection object, e.g. `geckoSession.getTrackingProtection().block(...)`. These would all share the same global state, so change in one is visible in another, but allows for per-GeckoSession state in the future.

@@ +11,5 @@
> +
> +import android.util.Log;
> +
> +public class TrackingProtection {
> +    public enum Tracker {

I think we should just use string constants located in TrackingProtectionListener

@@ +52,5 @@
> +    private static final String LISTS_PREF =
> +        "browser.safebrowsing.provider.mozilla.lists";
> +
> +    public static void block(final EnumSet<Tracker> trackers) {
> +        HashSet used = new HashSet<String>();

Using a HashSet seems like overkill versus using a string search or something.

@@ +53,5 @@
> +        "browser.safebrowsing.provider.mozilla.lists";
> +
> +    public static void block(final EnumSet<Tracker> trackers) {
> +        HashSet used = new HashSet<String>();
> +        String prefValue = "test-track-simple";

Use StringBuilder

@@ +62,5 @@
> +            used.add(tracker.getList());
> +            prefValue += "," + tracker.getList();
> +        }
> +
> +        Log.d(LOGTAG, "blocking " + trackers);

`if (DEBUG)` or something

@@ +63,5 @@
> +            prefValue += "," + tracker.getList();
> +        }
> +
> +        Log.d(LOGTAG, "blocking " + trackers);
> +        PrefsHelper.setPref(TRACKERS_PREF, prefValue);

Using per-GeckoSession TrackingProtection also lets us not use PrefsHelper, which I think can go into Fennec since it's not used elsewhere in GeckoView code.
Attachment #8945088 - Flags: feedback?(nchen) → feedback+
(In reply to Eugen Sawin [:esawin] from comment #8)
> Comment on attachment 8945088 [details] [diff] [review]
> 0002-Bug-1423229-2.1-Add-GeckoView-tracking-protection-AP.patch
> 
> Jim, can you please comment on the TrackingProtection.Tracker enum, would
> you consider this to be a good case for an enum or should I rewrite this to
> use integers?
> 
> Using the enum it makes it easy to keep the mapping between the public API
> categories and the internal Gecko tracker lists.

Sorry didn't see you asked me to comment on the enum specifically. I think string constants would work in this case, e.g.

> public interface TrackingProtectionListener {
>     final String CATEGORY_AD = "base";
>     final String CATEGORY_ANALYTIC = "base";
>     final String CATEGORY_SOCIAL = "base";
>     final String CATEGORY_CONTENT = "content";
>
>     void onTrackerBlocked(GeckoSession session, String uri,
>                           String category);
> }
Comment on attachment 8945088 [details] [diff] [review]
0002-Bug-1423229-2.1-Add-GeckoView-tracking-protection-AP.patch

Review of attachment 8945088 [details] [diff] [review]:
-----------------------------------------------------------------

I think I'd like to see another version with jchen's suggestions too.

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoSession.java
@@ +1659,5 @@
>          */
>          public void onScrollChanged(GeckoSession session, int scrollX, int scrollY);
>      }
>  
> +    public interface TrackingProtectionListener {

TrackingProtectionDelegate

::: mobile/android/modules/geckoview/GeckoViewTrackingProtection.jsm
@@ +14,5 @@
> +XPCOMUtils.defineLazyGetter(this, "dump", () =>
> +    Cu.import("resource://gre/modules/AndroidLog.jsm",
> +              {}).AndroidLog.d.bind(null, "ViewTrackingProtection"));
> +
> +function debug(aMsg) {

I'm kinda getting tired of us pasting this everywhere. We should just have a debug() message in the GeckoViewModule, then have a way to enable it for each one. You don't have to fix that in this patch, though.
Attachment #8945088 - Flags: review?(snorp) → review-
Attachment #8945090 - Flags: review?(snorp) → review+
(In reply to Jim Chen [:jchen] [:darchons] from comment #11)
> > public interface TrackingProtectionListener {
> >     final String CATEGORY_AD = "base";
> >     final String CATEGORY_ANALYTIC = "base";
> >     final String CATEGORY_SOCIAL = "base";
> >     final String CATEGORY_CONTENT = "content";
> >
> >     void onTrackerBlocked(GeckoSession session, String uri,
> >                           String category);
> > }

Using strings instead of the enum has the following disadvantages:
1. internal Gecko list names are exposed in a public GV API (or we need an additional explicit mapping between the public constants and the Gecko list names)
2. no type safety

Do we have strong reasons to discard Java enums in general?
(In reply to Eugen Sawin [:esawin] from comment #13)
> Using strings instead of the enum has the following disadvantages:
> 1. internal Gecko list names are exposed in a public GV API (or we need an
> additional explicit mapping between the public constants and the Gecko list
> names)

Even with enums, you still need to keep track of the internal list name inside the enum, so we don't actually gain much benefit to use enums in this case. Conversely, I don't think using constants is a disadvantage.

Using enums also results in some peculiar behavior in this case. For example, if you call block with SOCIAL and CONTENT, you'll actually get onBlocked calls with AD and CONTENT as categories, because we have no way of differentiating AD and SOCIAL at the moment. Using string constants leaves it up to the user to decide what makes sense given the string categories. Using enums forces us to have a unique list name for each enum member. 

> 2. no type safety

This is true, but I don't think it's that bad. If people make mistakes here because we don't give them an enum to use, I hope they never have to program in JS or Python :)

> Do we have strong reasons to discard Java enums in general?

They're just very cumbersome to use IMO, not to mention the extra memory/perf cost due to enums being separate classes. Android also has some precedent of preferring int/string constants over enums. We should use enums in some cases, but we should use them sparingly and in compelling situations.
(In reply to Jim Chen [:jchen] [:darchons] from comment #14)
> Even with enums, you still need to keep track of the internal list name
> inside the enum, so we don't actually gain much benefit to use enums in this
> case. Conversely, I don't think using constants is a disadvantage.

The mapping would be implicit and private through the typed enum.

> Using enums also results in some peculiar behavior in this case. For
> example, if you call block with SOCIAL and CONTENT, you'll actually get
> onBlocked calls with AD and CONTENT as categories, because we have no way of
> differentiating AD and SOCIAL at the moment. Using string constants leaves
> it up to the user to decide what makes sense given the string categories.
> Using enums forces us to have a unique list name for each enum member. 

Enum identities are distinct from their values, so in this case even for equal list names, we would get "blocking [AD, SOCIAL, ANALYTIC, CONTENT]". For the reverse case (in the callback), there is no difference between using an enum and constants. For this bug, it's not an issue because we're blocking on the tracker list splitting bug, so we would have unique values for each list before landing.

> > 2. no type safety
> 
> This is true, but I don't think it's that bad. If people make mistakes here
> because we don't give them an enum to use, I hope they never have to program
> in JS or Python :)

An API should be designed in a way that makes it difficult to be misused, type safety is a effective tool to support this.
However, if the decision on reducing/avoiding enums is final, consistency is more important, so I'll rewrite the enum to use constants.
Addressed comments.

Refactored tracker categories enum to integer constants.
Moved public API to GeckoSession.java and implementation details to TrackingProtection.java.

The per-GeckoSession configuration might be misleading until we actually support this, but this way the API won't have to change when we do.
Attachment #8945088 - Attachment is obsolete: true
Attachment #8945822 - Flags: review?(snorp)
Comment on attachment 8945822 [details] [diff] [review]
0002-Bug-1423229-2.2-Add-GeckoView-tracking-protection-AP.patch

Review of attachment 8945822 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm with nits

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoSession.java
@@ +1684,5 @@
> +         * A tracking element has been blocked from loading.
> +         *
> +        * @param session The GeckoSession that initiated the callback.
> +        * @param uri The URI of the blocked element.
> +        * @param category The tracker category of the blocked element.

Is this always a single category or can it be an OR of multiple categories?

@@ +1692,5 @@
> +
> +    /**
> +     * Enable tracking protection.
> +     * @param categories The categories of trackers that should be blocked
> +     *                   (TrackingProtectionDelegate.CATEGORY_* bitmask flag).

I think you want something like:

@param categories The categories of trackers that should be blocked. Use
                  one or more of the {@link #CATEGORY_AD CATEGORY_*} flags.

::: mobile/android/geckoview/src/main/java/org/mozilla/gecko/TrackingProtection.java
@@ +10,5 @@
> +import android.util.Log;
> +
> +import org.mozilla.gecko.GeckoSession.TrackingProtectionDelegate;
> +
> +public class TrackingProtection {

This should have 'package' visibility, right, since the public interface is in GeckoSession?
Attachment #8945822 - Flags: review?(snorp) → review+
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #17)
> Is this always a single category or can it be an OR of multiple categories?

It's always a single category, we will have a 1:1 mapping for the tracker lists.

> This should have 'package' visibility, right, since the public interface is
> in GeckoSession?

Correct, I will fix it.
Addressed comments.
Attachment #8945822 - Attachment is obsolete: true
Attachment #8945833 - Flags: review+
Rebased.
Attachment #8945090 - Attachment is obsolete: true
Attachment #8945837 - Flags: review+
(In reply to Eugen Sawin [:esawin] from comment #9)
> Comment on attachment 8945087 [details] [diff] [review]
> 0001-Bug-1423229-1.1-Enable-nsIClassifiedChannel-interfac.patch
> 
> Mike, do you think there is a better way to pass the matched list around?

Sorry for the delay - was wrestling with the chemspill. Looking.
Comment on attachment 8945087 [details] [diff] [review]
0001-Bug-1423229-1.1-Enable-nsIClassifiedChannel-interfac.patch

Review of attachment 8945087 [details] [diff] [review]:
-----------------------------------------------------------------

This seems okay, assuming that we're okay with the matchedList only being populated in onSecurityChange updates.

::: toolkit/content/browser-child.js
@@ +220,5 @@
>      let json = this._setupJSON(aWebProgress, aRequest);
>      let objects = this._setupObjects(aWebProgress, aRequest);
>  
> +    let classChannel = aRequest.QueryInterface(Ci.nsIClassifiedChannel);
> +    json.matchedList = classChannel.matchedList;

This means that matchedList is only ever going to be set onSecurityChange. For other Content: messages from browser-child.js (like LocationChange, StatusChange, etc), matchedList is going to be null. Is that okay / expected?
Attachment #8945087 - Flags: review?(mconley) → review+
(In reply to Mike Conley (:mconley) (:⚙️) from comment #22)
> This means that matchedList is only ever going to be set onSecurityChange.
> For other Content: messages from browser-child.js (like LocationChange,
> StatusChange, etc), matchedList is going to be null. Is that okay / expected?

That's expected, I only want to propagate the matched list when it's set in nsChannelClassifier::SetBlockedContent.

Also, currently we're only interested in the matchedList attribute, however we could expose all nsIClassifiedChannel attributes while we are at it, if that's the preferred way of handling this.
(In reply to Eugen Sawin [:esawin] from comment #23)
> Also, currently we're only interested in the matchedList attribute, however
> we could expose all nsIClassifiedChannel attributes while we are at it, if
> that's the preferred way of handling this.

Hm, I'm actually okay exposing only a subset. RemoteWebProgress was really just a shim so that some front-endy things could continue to work with our new multi-process overlords. I'd rather not add things that nothing ends up using. In fact, the less we add to the shims, probably the better.
Appendix to patch 2 (will be squashed) to support multiple categories per tracker.
Although we have a 1:1 mapping between tracker list and tracker category, the lists may contain redundant elements (e.g., the content tracker list seems to contain most of the ad, analytic trackers).
Attachment #8947240 - Flags: review?(snorp)
Appendix to patch 3 (will be squashed) to support multiple categories per tracker.
Attachment #8947241 - Flags: review?(snorp)
Attachment #8947239 - Flags: review?(francois) → review+
Attachment #8947240 - Flags: review?(snorp) → review+
Comment on attachment 8947241 [details] [diff] [review]
0006-Bug-1423229-3.2.1-Support-multiple-categories-per-tr.patch

Review of attachment 8947241 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/geckoview_example/src/main/java/org/mozilla/geckoview_example/GeckoViewActivity.java
@@ +336,2 @@
>  
>          private void clearCounters() {

We never call this AFAICT

@@ +341,3 @@
>          }
>  
>          private void logCounters() {

Same as above
Attachment #8947241 - Flags: review?(snorp) → review-
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #28)
> Comment on attachment 8947241 [details] [diff] [review]
> 0006-Bug-1423229-3.2.1-Support-multiple-categories-per-tr.patch
> 
> Review of attachment 8947241 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> :::
> mobile/android/geckoview_example/src/main/java/org/mozilla/geckoview_example/
> GeckoViewActivity.java
> @@ +336,2 @@
> >  
> >          private void clearCounters() {
> 
> We never call this AFAICT
> 
> @@ +341,3 @@
> >          }
> >  
> >          private void logCounters() {
> 
> Same as above

Sorry for the confusion, we call these in onPage{Start, Stop} respectively in patch 3.2.
Flags: needinfo?(snorp)
Flags: needinfo?(snorp)
Merged.
Attachment #8945833 - Attachment is obsolete: true
Attachment #8947240 - Attachment is obsolete: true
Attachment #8947947 - Flags: review+
Merged
Attachment #8945837 - Attachment is obsolete: true
Attachment #8947241 - Attachment is obsolete: true
Attachment #8947948 - Flags: review+
Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5417194f3778
[1.1] Enable nsIClassifiedChannel interface in onSecurityChange callbacks. r=mconley
https://hg.mozilla.org/integration/mozilla-inbound/rev/c86887b7f327
[2.4] Add GeckoView tracking protection API and module. r=snorp
https://hg.mozilla.org/integration/mozilla-inbound/rev/8ff1f049ac96
[3.3] Add example tracking protection listener to geckoview_example. r=snorp
https://hg.mozilla.org/integration/mozilla-inbound/rev/39d76cb8c73e
[4.0] Add support for category-based tracking lists. r=francois
Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2ee49f44e617
[6.0] Add missing file to build config on a CLOSED TREE. r=bustage
Backed out for failing browser chrome at browser/base/content/test/siteIdentity/browser_bug1045809.js and firefox functional at testing/firefox-ui/tests/functional/security/test_mixed_script_content_blocking.py TestMixedScriptContentBlocking.test_mixed_content_page and mochitest devtools at devtools/client/webconsole/test/browser_webconsole_block_mixedcontent_securityerrors.js

Push that caused the failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=39d76cb8c73ef787ded4e9b218e469514d78d974

Browser chrome failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=160153786&repo=mozilla-inbound&lineNumber=2064

Firefox Functional failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=160153786&repo=mozilla-inbound&lineNumber=2064

Mochitest devtools failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=160162262&repo=mozilla-inbound&lineNumber=3273

Backout: https://hg.mozilla.org/integration/mozilla-inbound/rev/aed5b139336a77659e4472e3639b9f92da92ace3
Flags: needinfo?(esawin)
I didn't account for aRequest == null, looks better with this patch: https://treeherder.mozilla.org/#/jobs?repo=try&revision=86e9ee5a5b45f9cbf0bd9d85053fe7a40b64dcf1
Attachment #8945087 - Attachment is obsolete: true
Flags: needinfo?(esawin)
Attachment #8948462 - Flags: review?(mconley)
Merged splinter and build fix.
Attachment #8947947 - Attachment is obsolete: true
Attachment #8948463 - Flags: review+
Comment on attachment 8948462 [details] [diff] [review]
0001-Bug-1423229-1.2-Enable-nsIClassifiedChannel-interfac.patch

Review of attachment 8948462 [details] [diff] [review]:
-----------------------------------------------------------------

Good catch, thanks.

::: toolkit/content/browser-child.js
@@ +225,5 @@
>      json.status = SecurityUI.getSSLStatusAsString();
>  
> +    json.matchedList = null;
> +    if (aRequest && aRequest instanceof Ci.nsIClassifiedChannel) {
> +      json.matchedList = aRequest.QueryInterface(Ci.nsIClassifiedChannel).matchedList;

Using instanceof effectively does QueryInterface on aRequest anyways, so you can skip the second QueryInterface, and have this be:

```js
if (aRequest && aRequest instanceof Ci.nsIClassifedChannel) {
  json.matchedList = aRequest.matchedList;
}
```
Attachment #8948462 - Flags: review?(mconley) → review+
Addressed comment.
Attachment #8948462 - Attachment is obsolete: true
Attachment #8948508 - Flags: review+
Pushed by esawin@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/67f3c10d7ec0
[1.3] Enable nsIClassifiedChannel interface in onSecurityChange callbacks. r=mconley
https://hg.mozilla.org/integration/mozilla-inbound/rev/4de96fb717c0
[2.5] Add GeckoView tracking protection API and module. r=snorp
https://hg.mozilla.org/integration/mozilla-inbound/rev/dfcb4f42a9be
[3.3] Add example tracking protection listener to geckoview_example. r=snorp
https://hg.mozilla.org/integration/mozilla-inbound/rev/fc41424ab9ce
[4.0] Add support for category-based tracking lists. r=francois
Product: Firefox for Android → GeckoView
Target Milestone: Firefox 60 → mozilla60
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: