Closed Bug 1661330 Opened 4 years ago Closed 3 years ago

Update the content blocking log to show when a resource is allowed by shim heuristics

Categories

(Core :: Privacy: Anti-Tracking, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
89 Branch
Tracking Status
firefox89 --- fixed

People

(Reporter: twisniewski, Assigned: pbz)

References

(Regression)

Details

(Keywords: regression)

Attachments

(2 files, 2 obsolete files)

Right now, ETP shims must allow network requests through content blocking so they can use extension APIs to redirect those requests to a local file bundled with Firefox (the actual shim).

Unfortunately, this results in the UI in the shield icon not knowing that the request was blocked, which causes confusion and concern for users (as reported for Fenix at https://github.com/mozilla-mobile/fenix/issues/14071).

We will need a JS-accessible way to address this from the webcompat built-in extension, presumably by allowing it to add to the content blocking log which the UI uses.

Note that it would also be best for the UI to know whether a request is shimmed versus simply blocked, so adding a flag that can later be used by the UI might be best. (This could likewise be useful for the UI to be able to reveal when an add-on is blocking a given request, in the future).

Flags: needinfo?(nhnt11)
Severity: -- → S1
Severity: S1 → S2

I dug into this a bit and I have some preliminary thoughts.

  1. The shimming code is using an experiment API to unblock() the channel if we want to shim the resource.
  2. unblock() basically causes URL classifier to react as if the resource is allow-listed.
  3. The UI uses the content blocking log to show those entries - and as far as the content blocking log knows, we allowed those resources through.

I'm not sure that making shimming a "first-class citizen" of the content blocking state is the best approach here, mostly because the shimming code lives in an add-on and I feel like it'd be a lot of work to add the right hooks into the anti-tracking flows to make this work.

Currently I'm liking the idea of instead having a separate store of shimmed resources per-site, and the UI can cross-check content blocking log entries with that list to see if an "allowed" entry was actually shimmed. This of course has its own downside, which is the addition of an extra source of truth that defines the UI state. I wouldn't be against that though, with a more comprehensive solution (i.e., giving shimming its own state flag and surfacing that information in the content blocking log) doable later. Another idea would be to expose something similar to "unblock" but specifically to tell URL classifier that the channel is still "blocked", but we are handling the actual "blocking" manually.

Dimi, can you clarify the exact meaning of a "blocked" channel? Is the URL classifier responsible not only for classification of channels but also preventing loads?

Tim, do you have any ideas on how we might be able to bridge extension code with the anti tracking backend such that the extension is able to add an additional content blocking state flag for a given resource?

Tom, what do you think about my intermediate solution where we simply store a list of shimmed resources per-site and cross-check that list before showing "allowed" items in the UI?

Flags: needinfo?(twisniewski)
Flags: needinfo?(tihuang)
Flags: needinfo?(nhnt11)
Flags: needinfo?(dlee)

Actually, I wonder if maybe we are overthinking this?

Right now, we know the addon is calling unblock(), and that it will either end up:

  • blocking the request itself, by canceling it.
  • shimming it, by redirecting the request to a webextension: URL.
  • actually leaving it unblocked, if the user has intentionally opted into that behavior.

So maybe we don't need a new API or shared data at all, if the call to unblock() can just observe what ends up happening to the unblocked request. For now, it could assume that if the request was canceled or redirected, it was blocked, otherwise it was not. Later, it and the UI could be updated to note which addon ID took over the request (so the user knows the webcompat addon is doing the deed), as well as whether it ended up blocking or shimming the request.

What do you think?

a separate store of shimmed resources per-site,

With an API like Services.cpmm.sharedData? That sounds fine to me.

Flags: needinfo?(twisniewski) → needinfo?(nhnt11)

(In reply to Nihanth Subramanya [:nhnt11] from comment #1)

Dimi, can you clarify the exact meaning of a "blocked" channel? Is the URL classifier responsible not only for classification of channels but also preventing loads?

Yes, URL Classifier is responsible for both classification and blocking (in the case of Tracking Protection).

If doing this in the UI side is not what we want or not doable, I think a viable solution here will be after calling the “unblock” API, URL Classifier still notifies this “unblock” event and the anti-tracking backend will store it in the content blocking log.
My question is, do we need “multiple flag” for this case or is simply an “unblock” state for the unblocked channel good enough in our use case.

Flags: needinfo?(dlee)

(In reply to Thomas Wisniewski [:twisniewski] from comment #2)

Actually, I wonder if maybe we are overthinking this?

Right now, we know the addon is calling unblock(), and that it will either end up:

  • blocking the request itself, by canceling it.
  • shimming it, by redirecting the request to a webextension: URL.
  • actually leaving it unblocked, if the user has intentionally opted into that behavior.

So maybe we don't need a new API or shared data at all, if the call to unblock() can just observe what ends up happening to the unblocked request. For now, it could assume that if the request was canceled or redirected, it was blocked, otherwise it was not. Later, it and the UI could be updated to note which addon ID took over the request (so the user knows the webcompat addon is doing the deed), as well as whether it ended up blocking or shimming the request.

What do you think?

I agree that tweaking unblock to either be smarter about inferring the final fate of the resource or to allow the caller to indicate whether it's really being unblocked or the blocking is getting handled externally would be good options. But Dimi is much better positioned than I to comment the complexity of such solutions.

a separate store of shimmed resources per-site,

With an API like Services.cpmm.sharedData? That sounds fine to me.

Yeah, that would work. Alternatively, a small JS module or even the gProtectionsHandler object in the browser window scope could own that information. The experiment API could set state directly or emit an observer notification that the state manager would intercept if we want further hygiene. This data doesn't really need to be cross process right? The experiment API is scoped to the parent process which is also where the UI code lives.

(In reply to Dimi Lee [:dimi][:dlee] from comment #3)

(In reply to Nihanth Subramanya [:nhnt11] from comment #1)

Dimi, can you clarify the exact meaning of a "blocked" channel? Is the URL classifier responsible not only for classification of channels but also preventing loads?

Yes, URL Classifier is responsible for both classification and blocking (in the case of Tracking Protection).

If doing this in the UI side is not what we want or not doable, I think a viable solution here will be after calling the “unblock” API, URL Classifier still notifies this “unblock” event and the anti-tracking backend will store it in the content blocking log.
My question is, do we need “multiple flag” for this case or is simply an “unblock” state for the unblocked channel good enough in our use case.

That would work! The main thing that I would want to double-check is whether the unblock API has any other consumers that rely on the current behavior. Technically we don't yet need a specific flag to explicitly distinguish shimming from blocking.

Flags: needinfo?(nhnt11)

(In reply to Thomas Wisniewski [:twisniewski] from comment #2)

So maybe we don't need a new API or shared data at all, if the call to unblock() can just observe what ends up happening to the unblocked request. For now, it could assume that if the request was canceled or redirected, it was blocked, otherwise it was not. Later, it and the UI could be updated to note which addon ID took over the request (so the user knows the webcompat addon is doing the deed), as well as whether it ended up blocking or shimming the request.

I think making the API too smart (observe the final outcome and notify result accordingly) is not a perfect solution here. It is not easy to do in URL Classifier and the API would only serve a very limited purpose.

I agree we don't need a new API, we can just reuse the current one and add a new flag to channels that is unblocked.
If we have to distinguish what may eventually happen (cancel, shimmed, just unblock), we can add different flags for each case and set it via the caller of unblock() API.

Do you think this makes sense?

(In reply to Nihanth Subramanya [:nhnt11] from comment #4)

That would work! The main thing that I would want to double-check is whether the unblock API has any other consumers that rely on the current behavior. Technically we don't yet need a specific flag to explicitly distinguish shimming from blocking.

I think ETP shim is the only consumer so far.

(In reply to Nihanth Subramanya [:nhnt11] from comment #1)

Tim, do you have any ideas on how we might be able to bridge extension code with the anti tracking backend such that the extension is able to add an additional content blocking state flag for a given resource?

I don't think there is an existing way that can allow Extention code to communicate to the anti-tracking backend directly. But, I agree with Dimi that we can add a new ContentBlockingEvent flag to tell the UI that there is an "unblocked" request that has happened. It would be good if we can identify who triggers this unblock, but the current content blocking log doesn't support this. I'd suggest that we first add a new flag for this and improve the content blocking log in a follow-up bug.

Flags: needinfo?(tihuang)
Depends on: 1662362
Depends on: 1669731
Summary: Shims need to be able to inform the UI that the requests being shimmed are still actually blocked. → Inform the protections UI when a resource is allowed by shim heuristics
Type: defect → enhancement
Assignee: nobody → pbz
Status: NEW → ASSIGNED

The Webcompat API can call channel.unblock for the shimming case, where the channel will be unblocked, but redirected to a shim. In this case show it as blocked in the protections UI since Bug 1662362.
For the case where we want to allow a tracker after user opt-in, the API can call channel.allow this will result in the content blocking log only containing STATE_LOADED_<category> (from the annotation classifier) for the affected channel / site.
I'm currently updating the protections UI to correctly handle this case. Specifically we also want to support cases where some trackers of a category were blocked and some allowed. This is currently not supported by the UI.

  • Added STATE_REPLACED_<category> flags for the content blocking categories: Tracking, SocialTracking, Cryptomining, Fingerprinting.
    These flags indicate that a channel was allowed to load, but the tracking content will be replaced by a shim.
  • Removed STATE_UNBLOCKED_TRACKING_CONTENT which was used for the shimming case before.
  • Updated annotation classifiers to not log STATE_LOADED_<category> if there is already a STATE_REPLACED_<category> message for the channel.
    This is necessary, because the UI needs to differentiate between the "replaced" and the "allowed" state.

Depends on D104697

Attachment #9202343 - Attachment description: Bug 1661330 - Updated Anti-Tracking URL classifiers to pass STATE_REPLACED_ flags for all content blocking categories. r=timhuang → Bug 1661330 - Updated Anti-Tracking URL classifiers to pass STATE_REPLACED_ flags for all content blocking categories. r=dimi!
Attachment #9202343 - Attachment is obsolete: true
Attachment #9202344 - Attachment is obsolete: true

This work is now tracked for 89 as part of shipping additional shims. We decided to split off the UI work into separate bugs (which I'll file momentarily) and to reduce the scope of this bug to the following:

Backend: Add a new flag in the content blocking log when a channel is "allowed" (in URLClassifier)
Addon: Call .allow() when a resource is allowed

No longer blocks: 1687942
Component: Interventions → Privacy: Anti-Tracking
Priority: -- → P2
Product: Web Compatibility → Core
Summary: Inform the protections UI when a resource is allowed by shim heuristics → Update the content blocking log to show when a resource is allowed by shim heuristics

Since this bug is an enhancement now, it should be marked as Depends on bug 1637329 not Regressed by it.

(In reply to Johann Hofmann [:johannh] from comment #12)

This work is now tracked for 89 as part of shipping additional shims. We decided to split off the UI work into separate bugs (which I'll file momentarily) and to reduce the scope of this bug to the following:

Backend: Add a new flag in the content blocking log when a channel is "allowed" (in URLClassifier)
Addon: Call .allow() when a resource is allowed

We need to keep GV's flags in sync with the nsIWebProgressListener flags that are being changed here -- please add/update them here: https://searchfox.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/geckoview/ContentBlockingController.java#201 as part of this patch and flag me as a reviewer.

Attachment #9202341 - Attachment description: Bug 1661330 - Webcompat: Call channel.allow for allowing tracking resources on user opt-in. → WIP: Bug 1661330 - Webcompat: Call channel.allow for allowing tracking resources on user opt-in.
Attachment #9202341 - Attachment description: WIP: Bug 1661330 - Webcompat: Call channel.allow for allowing tracking resources on user opt-in. → Bug 1661330 - Webcompat: Call channel.allow for allowing tracking resources on user opt-in.
Attachment #9211484 - Attachment description: WIP: Bug 1661330 - Log STATE_ALLOWED_TRACKING_CONTENT to the ContentBlockingLog if a channel is allowed. r=dimi! → Bug 1661330 - Log STATE_ALLOWED_TRACKING_CONTENT to the ContentBlockingLog if a channel is allowed. r=dimi!
Attachment #9202341 - Attachment description: Bug 1661330 - Webcompat: Call channel.allow for allowing tracking resources on user opt-in. → WIP: Bug 1661330 - Webcompat: Call channel.allow for allowing tracking resources on user opt-in.
Attachment #9202341 - Attachment description: WIP: Bug 1661330 - Webcompat: Call channel.allow for allowing tracking resources on user opt-in. → Bug 1661330 - Webcompat: Call channel.allow for allowing tracking resources on user opt-in.
Attachment #9202341 - Attachment description: Bug 1661330 - Webcompat: Call channel.allow for allowing tracking resources on user opt-in. → Bug 1661330 - Webcompat: Call channel.allow for allowing tracking resources on user opt-in. r=denschub!
Pushed by pzuhlcke@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/79f70c3aeece
Webcompat: Call channel.allow for allowing tracking resources on user opt-in. r=denschub,webcompat-reviewers
https://hg.mozilla.org/integration/autoland/rev/0ed6389810d3
Log STATE_ALLOWED_TRACKING_CONTENT to the ContentBlockingLog if a channel is allowed. r=geckoview-reviewers,dimi,droeh
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → 89 Branch
Has Regression Range: --- → yes
Keywords: regression
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: