Closed Bug 1575811 Opened 4 months ago Closed 4 months ago

Disabling protections on a site makes first-party social trackers show up in the content blocking log because there's no entity list for social tracking protection

Categories

(Core :: Privacy: Anti-Tracking, defect, P1)

70 Branch
defect

Tracking

()

VERIFIED FIXED
mozilla70
Tracking Status
firefox70 + verified

People

(Reporter: oana.botisan, Assigned: dimi, NeedInfo)

References

(Blocks 2 open bugs)

Details

(Whiteboard: [skyline][stp])

Attachments

(1 file, 1 obsolete file)

Affected versions

  • Firefox 70.a01

Affected platforms

  • Windows 10 x64
  • Ubuntu 18.04 x64
  • macOS 10.13

Steps to reproduce

  1. Open Facebook in two different tabs.
  2. In the first tab click on shield icon and turn off Trackers Protection.
  3. In the second tab click on shield icon and turn on Trackers Protection.
  4. In the first tab click again on the shield icon and observe the blocked trackers.
  5. Check the same thing in the second tab.

Expected result

    1. Social Media Trackers are displayed in the "Blocked" section.
    1. Social Media Trackers are displayed in the "Blocked" section

Actual result

    1. Social Media Trackers are displayed in the "Blocked" section.
    1. Social Media Trackers are displayed in the "None Detected" section.

Regression range

  • I don't think this is a regression.

Additional notes

  • After the page is refresh, Social Media Trackers are moved to "None Detected" section.
  • This bug is reproducing on multiple sites: Twitter, LinkedIn, Instagram etc.
  • This issue is reproducing when you change the setting from about:preferences#privacy, too:
    1. Open Facebook and turn off Trackers Protection for that site.
    2. Go to about:preferences#privacy.
    3. In the Manage Exceptions... remove Facebook from the list and save the changes.
    4. Go back to the tab that contains Facebook.

Actual results:

  • Social Media Trackers are displayed in the "Blocked" section.

This bug is reproducing on multiple sites: Twitter, LinkedIn, Instagram etc.

As mentioned before, we're blocking third-party trackers, so that part of the bug is invalid.

I think this boils down to:

  • Go to facebook.com
  • Content Blocking Log (seems correct, except for the null thing, null principal?):
"{
 \"https://www.facebook.com\": [[32768, true, 1]],
 \"https://facebook.com\": [[32768, true, 1]],
 \"null\": [[32768, true, 1]],
 \"https://cx.atdmt.com\": [[32768, true, 1]],
 \"https://www.google.com\": [[8192, true, 1], [32768, true, 1], [536870912, true, 1]],
 \"https://www.gstatic.com\": [[8192, true, 1], [32768, true, 1], [536870912, true, 6]]
}
"
  • Add an exception for the site
  • ContentBlockingLog (seems wrong for facebook.com):
"{
 \"https://www.facebook.com\": [[8192, true, 1], [32768, true, 1], [131072, true, 4]],
 \"https://facebook.com\": [[8192, true, 1], [32768, true, 1], [131072, true, 1]],
 \"null\": [[32768, true, 1]],
 \"https://cx.atdmt.com\": [[32768, true, 1]],
 \"https://www.google.com\": [[8192, true, 1], [32768, true, 1]],
 \"https://www.gstatic.com\": [[8192, true, 1]]
}
"

This might be related to bug 1576109 in some way? It's both about adding exceptions...

Baku, Ehsan, Dimi, it would be great to get some help analyzing this.

Component: Tracking Protection → Privacy: Anti-Tracking
Product: Firefox → Core
Summary: Switching between two tabs that contain the same site, confuses the detection of Social Media Trackers → Disabling protections on a site makes first-party social trackers show up in the content blocking log

Ehsan, does this appear to be the same issue as https://bugzilla.mozilla.org/show_bug.cgi?id=1576109?

Flags: needinfo?(ehsan)
Flags: needinfo?(dlee)

I haven't looked into ContentBlocking part, I'll check it later.

Right now, URL Classifier doesn't skip classifying first-party trackers for annotation features and because first-party trackers are also not included in our tracking whitelist table, first-party trackers will be annotated by URL Classifier.

For example, "facebook.de" of top-level "facebook.com" is not annotated as a tracker because it is in the whitelist while "facebook.com" of top-level "facebook.com" is annotated as a tracker.
So we rely on ContentBlocking to do decisions when a first-party tracker is reported.

My question is that can URL Classifier skips reporting first-party social trackers? I think we can still fix this in ContentBlocking code but it may be easier(or less bug) if URL Classifer doesn't report this at all(unless we need this information)

Flags: needinfo?(dlee)
Priority: -- → P1
Whiteboard: [skyline][stp]
Assignee: nobody → dlee
Status: NEW → ASSIGNED

(In reply to Dimi Lee [:dimi][:dlee] from comment #3)

My question is that can URL Classifier skips reporting first-party social trackers? I think we can still fix this in ContentBlocking code but it may be easier(or less bug) if URL Classifer doesn't report this at all(unless we need this information)

I did a quick test by not reporting first-party tracker in URL classifier, this issue still occurs. I'll investigate in more detail.

(In reply to Tanvi Vyas[:tanvi] from comment #2)

Ehsan, does this appear to be the same issue as https://bugzilla.mozilla.org/show_bug.cgi?id=1576109?

No, it is not.

We classify channels from https://www.facebook.com and https://facebook.com with STATE_LOADED_SOCIALTRACKING_CONTENT (131072) from here because IsAllowListed(aChannel) returns true (as expected).

The reason for this logic is that we would like things like trackers to show up as "Allowed" all the time when the page is on the content blocking allow list (aka when the user has added an exception for it.) This is the exact same mechanism that makes sure normal tracker channels are annotated with STATE_LOADED_TRACKING_CONTENT in that case.

From Tanvi on Slack:

In my mind, the entity list for ETP/TP should work automatically for STP.

Looking at the code generating these scripts, it seems that there is no whitelist table support for social tracking protection support at all in the list generation scripts. So not only the entity list for ETP/TP doesn't work automatically for STP but also as far as I can tell there is no entity list for STP whatsoever. If there were then https://www.facebook.com, https://facebook.com (and also https://www.facebook.de) wouldn't be classified in the first place, so none of what I explained above would have happened in the first place.

BTW I used this pernosco recording to debug this, feel free to use it if you'd like to inspect things further.

Flags: needinfo?(ehsan)
Flags: needinfo?(dlee)
Flags: needinfo?(amarchesini)
Summary: Disabling protections on a site makes first-party social trackers show up in the content blocking log → Disabling protections on a site makes first-party social trackers show up in the content blocking log because there's no entity list for social tracking protection

Looking at the code generating these scripts, it seems that there is no whitelist table support for social tracking protection support at all in the list generation scripts. So not only the entity list for ETP/TP doesn't work automatically for STP but also as far as I can tell there is no entity list for STP whatsoever. If there were then https://www.facebook.com, https://facebook.com (and also https://www.facebook.de) wouldn't be classified in the first place, so none of what I explained above would have happened in the first place.

BTW I used this pernosco recording to debug this, feel free to use it if you'd like to inspect things further.

We are now using tracking protection whitelist table, mozstd-trackwhite-digest256, as social tracking protection's whitelist table[1] (and all the other tracking related features are also implemented this way), if this is not right, or not right for all the features, I'll file a bug.

So because of "mozstd-trackwhite-digest256" is being used, facebook.de, fbcdn.net ... etc are not classified as social trackers when top-level is facebook.com.
But we still see "www.facebook.com", "facebook.com" are classified as trackers in this bug, this is because:

  1. Our whitelist generation script doesn't generate whitelist URL with the same origin, for example, we don't have "facebook.com/?resource=facebook.com" in the whitelist table.
    I guess we expected this scenario should be filtered out by our third-party check[4].

  2. we don't call third-party check[2] in social tracking annotation(and also for all the other annotation features) as we do in protection features[3]. This causes facebook.com in facebook.com is still classified by URL Classifier and because it will not be found in whitelist table because of #1, "facebook.com" is marked as social tracker eventually.

And My comment 4 was wrong, after adding [2] in FeatureSocailTrackingAnnotation, this did solve this issue.

So my question is, should we remove whitelist table from social tracking and let ContentBlocking do all the work? or we shouldn't report first-party trackers in URL Classifier in the first place.

[1] https://searchfox.org/mozilla-central/rev/325c1a707819602feff736f129cb36055ba6d94f/modules/libpref/init/all.js#5069
[2] https://searchfox.org/mozilla-central/rev/325c1a707819602feff736f129cb36055ba6d94f/netwerk/url-classifier/UrlClassifierFeatureTrackingProtection.cpp#95
[3] https://searchfox.org/mozilla-central/rev/325c1a707819602feff736f129cb36055ba6d94f/netwerk/url-classifier/UrlClassifierFeatureTrackingProtection.cpp
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=1189087#c1

Flags: needinfo?(dlee)

Before this patch, first-party social trackers are annotated when the
first-party URL is in allow list. This causes ContentBlocking panel display
first-party social trackers inconsistently while we turn ON/OFF ETP for
the site.

After this patch, first-party URLs are ignored in FeatureSocialTrackingAnnotation,
so the URLs will NOT be annotated and not be displayed in ContentBlocking panel.

I'll make a summary for this bug, hope this can make this bug easier to understand

Issue:

In facebook.com:
When ETP for this site is OFF, some URLs are shown in ContentBlocking Panel -> Blocked -> Social Media Tracker
When ETP for this site is ON, those URLs are not shown in any section of ContentBlocking Panel

Expected:

We should have consistent behavior for those URLs.
This means that no matter "ETP for this site is ON or OFF", either we display the URLs in ContentBlocking panel in both cases or we don't display them.

Because the URLs having inconsistent behavior are first-party URLs(See Comment 1), I think we should NOT display them.

Root cause:

When facebook.com is added to ALLOW_LIST (ETP for the site is OFF), all trackers are annotated, no matter it is third-party tracker or not (as pointed out by Ehsan, see here)

There are three types of trackers: third-party trackers, third-party trackers in STP whitelist, and first-party trackers
The reason why only first-party trackers are displayed inconsistently is explained in Comment 6.

The proposed solution is that we don't classify first-party URLs in FeatureSocialTrackingAnnotation.

That idea seems fine to me, off-hand. Thank you for the thorough investigation, Dimi!

Hi Dimi,

Thanks a lot for your thorough analysis here.

I still don't understand a few issues and I was hoping perhaps you could clarify them:

  • You mentioned that STP uses the mozstd-trackwhite-digest256 table for its whitelists. ETP and TP however also use moztest-trackwhite-simple. What's additionally included in that table and is that difference important?
  • Because of the usage of mozstd-trackwhite-digest256, a whitelist URI like http://www.blogger.com/?resource=youtube.com can be looked up in that table. This means that youtube.com for example is whitelisted on blogger.com for Social Tracking Protection. Is this intended? (Based on what Tanvi said yesterday I believe this is the case, but wanted to double check.)
  • With everything that you've explained so far, I still don't understand why this patch is required. We don't need this for UrlClassifierFeatureTrackingAnnotation, so I would like to understand what makes UrlClassifierFeatureSocialTrackingAnnotation a special case and why it cannot work exactly like UrlClassifierFeatureTrackingAnnotation. I suspect that patch may be wallpapering over another bug somewhere deeper in the stack...
Flags: needinfo?(dlee)

(In reply to :ehsan akhgari from comment #10)

Hi Dimi,

Thanks a lot for your thorough analysis here.

I still don't understand a few issues and I was hoping perhaps you could clarify them:

  • You mentioned that STP uses the mozstd-trackwhite-digest256 table for its whitelists. ETP and TP however also use moztest-trackwhite-simple. What's additionally included in that table and is that difference important?

moztest-trackwhite-simple is our built-in whitelist table for testing. It doesn't affect how these features work.
Not adding 'moztest-trackwhite-simple' to STP (or fingerprinting, cryptoming ...etc) may imply that we don't have whitelist testcases for those features or we just use another way to test that. I'll take a look.

  • Because of the usage of mozstd-trackwhite-digest256, a whitelist URI like http://www.blogger.com/?resource=youtube.com can be looked up in that table. This means that youtube.com for example is whitelisted on blogger.com for Social Tracking Protection. Is this intended? (Based on what Tanvi said yesterday I believe this is the case, but wanted to double check.)

While I added 'mozstd-trackwhite-digest256' to social tracking, I think this is how it works (share the same whitelist entries).
So yes, this is intended. But maybe this is not the right thing to do? To be honest, I didn't really think carefully while adding that, so if there are cases that TP/ETP/STP shouldn't share the same whitelist entries, then we should definitely file a bug to use different whitelist tables for different features.

  • With everything that you've explained so far, I still don't understand why this patch is required. We don't need this for UrlClassifierFeatureTrackingAnnotation, so I would like to understand what makes UrlClassifierFeatureSocialTrackingAnnotation a special case and why it cannot work exactly like UrlClassifierFeatureTrackingAnnotation. I suspect that patch may be wallpapering over another bug somewhere deeper in the stack...

Thank you for bringing this up, I did not think of this!
So I just did a quick check, I found actually tracking annotation has the same problem.
If you set ETP OFF for facebook.com, you will see "facebook.com" in ContentBlocking Panel -> Tracking Content.

So unless there is a need for annotating first-party trackers, I think this is the right solution. Otherwise, the current behavior is quite strange; for example, fbcdn.net is not annotated because of the whitelist while facebook.com is annotated because it is first-party.

And I just traced to see why we only check third-party in protection features, I think we copied the behavior while we only have tracking protection and tracking annotation[1]. So maybe there is a reason for doing that?

[1] https://searchfox.org/mozilla-central/rev/5ecc7e89e4fcfdc410df4a172a57cc9039011c93/netwerk/base/nsChannelClassifier.cpp#465

Flags: needinfo?(dlee) → needinfo?(ehsan)

(In reply to Dimi Lee [:dimi][:dlee] from comment #11)

(In reply to :ehsan akhgari from comment #10)

Hi Dimi,

Thanks a lot for your thorough analysis here.

I still don't understand a few issues and I was hoping perhaps you could clarify them:

  • You mentioned that STP uses the mozstd-trackwhite-digest256 table for its whitelists. ETP and TP however also use moztest-trackwhite-simple. What's additionally included in that table and is that difference important?

moztest-trackwhite-simple is our built-in whitelist table for testing. It doesn't affect how these features work.
Not adding 'moztest-trackwhite-simple' to STP (or fingerprinting, cryptoming ...etc) may imply that we don't have whitelist testcases for those features or we just use another way to test that. I'll take a look.

OK, thanks! (No rush BTW)

  • Because of the usage of mozstd-trackwhite-digest256, a whitelist URI like http://www.blogger.com/?resource=youtube.com can be looked up in that table. This means that youtube.com for example is whitelisted on blogger.com for Social Tracking Protection. Is this intended? (Based on what Tanvi said yesterday I believe this is the case, but wanted to double check.)

While I added 'mozstd-trackwhite-digest256' to social tracking, I think this is how it works (share the same whitelist entries).
So yes, this is intended. But maybe this is not the right thing to do? To be honest, I didn't really think carefully while adding that, so if there are cases that TP/ETP/STP shouldn't share the same whitelist entries, then we should definitely file a bug to use different whitelist tables for different features.

Hmm, I'm not sure to be honest. But that's kind of out of the scope of this bug, I was mostly trying to have a good grasp of this stuff. :-)

  • With everything that you've explained so far, I still don't understand why this patch is required. We don't need this for UrlClassifierFeatureTrackingAnnotation, so I would like to understand what makes UrlClassifierFeatureSocialTrackingAnnotation a special case and why it cannot work exactly like UrlClassifierFeatureTrackingAnnotation. I suspect that patch may be wallpapering over another bug somewhere deeper in the stack...

Thank you for bringing this up, I did not think of this!
So I just did a quick check, I found actually tracking annotation has the same problem.
If you set ETP OFF for facebook.com, you will see "facebook.com" in ContentBlocking Panel -> Tracking Content.

Aha! I was suspecting this. :-) Thanks for digging into it.

So unless there is a need for annotating first-party trackers, I think this is the right solution. Otherwise, the current behavior is quite strange; for example, fbcdn.net is not annotated because of the whitelist while facebook.com is annotated because it is first-party.

Well, the "need" is that we have APIs that return the results of classifying first-party stuff. :-(

https://searchfox.org/mozilla-central/rev/325c1a707819602feff736f129cb36055ba6d94f/netwerk/protocol/http/nsIHttpChannel.idl#560
https://searchfox.org/mozilla-central/rev/325c1a707819602feff736f129cb36055ba6d94f/netwerk/protocol/http/nsIHttpChannel.idl#492
https://searchfox.org/mozilla-central/rev/325c1a707819602feff736f129cb36055ba6d94f/netwerk/protocol/http/nsIHttpChannel.idl#479

Now I agree with you that classifying first-party channels makes no sense. But until we have code that uses the results we should keep doing it, so the full solution to this bug involves first getting rid of that code, and then doing something like your patch. And if we decide that no feature ever needs first-party classifications then perhaps we should actually remove the first-party classification code altogether (instead of just skipping over it at runtime all the time!)

The good news is that most of this should be mechanical, but there are probably tests that may be a bit tricky to port off of these APIs. I already started in bug 1577040. Please feel free to help chipping away at another one of the above list. :-)

And I just traced to see why we only check third-party in protection features, I think we copied the behavior while we only have tracking protection and tracking annotation[1]. So maybe there is a reason for doing that?

[1] https://searchfox.org/mozilla-central/rev/5ecc7e89e4fcfdc410df4a172a57cc9039011c93/netwerk/base/nsChannelClassifier.cpp#465

Right, tracking protection always acted on third-party resources only, by design. Tracking annotations originally were applied to the first-party context for the now-dead fastblock project I think (I don't remember the exact rationale really). We never really used this information for anything in practice.

BTW, separate from the above, I question the correctness of the IsAllowListed() branch of this condition. I can't imagine why we want to send the notifications irrespective of third-partiness when we're allow-listed. Do you know why this exists? Removing it surely sounds like a very cheap way to fix this bug...

Depends on: 1577040
Flags: needinfo?(ehsan)
Attachment #9088411 - Attachment is obsolete: true
Depends on: 1577114
No longer depends on: 1577114

(In reply to :ehsan akhgari from comment #12)

The good news is that most of this should be mechanical, but there are probably tests that may be a bit tricky to port off of these APIs. I already started in bug 1577040. Please feel free to help chipping away at another one of the above list. :-)

Sure, I'll work on another one.

BTW, separate from the above, I question the correctness of the IsAllowListed() branch of this condition. I can't imagine why we want to send the notifications irrespective of third-partiness when we're allow-listed. Do you know why this exists? Removing it surely sounds like a very cheap way to fix this bug...

I traced changesets for this, we added it in Bug 1525458.
We added |IsAllowListed| to protection features to notify UX about a channel is classified as a tracker when we turn off content blocking for a site (we didn't separate protection features and annotation features clearly that time, so the notification work was done in protection features).

I think we just mistakenly also added the |IsAllowListed| check to TrackingAnnotation.
So If I understand correctly, I think yes, we can simply remove that to fix the bug!
I also filed bug 1577123 not to classify first-party URLs in annotation features, as you pointed out, we should do this after we entirely remove the usage of first-party classification code.

When URL Classifier reports a channel is a tracker, we should only annotate the channel when
it is a third-party channel.
That means, |IsAllowListed| should not take into consideation here.

No longer depends on: 1577040
Pushed by dlee@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/366ce74393fc
Remove |IsAllowListed| check in UrlClassifierCommon::AnnotateChannel. r=Ehsan
Status: ASSIGNED → RESOLVED
Closed: 4 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla70

I verified the fix using latest Nightly 70.0a1 on Windows 10 x64, macOS 10.14 and Ubuntu 18.04 x64. The issue is not reproducing.

Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.