Enhanced Tracking Protection breaks Facebook instantgames
Categories
(Core :: Privacy: Anti-Tracking, defect)
Tracking
()
People
(Reporter: dveditz, Assigned: dimi)
References
(Blocks 1 open bug)
Details
(Keywords: regression, Whiteboard: [rca - design error])
Attachments
(2 files, 1 obsolete file)
Enhanced Tracking Protection is breaking legit Facebook content where the "3rd" party is an affiliated domain. In at least one case the 3rd party fbsbx.com base domain is on Disconnect's list, but the actual 3rd party is a sub-domain of that (and there are a lot of such subdomains: it functions like *.github.io or *.bmoattachments.org).
Examples include https://apps.facebook.com/ppearls/ and any game at https://www.facebook.com/instantgames
Brad Hill writes:
We actually give each game its own subdomain beneath apps.fbsbx.com, which is in the Public Suffix List, because we want to sandbox each game’s content from other developers, as well as from core facebook.com content.
I wonder if this is an edge case where crossing the PSL boundary causes the “same owner” property from the disconnect list (which just includes fbsbx.com) to be lost?
Example error is:
“Request to access cookie or storage on “https://apps-383041995381848.apps.fbsbx.com/instant-bundle/1…on=477&gcgs=1&source=fbinstant-383041995381848&IsMobileWeb=0” was blocked because it came from a tracker and content blocking is enabled.”
Reporter | ||
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 1•5 years ago
|
||
When I run https://apps.facebook.com/ppearls/, I get the following error on the web console:
Request to access cookie or storage on “https://apps-208295966289297.apps.fbsbx.com/bundle/169676410…vcml0aG0iOiJITUFDLVNIQTI1NiIsImlzc3VlZF9hdCI6MTU2ODIxODU4NH0” was blocked because it came from a tracker and content blocking is enabled.
On about:url-classifier, a whitelist search for https://apps.facebook.com/?resource=fbsbx.com
finds the following:
cryptomining-annotation
URI: https://apps.facebook.com/?resource=fbsbx.com
List of tables: mozstd-trackwhite-digest256
cryptomining-protection
URI: https://apps.facebook.com/?resource=fbsbx.com
List of tables: mozstd-trackwhite-digest256
fingerprinting-annotation
URI: https://apps.facebook.com/?resource=fbsbx.com
List of tables: mozstd-trackwhite-digest256
fingerprinting-protection
URI: https://apps.facebook.com/?resource=fbsbx.com
List of tables: mozstd-trackwhite-digest256
socialtracking-annotation
URI: https://apps.facebook.com/?resource=fbsbx.com
List of tables: mozstd-trackwhite-digest256
socialtracking-protection
URI: https://apps.facebook.com/?resource=fbsbx.com
List of tables: mozstd-trackwhite-digest256
tracking-protection
URI: https://apps.facebook.com/?resource=fbsbx.com
List of tables: mozstd-trackwhite-digest256
tracking-annotation
URI: https://apps.facebook.com/?resource=fbsbx.com
List of tables: mozstd-trackwhite-digest256
flash-deny
URI: https://apps.facebook.com/?resource=fbsbx.com
List of tables: except-flash-digest256,except-flash-digest256
A search for https://apps.facebook.com/?resource=apps.fbsbx.com
finds the following:
flash-deny
URI: https://apps.facebook.com/?resource=apps.fbsbx.com
List of tables: except-flash-digest256,except-flash-digest256
Similarly for https://apps.facebook.com/?resource=apps-208295966289297.apps.fbsbx.com
:
flash-deny
URI: https://apps.facebook.com/?resource=apps-208295966289297.apps.fbsbx.com
List of tables: except-flash-digest256,except-flash-digest256
Dimi, is this expected? Should we be collapsing the resource domain to its eTLD+1 when doing a whitelist search? Thanks!
Note here that the eTLD+1 is apps.fbsbx.com
not fbsbx.com
due to this.
Updated•5 years ago
|
Comment 2•5 years ago
|
||
The way that I think we could fix this bug is by:
a) Create a new nsIUrlClassifierFeature::listType
called whitelistSuffix
and modify UrlClassifierCommon::CreatePairwiseWhiteListURI()
to use the public suffix to create the whitelist URI if whitelistSuffix
is passed.
b) Merge those into mWhitelistTables
.
c) Work with Disconnect to make sure apps.fbsbx.com
is added to the resources
key under the entry for facebook.com
.
Since this will take some time, we may want to consider a skip list fix to address this in a more timely manner for users.
Comment 3•5 years ago
|
||
One relevant note here: I was wondering why we don't hit the same corner case when we go to block these resources, and then I remembered Bug 1203635. So Firefox effectively ignores the PSL boundary when checking whether to block a resource but honors it for the whitelist.
Assignee | ||
Comment 4•5 years ago
|
||
(In reply to :ehsan akhgari from comment #1)
Dimi, is this expected? Should we be collapsing the resource domain to its eTLD+1 when doing a whitelist search? Thanks!
Note here that the eTLD+1 is
apps.fbsbx.com
notfbsbx.com
due to this.
Yes, this is expected.
While matching URLs, URL Classifier only collapses domain name to find a match. Since the whitelisted domain is put in the query string (resouece=xxx), we don't collapse it. But while creating a pairwise whitelisted URL to classify, we already use eTLD+1[1] (here). I assume what we have done is also only to include eTLD+1 in the entity list
I want to make sure that if I understand correctly. The proposed solution in Comment 2 is to add an eTLD domain(apps.fbsbx.com) into our entity list and for every whitelist lookup, we will then check both eTLD and eTLD+1. If this is correct, it means that an additional whitelist lookup for every blacklisted URL(just for this apps.fbsbx.com case). And we should make sure this change doesn't have performance impact.
Comment 5•5 years ago
|
||
(In reply to Dimi Lee [:dimi][:dlee] from comment #4)
(In reply to :ehsan akhgari from comment #1)
Dimi, is this expected? Should we be collapsing the resource domain to its eTLD+1 when doing a whitelist search? Thanks!
Note here that the eTLD+1 is
apps.fbsbx.com
notfbsbx.com
due to this.Yes, this is expected.
While matching URLs, URL Classifier only collapses domain name to find a match. Since the whitelisted domain is put in the query string (resouece=xxx), we don't collapse it. But while creating a pairwise whitelisted URL to classify, we already use eTLD+1[1] (here). I assume what we have done is also only to include eTLD+1 in the entity list
Thanks for confirming.
I want to make sure that if I understand correctly. The proposed solution in Comment 2 is to add an eTLD domain(apps.fbsbx.com) into our entity list and for every whitelist lookup, we will then check both eTLD and eTLD+1. If this is correct, it means that an additional whitelist lookup for every blacklisted URL(just for this apps.fbsbx.com case). And we should make sure this change doesn't have performance impact.
Yes, that is correct. Do we have a good way to test the performance impact of this change?
Also, can you think of any alternative solutions here?
Assignee | ||
Comment 6•5 years ago
|
||
(In reply to :ehsan akhgari from comment #5)
Yes, that is correct. Do we have a good way to test the performance impact of this change?
Also, can you think of any alternative solutions here?
I think we could customize how we fragment a whitelisted URL.
Right now, the fragment rule follows the Safe Browsing url-hashing protocol, which is called here.
But for whitelisted URLs, we don't really have to follow the spec because the format of a whitelisted URL is defined by us (For example, fragment a whitelist URL by removing the query string is useless in our scenario).
So for the whitelisted URL, "toplevel.page/?resource=third.party.basedomain", the new fragment function can produce additional fragment in which the leading component of "third.party.basedomain" is removed.
If this method works, it shouldn't impact performance much because we won't create an additional lookup, we just create additional fragments in the same whitelist URL lookup (We can also optimize the method by removing those fragments we don't need for the whitelist).
How do you think?
Comment 7•5 years ago
|
||
(In reply to Dimi Lee [:dimi][:dlee] from comment #6)
(In reply to :ehsan akhgari from comment #5)
Yes, that is correct. Do we have a good way to test the performance impact of this change?
Also, can you think of any alternative solutions here?
I think we could customize how we fragment a whitelisted URL.
Right now, the fragment rule follows the Safe Browsing url-hashing protocol, which is called here.
But for whitelisted URLs, we don't really have to follow the spec because the format of a whitelisted URL is defined by us (For example, fragment a whitelist URL by removing the query string is useless in our scenario).So for the whitelisted URL, "toplevel.page/?resource=third.party.basedomain", the new fragment function can produce additional fragment in which the leading component of "third.party.basedomain" is removed.
If this method works, it shouldn't impact performance much because we won't create an additional lookup, we just create additional fragments in the same whitelist URL lookup (We can also optimize the method by removing those fragments we don't need for the whitelist).How do you think?
That sounds good! One question though, couldn't we also encounter URLs in the form of "toplevel.page/?resuource=foo.bar.baz" if the user goes to a page with a URL like that (that is, with an actual resource
GET parameter)? Does that matter here?
If not, then this sounds like a great plan!
Comment 8•5 years ago
|
||
BTW yesterday we deployed a skip list fix for this bug which effectively temporarily fixes it with a work-around across all version of Firefox, so we will need to only deploy a code fix for this on trunk. I filed bug 1581120 to remove the skip list work-around once the real fix is shipped to the release channel.
Assignee | ||
Comment 9•5 years ago
|
||
(In reply to :ehsan akhgari from comment #7)
That sounds good! One question though, couldn't we also encounter URLs in the form of "toplevel.page/?resuource=foo.bar.baz" if the user goes to a page with a URL like that (that is, with an actual
resource
GET parameter)? Does that matter here?If not, then this sounds like a great plan!
No, the new fragments generation function should only be used to whitelisted URLs produced in UrlClassifierCommon::CreatePairwiseWhiteListURI, which means there is no "real" query string in it.
Comment 10•5 years ago
|
||
Great, sounds like a great plan then!
Updated•5 years ago
|
Comment 11•5 years ago
|
||
Managed to verify for 69.0.1, 70.0b7, 71.0a1 (2019-09-17) but leaving the QE+ flag up and tracking it for when the changes.
As discussed with dimi on slack; due to the nature of this bug, filled bug 1582123 for the remaining issue.
Leaving the 71 flag as it is in case something breaks again. Once the issue is fully fixed, will come back and re-check all.
For any change that might occur feel free to update and revert for RC/beta builds.
Assignee | ||
Comment 12•5 years ago
|
||
just an update that the patch is ready. If try result looks good, I'll submit the patch to review tomorrow.
Assignee | ||
Comment 13•5 years ago
|
||
Updated•5 years ago
|
Assignee | ||
Comment 14•5 years ago
|
||
This patch does the following to support matching a whitelisted URI when
its domain is eTLD+1:
-
add an URIType to indicate whether a URI is generated by
UrlClassifierCommoon::CreatePairwiseWhiteListURI(), which crafts a
whitelist URL like "toplevel.page/?resource=third.party.domain" -
call LookupCache::GetLookupWhitelistFragments() if URIType is
nsIUrlClassifierFeature::pairwiseWhitelistedURI before initiating
a lookup. -
implement LookupCache::GetLookupWhitelistFragments() which creates
an additional fragment by removing the leading component of
third.party.domain
Assignee | ||
Comment 15•5 years ago
|
||
Depends on D47212
Comment 16•5 years ago
|
||
Comment 17•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/a1237d4bad3d
https://hg.mozilla.org/mozilla-central/rev/843690981959
Updated•5 years ago
|
Comment 18•5 years ago
|
||
Comment 19•5 years ago
|
||
Now that Bug 1587002 is fixed I tested https://apps.facebook.com/ppearls/ by first disabling remote settings and then checking in devtools if the apps.fbsbx.com
resources were classified. As expected, these resources are no longer marked as tracking on Facebook properties (e.g., https://apps.facebook.com/ppearls/), but are marked as tracking on my test page (https://senglehardt.com/test/trackingprotection/fbsbx.html).
Comment 20•5 years ago
|
||
This bug has been identified as part of a pilot on determining root causes of blocking and dot release drivers.
It needs a root-cause set for it. Please see the list at https://docs.google.com/document/d/1FFEGsmoU8T0N8R9kk-MXWptOPtXXXRRIe4vQo3_HgMw/.
Add the root cause as a whiteboard
tag in the form [rca - <cause> ]
and remove the rca-needed
keyword.
If you have questions, please contact :tmaity.
Updated•5 years ago
|
Description
•