Closed Bug 1514054 Opened 6 years ago Closed 6 years ago

HTTP requests from WebExtensions to URLs should not be blocked by tracking protection

Categories

(WebExtensions :: Request Handling, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: ntim, Unassigned)

References

Details

Attachments

(1 file)

Livemarks is affected by this, some feeds fail to be fetched because of tracking protection:

https://github.com/nt1m/livemarks/issues/38

I believe this is a valid use case, and it should not be blocked by TP.

The workaround suggested to add the domains to manifest.json simply does not scale, since you cannot add all the possible domains.
Can we please reconsider this use case ? Or improve the heuristics for content blocking ? Right now, this makes the add-on display "Failed to fetch feed", and causes the livemarks folder to be empty.
Flags: needinfo?(ehsan)
Flags: needinfo?(ddurst)
Hi Tim,

I'm sorry content blocking is causing issues.  I'd like to try to help improve things.  But I need to ask your help in understanding more details about what is going on behind the scenes.  I work on the anti-tracking backend these days, and from where I stand I don't know much about livemarks, web extensions, and this specific issue.  So far, all I have been able to capture from this bug report is "content blocking causes some URLs to not load somehow, probably when some extension is installed...", barely enough information to be able to tell what's going on.

May I ask for some information that would allow me to look into this in more detail?

For starters, can you please open the browser console (or web console if this issue happens in a web page) and reproduce the bug, and then paste what appears in the console here?  I assume you should see messages with "https://www.reddit.com/.rss" in them...

Also can you please tell me what content blocking settings you were testing with?  And do you mind using the "Standard" Content Blocking mode and see if the bug reproduces itself or not?

BTW, do you know what the all_urls permission in web extensions gets mapped to inside Gecko by any chance?

Thanks a lot!
Flags: needinfo?(ehsan) → needinfo?(ntim.bugs)
Theoretically an explicit domain in the host permissions for the extension is supposed to allow this to be bypassed.  all_urls won't (on purpose) provide a TP bypass for the extension.  

As Tim said, there's no way livemarks would be able to list all potential domains a user might want to load rss from.  I'd have to understand how livemarks is trying to access the rss url, whether it is a fetch, iframe, etc., but I'd guess it is a fetch/xhr.

Given bug 1509112, we have related (perhaps different) issues with extensions+TP.
(In reply to :Ehsan Akhgari from comment #2)
> Hi Tim,
> 
> I'm sorry content blocking is causing issues.  I'd like to try to help
> improve things.  But I need to ask your help in understanding more details
> about what is going on behind the scenes.  I work on the anti-tracking
> backend these days, and from where I stand I don't know much about
> livemarks, web extensions, and this specific issue.  So far, all I have been
> able to capture from this bug report is "content blocking causes some URLs
> to not load somehow, probably when some extension is installed...", barely
> enough information to be able to tell what's going on.
> 
> May I ask for some information that would allow me to look into this in more
> detail?

This is the code that tries to fetch the feed: https://github.com/nt1m/livemarks/blob/master/shared/feed-parser.js#L5-L24

> 
> For starters, can you please open the browser console (or web console if
> this issue happens in a web page) and reproduce the bug, and then paste what
> appears in the console here?  I assume you should see messages with
> "https://www.reddit.com/.rss" in them...

"The resource at “https://www.reddit.com/.rss” was blocked because content blocking is enabled."

> Also can you please tell me what content blocking settings you were testing
> with?  And do you mind using the "Standard" Content Blocking mode and see if
> the bug reproduces itself or not?

It seems to break with the "Strict" mode it seems.

> BTW, do you know what the all_urls permission in web extensions gets mapped
> to inside Gecko by any chance?

What Shane said, all_urls doesn't allow bypassing TP.
Flags: needinfo?(ntim.bugs)
Flags: needinfo?(ehsan)
(In reply to Tim Nguyen :ntim (please use needinfo?) from comment #4)
> (In reply to :Ehsan Akhgari from comment #2)
> > Hi Tim,
> > 
> > I'm sorry content blocking is causing issues.  I'd like to try to help
> > improve things.  But I need to ask your help in understanding more details
> > about what is going on behind the scenes.  I work on the anti-tracking
> > backend these days, and from where I stand I don't know much about
> > livemarks, web extensions, and this specific issue.  So far, all I have been
> > able to capture from this bug report is "content blocking causes some URLs
> > to not load somehow, probably when some extension is installed...", barely
> > enough information to be able to tell what's going on.
> > 
> > May I ask for some information that would allow me to look into this in more
> > detail?
> 
> This is the code that tries to fetch the feed:
> https://github.com/nt1m/livemarks/blob/master/shared/feed-parser.js#L5-L24
> 
> > 
> > For starters, can you please open the browser console (or web console if
> > this issue happens in a web page) and reproduce the bug, and then paste what
> > appears in the console here?  I assume you should see messages with
> > "https://www.reddit.com/.rss" in them...
> 
> "The resource at “https://www.reddit.com/.rss” was blocked because content
> blocking is enabled."
> 
> > Also can you please tell me what content blocking settings you were testing
> > with?  And do you mind using the "Standard" Content Blocking mode and see if
> > the bug reproduces itself or not?
> 
> It seems to break with the "Strict" mode it seems.

OK, that's helpful, thanks.  "Strict" mode turns on tracking protection in normal windows, so my guess would be that this is a dupe of bug 1509112.

Just to double check, may I ask you to run Firefox with the following environment variables set and please attach the generated log (the one from the parent process in case there are multiple ones) so that I can double check that is indeed what's happening here?  Thanks!

MOZ_LOG=nsChannelClassifier:5 MOZ_LOG_FILE=log.txt
Flags: needinfo?(ehsan) → needinfo?(ntim.bugs)
Depends on: 1509112
(In reply to :Ehsan Akhgari from comment #5)
> OK, that's helpful, thanks.  "Strict" mode turns on tracking protection in
> normal windows, so my guess would be that this is a dupe of bug 1509112.

Not a duplicate of bug 1509112, since I don't explicitly include the host permissions (I can't, I'd have to include all possible domains), only <all_urls> which appears to be intentionally not affecting TP.
(In reply to Tim Nguyen :ntim (please use needinfo?) from comment #6)
> (In reply to :Ehsan Akhgari from comment #5)
> > OK, that's helpful, thanks.  "Strict" mode turns on tracking protection in
> > normal windows, so my guess would be that this is a dupe of bug 1509112.
> 
> Not a duplicate of bug 1509112, since I don't explicitly include the host
> permissions (I can't, I'd have to include all possible domains), only
> <all_urls> which appears to be intentionally not affecting TP.

Hmm, can you please explain what you mean?  (I am probably confused because of not being familiar enough with web extensions...)

That bug is asking for a way for tracking protection to allow extensions to be able to load trackers inside extension iframes basically by considering tracker loads as first-party loads rather than third-party loads, without including host permissions (if I understand things correctly there).  Is that different from your situation here?  If yes, do you mind clarifying how?
(In reply to :Ehsan Akhgari from comment #7)
> (In reply to Tim Nguyen :ntim (please use needinfo?) from comment #6)
> > (In reply to :Ehsan Akhgari from comment #5)
> > > OK, that's helpful, thanks.  "Strict" mode turns on tracking protection in
> > > normal windows, so my guess would be that this is a dupe of bug 1509112.
> > 
> > Not a duplicate of bug 1509112, since I don't explicitly include the host
> > permissions (I can't, I'd have to include all possible domains), only
> > <all_urls> which appears to be intentionally not affecting TP.
> 
> Hmm, can you please explain what you mean?  (I am probably confused because
> of not being familiar enough with web extensions...)
> 
> That bug is asking for a way for tracking protection to allow extensions to
> be able to load trackers inside extension iframes basically by considering
> tracker loads as first-party loads rather than third-party loads, without
> including host permissions (if I understand things correctly there).  Is
> that different from your situation here?  If yes, do you mind clarifying how?

From what you describe, it sounds like the same bug, but bug 1509112 has "Adding domains to permissions" in the title, in my case, the domains are not explicitly in the permissions, only <all_urls> is.
Attached file log.txt
Here are the logs.
Flags: needinfo?(ntim.bugs) → needinfo?(ehsan)
> > Not a duplicate of bug 1509112, since I don't explicitly include the host
> > permissions (I can't, I'd have to include all possible domains), only
> > <all_urls> which appears to be intentionally not affecting TP.
> 
> Hmm, can you please explain what you mean?  (I am probably confused because
> of not being familiar enough with web extensions...)

Adding discrete (non-wildcard) domains to the extension's host permissions is supposed to override tracking protection (according to https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/manifest.json/permissions#Host_permissions).

So in the case of 1509112, that's germane, because mkaply tested with specific domains in the manifest and still the iframes were treated as third-party. In the case of this bug, it's not germane because there's no way to specify which domains (since it's trying to get data from an unknowable set of "other" domains, hence the use of <all_urls> for the host permissions in the extension in this bug report); but the docs specifically say that won't work for the purposes of bypassing TP.

ntim's request here is for a change to TP (or, I suppose some solution) that can accommodate <all_urls>; 1509112 is a potential bug report (maybe?) or a request for a change to third-party determination in the context of an extension's page's iframe.

[I hope I got all that right.]
(In reply to Tim Nguyen :ntim (please use needinfo?) from comment #9)
> Created attachment 9031441 [details]
> log.txt
> 
> Here are the logs.

Thanks, I'm almost 100% sure now that this is the same as bug 1509112.

(In reply to David Durst [:ddurst] (Regression Engineering Owner for 63) from comment #10)
> > > Not a duplicate of bug 1509112, since I don't explicitly include the host
> > > permissions (I can't, I'd have to include all possible domains), only
> > > <all_urls> which appears to be intentionally not affecting TP.
> > 
> > Hmm, can you please explain what you mean?  (I am probably confused because
> > of not being familiar enough with web extensions...)
> 
> Adding discrete (non-wildcard) domains to the extension's host permissions
> is supposed to override tracking protection (according to
> https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/
> manifest.json/permissions#Host_permissions).

Can you please clarify what "host permissions" are and how they're connected with tracking protection?  I'm not really familiar with the web extensions side of things here and I'm trying to connect the dots.  On the channel classifier side, we only run this check which seems related to web extensions: <https://searchfox.org/mozilla-central/rev/49e78df13e7a505827a3a86daae9efdf827133c6/netwerk/base/nsChannelClassifier.cpp#548> is this related?

> So in the case of 1509112, that's germane, because mkaply tested with
> specific domains in the manifest and still the iframes were treated as
> third-party.

If the code I linked to above is related to host permissions, please note that it's checked *after* checking third-partiness, so just merely seeing that the iframes were treated as third-party in itself isn't too revealing.  (I mean, of course they'd be third-party, because everything is third-party to moz-extension://foo/ :-) ).

> In the case of this bug, it's not germane because there's no
> way to specify which domains (since it's trying to get data from an
> unknowable set of "other" domains, hence the use of <all_urls> for the host
> permissions in the extension in this bug report); but the docs specifically
> say that won't work for the purposes of bypassing TP.
> 
> ntim's request here is for a change to TP (or, I suppose some solution) that
> can accommodate <all_urls>; 1509112 is a potential bug report (maybe?) or a
> request for a change to third-party determination in the context of an
> extension's page's iframe.

Right.  But see https://bugzilla.mozilla.org/show_bug.cgi?id=1509112#c8 and https://bugzilla.mozilla.org/show_bug.cgi?id=1509112#c19.  It seems that in the other bug people are also asking for a way to be able to load arbitrary URLs without listing them all in the manifest, which is why I really think in both bugs we're asking to solve the same problem.  :-)
Flags: needinfo?(ehsan)
(In reply to :Ehsan Akhgari from comment #11)
> (In reply to Tim Nguyen :ntim (please use needinfo?) from comment #9)

> If the code I linked to above is related to host permissions, please note
> that it's checked *after* checking third-partiness, so just merely seeing
> that the iframes were treated as third-party in itself isn't too revealing. 
> (I mean, of course they'd be third-party, because everything is third-party
> to moz-extension://foo/ :-) ).

ahh, that explains a lot.  That code looks at any non-wildcard hosts in the extensions host permissions and does a match.  If it succeeds, the extension is supposed to be able to load it.  If it's after third party, it will fail in lots of use cases.

> Right.  But see https://bugzilla.mozilla.org/show_bug.cgi?id=1509112#c8 and
> https://bugzilla.mozilla.org/show_bug.cgi?id=1509112#c19.  It seems that in
> the other bug people are also asking for a way to be able to load arbitrary
> URLs without listing them all in the manifest, which is why I really think
> in both bugs we're asking to solve the same problem.  :-)

Yes, I think if the above code happened prior to, or in conjunction with, third party check, and we allowed wild cards, it would probably address these issues.  However, we may not want to be that lenient, perhaps moving them out of host permissions and into a specific manifest entry.
(In reply to :Ehsan Akhgari from comment #11)
> Can you please clarify what "host permissions" are and how they're connected
> with tracking protection?  I'm not really familiar with the web extensions
> side of things here and I'm trying to connect the dots.  On the channel
> classifier side, we only run this check which seems related to web
> extensions:
> <https://searchfox.org/mozilla-central/rev/
> 49e78df13e7a505827a3a86daae9efdf827133c6/netwerk/base/nsChannelClassifier.
> cpp#548> is this related?

Sure. Host permissions are declared in the manifest, and they are specifically for allowing the extension "special privileges" -- one of those permissions is precisely supposed to be bypassing tracking protection (if the domain is fully-qualified, no wildcards, doesn't work with the special value <all_urls>, etc).


> > ntim's request here is for a change to TP (or, I suppose some solution) that
> > can accommodate <all_urls>; 1509112 is a potential bug report (maybe?) or a
> > request for a change to third-party determination in the context of an
> > extension's page's iframe.
> 
> Right.  But see https://bugzilla.mozilla.org/show_bug.cgi?id=1509112#c8 and
> https://bugzilla.mozilla.org/show_bug.cgi?id=1509112#c19.  It seems that in
> the other bug people are also asking for a way to be able to load arbitrary
> URLs without listing them all in the manifest, which is why I really think
> in both bugs we're asking to solve the same problem.  :-)

Yeah, I agree that this is probably the same root cause, but we're probably looking at two different use cases that might (probably shouldn't) have the same solution? Because one is intending to mitigate TP issues for an (ideally) author-owned other domain, and one is random-owned other domain.
(In reply to Shane Caraveo (:mixedpuppy) from comment #12)
> (In reply to :Ehsan Akhgari from comment #11)
> > (In reply to Tim Nguyen :ntim (please use needinfo?) from comment #9)
> 
> > If the code I linked to above is related to host permissions, please note
> > that it's checked *after* checking third-partiness, so just merely seeing
> > that the iframes were treated as third-party in itself isn't too revealing. 
> > (I mean, of course they'd be third-party, because everything is third-party
> > to moz-extension://foo/ :-) ).
> 
> ahh, that explains a lot.  That code looks at any non-wildcard hosts in the
> extensions host permissions and does a match.  If it succeeds, the extension
> is supposed to be able to load it.  If it's after third party, it will fail
> in lots of use cases.

As far as I can tell, that has always been the case since that extra check was added in bug 1308640...  (I see no discussion of third-partiness there, FWIW.)

> > Right.  But see https://bugzilla.mozilla.org/show_bug.cgi?id=1509112#c8 and
> > https://bugzilla.mozilla.org/show_bug.cgi?id=1509112#c19.  It seems that in
> > the other bug people are also asking for a way to be able to load arbitrary
> > URLs without listing them all in the manifest, which is why I really think
> > in both bugs we're asking to solve the same problem.  :-)
> 
> Yes, I think if the above code happened prior to, or in conjunction with,
> third party check, and we allowed wild cards, it would probably address
> these issues.

See this diff: <https://searchfox.org/mozilla-central/diff/c7c7bd4f51e3ca97b9576ba612c3a2ef7dbb7ba0/netwerk/base/nsChannelClassifier.cpp#153>.  The host permission checks were added when the third-party checks existed already.  I can't tell what the original intention for how they should work were, but one easy thing to check, if we have an add-on with the right host permissions in its manifest, would be to reverse this order and see if doing so would make the host permissions work.

> However, we may not want to be that lenient, perhaps moving
> them out of host permissions and into a specific manifest entry.

Yes, fair point...  But also note that I am not 100% sure if patching the channel classifier in this way is the right way to fix the TP issue, since, for example, with Enhanced Tracking Protection, now the same hosts would be considered for cookie blocking, since we would determine them to be third-party.

I'm thinking perhaps a better place to patch things up may now be AntiTrackingCommon::IsOnContentBlockingAllowList().  Basically we can treat all moz-extension:// URLs (or some of them, e.g. the ones that match certain rules) as if they were exempted from all content blocking protections.  That would mean this would work for tracking protection, enhanced tracking protection, as well as anything else to be introduced in the future of this sort, without changing the definition of third-partiness of this type of content in Gecko...
(In reply to David Durst [:ddurst] (Regression Engineering Owner for 63) from comment #13)
> (In reply to :Ehsan Akhgari from comment #11)
> > Can you please clarify what "host permissions" are and how they're connected
> > with tracking protection?  I'm not really familiar with the web extensions
> > side of things here and I'm trying to connect the dots.  On the channel
> > classifier side, we only run this check which seems related to web
> > extensions:
> > <https://searchfox.org/mozilla-central/rev/
> > 49e78df13e7a505827a3a86daae9efdf827133c6/netwerk/base/nsChannelClassifier.
> > cpp#548> is this related?
> 
> Sure. Host permissions are declared in the manifest, and they are
> specifically for allowing the extension "special privileges" -- one of those
> permissions is precisely supposed to be bypassing tracking protection (if
> the domain is fully-qualified, no wildcards, doesn't work with the special
> value <all_urls>, etc).

Thanks!

Does that mean hosts on this list shouldn't ever be blocked by TP, or that hosts on this list, when embedded inside a moz-extension:// URL, shouldn't ever be blocked by TP, or something else?

> > > ntim's request here is for a change to TP (or, I suppose some solution) that
> > > can accommodate <all_urls>; 1509112 is a potential bug report (maybe?) or a
> > > request for a change to third-party determination in the context of an
> > > extension's page's iframe.
> > 
> > Right.  But see https://bugzilla.mozilla.org/show_bug.cgi?id=1509112#c8 and
> > https://bugzilla.mozilla.org/show_bug.cgi?id=1509112#c19.  It seems that in
> > the other bug people are also asking for a way to be able to load arbitrary
> > URLs without listing them all in the manifest, which is why I really think
> > in both bugs we're asking to solve the same problem.  :-)
> 
> Yeah, I agree that this is probably the same root cause, but we're probably
> looking at two different use cases that might (probably shouldn't) have the
> same solution? Because one is intending to mitigate TP issues for an
> (ideally) author-owned other domain, and one is random-owned other domain.

Yes, I'm starting to see this now, thanks for the clarification.

So for the use case described here, do I understand this correctly?  URLs loaded under a moz-extension:// URL shouldn't be blocked by TP.

If that's the use case, what's your take on the problem described at the end of this comment? https://bugzilla.mozilla.org/show_bug.cgi?id=1458374#c4
(In reply to :Ehsan Akhgari from comment #15)
> Does that mean hosts on this list shouldn't ever be blocked by TP, or that
> hosts on this list, when embedded inside a moz-extension:// URL, shouldn't
> ever be blocked by TP, or something else?

It means that hosts on this list, when embedded inside a moz-extension:// URL, shouldn't (ever?) be blocked by TP -- but I'm not clear on the "ever" part, which I think is why we have so many open bugs around this subject at the moment. :)


> If that's the use case, what's your take on the problem described at the end
> of this comment? https://bugzilla.mozilla.org/show_bug.cgi?id=1458374#c4

Well, we're planning to remove the remote iframe in 67. Which leaves the rest of the question. I don't know. I'm sensitive to the double-standard (and the risk to about: pages), but also the complexity of mozbrowser in extension pages.

As noted in https://bugzilla.mozilla.org/show_bug.cgi?id=1458374#c7, the fact that we do have domains grouped in the shavar list (which is easier conceptually on https://github.com/mozilla-services/shavar-prod-lists/blob/master/disconnect-blacklist.json, at least to my brain) means we *could* resolve the issue by granting first-party to the "parent" (company) listed -- provided that that's somehow indicated in the extension (which perhaps should be via host permissions, but maybe not, see bug 1509112).

But that does nothing for #c0, here. So... yeah.
(In reply to David Durst [:ddurst] from comment #16)
> (In reply to :Ehsan Akhgari from comment #15)
> > Does that mean hosts on this list shouldn't ever be blocked by TP, or that
> > hosts on this list, when embedded inside a moz-extension:// URL, shouldn't
> > ever be blocked by TP, or something else?
> 
> It means that hosts on this list, when embedded inside a moz-extension://
> URL, shouldn't (ever?) be blocked by TP -- but I'm not clear on the "ever"
> part, which I think is why we have so many open bugs around this subject at
> the moment. :)

https://phabricator.services.mozilla.com/D14832 fixes the underlying reason why they were sometimes still blocked even with host permissions.  :-)

> > If that's the use case, what's your take on the problem described at the end
> > of this comment? https://bugzilla.mozilla.org/show_bug.cgi?id=1458374#c4
> 
> Well, we're planning to remove the remote iframe in 67. Which leaves the
> rest of the question. I don't know. I'm sensitive to the double-standard
> (and the risk to about: pages), but also the complexity of mozbrowser in
> extension pages.
> 
> As noted in https://bugzilla.mozilla.org/show_bug.cgi?id=1458374#c7, the
> fact that we do have domains grouped in the shavar list (which is easier
> conceptually on
> https://github.com/mozilla-services/shavar-prod-lists/blob/master/disconnect-
> blacklist.json, at least to my brain) means we *could* resolve the issue by
> granting first-party to the "parent" (company) listed -- provided that
> that's somehow indicated in the extension (which perhaps should be via host
> permissions, but maybe not, see bug 1509112).
> 
> But that does nothing for #c0, here. So... yeah.

So with bug 1509112 fixed assuming my patch there lands, the only use case we have now which is unfixed is comment 0.  For that use case we really do need a way for extensions to just be able to bypass tracking protection for any arbitrary URL (it is still unclear to me whether bypassing other anti-tracking features is needed or not, but I would assume yes.)

If we were to capture this as another permission and/or another manifest field, then we could potentially prevent extensions that override the new tab page (and things like that) from using that permission or vice versa (prevent extensions which override the new tab page from obtaining that permission.)  Other similar restrictions are also perceivable.  So I think having extension authors opt into that mode explicitly will probably have some advantages for us in terms of having the ability to impose our own controls on how much such extensions can undermine our built-in privacy protections and in what contexts.
(In reply to :Ehsan Akhgari from comment #17)
> So with bug 1509112 fixed assuming my patch there lands, the only use case
> we have now which is unfixed is comment 0.  For that use case we really do
> need a way for extensions to just be able to bypass tracking protection for
> any arbitrary URL (it is still unclear to me whether bypassing other
> anti-tracking features is needed or not, but I would assume yes.)

We could also fix the heuristics. There's something wrong with the heuristics, if something that's supposed to be important main page content is blocked by Content blocking.
(In reply to :ntim (low availability until February) from comment #18)
> (In reply to :Ehsan Akhgari from comment #17)
> > So with bug 1509112 fixed assuming my patch there lands, the only use case
> > we have now which is unfixed is comment 0.  For that use case we really do
> > need a way for extensions to just be able to bypass tracking protection for
> > any arbitrary URL (it is still unclear to me whether bypassing other
> > anti-tracking features is needed or not, but I would assume yes.)
> 
> We could also fix the heuristics. There's something wrong with the
> heuristics, if something that's supposed to be important main page content
> is blocked by Content blocking.

I'm not sure which heuristics you are referring to here.  At the risk of repeating myself over and over again, here is the problem, once again.  I'm going to copy the graph in https://bugzilla.mozilla.org/show_bug.cgi?id=1458374#c4 here inline:

This is the frame hierarchy in a situation where the bug in comment 0 occurs:

    +-------https://03a31e9c-d34b-6b44-b8a9-2cf918bc0987/----------+
    |                                                              |
    |    +------------https://mail.google.com------------+         |
    |    |                                               |         |
    |    |     +------(something else from Google)----+  |         |
    |    |     |                                      |  |         |
    |    |     +--------------------------------------+  |         |
    |    +-----------------------------------------------|         |
    +--------------------------------------------------------------+

Gecko, being a web engine, tries to detect third-party trackers.  The definition of third-party is: is the loading source of content the same as what appears in the URL bar.  Normally, what appears in the URL bar is the location of the top-most content.  This all works completely fine for normal web pages.  mail.google.com would be considered a third-party and would be treated as a 3rd-party tracker (e.g. get blocked by TP).

Now with web extensions, there is no URL bar.  And the URL of the top-most frame means absolutely nothing at all.  In this case, that URL is a moz-extension URL.  We end up comparing it against https://mail.google.com for example and incorrectly think that mail.google.com is a third-party tracker.

The problem here is that from Gecko's point of view, the frame hosted inside the moz-extension frame *is* a third-party frame.  The fact that we want to consider it as something else completely subverts all of the rules we have around what constitutes first-party and what constitutes third-party.  That's what we're discussing right now, how to fix this situation in a way that keeps things working well on the Gecko side, and also makes things workable on the extensions side.

Since you have previously suggested to "fix the heuristics", I just want to be super clear that there *are* no magic heuristics that are getting in our way in the first place, it's just that there is nothing indicating that content that's being loaded in the extension context in the above example (or in your case in comment 0) is being loaded in first-party context, that's all there is to it really.
The feed content might be third-party, but considering it as a tracker is definitively incorrect, it is part of the webpage's main functionality. These are the heuristics I'm asking to improve, I'm definitively not saying it's easy, but they are definitively broken when they start detecting webpage's main functionality as "tracking".
I've an idea for an alternate approach.  The essential problem here is, the live bookmark extension has no way to know that the rss file will be blocked by TP until it makes the xhr request, thus UX is broken for the user.  What if instead, the extension could query TP whether a url will be blocked up front.  If TP says yes, then the extension can disable its page/browserAction button for that page.  This would be a big UX improvement for the extension.

yes, It would be preferable for the xhr to just work, but if we're unwilling to open up the xhr requests that way, this may be a good middle road approach.
Then also, with bug 1509112 fixed, if there are very popular feeds that are for some reason blocked by TP, the extension could list those specific feeds in host permissions.
(In reply to :ntim (low availability until February) from comment #20)
> The feed content might be third-party, but considering it as a tracker is
> definitively incorrect, it is part of the webpage's main functionality.

The content you're loading being part of the main page's functionality has absolutely nothing to do with whether the content tries to track the user or not.  If you believe that it does, then you're under an incorrect impression on how web tracking works unfortunately.

When you load a feed from companies like reddit which are known to track users' activity across the web, if the browser doesn't provide any protections by default it will do things like send any cookies that the user may have from *.reddit.com to the reddit servers.  These cookies will allow reddit to build a behavioural profile of the user, and correlate the fact that the user fetched that particular feed at that particular time at the location indicated by their IP address (and so on) with all of the information that they have previously obtained from the user (e.g. the user having browsed other pages across the web.)

So just so that we're clear on what's happening, extensions such as Livemarks are causing a privacy risk for the user through fetching content that tracks the user.  Gecko is intervening in a way that the extension doesn't expect (here by blocking the loading of the feed) and the extension breaks as a result.

You are looking at the outcome (the extension breaking) and drawing completely incorrect conclusions (that there is no tracking going on).

The correct conclusion to draw would be that tracking _is_ happening, and Firefox is correctly stopping the tracking, but that is *unexpectedly* breaking the extension.  The question now is, how to fix things so that the extension author can, if they choose to do so, opt out of our privacy protections (and while doing so expose their users to the resulting privacy risk) to make their extensions work.  And ideally do so in a way that would allow Firefox to explain the trade-off to the user (e.g. explain that this extension will do things that may override some of our built-in privacy protections, to allow the user an intelligent decision on whether they'd like to install the extension.)

> These are the heuristics I'm asking to improve, I'm definitively not saying
> it's easy, but they are definitively broken when they start detecting
> webpage's main functionality as "tracking".

FTR if you really want the "main functionality" here to not incur tracking users, it is *your* responsibility to write the code in the extension in a way that eliminates the ability of companies like reddit to track users.  There is nothing Firefox can do for you in order to do this automatically.

The right way to do so in the case of a feed reader would be to load the feeds in a sandboxed server environment which is proxied away from the user's machines, so that the server side endpoints you connect to (e.g. reddit.com) never end up seeing the machines of real users.  That way they can't exploit any potential cookies that may exist there, abuse the user's IP address, etc.  But that's way out of the scope of this bug.  :-)
I had the impression that websites using Facebook/Google Login were whitelisted, but that doesn't seem to be the case on Strict mode. I was going to ask to put Livemarks under the same bucket (tracking but main web page functionality), but I just realized that wouldn't actually solve anything in strict mode.
(In reply to Shane Caraveo (:mixedpuppy) from comment #21)
> I've an idea for an alternate approach.  The essential problem here is, the
> live bookmark extension has no way to know that the rss file will be blocked
> by TP until it makes the xhr request, thus UX is broken for the user.  What
> if instead, the extension could query TP whether a url will be blocked up
> front.  If TP says yes, then the extension can disable its
> page/browserAction button for that page.  This would be a big UX improvement
> for the extension.

There are a couple of ways this can be done...

One is by doing the work completely manually.  That is, by downloading our TP list and parsing it manually from <https://raw.githubusercontent.com/mozilla-services/shavar-prod-lists/master/disconnect-blacklist.json>, parse it as JSON, ignore things in the Content category (since we don't block anything from that list by default), and then check to see if the host you're trying to connect to exists in this list.

The privacy.websites.trackingProtectionMode API allows you to determine the active TP mode <https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/WebExtensions/API/privacy/websites>.

The other way which would be easier to use would be to add an API to classify a URL and return a Yes/No response on whether it's on the TP list...  This will be a new WebExtension API of course so it's higher overhead to do, I believe.

Neither of these will of course allow the extension to *load* the feed!  For that reason, I still would prefer the idea in comment 17...
(In reply to :ntim (low availability until February) from comment #24)
> I had the impression that websites using Facebook/Google Login were
> whitelisted, but that doesn't seem to be the case on Strict mode. I was
> going to ask to put Livemarks under the same bucket (tracking but main web
> page functionality), but I just realized that wouldn't actually solve
> anything in strict mode.

No, Strict mode just means enabling tracking protection for normal windows, as well as private windows, without any other magic behind the scenes.  :-)

We do have some heuristics for detecting SSO logins with provides such as FB and Google which I think is what you are referring to, but those are for the *cookie blocking* part of Enhanced Tracking Protection, which is an entirely different beast altogether, completely irrelevant to this bug.  (I think maybe that's what you had in mind?)
(In reply to :Ehsan Akhgari from comment #26)
> (In reply to :ntim (low availability until February) from comment #24)
> > I had the impression that websites using Facebook/Google Login were
> > whitelisted, but that doesn't seem to be the case on Strict mode. I was
> > going to ask to put Livemarks under the same bucket (tracking but main web
> > page functionality), but I just realized that wouldn't actually solve
> > anything in strict mode.
> 
> No, Strict mode just means enabling tracking protection for normal windows,
> as well as private windows, without any other magic behind the scenes.  :-)
> 
> We do have some heuristics for detecting SSO logins with provides such as FB
> and Google which I think is what you are referring to, but those are for the
> *cookie blocking* part of Enhanced Tracking Protection, which is an entirely
> different beast altogether, completely irrelevant to this bug.  (I think
> maybe that's what you had in mind?)

I thought that TP was about protecting against tracking without fundamentally breaking webpage main content/functionality (by whitelisting things like SSO logins), but maybe that's not the case...
See Also: → 1515449
(In reply to :ntim (low availability until February) from comment #27)
> (In reply to :Ehsan Akhgari from comment #26)
> > (In reply to :ntim (low availability until February) from comment #24)
> > > I had the impression that websites using Facebook/Google Login were
> > > whitelisted, but that doesn't seem to be the case on Strict mode. I was
> > > going to ask to put Livemarks under the same bucket (tracking but main web
> > > page functionality), but I just realized that wouldn't actually solve
> > > anything in strict mode.
> > 
> > No, Strict mode just means enabling tracking protection for normal windows,
> > as well as private windows, without any other magic behind the scenes.  :-)
> > 
> > We do have some heuristics for detecting SSO logins with provides such as FB
> > and Google which I think is what you are referring to, but those are for the
> > *cookie blocking* part of Enhanced Tracking Protection, which is an entirely
> > different beast altogether, completely irrelevant to this bug.  (I think
> > maybe that's what you had in mind?)
> 
> I thought that TP was about protecting against tracking without
> fundamentally breaking webpage main content/functionality (by whitelisting
> things like SSO logins), but maybe that's not the case...

We do whitelist stuff that we can't block without breaking a lot of web pages, yes.  (We do have bugs with logins not working with TP, see the dependencies of https://bugzilla.mozilla.org/show_bug.cgi?id=1470298.  This isn't a perfect system.)
See Also: → 1502987
Flags: needinfo?(ddurst) → needinfo?(mconca)
Priority: -- → P2

The right way to support this use case, I believe, is to allow optional permissions to add specific URLs even when <all_urls> is specified in the manifest file. That feature is covered in bug 1502987 and has already be triaged as P2 specifically for tracking protection. Closing this bug report and asking that we track this issue there.

Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(mconca)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: