Closed Bug 1231543 Opened 9 years ago Closed 5 years ago

Tracking protection doesn't work on data: URLs

Categories

(Toolkit :: Safe Browsing, defect, P1)

defect

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox45 --- affected

People

(Reporter: arni2033, Assigned: dimi)

References

(Blocks 1 open bug, )

Details

Attachments

(3 files)

>>>   My Info:   Win7_64, Nightly 45, 32bit, ID 20151206030448
STR (Steps 2 and 3 are unnecessary):

1. Open "data:" url in a new private window:
>   data:text/html,<iframe src="https://www.google-analytics.com/">
2. Open "testcase 1" in private window
3. Save "testcase 1" locally and open saved .html file in private window

Result:       
 Tracking Protection doesn't block google-analytics.com in Step 1. It does in Step 2 and 3.

Expectations: 
 Tracking Protection should block all malicious resources in private windows.
It doesn't seem to go through tracking protection at all:

[Main Thread]: D/nsChannelClassifier nsChannelClassifier[7fbb3701bd60]: Classifying principal https://www.google-analytics.com/ on channel with uri https://www.google-analytics.com/
[Main Thread]: D/nsChannelClassifier nsChannelClassifier[7fbb3701bd60]: suspended channel 7fbb34583848
[URL Classifier]: D/UrlClassifierDbService Checking fragment www.google-analytics.com/
[URL Classifier]: D/UrlClassifierDbService Checking fragment google-analytics.com/
[URL Classifier]: D/UrlClassifierDbService Checking table goog-malware-shavar
[URL Classifier]: D/UrlClassifierDbService Checking table goog-unwanted-shavar
[URL Classifier]: D/UrlClassifierDbService Checking table test-malware-simple
[URL Classifier]: D/UrlClassifierDbService Checking table test-unwanted-simple
[URL Classifier]: D/UrlClassifierDbService Checking table goog-phish-shavar
[URL Classifier]: D/UrlClassifierDbService Checking table test-phish-simple
[URL Classifier]: D/UrlClassifierDbService Checking table test-block-simple
[URL Classifier]: D/UrlClassifierDbService Checking table mozplugin-block-digest256
[URL Classifier]: D/UrlClassifierDbService Checking fragment www.google-analytics.com/, hash 7FBAFEF973787AAC7764E1552B580D7EC633DB2D5787BCF56A2ECF4C17FCF1C8 (F9FEBA7F)
[URL Classifier]: D/UrlClassifierDbService Probe in goog-malware-shavar: F9FEBA7F, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in goog-unwanted-shavar: F9FEBA7F, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in test-malware-simple: F9FEBA7F, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in test-unwanted-simple: F9FEBA7F, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in goog-phish-shavar: F9FEBA7F, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in test-phish-simple: F9FEBA7F, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in test-block-simple: F9FEBA7F, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in mozplugin-block-digest256: F9FEBA7F, found 0
[URL Classifier]: D/UrlClassifierDbService Checking fragment google-analytics.com/, hash 9DF2E93276C4DED5EF0FB2FBA5698770553F1F95044DDB6DF9BE1E67E58C181E (32E9F29D)
[URL Classifier]: D/UrlClassifierDbService Probe in goog-malware-shavar: 32E9F29D, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in goog-unwanted-shavar: 32E9F29D, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in test-malware-simple: 32E9F29D, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in test-unwanted-simple: 32E9F29D, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in goog-phish-shavar: 32E9F29D, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in test-phish-simple: 32E9F29D, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in test-block-simple: 32E9F29D, found 0
[URL Classifier]: D/UrlClassifierDbService Probe in mozplugin-block-digest256: 32E9F29D, found 0
[URL Classifier]: D/UrlClassifierDbService Found 0 results.
[URL Classifier]: D/UrlClassifierDbService Found 0 results.
[URL Classifier]: D/UrlClassifierDbService query took 0ms
[Main Thread]: D/UrlClassifierDbService nsUrlClassifierLookupCallback::LookupComplete [7fbb338e6d00] 0 pending completions
[Main Thread]: D/UrlClassifierDbService nsUrlClassifierLookupCallback::HandleResults [7fbb338e6d00, 0 results]
[Main Thread]: D/nsChannelClassifier nsChannelClassifier[7fbb3701bd60]:OnClassifyComplete NS_OK (suspended channel)
[Main Thread]: D/nsChannelClassifier nsChannelClassifier::MarkEntryClassified[NS_OK] https://www.google-analytics.com/
[URL Classifier]: D/UrlClassifierDbService nsUrlClassifierDBServiceWorker::CacheMisses [7fbb442cdd80] 0
[Main Thread]: D/nsChannelClassifier nsChannelClassifier[7fbb3701bd60]: resuming channel 7fbb34583848 from OnClassifyComplete

Maybe ShouldEnableTrackingProtection() takes an early return somewhere in here:
https://dxr.mozilla.org/mozilla-central/rev/233ab21b64b5d5e9f2f16ea2d4cfb4c8b293c5c4/netwerk/base/nsChannelClassifier.cpp#69-93
Priority: -- → P5
Priority: P5 → P3
Assignee: nobody → tnguyen
This failed with :
https://dxr.mozilla.org/mozilla-central/source/dom/base/ThirdPartyUtil.cpp#81

We could not get the URI associated with a window in the data: scheme case which are using NullPrincipal.
I have to look closer to see what we can do in this case
Un-assigning for now since I don't think Thomas is actively working on it.
Assignee: tnguyen → nobody
Whiteboard: tp-leak
Whiteboard: tp-leak
(In reply to Thomas Nguyen OOO Util 20 October from comment #2)
> This failed with :
> https://dxr.mozilla.org/mozilla-central/source/dom/base/ThirdPartyUtil.cpp#81
> 
> We could not get the URI associated with a window in the data: scheme case
> which are using NullPrincipal.
> I have to look closer to see what we can do in this case

It's possible that bug 1456486 will fix this.
See Also: → 1456486
Priority: P3 → P2
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Priority: P2 → P1

As mentioned in Comment 2, this is because the associated "data scheme" window is using null principal so GetTopWindowURI here [1]will fail.

I guess we can fix this by replacing GetTopWindowURI with TYPE_DOCUMENT check.

But here is my questions:

  1. Do we really want to solve this issue? It is not clear to me that what a tracker can do in this scenario because there is no "first-party" URI.
    Maybe cryptomining is one of the cases that a third-party can still harm the user?
  2. We can not generate the pairwise whitelist url because we don't have top-level window's uri, do we just skip whitelist?

[1] https://searchfox.org/mozilla-central/rev/99a2a5a955960b0e58ceade1db1f7652d9db4ba1/netwerk/url-classifier/UrlClassifierCommon.cpp#117

Hi Ehsan,
I would like to know your opinion on this bug(See Comment 5), is this something we want to fix? thank you!

Flags: needinfo?(ehsan)

(In reply to Dimi Lee [:dimi][:dlee] from comment #5)

As mentioned in Comment 2, this is because the associated "data scheme" window is using null principal so GetTopWindowURI here [1]will fail.

I guess we can fix this by replacing GetTopWindowURI with TYPE_DOCUMENT check.

What do you mean by that? I'm not sure if I understand...

I think the problem that we have here is that GetTopWindowURI() is having magic unwanted behaviour we don't want. :-) This null principal filtering business is really unhelpful to us. But then again we probably shouldn't change the behaviour of the function globally either since other consumers probably depend on this. So what to do? Let's add an aAllowNullPrincipal argument (default to false) to the function to allow callers to customize its behaviour, and pass true here.

But here is my questions:

  1. Do we really want to solve this issue? It is not clear to me that what a tracker can do in this scenario because there is no "first-party" URI.

Isn't that the data: URI? Consider the case where the tracker uses some kind of data mining to learn various information from English words found in a URI, if the first-party page is an English page coming from a data: URI, a lot of info could leak that way (unless base64 encoded).

Maybe cryptomining is one of the cases that a third-party can still harm the user?

For sure. Another case is fingerprinting, where they would use a data: URI page to capture a fingerprint of your device and phone it back home. And so on. :-)

So I would say yes we do want to fix this.

  1. We can not generate the pairwise whitelist url because we don't have top-level window's uri, do we just skip whitelist?

Good question. The above suggestion should address this by making sure we would have access to the top-level window's uri.

Flags: needinfo?(ehsan)

Ehsan, thanks for your suggestion!
Can you help take a quick look for both of the WIP patches and let me know how do you think? thanks!

(In reply to :Ehsan Akhgari (Out of office 5/10-5/21) from comment #7)

What do you mean by that? I'm not sure if I understand...

I think the problem that we have here is that GetTopWindowURI() is having magic unwanted behaviour we don't want. :-) This null principal filtering business is really unhelpful to us. But then again we probably shouldn't change the behaviour of the function globally either since other consumers probably depend on this. So what to do? Let's add an aAllowNullPrincipal argument (default to false) to the function to allow callers to customize its behaviour, and pass true here.

I didn't mean to change anything in GetTopWindowURI(), attach the WIP patch.
I also tried to use the "AllowNullPrincipal" approach. I have to add two idl methods and I feel that it is not as straightforward as I expected.

Isn't that the data: URI? Consider the case where the tracker uses some kind of data mining to learn various information from English words found in a URI, if the first-party page is an English page coming from a data: URI, a lot of info could leak that way (unless base64 encoded).

Just out of curiosity(hope this is not a too dumb question), how does third-party tracker retrieve the URI of the data scheme in this scenario? I test both Firefox and Chrome, both don't include referer in the request, is there any other way to get that?

Maybe cryptomining is one of the cases that a third-party can still harm the user?

For sure. Another case is fingerprinting, where they would use a data: URI page to capture a fingerprint of your device and phone it back home. And so on. :-)

So I would say yes we do want to fix this.

Agree! Thanks for the information :)

  1. We can not generate the pairwise whitelist url because we don't have top-level window's uri, do we just skip whitelist?

Good question. The above suggestion should address this by making sure we would have access to the top-level window's uri.

I tested the "AllowNullPrincipal" approach. URI from NullPrincial is "moz-nullprincipal {UUID}" which cause an error while generating whitelist URI because it is not a valid URI. So we still have to do additional handling with this approach :(

Flags: needinfo?(ehsan)

(In reply to Dimi Lee [:dimi][:dlee] from comment #10)

Ehsan, thanks for your suggestion!
Can you help take a quick look for both of the WIP patches and let me know how do you think? thanks!

(In reply to :Ehsan Akhgari (Out of office 5/10-5/21) from comment #7)

What do you mean by that? I'm not sure if I understand...

I think the problem that we have here is that GetTopWindowURI() is having magic unwanted behaviour we don't want. :-) This null principal filtering business is really unhelpful to us. But then again we probably shouldn't change the behaviour of the function globally either since other consumers probably depend on this. So what to do? Let's add an aAllowNullPrincipal argument (default to false) to the function to allow callers to customize its behaviour, and pass true here.

I didn't mean to change anything in GetTopWindowURI(), attach the WIP patch.
I also tried to use the "AllowNullPrincipal" approach. I have to add two idl methods and I feel that it is not as straightforward as I expected.

I think attachment 9063877 [details] [diff] [review] is the way to go! The other one looks wrong, not sure why TYPE_DOCUMENT implies top-level?

Isn't that the data: URI? Consider the case where the tracker uses some kind of data mining to learn various information from English words found in a URI, if the first-party page is an English page coming from a data: URI, a lot of info could leak that way (unless base64 encoded).

Just out of curiosity(hope this is not a too dumb question), how does third-party tracker retrieve the URI of the data scheme in this scenario? I test both Firefox and Chrome, both don't include referer in the request, is there any other way to get that?

Sure, here is an example:

// served from https://tracker.example/utilities-sdk.js, embedded in the top-level page
fetch("https://tracker.example/pixel", {credentials: "include", method: "POST", body: JSON.stringify({url: location.href, ...})});

  1. We can not generate the pairwise whitelist url because we don't have top-level window's uri, do we just skip whitelist?

Good question. The above suggestion should address this by making sure we would have access to the top-level window's uri.

I tested the "AllowNullPrincipal" approach. URI from NullPrincial is "moz-nullprincipal {UUID}" which cause an error while generating whitelist URI because it is not a valid URI. So we still have to do additional handling with this approach :(

Oh well then. :-( In practice we currently don't need that handling anyway (right?) so leaving it broken for now and deferring fixing it to a follow-up wouldn't be the end of the world.

Flags: needinfo?(ehsan)

(In reply to :Ehsan Akhgari from comment #11)

(In reply to Dimi Lee [:dimi][:dlee] from comment #10)

I didn't mean to change anything in GetTopWindowURI(), attach the WIP patch.
I also tried to use the "AllowNullPrincipal" approach. I have to add two idl methods and I feel that it is not as straightforward as I expected.

I think attachment 9063877 [details] [diff] [review] is the way to go! The other one looks wrong, not sure why TYPE_DOCUMENT implies top-level?

Hi Ehsan,
From the comment here[1], it says "Indicates a document at the top-level".
Also, while we implemented URL classifier classification by default, we also use TYPE_DOCUMENT indicates a top-level load[2], did I miss anything here?

[1] https://searchfox.org/mozilla-central/rev/952521e6164ddffa3f34bc8cfa5a81afc5b859c4/dom/base/nsIContentPolicy.idl#88
[2] https://searchfox.org/mozilla-central/rev/952521e6164ddffa3f34bc8cfa5a81afc5b859c4/netwerk/base/nsNetUtil.cpp#3039

Flags: needinfo?(ehsan)

(In reply to Dimi Lee [:dimi][:dlee] from comment #12)

(In reply to :Ehsan Akhgari from comment #11)

(In reply to Dimi Lee [:dimi][:dlee] from comment #10)

I didn't mean to change anything in GetTopWindowURI(), attach the WIP patch.
I also tried to use the "AllowNullPrincipal" approach. I have to add two idl methods and I feel that it is not as straightforward as I expected.

I think attachment 9063877 [details] [diff] [review] is the way to go! The other one looks wrong, not sure why TYPE_DOCUMENT implies top-level?

Hi Ehsan,
From the comment here[1], it says "Indicates a document at the top-level".
Also, while we implemented URL classifier classification by default, we also use TYPE_DOCUMENT indicates a top-level load[2], did I miss anything here?

Ah, right. No the mistake was mine, sorry about that!

But I still prefer the other approach, since assuming that it's OK to bail out silently for top level loads for all classifications is probably something that's going to cause subtle bugs in the long term if we add a classifier later on which needs to catch top level loads for some reason...

Flags: needinfo?(ehsan)

Hey Dimi,

While working on something completely unrelated, I saw that this null principal check is causing problems for us there too. Unlike with this case, there all that was happening was we were doing the usual third-partiness checks. That means that a solution like the one I proposed here (the "AllowNullPrincipal" approach) wouldn't work there.

So I decided to dig up why we have this check in the first place. This took me back to bug 1088183 comment 8. It turns out that this check was added because back then we couldn't serialize null principals across IPC(!!!). Now that we have no such limitation perhaps we should just remove this check, and maybe that is the right solution to this bug.

See Also: → 1558969

(In reply to :Ehsan Akhgari from comment #14)

So I decided to dig up why we have this check in the first place. This took me back to bug 1088183 comment 8. It turns out that this check was added because back then we couldn't serialize null principals across IPC(!!!). Now that we have no such limitation perhaps we should just remove this check, and maybe that is the right solution to this bug.

Hi Ehsan, Thank you very much for the information!
Agree, if the limitation can be removed, it is the right solution :)

BTW, one more note about this bug... Bug 1401895 blocked the ability of web pages to do top-level navigations to data: URIs. Therefore I think WONTFIX could be a perfectly fine resolution to this bug now, since the only top-level data: URIs that can now exist are the ones that users type in to the address bar manually, and I don't believe there is a real-life tracking risk for those URIs.

Needinfo-ing Steven for a second opinion.

Flags: needinfo?(senglehardt)
See Also: 15589691558994

Therefore I think WONTFIX could be a perfectly fine resolution to this bug now, since the only top-level data: URIs that can now exist are the ones that users type in to the address bar manually, and I don't believe there is a real-life tracking risk for those URIs.

That sounds right to me as well. I have verified that we do classify resources loaded in embedded data: documents (https://senglehardt.com/test/trackingprotection/data_urls.html).

Reading through [0] I see that sites can still open data:application/pdf as top-level navigations, which I presume will open in pdfjs. From what I understand, pdfjs does not support embedded JS or external images so this should be safe.

[0] https://blog.mozilla.org/security/2017/11/27/blocking-top-level-navigations-data-urls-firefox-59/

Flags: needinfo?(senglehardt)

Base on Comment 16 & Comment 17, WONT FIX this issue.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: