1231543 - Tracking protection doesn't work on data: URLs

arni2033

Reporter

Description

•

10 years ago

Attached file testcase 1 - Tracking protection doesn't work on data URLs.html — Details

>>> My Info: Win7_64, Nightly 45, 32bit, ID 20151206030448 STR (Steps 2 and 3 are unnecessary): 1. Open "data:" url in a new private window: > data:text/html,<iframe src="https://www.google-analytics.com/"> 2. Open "testcase 1" in private window 3. Save "testcase 1" locally and open saved .html file in private window Result: Tracking Protection doesn't block google-analytics.com in Step 1. It does in Step 2 and 3. Expectations: Tracking Protection should block all malicious resources in private windows.

François Marier [:francois]

Comment 1

•

9 years ago

It doesn't seem to go through tracking protection at all: [Main Thread]: D/nsChannelClassifier nsChannelClassifier[7fbb3701bd60]: Classifying principal https://www.google-analytics.com/ on channel with uri https://www.google-analytics.com/ [Main Thread]: D/nsChannelClassifier nsChannelClassifier[7fbb3701bd60]: suspended channel 7fbb34583848 [URL Classifier]: D/UrlClassifierDbService Checking fragment www.google-analytics.com/ [URL Classifier]: D/UrlClassifierDbService Checking fragment google-analytics.com/ [URL Classifier]: D/UrlClassifierDbService Checking table goog-malware-shavar [URL Classifier]: D/UrlClassifierDbService Checking table goog-unwanted-shavar [URL Classifier]: D/UrlClassifierDbService Checking table test-malware-simple [URL Classifier]: D/UrlClassifierDbService Checking table test-unwanted-simple [URL Classifier]: D/UrlClassifierDbService Checking table goog-phish-shavar [URL Classifier]: D/UrlClassifierDbService Checking table test-phish-simple [URL Classifier]: D/UrlClassifierDbService Checking table test-block-simple [URL Classifier]: D/UrlClassifierDbService Checking table mozplugin-block-digest256 [URL Classifier]: D/UrlClassifierDbService Checking fragment www.google-analytics.com/, hash 7FBAFEF973787AAC7764E1552B580D7EC633DB2D5787BCF56A2ECF4C17FCF1C8 (F9FEBA7F) [URL Classifier]: D/UrlClassifierDbService Probe in goog-malware-shavar: F9FEBA7F, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in goog-unwanted-shavar: F9FEBA7F, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in test-malware-simple: F9FEBA7F, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in test-unwanted-simple: F9FEBA7F, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in goog-phish-shavar: F9FEBA7F, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in test-phish-simple: F9FEBA7F, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in test-block-simple: F9FEBA7F, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in mozplugin-block-digest256: F9FEBA7F, found 0 [URL Classifier]: D/UrlClassifierDbService Checking fragment google-analytics.com/, hash 9DF2E93276C4DED5EF0FB2FBA5698770553F1F95044DDB6DF9BE1E67E58C181E (32E9F29D) [URL Classifier]: D/UrlClassifierDbService Probe in goog-malware-shavar: 32E9F29D, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in goog-unwanted-shavar: 32E9F29D, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in test-malware-simple: 32E9F29D, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in test-unwanted-simple: 32E9F29D, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in goog-phish-shavar: 32E9F29D, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in test-phish-simple: 32E9F29D, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in test-block-simple: 32E9F29D, found 0 [URL Classifier]: D/UrlClassifierDbService Probe in mozplugin-block-digest256: 32E9F29D, found 0 [URL Classifier]: D/UrlClassifierDbService Found 0 results. [URL Classifier]: D/UrlClassifierDbService Found 0 results. [URL Classifier]: D/UrlClassifierDbService query took 0ms [Main Thread]: D/UrlClassifierDbService nsUrlClassifierLookupCallback::LookupComplete [7fbb338e6d00] 0 pending completions [Main Thread]: D/UrlClassifierDbService nsUrlClassifierLookupCallback::HandleResults [7fbb338e6d00, 0 results] [Main Thread]: D/nsChannelClassifier nsChannelClassifier[7fbb3701bd60]:OnClassifyComplete NS_OK (suspended channel) [Main Thread]: D/nsChannelClassifier nsChannelClassifier::MarkEntryClassified[NS_OK] https://www.google-analytics.com/ [URL Classifier]: D/UrlClassifierDbService nsUrlClassifierDBServiceWorker::CacheMisses [7fbb442cdd80] 0 [Main Thread]: D/nsChannelClassifier nsChannelClassifier[7fbb3701bd60]: resuming channel 7fbb34583848 from OnClassifyComplete Maybe ShouldEnableTrackingProtection() takes an early return somewhere in here: https://dxr.mozilla.org/mozilla-central/rev/233ab21b64b5d5e9f2f16ea2d4cfb4c8b293c5c4/netwerk/base/nsChannelClassifier.cpp#69-93

Priority: -- → P5

François Marier [:francois]

Updated

•

9 years ago

Blocks: 1207775

François Marier [:francois]

Updated

•

9 years ago

Priority: P5 → P3

Thomas Nguyen (:tnguyen)

Updated

•

9 years ago

Assignee: nobody → tnguyen

Thomas Nguyen (:tnguyen)

Comment 2

•

9 years ago

This failed with : https://dxr.mozilla.org/mozilla-central/source/dom/base/ThirdPartyUtil.cpp#81 We could not get the URI associated with a window in the data: scheme case which are using NullPrincipal. I have to look closer to see what we can do in this case

François Marier [:francois]

Comment 3

•

9 years ago

Un-assigning for now since I don't think Thomas is actively working on it.

Assignee: tnguyen → nobody

François Marier [:francois]

Updated

•

7 years ago

Whiteboard: tp-leak

François Marier [:francois]

Updated

•

7 years ago

Whiteboard: tp-leak

François Marier [:francois]

Comment 4

•

7 years ago

(In reply to Thomas Nguyen OOO Util 20 October from comment #2) > This failed with : > https://dxr.mozilla.org/mozilla-central/source/dom/base/ThirdPartyUtil.cpp#81 > > We could not get the URI associated with a window in the data: scheme case > which are using NullPrincipal. > I have to look closer to see what we can do in this case It's possible that bug 1456486 will fix this.

Updated

•

7 years ago

Priority: P3 → P2

Dimi Lee [:dimi]

Assignee

Updated

•

7 years ago

Assignee: nobody → dlee

Status: NEW → ASSIGNED

Dimi Lee [:dimi]

Assignee

Updated

•

7 years ago

Priority: P2 → P1

Dimi Lee [:dimi]

Assignee

Comment 5

•

7 years ago

As mentioned in Comment 2, this is because the associated "data scheme" window is using null principal so GetTopWindowURI here [1]will fail.

I guess we can fix this by replacing GetTopWindowURI with TYPE_DOCUMENT check.

But here is my questions:

Do we really want to solve this issue? It is not clear to me that what a tracker can do in this scenario because there is no "first-party" URI.
Maybe cryptomining is one of the cases that a third-party can still harm the user?
We can not generate the pairwise whitelist url because we don't have top-level window's uri, do we just skip whitelist?

[1] https://searchfox.org/mozilla-central/rev/99a2a5a955960b0e58ceade1db1f7652d9db4ba1/netwerk/url-classifier/UrlClassifierCommon.cpp#117

Dimi Lee [:dimi]

Assignee

Comment 6

•

7 years ago

•

Edited

Hi Ehsan,
I would like to know your opinion on this bug(See Comment 5), is this something we want to fix? thank you!

Flags: needinfo?(ehsan)

(no longer active)

Comment 7

•

7 years ago

(In reply to Dimi Lee [:dimi][:dlee] from comment #5)

As mentioned in Comment 2, this is because the associated "data scheme" window is using null principal so GetTopWindowURI here [1]will fail.

I guess we can fix this by replacing GetTopWindowURI with TYPE_DOCUMENT check.

What do you mean by that? I'm not sure if I understand...

I think the problem that we have here is that GetTopWindowURI() is having magic unwanted behaviour we don't want. :-) This null principal filtering business is really unhelpful to us. But then again we probably shouldn't change the behaviour of the function globally either since other consumers probably depend on this. So what to do? Let's add an aAllowNullPrincipal argument (default to false) to the function to allow callers to customize its behaviour, and pass true here.

But here is my questions:

Do we really want to solve this issue? It is not clear to me that what a tracker can do in this scenario because there is no "first-party" URI.

Isn't that the data: URI? Consider the case where the tracker uses some kind of data mining to learn various information from English words found in a URI, if the first-party page is an English page coming from a data: URI, a lot of info could leak that way (unless base64 encoded).

Maybe cryptomining is one of the cases that a third-party can still harm the user?

For sure. Another case is fingerprinting, where they would use a data: URI page to capture a fingerprint of your device and phone it back home. And so on. :-)

So I would say yes we do want to fix this.

We can not generate the pairwise whitelist url because we don't have top-level window's uri, do we just skip whitelist?

Good question. The above suggestion should address this by making sure we would have access to the top-level window's uri.

Flags: needinfo?(ehsan)

Dimi Lee [:dimi]

Assignee

Comment 8

•

7 years ago

Attached patch test patch with aAllowNullPrincipal — Details — Splinter Review

Dimi Lee [:dimi]

Assignee

Comment 9

•

7 years ago

Attached patch test patch with TYPE_DOCUMENT check — Details — Splinter Review

Dimi Lee [:dimi]

Assignee

Comment 10

•

7 years ago

•

Edited

Ehsan, thanks for your suggestion!
Can you help take a quick look for both of the WIP patches and let me know how do you think? thanks!

(In reply to :Ehsan Akhgari (Out of office 5/10-5/21) from comment #7)

What do you mean by that? I'm not sure if I understand...

I think the problem that we have here is that GetTopWindowURI() is having magic unwanted behaviour we don't want. :-) This null principal filtering business is really unhelpful to us. But then again we probably shouldn't change the behaviour of the function globally either since other consumers probably depend on this. So what to do? Let's add an aAllowNullPrincipal argument (default to false) to the function to allow callers to customize its behaviour, and pass true here.

I didn't mean to change anything in GetTopWindowURI(), attach the WIP patch.
I also tried to use the "AllowNullPrincipal" approach. I have to add two idl methods and I feel that it is not as straightforward as I expected.

Isn't that the data: URI? Consider the case where the tracker uses some kind of data mining to learn various information from English words found in a URI, if the first-party page is an English page coming from a data: URI, a lot of info could leak that way (unless base64 encoded).

Just out of curiosity(hope this is not a too dumb question), how does third-party tracker retrieve the URI of the data scheme in this scenario? I test both Firefox and Chrome, both don't include referer in the request, is there any other way to get that?

Maybe cryptomining is one of the cases that a third-party can still harm the user?

For sure. Another case is fingerprinting, where they would use a data: URI page to capture a fingerprint of your device and phone it back home. And so on. :-)

So I would say yes we do want to fix this.

Agree! Thanks for the information :)

We can not generate the pairwise whitelist url because we don't have top-level window's uri, do we just skip whitelist?

Good question. The above suggestion should address this by making sure we would have access to the top-level window's uri.

I tested the "AllowNullPrincipal" approach. URI from NullPrincial is "moz-nullprincipal {UUID}" which cause an error while generating whitelist URI because it is not a valid URI. So we still have to do additional handling with this approach :(

Flags: needinfo?(ehsan)

(no longer active)

Comment 11

•

7 years ago

(In reply to Dimi Lee [:dimi][:dlee] from comment #10)

Ehsan, thanks for your suggestion!
Can you help take a quick look for both of the WIP patches and let me know how do you think? thanks!

(In reply to :Ehsan Akhgari (Out of office 5/10-5/21) from comment #7)

What do you mean by that? I'm not sure if I understand...

I think the problem that we have here is that GetTopWindowURI() is having magic unwanted behaviour we don't want. :-) This null principal filtering business is really unhelpful to us. But then again we probably shouldn't change the behaviour of the function globally either since other consumers probably depend on this. So what to do? Let's add an aAllowNullPrincipal argument (default to false) to the function to allow callers to customize its behaviour, and pass true here.

I didn't mean to change anything in GetTopWindowURI(), attach the WIP patch.
I also tried to use the "AllowNullPrincipal" approach. I have to add two idl methods and I feel that it is not as straightforward as I expected.

I think attachment 9063877 [details] [diff] [review] is the way to go! The other one looks wrong, not sure why TYPE_DOCUMENT implies top-level?

Isn't that the data: URI? Consider the case where the tracker uses some kind of data mining to learn various information from English words found in a URI, if the first-party page is an English page coming from a data: URI, a lot of info could leak that way (unless base64 encoded).

Just out of curiosity(hope this is not a too dumb question), how does third-party tracker retrieve the URI of the data scheme in this scenario? I test both Firefox and Chrome, both don't include referer in the request, is there any other way to get that?

Sure, here is an example:

// served from https://tracker.example/utilities-sdk.js, embedded in the top-level page
fetch("https://tracker.example/pixel", {credentials: "include", method: "POST", body: JSON.stringify({url: location.href, ...})});

We can not generate the pairwise whitelist url because we don't have top-level window's uri, do we just skip whitelist?

Good question. The above suggestion should address this by making sure we would have access to the top-level window's uri.

I tested the "AllowNullPrincipal" approach. URI from NullPrincial is "moz-nullprincipal {UUID}" which cause an error while generating whitelist URI because it is not a valid URI. So we still have to do additional handling with this approach :(

Oh well then. :-( In practice we currently don't need that handling anyway (right?) so leaving it broken for now and deferring fixing it to a follow-up wouldn't be the end of the world.

Flags: needinfo?(ehsan)

Dimi Lee [:dimi]

Assignee

Comment 12

•

7 years ago

(In reply to :Ehsan Akhgari from comment #11)

(In reply to Dimi Lee [:dimi][:dlee] from comment #10)

I didn't mean to change anything in GetTopWindowURI(), attach the WIP patch.
I also tried to use the "AllowNullPrincipal" approach. I have to add two idl methods and I feel that it is not as straightforward as I expected.

I think attachment 9063877 [details] [diff] [review] is the way to go! The other one looks wrong, not sure why TYPE_DOCUMENT implies top-level?

Hi Ehsan,
From the comment here[1], it says "Indicates a document at the top-level".
Also, while we implemented URL classifier classification by default, we also use TYPE_DOCUMENT indicates a top-level load[2], did I miss anything here?

[1] https://searchfox.org/mozilla-central/rev/952521e6164ddffa3f34bc8cfa5a81afc5b859c4/dom/base/nsIContentPolicy.idl#88
[2] https://searchfox.org/mozilla-central/rev/952521e6164ddffa3f34bc8cfa5a81afc5b859c4/netwerk/base/nsNetUtil.cpp#3039

Flags: needinfo?(ehsan)

(no longer active)

Comment 13

•

7 years ago

(In reply to Dimi Lee [:dimi][:dlee] from comment #12)

(In reply to :Ehsan Akhgari from comment #11)

(In reply to Dimi Lee [:dimi][:dlee] from comment #10)

I didn't mean to change anything in GetTopWindowURI(), attach the WIP patch.
I also tried to use the "AllowNullPrincipal" approach. I have to add two idl methods and I feel that it is not as straightforward as I expected.

I think attachment 9063877 [details] [diff] [review] is the way to go! The other one looks wrong, not sure why TYPE_DOCUMENT implies top-level?

Hi Ehsan,
From the comment here[1], it says "Indicates a document at the top-level".
Also, while we implemented URL classifier classification by default, we also use TYPE_DOCUMENT indicates a top-level load[2], did I miss anything here?

Ah, right. No the mistake was mine, sorry about that!

But I still prefer the other approach, since assuming that it's OK to bail out silently for top level loads for all classifications is probably something that's going to cause subtle bugs in the long term if we add a classifier later on which needs to catch top level loads for some reason...

Flags: needinfo?(ehsan)

(no longer active)

Comment 14

•

7 years ago

Hey Dimi,

While working on something completely unrelated, I saw that this null principal check is causing problems for us there too. Unlike with this case, there all that was happening was we were doing the usual third-partiness checks. That means that a solution like the one I proposed here (the "AllowNullPrincipal" approach) wouldn't work there.

So I decided to dig up why we have this check in the first place. This took me back to bug 1088183 comment 8. It turns out that this check was added because back then we couldn't serialize null principals across IPC(!!!). Now that we have no such limitation perhaps we should just remove this check, and maybe that is the right solution to this bug.

(no longer active)

Updated

•

7 years ago

Comment 15

•

7 years ago

(In reply to :Ehsan Akhgari from comment #14)

So I decided to dig up why we have this check in the first place. This took me back to bug 1088183 comment 8. It turns out that this check was added because back then we couldn't serialize null principals across IPC(!!!). Now that we have no such limitation perhaps we should just remove this check, and maybe that is the right solution to this bug.

Hi Ehsan, Thank you very much for the information!
Agree, if the limitation can be removed, it is the right solution :)

(no longer active)

Comment 16

•

7 years ago

BTW, one more note about this bug... Bug 1401895 blocked the ability of web pages to do top-level navigations to data: URIs. Therefore I think WONTFIX could be a perfectly fine resolution to this bug now, since the only top-level data: URIs that can now exist are the ones that users type in to the address bar manually, and I don't believe there is a real-life tracking risk for those URIs.

Needinfo-ing Steven for a second opinion.

Flags: needinfo?(senglehardt)

Comment 17

•

6 years ago

Therefore I think WONTFIX could be a perfectly fine resolution to this bug now, since the only top-level data: URIs that can now exist are the ones that users type in to the address bar manually, and I don't believe there is a real-life tracking risk for those URIs.

That sounds right to me as well. I have verified that we do classify resources loaded in embedded data: documents (https://senglehardt.com/test/trackingprotection/data_urls.html).

Reading through [0] I see that sites can still open data:application/pdf as top-level navigations, which I presume will open in pdfjs. From what I understand, pdfjs does not support embedded JS or external images so this should be safe.

[0] https://blog.mozilla.org/security/2017/11/27/blocking-top-level-navigations-data-urls-firefox-59/

Flags: needinfo?(senglehardt)

Dimi Lee [:dimi]

Assignee

Comment 18

•

6 years ago

Base on Comment 16 & Comment 17, WONT FIX this issue.

Status: ASSIGNED → RESOLVED

Closed: 6 years ago

Resolution: --- → WONTFIX

testcase 1 - Tracking protection doesn't work on data URLs.html 10 years ago arni2033 48 bytes, text/html		Details
test patch with aAllowNullPrincipal 7 years ago Dimi Lee [:dimi] 8.06 KB, patch		Details \| Diff \| Splinter Review
test patch with TYPE_DOCUMENT check 7 years ago Dimi Lee [:dimi] 2.42 KB, patch		Details \| Diff \| Splinter Review