Open Bug 1437626 Opened 2 years ago Updated 16 days ago

GetTopWindowURI should return the URI that a service worker is registered to instead of null

Categories

(Toolkit :: Safe Browsing, enhancement, P1)

enhancement

Tracking

()

ASSIGNED
Tracking Status
firefox60 --- affected

People

(Reporter: erahm, Assigned: dimi)

References

(Blocks 1 open bug)

Details

(Whiteboard: [necko-triaged] tp-leak)

Attachments

(3 files)

+++ This bug was initially created as a clone of Bug #1384638 +++

In bug 1384638 we silenced a rather verbose warning, but we think there's still an issue.

From bug 1384638, comment 6:

> > Similar to bug 1408410 comment 5, I found that when a request is made from a
> > service worker, GetTopWindowURI() returns fail.
> > In this case, maybe we should make GetTopWindowURI() return the uri the
> > service worker is registered for?
> > 
> > What do you think, Francois?
> 
> That sounds right to me. If a Google service worker wants to load Google
> Analytics, that should be treated as a first-party load. If another
> organization wants to load it then we should consider it a third-party load.
Does this mean:

a) nsIChannel's that the service worker produced by doing something like fetch()?
b) nsIChannel's being intercepted by a service worker with a synthetic response?
ni? on comment 1
Flags: needinfo?(erahm)
(In reply to Valentin Gosu [:valentin] from comment #2)
> ni? on comment 1

François?
Flags: needinfo?(erahm) → needinfo?(francois)
I chatted with Kershaw about this before Austin, but I don't remember the details.

I have a vague recollection that there was a service worker hole and another non-service worker one, but I unfortunately didn't write any of it down since he was totally on top of this and already had solutions in mind.
Flags: needinfo?(francois)
ni? Jason to figure out how we want to prioritize this in terms of upcoming work (it's unclear to me), and adjust/assign as necessary.
Flags: needinfo?(jduell.mcbugs)
Whiteboard: [necko-triaged]
I re-read comment 0 and I think I might understand a bit better.  It sounds like this is for a `fetch()` initiated from a service worker thread.  So (a) from comment 1.

I think maybe the best fix here would be to change HttpBaseChannel::GetTopWindowURI() to handle requests from worker clients:

https://searchfox.org/mozilla-central/rev/056a4057575029f58eb69ae89bc7b6c3c11e2e28/netwerk/protocol/http/HttpBaseChannel.cpp#2319

The concept of "top window URI" is kind of weird now that we have contexts that don't associate to a single top window.  For example, service workers are not associated with any window directly.  Shared workers can be associated with many windows and we probably treat whichever one attached last as "top" today.

Anyway, if GetTopWindowURI() gets a nullptr back from GetTopWindowForChannel(), I propose that we make it do something like:

  Maybe<ClientInfo> client = mLoadInfo->GetClientInfo();
  if (client.isSome()) {
    return NS_NewURI(aTopWindowURI, client.ref().URL(), nullptr);
  }

Or maybe use the principal origin without any suffix instead of the URL() since we can have blob URL workers, etc.  We could also restrict this check based on client type (window/worker/etc) if desired.

This does not do any kind of "find the top" operation, but things like shared worker and service worker don't really have a concept of "top".
I guess we should consider, however, what should happen in the case where:

1. Top window is origin "foo"
2. Nested frame is origin "bar"
3. Nested frame talks to ServiceWorker on origin "bar" using a FetchEvent, postMessage, etc.
4. ServiceWorker on origin "bar" initiates a fetch()

Or replace the ServiceWorker here with a SharedWorker.
The overall goal here is to prevent a workers from loading trackers unless the page/worker and the tracker belong to the same party. To do that, we compare origins and we group related origins together (e.g. google.com, google.ca and google-analytics.com are considered the same party).

For example, this is the behavior I would expect in these two cases:

1. The worker on worker.com is associated with a single window at URL example.com:
   (a) It should be forbidden to make network requests to tracker.com if it's on the TP list.
   (b) It should be allowed to make network requests to worker.com even if it's on the TP list.
   (c) It should be allowed to make network requests to example.com even if it's on the TP list.

2. The worker on worker.com is not associated with a window:
  (a) It should also be forbidden to make network requests to tracker.com if it's on the TP list.
  (b) It should also be allowed to make network requests to worker.com even if it's on the TP list.
  (c) It should be forbidden to make network requests to example.com if it's on the TP list.

In order to do the origin comparison, we need to know what the "topmost" URL is. If the worker is tied to a single window, then I guess it should be that window URL. If it's not tied to a window, then we should probably use the URL of the worker itself.

For workers that are associated with multiple windows then we should probably use the URL of the worker.

Does that cover all of the cases?
(In reply to François Marier [:francois] from comment #8)
> The overall goal here is to prevent a workers from loading trackers unless
> the page/worker and the tracker belong to the same party. To do that, we
> compare origins and we group related origins together (e.g. google.com,
> google.ca and google-analytics.com are considered the same party).

Are we explicitly not dealing with the situation where origins can communicate from nested contexts to top level contexts here?

For example,

1) example.com is a top level window and has a nested iframe worker.com
2) worker.com is also open as a top level window
3) top level worker.com window loads a tracker.com script
4) nested worker.com iframe communicates data to top level worker.com window using BroadcastChannel/SharedWorker/etc to exfiltrate tracking data

In this case you could argue that opening a top level worker.com window is opt'ing in to the "top level" behavior for the origin by the user.  But consider how this plays out if we replace "top level window" with "worker without an associated window".

1) example.com is a top level window and has a nested iframe worker.com
2) nested worker.com iframe registers a service worker
3) worker.com service worker loads a tracker.com script because it does not have an explicit associated window and is considered "top level" itself
4) nested worker.com iframe communicates with the worker.com service worker using BroadcastChannel/postMessage/FetchEvent/etc to exfiltrate tracking data

Also consider the case where a service worker is used to load a tracking script and mask the tracking script URL from the outer page:

1) example.com is a top level window and has a nested iframe worker.com
2) nested worker.com iframe registers a service worker
3) nested worker.com iframe loads a worker.com script with a particular URL
4) worker.com service worker receives a FetchEvent for the script load, but instead loads a tracker.com script
5) worker.com service worker extracts data from the tracker.com script Response and creates a new synthetic Response, thus masking its URL
6) nested worker.com effectively loads the tracking.com script content

I feel like maybe we should have some heuristic like:

a) If there are any top level windows of a given origin, then the service worker is considered top level as well.  Otherwise the service worker is considered not top level.
b) A SharedWorker is only considered top level if one of the clients attached to it is a top level window.

Expressing will probably require a different API from GetTopWindowURI(), though.

What do you think?
Flags: needinfo?(francois)
The other, more holistic, way of dealing with this is to implement a partitioning scheme like safari's "intelligent tracking protection".  We have first-party isolation, but that doesn't have the stuff that auto-promotes to first-party for sites the user has explicitly interacted with in the past.
Sorry, the partitioning helps here because it prevents all the communication and interactions I mention in comment 9.  In general those are all based on same-origin principal checks.
I think you're right about TopWindowURI being insufficient.

Ignoring for a moment the details of what we replace TopWindowURI with, here is what I would expect to see.
Flags: needinfo?(francois)
Attached image case1.png
(In reply to Ben Kelly [:bkelly] from comment #9)
> (In reply to François Marier [:francois] from comment #8)
> 1) example.com is a top level window and has a nested iframe worker.com
> 2) worker.com is also open as a top level window
> 3) top level worker.com window loads a tracker.com script
> 4) nested worker.com iframe communicates data to top level worker.com window
> using BroadcastChannel/SharedWorker/etc to exfiltrate tracking data

Here, the top-level worker.com page would be unable to load the tracker.com script in the first place because worker.com and tracker.com are not same-origin. Therefore there would be nothing to send over the BroadcastChannel since the fetch()/xhr() would fail.
Attached image case2.png
(In reply to Ben Kelly [:bkelly] from comment #9)
> 1) example.com is a top level window and has a nested iframe worker.com
> 2) nested worker.com iframe registers a service worker
> 3) worker.com service worker loads a tracker.com script because it does not
> have an explicit associated window and is considered "top level" itself
> 4) nested worker.com iframe communicates with the worker.com service worker
> using BroadcastChannel/postMessage/FetchEvent/etc to exfiltrate tracking data

Here we would look at the service worker's "origin" (worker.com) and see that it's not the same as tracker.com and so we would block that load.
Attached image case3.png
(In reply to Ben Kelly [:bkelly] from comment #9)
> 1) example.com is a top level window and has a nested iframe worker.com
> 2) nested worker.com iframe registers a service worker
> 3) nested worker.com iframe loads a worker.com script with a particular URL
> 4) worker.com service worker receives a FetchEvent for the script load, but
> instead loads a tracker.com script
> 5) worker.com service worker extracts data from the tracker.com script
> Response and creates a new synthetic Response, thus masking its URL
> 6) nested worker.com effectively loads the tracking.com script content

This one is pretty interesting because the worker.com iframe would be allowed to load a tracker.com script if it is "same-origin" with the top-level page (example.com). For example, a Facebook game in an iframe on Facebook.com can load Facebook's tracking scripts.

So perhaps in this case the origin comparison should be between:

1. worker.com and tracker.com (as in case #2)
2. example.com and tracker.com

and we let the load proceed if either of these succeeds.
So it sounds like we need to change the check to something like:

    if isServiceWorkerLoad {
        return isSameOrigin(workerOrigin, resourceUri) or isSameOrigin(registeringWindowUri, resourceUri)
    } else {
        return isSameOrigin(toplevelwindowURI, resourceUri)
    }

Can a service worker "service" more than one origin (e.g. example.com and example.net) or is it tied to example.com?

From the name, it sounds like SharedWorkers might be able to "service" more than one origin. In that case, we probably would need to be stricter and only check isSameOrigin(workerOrigin, resourceUri).
This may help:
https://www.w3.org/TR/service-workers/#origin-restriction

They can be imported, this area seems to consist of some flux.

:asuth, thoughts?
Flags: needinfo?(bugmail)
(In reply to François Marier [:francois] from comment #16)
> Can a service worker "service" more than one origin (e.g. example.com and
> example.net) or is it tied to example.com?

ServiceWorkers are strictly same-origin.  The only way they can interact with other origins is:
- via importScripts() as defined at https://w3c.github.io/ServiceWorker/#importscripts
- via fetch(), obeying the rules or fetch and CORS.
- via cross-origin MessagePorts passed to it via postMessage from a same-origin source that received the port via cross-origin window.postMessage() in the first place.

> From the name, it sounds like SharedWorkers might be able to "service" more
> than one origin. In that case, we probably would need to be stricter and
> only check isSameOrigin(workerOrigin, resourceUri).

No, SharedWorkers are also strictly same-origin.  "Shared" comes from the ability of multiple same-origin pages to create and reference a named SharedWorker that is a single thread they all can talk to with a single global scope.  When all the pages go away, the SharedWorker goes away.  (The spec does allow the SharedWorker to slightly out-live the last page if we expect a navigation and subsequent page logic will "create" the SharedWorker again, but we'd expect to hold the SharedWorker suspended/frozen and unable to run script if that's the case.  We similarly handle the bfcache state of existence.)

I'm going to address the other points in a separate comment.

(In reply to Marion Daly [:mdaly] from comment #17)
> This may help:
> https://www.w3.org/TR/service-workers/#origin-restriction

Note: It's most preferable to link to the living spec documents in most cases.  The versioned spec's and TRs usually only exist because snapshots are needed for Intellectual Property Rights purposes (or drama related to WHATWG/W3C).  The canonical spec link for SWs is https://w3c.github.io/ServiceWorker/.
(In reply to François Marier [:francois] from comment #15)
> This one is pretty interesting because the worker.com iframe would be
> allowed to load a tracker.com script if it is "same-origin" with the
> top-level page (example.com). For example, a Facebook game in an iframe on
> Facebook.com can load Facebook's tracking scripts.

I think this use case (the comment 7 "1c" carve-out) and the idealized goal of ServiceWorker adoption not breaking things means that if we wanted maximum consistency and don't want to use firstPartyDomain OriginAttribute partitioning whenever tracking protection is involved, we would:
- Allow ServiceWorkers to perform mode="no-cors" cross-origin fetch requests that are otherwise forbidden by tracking protection.  These return opaque filtered responses (https://fetch.spec.whatwg.org/#concept-filtered-response-opaque) that can be persisted to the Cache API's storage and respondWith()ed to internal fetch events, but that's it.  mode="cors" fetch()es would be forbidden per normal tracking protection rules.
- Make sure tracking protection applies its logic when the SW does respondWith(opaqueResponseMaybeAllowedByTrackingProtection).
- Have Tracking Protection consider only ServiceWorker and SharedWorker origins for any non-opaque requests.  Whether or not the ServiceWorker was controlling an iframed window would not matter.  Likewise, whether the SharedWorker had at least one iframed window parent/owner would not matter.

This would provide consistency for the following ServiceWorker scenarios:
a) Pass-through ServiceWorker that issues a fetch() for every fetch() event.
b) Cache-on-request ServiceWorkers that check the Cache API storage then hit network if it's not already there.
c) A ServiceWorker that downloads all of its resources as part of its "install" stage, storing all the results in Cache API.
In particular, this would be consistent even if the ServiceWorker is initially registered and installed when not in use as a 3rd-party iframe.


This wouldn't do anything to address the :bkelly's concern that a (non-firstPartyDomain OriginAttributes-partitioned) iframe window could totally talk with its non-iframed top-level siblings via BroadcastChannel and the Clients API's postMessage() affordances.  Or that it could persist things to the Cache API and IndexedDB, as well as read things out of them, thereby using IndexedDB as a dead-drop between iframe invocations, etc.  That's really inescapable without origin partitioning.
Flags: needinfo?(bugmail)
(In reply to Andrew Sutherland [:asuth] from comment #19)
> - Allow ServiceWorkers to perform mode="no-cors" cross-origin fetch requests
> that are otherwise forbidden by tracking protection.  These return opaque
> filtered responses
> (https://fetch.spec.whatwg.org/#concept-filtered-response-opaque) that can
> be persisted to the Cache API's storage and respondWith()ed to internal
> fetch events, but that's it.  mode="cors" fetch()es would be forbidden per
> normal tracking protection rules.

Wouldn't that allow SW to do the equivalent of "tracking pixels" by fetching an opaque resource with a unique identifier in the query string?
(In reply to François Marier [:francois] from comment #20)
> Wouldn't that allow SW to do the equivalent of "tracking pixels" by fetching
> an opaque resource with a unique identifier in the query string?

Yes.

I'm really not clear on what the threat model for tracking protection is[1].  It seems what we're doing is making it logistically harder for 1st-party sites to enable 3rd-party tracking by:
a) requiring the 1st-party to host the tracking scripts on their own origin because we'll block the loads from tracker 3rd-party origins
b) making "exfiltration" of tracking information harder by requiring it to either be routed through the 1st-party origin or use persistent storage or side-channels like BroadcastChannel to move the information to 3rd-party origin.

What I propose does magnify the "b" problem of data exfiltration, as you say.

It seems like there are really only 2 sane-ish choices here if we're not okay with that:

1) Implement double-keying via firstPartyDomain/similar so that "b" goes away.  The Facebook game situation is addressed by the double-keying created two separate ServiceWorkers in two separate OriginAttributes universes:
 - "some-game.tld^firstPartyDomain=facebook.com" where the tracking script fetches are allowed (and we can allow on the basis of the firstPartyDomain OriginAttribute) both by the initial fetch() from the SW, as well as for respondWith() in response to the iframe document's fetch event.
 - "some-game.tld^firstPartyDomain=facebook.com" where the tracking script fetches get denied.

2) Always block ServiceWorker access to third-party (non-same-origin) tracking resources so that we're consistent no matter what iframes are involved.  This means the game-iframe tracking scripts will fail to load on a Facebook top-level page if a ServiceWorker is used, but would succeed if a ServiceWorker was not used.


There's also a 3rd over-complicated hack that's only partially insane:
 
3) Modify my above proposal so that:
  - When considering tracking protection for a ServiceWorker we consider the top-level windows of all potential clients (both controlled and uncontrolled), accumulating a set of allowed 3rd party origins.
  - Require all ServiceWorker fetches to satisfy tracking protection using that set of 3rd party origins.
  - Also require that anything that is a tracking script/endpoint be a "no-cors" opaque request, so even if it succeeds right now for the SW and gets persisted to the Cache API, it will still be rejected when attempting to be handed off to windows that don't have the top-level URI requirement satisfied.
  - The insane part, because it's possible the cache API could latch a failed tracking script state that breaks the game... we cause the ServiceWorker update-check algorithm to treat a change in top-level URI as triggering an update of the ServiceWorker.  So if the SW installed where the top-level URI wasn't Facebook, but then the SW is used in a situation where the controlled document's top-level URI is Facebook, we would act like one of the script files had changed when the function event fires the update event.


Less sane options would include:
- Disable ServiceWorkers for the game-iframe that is not subject to tracking protection constraints for the top-level window.
- Some kind of hack where in the game-iframe case tracking protection logic notices that the page is making a fetch request that would normally be forbidden, and then decide to bypass the ServiceWorker and go straight to the network.  Or fail-over to the network.
- Let the ServiceWorker fetch()es of/to the tracking endpoints sometimes fail and sometimes succeed depending on whether there are any controlled or uncontrolled clients whose top-level windows correspond to the tracking origin.  These results may then becomes persisted, causing a debugging nightmare for anyone who gets involved.


I would personally suggest we go with option 2 for now and push really hard on double-keying/option 1, because it's the only approach that can't be gamed and is great for privacy for many reasons.  I'd also suggest we wait for :bkelly to come back and chime in :)

1: I checked the following docs after search found them, and I also read their linked papers/docs, but didn't really find an attack/defense tree or anything like that:
- https://developer.mozilla.org/en-US/Firefox/Privacy/Tracking_Protection
- https://wiki.mozilla.org/Security/Tracking_protection
- https://support.mozilla.org/en-US/kb/tracking-protection
(In reply to François Marier [:francois] from comment #13)
> (In reply to Ben Kelly [:bkelly] from comment #9)
> > (In reply to François Marier [:francois] from comment #8)
> > 1) example.com is a top level window and has a nested iframe worker.com
> > 2) worker.com is also open as a top level window
> > 3) top level worker.com window loads a tracker.com script
> > 4) nested worker.com iframe communicates data to top level worker.com window
> > using BroadcastChannel/SharedWorker/etc to exfiltrate tracking data
> 
> Here, the top-level worker.com page would be unable to load the tracker.com
> script in the first place because worker.com and tracker.com are not
> same-origin. Therefore there would be nothing to send over the
> BroadcastChannel since the fetch()/xhr() would fail.

Hmm.  I think maybe I was also lumping in 3rd party cookie blocking into my scenarios.  But this is why we block service workers when cookies are disabled.  BroadcastChannel is not blocked in that scenario yet, though.
So, I've tried to catch up on this.  I'm still paging stuff back in, but with I am kind of leaning towards the strict same-origin check for tracking protection as suggested in comment 16.  It seems the most conservative in terms of privacy protection to start.

Fortunately service workers are not available for private browsing windows, so we only have to deal with the profile-wide tracking protection setting.

I do think we should probably also consider if there is telemetry we can add to better understand what is happening in the wild.

Since service workers are same-origin with their registering client, I think comment 16 just devolves to:

    if isServiceWorkerLoad or isSharedWorkerLoad {
        return isSameOrigin(workerOrigin, resourceUri)
    } else {
        return isSameOrigin(toplevelwindowURI, resourceUri)
    }

Francois, does anyone on your time have time to work on this?  I think this change is small, but writing tests and telemetry might take additional time.  In particular, I'm not sure we have good experience writing the kind of telemetry that would be useful for analysis of this kind of problem.
Flags: needinfo?(francois)
(In reply to Ben Kelly [Part-time, not reviewing][:bkelly] from comment #23)
> Since service workers are same-origin with their registering client, I think
> comment 16 just devolves to:
> 
>     if isServiceWorkerLoad or isSharedWorkerLoad {
>         return isSameOrigin(workerOrigin, resourceUri)
>     } else {
>         return isSameOrigin(toplevelwindowURI, resourceUri)
>     }

That sounds good to me. Thank you both for pointing out my flawed assumptions about workers.

> Francois, does anyone on your time have time to work on this?  I think this
> change is small, but writing tests and telemetry might take additional time.
> In particular, I'm not sure we have good experience writing the kind of
> telemetry that would be useful for analysis of this kind of problem.

We're a bit short-staffed at the moment due to the Taipei layoffs so I don't expect we'll be able to take care of this before Q3.
Flags: needinfo?(francois)
Francois, do you know where the code in comment 16 logically resides in the tree?  Is it in necko or script security manager, etc?
Flags: needinfo?(francois)
It seems like everything eventually ends up calling nsChannelClassifier::ShouldEnableTrackingProtectionInternal() now. So I think the place where we'd need to add the new logic from comment 23 would be here: https://searchfox.org/mozilla-central/rev/de5c4376b89c3603341d226a65acea12f8851ec5/netwerk/base/nsChannelClassifier.cpp#342-346

If we rename `topWinURI` to something like `topLevelURI` and set it to the worker origin then it will avoid the early return and end up being checked by ThirdPartyUtil::IsThirdPartyURI(): https://searchfox.org/mozilla-central/rev/de5c4376b89c3603341d226a65acea12f8851ec5/netwerk/base/nsChannelClassifier.cpp#366

It's not entirely clear to me how a worker-initiated channel will behave in ThirdPartyUtil::IsThirdPartyChannel() though:
https://searchfox.org/mozilla-central/rev/56274d0a016a6833e5da7770ce70580b6f5abb21/dom/base/ThirdPartyUtil.cpp#181
Flags: needinfo?(francois)
Whiteboard: [necko-triaged] → [necko-triaged] tp-leak
See Also: → 1479134
Flags: needinfo?(jduell.mcbugs) → needinfo?(dd.mozilla)
Francois, this should be fix in nsChannelClassifier. Do you know who could take it and how important is it?
Flags: needinfo?(dd.mozilla) → needinfo?(francois)
I think we'll have to do that as part of bug 1207775. I'll reassign it to the Safe Browsing component and track it there.
Component: Networking → Safe Browsing
Flags: needinfo?(francois)
Product: Core → Toolkit
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Priority: P2 → P1

Besides the issue that we want to fix by implementing the solution in Comment 23, I found another one is that we don't get correct "useTrackingProtection" attribute in LoadContext[1] for channels triggered by ServiceWorker.(I tested Fetch in ServiceWorker script)

The reason is that in Bug 1322576, we change the behavior from using a global tracking protection preference to setting it through individual docshell[2]. But in the case of ServiceWorker, the LoadGroup is not from docshell, it is empty[3][4], so the "useTrackingProtection" attribute is always false. So I guess we need to get the information from docshell somehow while creating service worker?

Hi Andrew,
I am not sure how to fix this in service worker, do you have any suggestion?
If the comment above is not clear, please let me know and I'll provide more information, thanks!

[1] https://searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/docshell/base/nsILoadContext.idl#93
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1322576#c2
[3] https://searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/dom/serviceworkers/ServiceWorkerPrivate.cpp#1692
[4] https://searchfox.org/mozilla-central/rev/94c6b5f06d2464f6780a52f32e917d25ddc30d6b/dom/workers/WorkerLoadInfo.cpp#371

Flags: needinfo?(bugmail)

Is useTrackingProtection something that could live on CookieSettings? It seems like that's the right place and already has the right semantics and life-cycle.

Flags: needinfo?(bugmail) → needinfo?(amarchesini)

Yes, it makes sense. The main difference is that nsICookieSettings is not exposed by the docShell, but by the document.
I don't have time to work on it right now. Dimi, is it something you would like to take?

Flags: needinfo?(amarchesini) → needinfo?(dlee)

(In reply to Andrea Marchesini [:baku] from comment #31)

Yes, it makes sense. The main difference is that nsICookieSettings is not exposed by the docShell, but by the document.
I don't have time to work on it right now. Dimi, is it something you would like to take?

Yes, I'll work on it :)

Flags: needinfo?(dlee)
See Also: → 1593083
You need to log in before you can comment on or make changes to this bug.