Open Bug 1319839 Opened 8 years ago Updated 2 years ago

[FirstPartyIsolation] If you sign in to Gmail, you'll be automatically signed in when you visit YouTube

Categories

(Core :: DOM: Security, defect, P3)

defect

Tracking

()

People

(Reporter: cynthiatang, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [tor][domsecurity-backlog1][dfpi-ok])

This issue can be reproduced in Tor browser and Firefox with FPI.

Preference setting:
 1. privacy.firstparty.isolate;true
 2. network.predictor.enable-prefetch;false
 3. network.predictor.enabled;false

Step:
 1. Launch Firefox browser
 2. Visit gmail.com
 3. Sign in using an existing Gmail's ID and password
 4. New a tab 
 5. Visit youtube.com

Actual result:
 - If you sign in to your Google Account on another Google service, you'll be automatically signed in when you visit YouTube.
 - Tor Browser:      https://youtu.be/Rlz8frlK-Y0 
 - Firefox with FPI: https://youtu.be/k_aW7TL0Iyk  

Expected result:
 - Wait for developers to confirm the expected result.

Firefox version: 53.0a1 (2016-11-22) (64-bit)
It's the reason that gmail.com redirs to accounts.google.com where you log in?  The login cookie is then probably available to all google sites.  (Just a theory)
(In reply to Honza Bambas (:mayhemer) from comment #1)
> It's the reason that gmail.com redirs to accounts.google.com where you log
> in?  The login cookie is then probably available to all google sites.  (Just
> a theory)

Even so, but accounts.google.com and www.youtube.com are different domains, First Party Isolation should be able to separate login sessions between these two sites.
We have to investigate the mechanism behind the scenes, see if we can do anything on it.
Priority: -- → P1
It turns out that gmail redirects to several sites when you log in. The sequence is:
302 accounts.google.com (password)
302 mail.google.com
302 account.youtube.com (with Set-Cookie header)
302 accounts.google.fr
302 mail.google.com
302 mail.google.com
200 mail.google.com

(See image at https://trac.torproject.org/projects/tor/ticket/20754#comment:2)
One possible solution with redirects would be to assign to each 30x response the first-party of the final 200 response. So, in this example, the first-party for the youtube.com cookie would be google.com (note that first parties don't include the subdomain).

The problem with this approach is in the implementation -- how do we assign a first-party to a Cookie when we don't know the final destination domain yet? We could assign the Cookie a temporary first-party string nonce (such as a UUID). Then, after the final domain is known, we would re-assign the Cookie the final first-party domain. But we would also need to apply this to other stateful data, including the HTTP cache, the Authorization header, OCSP, SSL session cache, HSTS, etc., so it could get complicated.
Another possible solution is to assign each 30x response the first-party of the referring domain (whether or not the referer is provided to content). So if I log in to accounts.google.com, then all of the subsequent redirects get google.com as a first-party. The final 200 response would obtain its first-party domain from the URL bar domain. This approach seems like a much easier thing to implement. For Google (or other search sites) an internal redirect when the user clicks on a search result would be assigned google.com as the first-party, which is reasonable.

Both approaches (this one in this comment and comment 4) treat only those domains that are visible to the user in the URL bar as "first-party" domains. Domains in redirects, like other third-party domains, are not visible in the URL bar and therefore should not be considered to be "first-party".
See Also: → 1309800
(In reply to Arthur Edelstein [:arthuredelstein] from comment #5)
> Another possible solution is to assign each 30x response the first-party of
> the referring domain (whether or not the referer is provided to content). So
> if I log in to accounts.google.com, then all of the subsequent redirects get
> google.com as a first-party. The final 200 response would obtain its
> first-party domain from the URL bar domain. This approach seems like a much
> easier thing to implement. For Google (or other search sites) an internal
> redirect when the user clicks on a search result would be assigned
> google.com as the first-party, which is reasonable.
> 
> Both approaches (this one in this comment and comment 4) treat only those
> domains that are visible to the user in the URL bar as "first-party"
> domains. Domains in redirects, like other third-party domains, are not
> visible in the URL bar and therefore should not be considered to be
> "first-party".

I think this approach still has the same problem of the first solution that you mentioned in comment 4. You wouldn't know whether the next load will be a 200 response or a 30x response. And you have to decide your originAttributes before the loading of the http channel. But, the decision of the first party domain relies on the loading result. So, we don't know which first party domain we should use here.

A hacky solution is that maybe we can make the 'set-cookie' header of the 30x responses use the first party of the referring domain only.
Is there a clear definition of the "first party domain" or can it be stated here now regarding any type of navigation, like normal cross-origin link click and also middle click ("sub" tab), xhr, window.open(), window.location assignment and any other considerable way of navigation?  I also don't know what exactly to expect from the user point of view - where from is the FPD taken?  How does it change when I browse in one tab, and how when I browse in multiple tabs?

If that is more clearly defined, we can start thinking of how to solve the redirect problem.  Or at least for me it will be clearer, because I'd really like to help you here, guys :)
(In reply to Honza Bambas (:mayhemer) from comment #7)
> If that is more clearly defined, we can start thinking of how to solve the
> redirect problem.  Or at least for me it will be clearer, because I'd really
> like to help you here, guys :)

Honza, thank you so much for your support!
(In reply to Honza Bambas (:mayhemer) from comment #7)
> Is there a clear definition of the "first party domain" or can it be stated
> here now regarding any type of navigation, like normal cross-origin link
> click and also middle click ("sub" tab), xhr, window.open(), window.location
> assignment and any other considerable way of navigation?  I also don't know
> what exactly to expect from the user point of view - where from is the FPD
> taken?  How does it change when I browse in one tab, and how when I browse
> in multiple tabs?

Georg or Arthur, could you help to answer Honza's questions?

(meanwhile I am going to lower the priority of this bug since we don't even know whether we can/should resolve it or not)
Flags: needinfo?(gk)
Flags: needinfo?(arthuredelstein)
Priority: P1 → P3
To quote from our design document:

"First party isolation means that all identifier sources and browser state are scoped (isolated) using the URL bar domain"

There is a bit more context in section 4.5 of said document (including how an UI reflecting this could look like): https://www.torproject.org/projects/torbrowser/design/#identifier-linkability

The general idea behind this is that the domain in the URL bar represents the site the user wanted to interact with be it via a click on a link or via typing it into the location bar. This model gets complicated when redirects enter the picture but I think it might be a solid start in exploring ideas to cope with that if one assumes a user wanted to interact with the domain that did the first redirecting.

Hope that helps. Let us know if things are still unclear.
Flags: needinfo?(gk)
(In reply to Honza Bambas (:mayhemer) [off till Jan 4] from comment #7)
> Is there a clear definition of the "first party domain" or can it be stated
> here now regarding any type of navigation, like normal cross-origin link
> click and also middle click ("sub" tab), xhr, window.open(), window.location
> assignment and any other considerable way of navigation? 

This is a very good question and it is a place where the first-party isolation model is not totally clean. We know that if a user clicks on a link in a page at A.com that sends them to a page on B.com, it's easy for that link to contain tracking information. I don't see a general way to prevent this sort of information transfer, and undoubtedly we want to allow it in some cases. But this sort of tracking information is associated with a one-time click -- it doesn't entail back-and-forth communication between the two sites or long-term data storage in the browser that is accessible to both A.com and B.com.

> I also don't know
> what exactly to expect from the user point of view - where from is the FPD
> taken?  How does it change when I browse in one tab, and how when I browse
> in multiple tabs?

As Georg explains, FPD is based on the URL bar, which is visible to the user (third party domains are not in general visible to the naive user). So two tabs with the same URL bar domain can share data, but a single tab that is first sent to A.com and then subsequently sent to B.com (by, in the simple case, entering B.com in the URL bar) cannot share data.

> If that is more clearly defined, we can start thinking of how to solve the
> redirect problem.  Or at least for me it will be clearer, because I'd really
> like to help you here, guys :)

Thanks, Honza! :) I think redirects are a special case here, because they are currently behaving as a first party, but are invisible to the user because they don't appear in the URL bar. So, for example, in comment 3, the youtube.com redirect is treated as a first party, which means if the user later opens a separate tab and enters youtube.com in the URL bar, the user will already be logged in.
Flags: needinfo?(arthuredelstein)
(In reply to Tim Huang[:timhuang] from comment #6)
> (In reply to Arthur Edelstein [:arthuredelstein] from comment #5)
> > Another possible solution is to assign each 30x response the first-party of
> > the referring domain (whether or not the referer is provided to content). So
> > if I log in to accounts.google.com, then all of the subsequent redirects get
> > google.com as a first-party. The final 200 response would obtain its
> > first-party domain from the URL bar domain. This approach seems like a much
> > easier thing to implement. For Google (or other search sites) an internal
> > redirect when the user clicks on a search result would be assigned
> > google.com as the first-party, which is reasonable.
> > 
> > Both approaches (this one in this comment and comment 4) treat only those
> > domains that are visible to the user in the URL bar as "first-party"
> > domains. Domains in redirects, like other third-party domains, are not
> > visible in the URL bar and therefore should not be considered to be
> > "first-party".
> 
> I think this approach still has the same problem of the first solution that
> you mentioned in comment 4. You wouldn't know whether the next load will be
> a 200 response or a 30x response. And you have to decide your
> originAttributes before the loading of the http channel. But, the decision
> of the first party domain relies on the loading result. So, we don't know
> which first party domain we should use here.

That is a very good point.

> A hacky solution is that maybe we can make the 'set-cookie' header of the
> 30x responses use the first party of the referring domain only.

That solution is not a bad idea, I think.

But we still have the problem that in any first-party request, we send a Cookie header if available (as well as an Authorization header). So for example, if we visit and login to youtube.com first, and then subsequently visit Google (entering google.com in the URL bar), a redirect like the one in comment 3 links the two accounts. I'm not sure how to solve this.
Whiteboard: [tor][domsecurity-active] → [tor][domsecurity-backlog1]
Hello, As a simple user interested in tracking protection, and especially in first party isolation, I think what "first party" means is "the website the user thinks it is looking at".  I know that's not a very good description but the user doesn't tracking information to pass  on a redirect.

I have been thinking about this for a few years and I think the best solution to this is to let the user decide if and when we should switch the first party context.

- In case of youtube, I want my session to be scoped with the first party context of youtube. if I log in, I want the log-in form to be scoped with the youtube first party context, even if the login form itself is not hosted at youtube.com.

- On the contrary, if I leave google.com from a search to a cool website, I want my first party context to change there from google.com to the web domain.

Those two use cases are identical from a technical point of view (a webpage is redirecting to another one) but the expected behaviour is different. and I believe letting the user know what happens is a good thing in that case. The only solution I imagine is the following:

Perform all redirects in the context of the original domain. After the redirect, show the page with a banner that says something like "You are viewing mail.google.com in the context of youtube.com. Anything you do can be traced back to youtube.com." with a button "Disconnect"

The disconnect button will perform a simple reload of the page with the first party context of mail.google.com instead.

Instead if we want to scope to the new domain by default, I see no option than to perform a request to mail.google.com with the first party of youtube.com first, to ensure the isolation, and then if there is a 200 response, perform the same request again with the new first party context of mail.google.com.

In that case I also suggest to show a banner on the page with "Firefox isolated the current page from youtube.com to prevent tracking" with an option to "click here to view this page in the context of youtube.com". In case of login forms, you'd have to do this and you can possibly choose to "always open mail.google.com in the context of youtube.com when following a youtube.com link".

In any case, for true tracking protection, I see no option than always make http requests using the original first party domain and in case of first party switch, perform the same request again. This is the cost of the privacy.
A while ago Mike Perry pointed me to Apple's Intelligent Tracking Protection 2.0 (https://webkit.org/blog/8311/intelligent-tracking-prevention-2-0/) which has among many other interesting things the following feature:

"ITP 2.0 has the ability to detect when a domain is solely used as a “first party bounce tracker,” meaning that it is never used as a third party content provider but tracks the user purely through navigational redirects."

They prevent tracking via bouncing redirects that way. It would be interesting for us investigating whether that is something we could/should use as well. After digging a bit it seems this is implemented in the open source WebKit part. E.g. ​https://bugs.webkit.org/show_bug.cgi?id=182664 seems to be relevant and the whole Resource Load Statistics piece.
Whiteboard: [tor][domsecurity-backlog1] → [tor][domsecurity-backlog1][dfpi-ok]
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.