Closed Bug 1471755 Opened 6 years ago Closed 6 years ago

Data leak: don't send HTTP-Referer without consent, have a UI switch for default referer policy

Categories

(Firefox :: Settings UI, enhancement)

60 Branch
enhancement
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: robert, Unassigned)

Details

(Keywords: privacy)

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Firefox/60.0
Build ID: 20180503143129

Steps to reproduce:

According to the recommendation of the standard RFC 7231 Section 5.5.2, Firefox does not send a Referer field when:

1) the referring resource is a local "file" or "data" URI,
2) an unsecured HTTP request is used and the referring page was received with a secure protocol (HTTPS).

I argue that Firefox should be more strict than the standard and should also not send a HTTP referer field if the visitor has not provided its informed consent to the sharing of its most recently visited webpage (that may be considered personal data).

If the HTTP referer field is personal data, GDPR may apply for EU citizens and a legal basis (see https://gdpr-info.eu/art-6-gdpr/) should be identified of which one is consent.


Actual results:

When I browse from a HTTPS site to another HTTPS website, a HTTPS referer field is sent along the request.


Expected results:

The request should only contain a referer field if I consented.

Consent could be implemented in a similar manner like DNT. Maybe the setting can be repurposed. If I wish to be not tracked, also no referer field should be sent.
I'm tweaking the summary a bit because as worded it's not a bug but intended behavior.
Severity: normal → enhancement
Component: Untriaged → Networking: HTTP
Keywords: privacy
OS: Unspecified → All
Product: Firefox → Core
Hardware: Unspecified → All
Summary: Data leak: HTTP-Referer sent without consent → Data leak: don't send HTTP-Referer without consent
We already have a preference for this (network.http.referer.defaultPolicy), but the default we ship doesn't prevent sending referrer.  If we decide to have a visible preference for this, it's a Firefox bug, not Necko bug.
Component: Networking: HTTP → Preferences
Product: Core → Firefox
Summary: Data leak: don't send HTTP-Referer without consent → Data leak: don't send HTTP-Referer without consent, have a UI switch for default referer policy
(In reply to Honza Bambas (:mayhemer) from comment #2)
> We already have a preference for this (network.http.referer.defaultPolicy)

https://dxr.mozilla.org/mozilla-release/source/modules/libpref/init/all.js#1613
// Set the default Referrer Policy; to be used unless overriden by the site
// 0=no-referrer, 1=same-origin, 2=strict-origin-when-cross-origin,
// 3=no-referrer-when-downgrade
pref("network.http.referer.defaultPolicy", 3);

Which of the above values sends the referrer only upon user consent? For that matter, how is consent determined?
The last paragraph of comment 0 makes it sound like the request is for "Send websites a 'Do not track' signal…: Always" to also set network.http.referer.defaultPolicy = 0.
(In reply to robert from comment #0)
> If the HTTP referer field is personal data

This doesn't seem like a valid assumption. That is, it may contain personal data if the website put it there - or it may not. Most HTTP URLs seem unlikely to contain personal data according to the letter of GDPR (ie most URLs don't tend to contain people's names, or other identifying data *that the third party could use to identify this person*).

If a webpage on foo.com loads a script/image/whatever from foo-cdn.net, it's going to send the referrer (unless referrer-policy dictates otherwise). Not sending it may break fetching the image/script on which foo.com depends (ie sending the referrer might be necessary for the functionality of the website - as a random example, some image hosting sites are nigh infamous for relying on the referrer header to restrict users' possibilities of using the same hosted image indefinitely, on third-party sites ("hotlinking")). Additionally, sharing the supposed personal data from foo.com with foo-cdn.net would only be problematic if they were different (legal) entities, something which we can't know from the domain names alone.

In other words, there are technical as well as legal problems with this line of thinking. From a GDPR perspective, given that it is not possible for the browser to determine (a) if the referrer contains personal information or (b) if the party it is being sent to is a legal third party or not, responsibility must surely lie with the website to set the correct referrer policy. 
In a sense, it could be argued that this bug report is like telling the postal service to erase the "expediteur" or "sender" information from letters sent from A to B. "A" could easily send the letter without this information, and it is quite obviously up to A to do so, not up to the postal service.

Similarly, this is why websites have more detailed cookie/ads "choices" in-page modal popups in the last few months. They need to ask for informed consent, and Firefox can't just block all cookies because some cookies may contain personal data. The referrer policy issue seems very similar - it may (where it involves what GDPR considers "personal data") need work on the side of the website, but the best browser can do is implementing "referrer-policy" correctly.


Orthogonal to all of that, the current about:config option clearly doesn't override sites' specified policies, so in that sense it doesn't do what comment #0 asks and it's not "just" a question of exposing it in the UI.

Exposing it in the UI when it doesn't do what people expect seems like a recipe for user confusion and possible liability ("I thought you weren't going to send this information!").

Both with and without the option overriding the website's policy, there is a not-unreasonable risk of web breakage, that will be hard for users to understand or link to the choice they made not to send *any* referrer information. Only very few users would be willing to live with this and/or technical enough to understand that this might be an issue, and so the best trade-off seems like it'd be to implement this in an add-on (which you could do by modifying responses from websites to add/"upgrade to" stricter referrer policies), as then you could also keep a list of exceptions and so on.

As a result, I think the correct resolution is WONTFIX. Mike, can you give me a second opinion and close this bug if you agree?
Flags: needinfo?(mconley)
I think you've made a solid case, Gijs. We might want to pass this over to someone from the Privacy and Security team in case they had anything else to add, but I agree this is starting to look like a WONTFIX.
Flags: needinfo?(mconley) → needinfo?(jhofmann)
Dear Gijs,

thank you for kicking in and mention the possibility of referrer-policies at hand of website developers.

I address your points in chronological order:


1) Is a visited link personal data in the sense of the GDPR

Certainly are many websites not personal data. However, some may contain personal data in form of usernames etc. Note that also if the URL does not contain a username, it can be personal data. Imagine a service producing personal invitation links (for Flickr albums or Google docs) or a link shortener. If a person can be singled out because the link has been given only to one person, this link is identifying one person. If such a link is shared with e.g. 3 people, I would argue it also remains personal data as the chance to guess one out of very few remains rather high.

2) If the referrer link is personal data, who has to take measures?

The data controller has to take measures. It is not evident who is the data controller. Things get more complicated when the referring website is not subject to the GDPR and the referred website is subject to the GDPR. I just deleted my reasoning as I believe this is a very complex legal question. Data controller could be the website, the browser, or the browser user. Most users do not really understand the HTTP protocol. So I find it difficult to gives them all the responsibility for their privacy. In my first report I assumed that the browser is the controller, i.e. the people offering the browser.

3) HTTP referrer is like a sender of a postcard

I disagree. The sender has much transparency about what information is obviously part of the letter (envelope). One can assume explicit informed consent of the sender.

The sending of the HTTP referrer is not explicit and not transparent.

3) Websites stop functioning without referrer

Of course, user expectation must be managed. However, if those functionalities cannot be achieved in a law compliant manner, than they cannot be offered longer.

A solution to those may become more clearer as GDPR case law emerges. Until then, I suggest to better adopt a stronger privacy setting than a weaker one. That's why people choose Firefox. Shedding attention to this issue by introducing an UI element giving control(lership) to the user is an improvement in my opinion.
(In reply to robert from comment #6)
> 1) Is a visited link personal data in the sense of the GDPR
> 
> Certainly are many websites not personal data. However, some may contain
> personal data in form of usernames etc. Note that also if the URL does not
> contain a username, it can be personal data. Imagine a service producing
> personal invitation links (for Flickr albums or Google docs) or a link
> shortener. If a person can be singled out because the link has been given
> only to one person, this link is identifying one person. If such a link is
> shared with e.g. 3 people, I would argue it also remains personal data as
> the chance to guess one out of very few remains rather high.

I already said that there may or may not be personal data in a URL. The problem is that the browser cannot know whether a URL contains personal information or not. Even for a human being this can take work (ie reading and interpreting source code generating the URL, which may be obfuscated or server-side and thus not accessible) to determine, e.g. when such data is encoded.

> 2) If the referrer link is personal data, who has to take measures?
> 
> The data controller has to take measures. It is not evident who is the data
> controller. Things get more complicated when the referring website is not
> subject to the GDPR and the referred website is subject to the GDPR. I just
> deleted my reasoning as I believe this is a very complex legal question.
> Data controller could be the website, the browser, or the browser user. Most
> users do not really understand the HTTP protocol. So I find it difficult to
> gives them all the responsibility for their privacy. In my first report I
> assumed that the browser is the controller, i.e. the people offering the
> browser.

Given that the creator of the URL (which may or may not have such data) is not the browser but the website, I don't see why you think this would be the browser. The people making the browser, as an organisation, have no access to the data of individual users of the software so cannot possibly be the controller. See also e.g. https://www.kuppingercole.com/blog/kuppinger/is-your-software-gdpr-compliant-is-that-the-right-question .


> 3) HTTP referrer is like a sender of a postcard
> 
> I disagree. The sender has much transparency about what information is
> obviously part of the letter (envelope). One can assume explicit informed
> consent of the sender.

For handwritten postcards, yes. But a lot of postal mail is not sent that way. I get post from charities that I give money to. So the postman knows some of the charities I give money to. Is that a problem? Etc.

> The sending of the HTTP referrer is not explicit and not transparent.

I would argue that the presence or absence of the referrer is more transparent than the content (and/or "personal-data-ness") of the URL, as pointed out above. :-)

Most users can't distinguish "data:application/javascript,/*www.google.com*//*now steal all your data*/" from "https://www.google.com/" as a URL, nevermind whether a blob of seemingly random alphanumerics is hiding a personal identifier or is just a random token, and even if they could, whether that token was *necessary* for the functioning of the website or not.

> 3) Websites stop functioning without referrer
> 
> Of course, user expectation must be managed. However, if those
> functionalities cannot be achieved in a law compliant manner, than they
> cannot be offered longer.

That would be the responsibility of the data controller, which isn't Mozilla or the Firefox browser. We can't just block all the referrers because *a few* might contain personal data, also breaking other (compliant) websites that just use it internally.

> A solution to those may become more clearer as GDPR case law emerges. Until
> then, I suggest to better adopt a stronger privacy setting than a weaker
> one. That's why people choose Firefox. Shedding attention to this issue by
> introducing an UI element giving control(lership) to the user is an
> improvement in my opinion.

If we added user prompting for everything that impacted your privacy you would no longer be able to use the browser - there are too many issues (what if the IP address of the website changed? Or its TLS cert chain? Or the ownership of the business? Should we tell you about 3rd-party scripts included in payment pages ( https://hackernoon.com/im-harvesting-credit-card-numbers-and-passwords-from-your-site-here-s-how-9a8cb347c5b5 )? The browser is there to make decisions on behalf of the user, based on what the user tells it - not to wait and ask endless questions from the user, or overload them with information.



I'm pretty convinced that the GDPR angle here is a dead end. That doesn't mean we can't do anything, but "turn off all referrers until the user opts in" isn't a viable strategy. Basically what you described is something like:

If the users opt in to TP (tracking protection) or DNT, then stop sending referer information except where necessary/desirable.  

The hard part is figuring out "where necessary/desirable", and how to give users informed control over that. Just asking them every time isn't workable (pages regulary have 30-100 subrequests for scripts, styles, frames etc., can you imagine dealing with that number of prompts?), nor is breaking the referrer header for everything and making users manually turn it back on for all requests from/to sites that "need" it (where it'd also be unclear if that would result in a different end-state than the status quo - ie wouldn't users just turn it back on for loads of sites, thus making them still leak data on those sites for some of the requests where the referrer *isn't* strictly necessary?). It's not even clear if a per-site whitelist would have to be from outgoing or to incoming sites (ie "never/always allow sending google.com URLs in the referrer" vs. "never/always allow sending any referrer information to evil.com").

So if we want to do anything here (unrelated to GDPR), then I think we need some answers for some of those questions, ie how to offer users informed control, and/or how the browser should determine when to send and when not to send the referrer (e.g. only for toplevel requests, or only to sites the user has visited before / bookmarked, or only send toplevel domains but no path/querystring information, or...?). Do you have more specific ideas about this, and how we could strike a balance between privacy and usability here?
Flags: needinfo?(robert)
I agree with Gijs about the GDPR thing. Since GDPR we've had a bunch of bugs where reporters claim that their pet peeve surely falls under GDPR and thus we must implement their proposed solution. For GDPR related issues I believe/hope our legal department is in control of the situation and I don't want to set a precedent of acting on claims of GDPR un-compliance without legal advice.

With that said, giving users more control or even more sensible defaults (beyond bug 587523) on referrers might be something worth pursuing. Tanvi, do you agree (should we track this somewhere) or is this a WONTFIX for now?

Also CC'ing Steve on the chance his research made him have an opinion on referrers and Luke who did the previous referrer thing.
Flags: needinfo?(jhofmann) → needinfo?(tanvi)
In Bug 1461824 I will be investigating the impact of stripping referrers from resources on the tracking protection list. Something like that may strike a balance between limiting site breakage and preserving user privacy from trackers.

I think it's safe to WONTFIX this as I don't see a viable way to obtain meaningful consent for referrers headers. Even if we could, I don't think that's the approach we should take.
That sounds great, thanks Steven. Closing this one then.
Status: UNCONFIRMED → RESOLVED
Closed: 6 years ago
Flags: needinfo?(tanvi)
Resolution: --- → WONTFIX
Flags: needinfo?(robert)
You need to log in before you can comment on or make changes to this bug.