bugzilla.mozilla.org will be intermittently unavailable on Saturday, March 24th, from 16:00 until 20:00 UTC.

[Shield] Pref Flip Study: Trusted Recursive Resolver

ASSIGNED
Assigned to

Status

Shield
Shield Study
ASSIGNED
8 days ago
a day ago

People

(Reporter: bagder, Assigned: bagder, NeedInfo)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [trr])

(Assignee)

Description

8 days ago
Basic description of experiment: TRR is a separate and parallel way to resolve host names in the browser and the implementation allows for several different operational modes. We want to enable TRR in “shadow mode”, meaning that Firefox resolves all host names using both original native resolver mechanism as well as DNS-over-HTTPS (DOH) but the results from DOH are discarded and are only used for measuring and telemetry. For this experiment, we would use a cloudflare hosted server.

What is the preference we will be changing? network.trr.mode = 4, and network.trr.uri = “https://dns.cloudflare.com/.well-known/dns”

What are the branches of the study and what values should each branch be set to? Two branches: one using TRR, one not. (the one ‘not’ might actually be the control - it would have default prefs. Not sure of shield nomenclature.)

What percentage of users do you want in each branch? 50/50

What Channels and locales do you intend to ship to? Nightly

What is your intended go live date and how long will the study run? 7 days (?)
Are there specific criteria for participants? We want a random distribution to make it possible to assume both branches are sufficiently similar, user wise. Being able to break the data down by very rough locale would be interesting as internet topology will impact performance.

What is the main effect you are looking for and what data will you use to make these decisions? We will look at resolver timings, connection error rates and http response code changes.

Who is the owner of the data analysis for this study? Daniel Stenberg + Patrick McManus

Will this experiment require uplift? No

QA Status of your code: Green, yellow, red. Your code should be QA’d to ensure that changing the preference values has the intended effect you are looking for and does not cause obvious regressions to Firefox. All experiments must pass QA. Depending on the channel/population size a dev QA may be accepted.

Do you plan on surveying users at the end of the study? No.

Link to any relevant google docs / Drive files that describe the project. Links to prior art if it exists:



Details Section for Analysis
For each telemetry probe to be analysed in the study, find it here to determine the following:
Name of probes
DNS_LOOKUP_DISPOSITION
DNS_NATIVE_LOOKUP_TIME
DNS_TRR_RACE
DNS_LOOKUP_ALGORITHM
DNS_TRR_LOOKUP_TIME
DNS_BLACKLIST_COUNT
DNS_TRR_BLACKLISTED
DNS_CLEANUP_AGE
IPV4_AND_IPV6_ADDRESS_CONNECTIVITY
HTTP_RESPONSE_STATUS_CODE
Associated bugzilla thread URL:
e.g. https://bugzilla.mozilla.org/show_bug.cgi?id=
(Assignee)

Updated

8 days ago
Flags: needinfo?(mgrimes)
Flags: needinfo?(isegall)

Comment 1

7 days ago
> For this experiment, we would use a cloudflare hosted server.

This will result in sending all DNS lookups in the study to Cloudflare. Given that even Mozilla does not collect this level of browsing data via telemetry/studies, I do not think it is reasonable to send this data to a third party.
Flags: needinfo?(mgrimes)
I think we shouldn't run this study in the proposed form.

Sending information about what is browsed to an off-path party will erode trust in Mozilla due to people getting upset about privacy-sensitive information (what they browse where "they" is identified by IP address and "what" by host name) getting sent to an off-path party without explicit consent.

The policy agreements we have in place with the off-path party won't remove this negative effect, since the way people are known to react this kind of thing isn't in our power to negotiate: people will react to this as a matter of what technically got sent and not as a matter of what the recipient promised not to do. (A browser sending information about what is browsed to an off-path party is the quintessential browser privacy no-no.)

(By off-path party, I mean a party that isn't *by necessity* on the network path between the user's computer and the site the user browses. The site can use third-party trackers or infrastructure providers, but that's an action on the site's part--not on the browser's part.)

The problem could be addressed in two ways:

 1) To study things like end-to-end reachability or round-trip time, Firefox could perform queries for a set of pre-defined names (to remove any correlation with what the user actually browses). This kind of thing has successful precedent as part of TLS 1.3 handshake studies.

 2) To study things under realistic use, we should obtain an explicit opt in specifically for sending DNS queries to Cloudflare. (We should do this even if it introduces a potential bias in the data.)
Are DoH requests made from private browsing sessions too?
Flags: needinfo?(daniel)
(Assignee)

Comment 4

5 days ago
Yes, DOH requests are made in PB as well. DOH is after all by design likely to be more private and secure than native resolves.

- The name resolve results are like "native" resolves in PB cached separately from non-PB resolves
- The TRR blacklist is never stored on disk for PB sessions
Flags: needinfo?(daniel)

Comment 5

5 days ago
Science review: R+, though I have some questions about the analysis that we can discuss when I have access to the phd
Flags: needinfo?(isegall)
These DoH requests definitely leak private browsing history to the (3rd-party) DoH provider.

Are the PB-cached DoH resolve results cleared when the user exits private browsing?
Flags: needinfo?(daniel)
Based on our risk matrix I'm adding a few more folks for sign off.
Flags: needinfo?(merwin)
Flags: needinfo?(nnguyen)
Flags: needinfo?(dcamp)
Since this is happening in Nightly, I think this is fine to launch.
Flags: needinfo?(nnguyen)
(Assignee)

Comment 9

5 days ago
(In reply to Luke Crouch [:groovecoder] from comment #6)
> These DoH requests definitely leak private browsing history to the
> (3rd-party) DoH provider.

Name resolving means asking a 3rd party (in all typical cases). It is often your ISP and it is often Google's DNS (8.8.8.8) or similar. In the DOH case it is *also* a 3rd party, that is correct. Probably not the same 3rd party though.

(and for this study, we're suggesting we leak to both 3rd parties for the purpose of getting data and metrics on how it fares)

Name resolving leaks info to 3rd parties. Both DOH and ordinary native resolving do.

> Are the PB-cached DoH resolve results cleared when the user exits private
> browsing?

doh-resolved names in PB are treated exactly the same as natively resolved names in PB in the cache. They're kept separate from the normal DNS cache (and they're only stored in memory). The DNS cache is in fact working the same way, doh or not.

I believe PB-resolved DNS cache entries are not cleared on "PB exit". The DNS cache isn't really aware of the concept of entering or exiting the mode.
Flags: needinfo?(daniel)
I also support this time-limited Nightly trial.
Flags: needinfo?(dcamp)

Comment 11

5 days ago
favors-a-study
(In reply to Daniel Stenberg [:bagder] from comment #9)
> Name resolving leaks info to 3rd parties. Both DOH and ordinary native resolving do.

Short feedback on this:
The user's DNS provider (even it's the regular ISP) is the user's decision.

https://www.mozilla.org/en-US/privacy/firefox/#telemetry
It may be illegal in the EU to process parts of surf data without further consent. With agreeing to basic telemetry the Nightly user does not expect to transmit domains from his surf activity to any host defined by Mozilla. That's far more than a search request being routed to a search engine.

You have https://addons.mozilla.org/en-US/firefox/addon/firefox-pioneer/ ("specially marked SHIELD studies") for these types of experiments. https://support.mozilla.org/en-US/kb/about-firefox-pioneer
Tagging Selena too, my support is contingent on her go-ahead.
Flags: needinfo?(sdeckelmann)
(In reply to Daniel Stenberg [:bagder] from comment #9)
> I believe PB-resolved DNS cache entries are not cleared on "PB exit". The
> DNS cache isn't really aware of the concept of entering or exiting the mode.

Actually we do clear the DNS cache when the last PB is closed.
https://searchfox.org/mozilla-central/rev/877c99c523a054419ec964d4dfb3f0cadac9d497/netwerk/dns/nsDNSService2.cpp#1129-1130,1133
(Assignee)

Comment 14

5 days ago
(In reply to Valentin Gosu [:valentin] from comment #13)

> Actually we do clear the DNS cache when the last PB is closed.

Ah, yes there it is! Thanks for correcting me!
Thanks for the info. Since this does not leak any additional Private Browsing history to local adversaries, it seems fine to me. The new additional 3rd party receiving name resolution requests is still concerning. But I like the random_padding feature of DoH for some extra privacy. Are we planning to use the random_padding feature?

Comment 16

4 days ago
I am signing off, conditional on Selena giving the final go ahead. This experiment is testing a feature that could add valuable privacy and security protections for our users. We have in the past used Nightly for tests that have similar privacy properties.

We do need to be transparent about this is and to communicate clearly what prospective privacy and security benefits this feature will provide. We are working on a blog post that will add that transparency.
Flags: needinfo?(merwin)
(Assignee)

Comment 17

4 days ago
(In reply to Luke Crouch [:groovecoder] from comment #15)

> I like the random_padding feature of DoH for some extra privacy. Are we
> planning to use the random_padding feature?

Are you referring to HTTP/2 padding here? DOH itself, which really is a very basic protocol which mostly just encapsulate DNS-pakets as they look over UDP but over HTTP/2 and HTTPS instead, doesn't have any particular padding specified. I don't think we use padding in HTTP/2 (even if we of course support receiving it), but I'm not the expert on that.
(In reply to Merwin from comment #16)
> We have in the past used Nightly for tests that
> have similar privacy properties.
> 

Can you point to examples and how we communicated privacy leaking for those?

(In reply to Valentin Gosu [:valentin] from comment #13)
> (In reply to Daniel Stenberg [:bagder] from comment #9)
> > I believe PB-resolved DNS cache entries are not cleared on "PB exit". The
> > DNS cache isn't really aware of the concept of entering or exiting the mode.
> 
> Actually we do clear the DNS cache when the last PB is closed.
> https://searchfox.org/mozilla-central/rev/
> 877c99c523a054419ec964d4dfb3f0cadac9d497/netwerk/dns/nsDNSService2.cpp#1129-
> 1130,1133

The point here is to NOT SEND DNS requests to an additional 3rd party in PB, not to ensure we don't keep them persistent.

Does the study ensure that?


In general, I think the main concern here is to make sure people explicitly opt-in to this study, since this study (unlike other studies we run - fix me if I'm wrong) leaks private data to a 3rd party of our choice.


(Note that this TRR study made me to turn off running all studies on all my Nightly profiles)
(Assignee)

Comment 19

4 days ago
(In reply to Honza Bambas (:mayhemer) from comment #18)

> The point here is to NOT SEND DNS requests to an additional 3rd party in PB,
> not to ensure we don't keep them persistent.
> 
> Does the study ensure that?

I responded to a question about the currently implemented functionality, and it does not make any difference in Private Browsing.

The T in TRR stands for Trusted: this resolver mechanism is designed for you to select a trusted resolver to send your requests to and thus avoiding privacy-leaking and vulnerable plain-text resolves as is otherwise done natively. In such a setup I don't think it makes a lot of sense to use the non-trusted resolver when you switch to PB. Do you?

If something in the existing implementation should be changed to make a study better or more likely to be acceptable, that's a slightly different question but I'm listening and I certainly personally am not against modifications for that purpose.

As the discussion goes however, I don't think changing behavior during PB would make a very big difference.

Comment 20

3 days ago
PB is by design only private on the users end, not on the web end. How could it be?
To have true privacy regarding browsing, the user would need to use Tor or similar technologies.

I'm in favor of this study.

Comment 21

3 days ago
> PB is by design only private on the users end, not on the web end. How could it be?

Private Browsing enables Tracking Protection by default, which is supposed to keep the user safe(r) from third-parties.

In the context of this study, at least, Cloudflare is a third party that some of us don't necessarily want to trust with our browsing history [1]. As roc pointed out on the mailing list, a recent Mozilla blog post says:

> The headlines speak for themselves: Up to 50 million Facebook users had their information used by Cambridge Analytica, a private company, *without their knowledge or consent*. That’s not okay.
> [...]
> At Mozilla, our approach to data is simple: no surprises, and user choice is critical.

Do you really feel that opt-out transmission to a third-party of the users' browsing history is a good way to improve their trust in Mozilla? The community and media provided quite some push-back against recent initiatives like Pocket, the RAPPOR collection of browsing history proposal, Cliqz, and others.

This does respect https://wiki.mozilla.org/Firefox/Data_Collection#Data_Collection_Categories, but I think running Nightly isn't a free pass for Mozilla to silently collect such information. And that's without mentioning the tendency of the "Allow Studies" check box to get re-enabled by itself [2].

All we're asking for is making this an opt-in instead.

[1] While I respect their work, they had a major data leak in the past
[2] Bug #1425663
(In reply to i from comment #20)
> PB is by design only private on the users end, not on the web end. 

Note: the design goals of Private Browsing were updated at the beginning of this year to include protecting the private session data from online tracking. [1] E.g., enabling Tracking Protection, and stripping path information from 3rd-party requests. [2]

[1] https://wiki.mozilla.org/Private_Browsing
[2] https://blog.mozilla.org/security/2018/01/31/preventing-data-leaks-by-stripping-path-information-in-http-referrers/

> How could
> it be?
> To have true privacy regarding browsing, the user would need to use Tor or
> similar technologies.

Bugzilla comments of a single study are a poor channel for defining "true privacy". :) Suffice it to say - Private Browsing should try to match users' expectations as much as possible, not necessarily aim for "true privacy".

> 
> I'm in favor of this study.

I'm also in favor of this study.
(In reply to Daniel Stenberg [:bagder] from comment #17)
> (In reply to Luke Crouch [:groovecoder] from comment #15)
> 
> > I like the random_padding feature of DoH for some extra privacy. Are we
> > planning to use the random_padding feature?
> 
> Are you referring to HTTP/2 padding here? DOH itself, which really is a very
> basic protocol which mostly just encapsulate DNS-pakets as they look over
> UDP but over HTTP/2 and HTTPS instead, doesn't have any particular padding
> specified. 

Oh, sorry ... I was looking at the random_padding parameter of https://developers.google.com/speed/public-dns/docs/dns-over-https. But I suppose that's a Google-specific implementation.

Does our implementation try to use any HTTP/2 padding to mitigate traffic analysis of the DNS queries? Seems like it could potentially be a nice bit of extra protection?
wrt padding in h2 or tls:

it turns out that padding is ridiculously hard to do in a way that's effective; at least in the general case. It would be an interesting thing we could analyze (and because h2 and tls define it well we can make at least 1/2 the change unilaterally) for DNS. Its possible the nature of DNS could be more suitable to this than general web traffic is.

could you maybe open a bug/request on the topic? We don't need to discuss it in the shield bug.

But we should pursue that separately as resources accommodate; whatever the merits and demerits of TRR as a whole might be TRR is already a strict improvement vs the passive observer adversary.
(Assignee)

Comment 25

3 days ago
(In reply to Luke Crouch [:groovecoder] from comment #23)

> Oh, sorry ... I was looking at the random_padding parameter of ... But I
> suppose that's a Google-specific implementation.

Correct, that's a different protocol. The DNS-over-HTTPS protocol we are working on here is documented here: https://tools.ietf.org/html/draft-ietf-doh-dns-over-https-04

Comment 26

a day ago
Could not you at least just running a Mozilla server for that, instead of using a (untrusted) third-party?

I agree that leaking domains to a third-party is not what you want to do (and yes, it can erode trust in Mozilla, remember the Mr. Robot thing, whcih has happened –  I know, this here is different, but I am just saying…)
Especially when others stumble upon this bug and now complain. (Hint: it has already happened, German text: https://www.kuketz-blog.de/firefox-nightly-uebermittlung-von-besuchten-domains/)
You need to log in before you can comment on or make changes to this bug.