Open Bug 1873171 Opened 1 year ago Updated 18 days ago

Support OpenPGP key lookup privacy

Categories

(MailNews Core :: Security: OpenPGP, enhancement)

enhancement

Tracking

(Not tracked)

ASSIGNED

People

(Reporter: KaiE, Assigned: KaiE)

References

Details

Attachments

(1 file)

As of today, whenever Thunderbird downloads a key from a public directory, it will use a direct connection to the keyserver, or to the WKD service.

This can allow the key directories, or observers on the network, to learn which keys the user is interested in.

The risk will further increase when automatically refreshing the local cache of keys (which is something we should support in the future), as it allows observers to learn about the user's social graph.

We should allow the use of helper gateway services, which make it more difficult to spy on the user.

I suggest to introduce an OHTTP like service to implement this enhancement.

As a first step towards this solution, I have deployed experimental relay/gateway servers, which don't use HPKE (as suggested by OHTTP), but rather uses OpenPGP cryptography. This was simpler and quicker for me to do.

(The server side code, and sample bash commands for accessing them, can be found at https://github.com/kaie/openpgp-key-gateway/ )

I have implemented code for Thunderbird to use those servers.

I suggest to start offering this as an optional service for Thunderbird users, which needs to be enabled using a hidden preference.

Ideally we should work on migrating this to a real OHTTP service and use Mozilla/Thunderbird provided server infrastructure.

If gateway servers for specific keyservers become available, those could be used in addition.

Assignee: nobody → kaie
Status: NEW → ASSIGNED

Here are optional configuration parameters, which uses an experimental onion gateway:

pref("mail.openpgp.keyserver.cloak.url", "https://kuix.de/koof/v1/forward.php?");
pref("mail.openpgp.wkd.cloak.url", "https://kuix.de/wkdf/forward.php?");
pref("mail.openpgp.keyserver.cloak.pubkey", "mDMEZY+CnhYJKwYBBAHaRw8BAQdAMyraZxilvBnrW8kcy/MhAzNuPQL2zUedGEQ45tMb1Jm0PnBncGtleTRqcmJ4eDd6Y2E2YnpncmgybWhwajN3bmlzMmdobWY3aXdtYW8za3phN2Rid2RkcnlkLm9uaW9uiJYEExYIAD4WIQS6baW0c1Qcwkbw5fey/WCzUjnC7wUCZY+CngIbAwUJBaOagAULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAKCRCy/WCzUjnC7wWkAP97mqpyVkzL11VLt6azW1vhFiIMObni/JVtayQX/0Z2pQEAtySis2kP2NdNMGhekRBee5uuEradmlAaP2pggpwdIAC4OARlj4KeEgorBgEEAZdVAQUBAQdAN46xPu2u/wSD/I5EmeTlVDfvrj1UDH/Zt+7MYrCV8WIDAQgHiH4EGBYIACYWIQS6baW0c1Qcwkbw5fey/WCzUjnC7wUCZY+CngIbDAUJBaOagAAKCRCy/WCzUjnC72QpAP95kuRmdRng+kPXPTcyMtODQ+aGcoPKmFMdy0j4O5l4hQEA2FqMQFKNU4h/KbI2s+q5H++QLgMDF0dMzIJiBsylvgM");
pref("mail.openpgp.wkd.cloak.pubkey", "mDMEZY+CnhYJKwYBBAHaRw8BAQdAMyraZxilvBnrW8kcy/MhAzNuPQL2zUedGEQ45tMb1Jm0PnBncGtleTRqcmJ4eDd6Y2E2YnpncmgybWhwajN3bmlzMmdobWY3aXdtYW8za3phN2Rid2RkcnlkLm9uaW9uiJYEExYIAD4WIQS6baW0c1Qcwkbw5fey/WCzUjnC7wUCZY+CngIbAwUJBaOagAULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAKCRCy/WCzUjnC7wWkAP97mqpyVkzL11VLt6azW1vhFiIMObni/JVtayQX/0Z2pQEAtySis2kP2NdNMGhekRBee5uuEradmlAaP2pggpwdIAC4OARlj4KeEgorBgEEAZdVAQUBAQdAN46xPu2u/wSD/I5EmeTlVDfvrj1UDH/Zt+7MYrCV8WIDAQgHiH4EGBYIACYWIQS6baW0c1Qcwkbw5fey/WCzUjnC7wUCZY+CngIbDAUJBaOagAAKCRCy/WCzUjnC72QpAP95kuRmdRng+kPXPTcyMtODQ+aGcoPKmFMdy0j4O5l4hQEA2FqMQFKNU4h/KbI2s+q5H++QLgMDF0dMzIJiBsylvgM");

Stunning idea and work, thank You so much for your extremely professional work, Kai.

(In reply to Kai Engert (:KaiE:) from comment #0)

This can allow the key directories, or observers on the network, to learn which keys the user is interested in.

If someone is worried about this, why not use a VPN? Or at least a trusted proxy server for http requests?
Or simply use another key server if they do not like the default?

Hi Magnus, I think it isn't just that some people should worry about it. I think that everyone needs to be worried about this. This lookup privacy should be a default.

A VPN would simply move the place where listening could be done.
Also, it's hard to setup a VPN or a trusted proxy server, not everyone can get that done easily.
This should be a default, not something that requires manual protection.

This isn't a problem that I'm making up. At the last OpenPGP summit, there was agreement that this is a problem.

It's useful to implement functionality that protects the privacy of users, without requiring them to act on their own, and this technology reduces the need to have to trust someone. If we can achieve that relay and gateway are operated by different entities, it's a big improvement.

You kind of need to trust some server anyway, because some server is making the requests and they could always be listened to there - perhaps with slightly less context.

Can you explain why I'd want to worry about it?
In general it would require a very significant effort to actively monitor and guess the https traffic I send out. If I successfully send the message, that still goes through an smtp server and we must assume the recipient list is in many cases directly fed to surveillance. If not there, than one of the recipients receiving servers might leak the list to powerful parties.

Magnus, have you read the concept of Oblivious HTTP?

client -> relay -> gateway -> target

The client encrypts a request for the gateway.

The gateway will execute the request, but the gateway doesn't know the origin of the request.

(In reply to Magnus Melin [:mkmelin] from comment #8)

Can you explain why I'd want to worry about it?

If Thunderbird routinely refreshes all keys that you have (which is something that I want to enable in the future),
then a listener on the network can passively learn about the group of people you are having encrypted communications with.

(In reply to Kai Engert (:KaiE:) from comment #9)

The gateway will execute the request, but the gateway doesn't know the origin of the request.

It has one less IP, but could still easily guess which chunks of requests go together.
And as current, relay and gateway would be run by same entity - no reason it couldn't sync IP later if wanted. Sure, find another party then... I don't think this is easy in practice.

then a listener on the network can passively learn about the group of people you are having encrypted communications with.

But why would I care? Like explained in comment 33, I think it's safe to assume any address lists, encrypted mail or not, are already known to state like actors. Who's really behind a certain email might not be directly know.

(In reply to Magnus Melin [:mkmelin] from comment #11)

(In reply to Kai Engert (:KaiE:) from comment #9)

The gateway will execute the request, but the gateway doesn't know the origin of the request.

It has one less IP, but could still easily guess which chunks of requests go together.

It isn't as easy.

If relay and gateway run in different geographic regions, it requires a powerful adversary with global observation ability to correlate those packets. Sure, if there's very low traffic, it's still possible. But it's much harder to do, it won't be done by any random actor who uses the opportunity of spying on something that's easy to do, and even for a global adversary, it might be too resource intensive to be worthwhile to do for all packets.

Also, by introducing the gateway, we are also introducing the concept of "padding". While direct responses may have a rather direct indication which data is being transported, because public keys differ in size, padding makes it difficult. Padding can be used (and is used by my sample deployment) to being response packets to rounded sizes, e.g. by making sure all chunks are multiples of 20 KB.

Assuming there is sufficient load between gateway and target, it won't be easy to tell which response from the target belongs to a which response going back to the relay and client.

And as current, relay and gateway would be run by same entity - no reason it couldn't sync IP later if wanted. Sure, find another party then... I don't think this is easy in practice.

then a listener on the network can passively learn about the group of people you are having encrypted communications with.

But why would I care? Like explained in comment 33, I think it's safe to assume any address lists, encrypted mail or not, are already known to state like actors. Who's really behind a certain email might not be directly know.

I disagree with your very broad statement "any address lists".

Sure, maybe some social graphs are already known.

But you cannot claim that this covers everyone.
It isn't safe to assume that ALL address lists are already known to state actors.

We can construct scenarios in which it isn't as you say.
Would you like me to describe example scenarios?

Magnus, it's fine if you personally don't care. But I don't think everyone is careless in the same way. And I personally care about exposing people to that risk. I feel responsibly to not unnecessarily expose users to this additional risk, which I would do, by enabling key refreshing, without any layer of cloaking.

I think it is necessary to have some level of cloaking, as a precondition to enabling key refreshing by default for everyone, as an additional safetly layer, for those who need it.

For example, Alice might have sent an email to Bob once in the past. That's why she has Bob's key in her key store. Alice might have used an email provider in Bob's country. The providers in Alice's own country might have no idea that Alice and Bob were interacting in the past. They might even still interacting, and they might still have no idea.

(In reply to Magnus Melin [:mkmelin] from comment #5)

why not use a [...] trusted proxy server for http requests?

The http proxy code in TB is currently broken, yields in unexpected behavior. At least in the past, it seems to have nearly zero priority to use proxy. I ended up to not trying to use it anymore.
See e.g. bug 1735349 and bug 1734992

So imho this problem should be cleanly and generic in-built and not via the broken user preference proxy setting.

Blocks: 1873567

I support this work. metadata leakage is a real problem for everyone, even though most people have no idea about the risks (or how to defend against them). Having a path to reduced metadata leakage by default is exactly the sort of way that Thunderbird should be leading the ecosystem. Thank you for doing it, Kai!

I would be even happier if thunderbird were able to use a standardized OHTTP architecture, where the gateway is operated under different administrative control from the relay (and, preferably, under the same administrative control as the origin, since that is what lets the end user rely on the contents). Several engineer years worth of labor went into making the OHTTP architecture do sensible things against plausible adversaries, and it would probably be better to stand on the shoulders of that work than to have to replicate it.

(In reply to Kai Engert (:KaiE:) from comment #1)

As a first step towards this solution, I have deployed experimental relay/gateway servers, which don't use HPKE (as suggested by OHTTP), but rather uses OpenPGP cryptography. This was simpler and quicker for me to do.

How does this impact using a standard OHTTP setup? If we want to implement it, we should implement it per spec.
I would say also a first step for landing should be having a confirmed plan to actually have a real OHTTP setup running.

(In reply to Arvidt from comment #14)

The http proxy code in TB is currently broken, yields in unexpected behavior.

AFAIK it's working. It may (depending on prefs) fall back to direct connection if proxy didn't work for certain cases.

(In reply to Kai Engert (:KaiE:) from comment #0)

As of today, whenever Thunderbird downloads a key from a public directory, it will use a direct connection to the keyserver, or to the WKD service.

This can allow the key directories, or observers on the network, to learn which keys the user is interested in.

Given ECH and other improvements in the area, I think it's really only the keyserver that could learny anything, even in case of a powerful attacker.

I don't see the case using this for WKD, as that's individual servers. Their collection ability is thus pretty limited to start with.

Given ECH and other improvements in the area, I think it's really only the keyserver that could learn anything

ECH is not yet in wide deployment -- i don't think it's being offered for any keyserver that i'm aware of; even if it did, ECH only elides the server identity to the anonymity set of all names bound to that specific IP address. Given that keys.openpgp.org is the only name i'm aware of being hosted at 37.218.245.50, that's not much of a win. But let's put ECH aside and the network adversaries aside and talk about the risks of the keyserver learning something.

The metadata available to the keyserver, or to a WKD server for a significant domain (e.g. proton.me) can learn quite a bit about patterns of access and metadata linkage. This is not something that end users understand how to characterize well, or to defend against. Kai's work here can help establish reasonable policies and practices for metadata-averse key distribution mechanisms, which in turn helps the operators of those key distribution mechanisms resist requests for them to turn over whatever trackable data they could have access to, because they can lock themselves out of the data.

Attachment #9371253 - Attachment description: Bug 1873171 - Initial support for OpenPGP key lookup privacy. r=mkmelin → Bug 1873171 - Initial support for OpenPGP key lookup privacy.
Attachment #9371253 - Attachment description: Bug 1873171 - Initial support for OpenPGP key lookup privacy. → WIP: Bug 1873171 - Initial support for OpenPGP key lookup privacy.

One year later, I would like to make another attempt to move forward with an initial implementation of this project.

The new functionality would be disabled by default and completely optional, so it shouldn't harm anyone to have this functionality built in.

I believe it would be a useful enhancement, I don't agree with most of Magnus' concerns, and I'm thankful for the explanations that Daniel has provided.

Attachment #9371253 - Attachment description: WIP: Bug 1873171 - Initial support for OpenPGP key lookup privacy. → Bug 1873171 - Initial support for OpenPGP key lookup privacy. r=mkmelin

I've posted the suggestion to add this functionality to Thunderbird on the planning mailing list:
https://thunderbird.topicbox.com/groups/planning/T578d845b3908cf45/implementing-automatic-openpgp-key-refresh-with-privacy-protections

Let's use that list to discuss further general concerns or thoughts on the suggestion, because bugzilla should be reserved mostly for discussions concering the implementation.

No longer blocks: 1873567
See Also: → 1873567
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: