Closed Bug 1370217 Opened 3 years ago Closed 7 months ago

Option to override/turn off Content-Language header

Categories

(Thunderbird :: Message Compose Window, enhancement)

52 Branch
enhancement
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 73.0

People

(Reporter: mozilla, Assigned: segfault)

Details

(Keywords: good-first-bug)

Attachments

(1 file, 3 obsolete files)

User Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:45.9) Gecko/20100101 Goanna/3.2 Firefox/45.9 PaleMoon/27.3.0
Build ID: 20161109284099

Steps to reproduce:

Content-Language is a new header, which is forcibly put into going messages since Thunderbird 52, which contains information about user’s dictionaries, a minor metadata for fingerprinting.

RFC 3282, on which this feature is based, warns about potential privacy breach in “Security considerations” paragraph, nonetheless there is no option to override/turn off Content-Language header.
We could do an option to turn it off.

https://tools.ietf.org/html/rfc3282
4. Security Considerations

   The only security issue that has been raised with language tags since
   the publication of RFC 1766, which stated that "Security issues are
   believed to be irrelevant to this memo", is a concern with language
   ranges used in content negotiation - that they may be used to infer
   the nationality of the sender, and thus identify potential targets
   for surveillance.

Personally I don't quite understand the issue. When sending an e-mail, heaps of headers are sent, including your e-mail address and name (usually). Also your mail providers MTA adds more routing headers so you can follow the path of the e-mail. That the e-mail was written using a particular dictionary is not much more identifying than the information already sent.

What am I missing?
Severity: normal → enhancement
Heaps of headers are modified by privacy-aware email providers to reduce metadata leaks.
In the example below you could see IP is cloaked and timestamp changed into UTC+0.

Received: from Unknown (ml01.unseen.is [82.221.106.185])
	by mt03.unseen.is (Postfix) with ESMTPSA id 78725542DC1
	for <xxx@xxx.xxx>; Tue,  6 Jun 2017 03:08:09 +0000 (GMT)
From: zzz@unseen.is
To: xxx@xxx.xxx
Subject: Test
Message-Id: <20170606060757.e60f2bf7720f90535b25e3ec@unseen.is>
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Date: Tue,  6 Jun 2017 03:08:09 +0000 (GMT)

User is able to specify any name or omit one, same goes for mail agent, but dictionary metadata is out of control.
Some would wonder upon receiving a letter from Iceland, written in English, but with dual Eng-Rus dictionary used.
As I said, we can do an option to turn it off.
Please do, sir.
 
Perhaps it’s better to let override Content-Language like it's done with general.useragent.override than just turn on/off.

Let’s say a new option is called “general.contentlang.ovverride”, then if it is
- not set (default), then Content-Language is inserted like now,
- set to blank, then Content-Language is not inserted at all,
- set to custom string (e.g. “en-US” regardless dictionary used), then Content-Language is inserted with that custom-string.

What do you think? Any ETA?
I don't think we can distinguish between "not set" and "blank".

ETA would be a few versions down the track, if I did it soon it would go into TB 55 beta due out after next Monday, and then hit TB 52.3 six weeks later, no promises made. This is a small tweak with little risk.
> I don't think we can distinguish between "not set" and "blank".

Why? 
I’m looking at how general.useragent.override option is implemented:
- default installation has no such string in about:config,
- user is able to create one blank (then header is not inserted at all) or filled with whatever he likes.
You mean "space", right? If you've done the research already, kindly paste the line number here, so far I can only see:
https://dxr.mozilla.org/comm-central/rev/cad53f061da634a16ea75887558301b77f65745d/mozilla/dom/base/Navigator.cpp#1881

I'd have to go looking for the space processing.
Any updates on this?
Not really. We're and open source contributor driven project, so a patch is welcome. That would be about a three line change:

1) Define the new preference in mailnews.js
2) Retrieve it and act on it when the header is written out. I'd have to look where that goes.
   Like many times, the research here will take longer than the implementation.
Component: Untriaged → General
Status: UNCONFIRMED → NEW
Component: General → Message Compose Window
Ever confirmed: true
Keywords: good-first-bug

This is a patch created by Tails to replace part of the functionality previously provided by TorBirdy, i.e. to avoid leaking the user's dictionary in the Content-Language header by overriding it with "en-US".

You need to ask for review or the patch will be ignored.

You need to ask for review or the patch will be ignored.

Ok... so could you review it? Or what do I have to do to find a reviewer?

I re-read the "Getting reviews" section of https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/How_to_Submit_a_Patch. I says "To request a review, you will need to specify one or more usernames either when you submit the patch, or afterward in the UI". And that I should see the Mozilla Phabricator User Guide for details, but there I could only find information about using moz-phab, but I would like to avoid learning how to use a new tool, just to ask for a review, if possible.

I don't know where I should specify the usernames. So I will just try it here: jorgk-bmo or mkmelin, could one of you review this please?

Attachment #9115194 - Flags: review?(mkmelin+mozilla)

Attachment #9115194 [details] [diff] - Flags: review?(mkmelin+mozilla@iki.fi)

Thanks Jörg. I'll keep in mind for the future that I have to set the review flag to ? in the attachment.

Comment on attachment 9115194 [details] [diff] [review]
Avoid-spellchecking-language-disclosure-in-Content-Language-header.patch

Coming to think of this, the approach isn't right. We store the language in the message so when you save a draft (or template?) and later edit it again, you get the spellcheck language restored. Just storing en-US all the time will destroy that function.

You should check bug 1169184 where this feature was added and then make sure the language is not sent out if your preference is set, maybe around here:
https://hg.mozilla.org/comm-central/rev/5855f51dead5#l6.12
Saving a draft also goes via the send pipeline, so you need to be careful.
Attachment #9115194 - Flags: review-
Comment on attachment 9115194 [details] [diff] [review]
Avoid-spellchecking-language-disclosure-in-Content-Language-header.patch

Review of attachment 9115194 [details] [diff] [review]:
-----------------------------------------------------------------

Agreed it would be better to just avoid setting the header if a pref is set.
For the patch, please make the summary start with "Bug 1370217 -" and also add the name of the pref you add to the message
Attachment #9115194 - Flags: review?(mkmelin+mozilla)
Assignee: nobody → segfault
Status: NEW → ASSIGNED

Coming to think of this, the approach isn't right. We store the language in the message so when you save a draft (or template?) and later edit it again, you get the spellcheck language restored. Just storing en-US all the time will destroy that function.

But storing the language in a draft that is sent to the IMAP server is also a metadata leak which we want to prevent. And if the draft is stored locally, the content language isn't preserved anyway.

Agreed it would be better to just avoid setting the header if a pref is set.

Since it's clear from the header that the mail was sent via Thunderbird, and Thunderbird always sets the ContentLanguage header except if this option is set, the fact that it's missing is also a metadata leak which we might want to avoid.

I could also add an option to not set the header at all (once I find out how, skipping the line Jörg pointed me to doesn't work, the header is still being set), but I would prefer to also have an option to override it with en-US (or maybe some other configurable value).

(In reply to segfault from comment #18)

But storing the language in a draft that is sent to the IMAP server is also a metadata leak which we want to prevent.

Your mail provider already has identifying information, like your e-mail address. In case of MFA, they have your phone number, and if it's a paid service, they have your name, address and credit card number. What are you trying to hide from your (trusted?) e-mail provider? That an e-mail is sent from your account? If you don't trust them, then worse things can happen to you. Like they don't deliver mail to you, they send it to the wrong recipient, or they take part in man in the middle attacks.

And if the draft is stored locally, the content language isn't preserved anyway.

In fact, it is, I've just tested that. Overwriting with en-US is really counterproductive, not every has an en-US version of TB installed. Say someone has en-US and es-ES dictionary installed and writes a draft in Spanish. You'll then restore en-US for him/her. Sorry, that's 100% unacceptable. As Magnus said, if at all, don't write the header, preferably only when sending.

Since it's clear from the header that the mail was sent via Thunderbird, and Thunderbird always sets the ContentLanguage header except if this option is set, the fact that it's missing is also a metadata leak which we might want to avoid.

M$ Outlook has sent the header for decades, only that it was mostly wrong. There's also a User-Agent header that TB sends, or have you suppressed that in some other way?

(In reply to Jorg K (GMT+1) (PTO to 5th Jan 2020, sporadically reading bugmail) from comment #19)

(In reply to segfault from comment #18)

But storing the language in a draft that is sent to the IMAP server is also a metadata leak which we want to prevent.

Your mail provider already has identifying information, like your e-mail address. In case of MFA, they have your phone number, and if it's a paid service, they have your name, address and credit card number. What are you trying to hide from your (trusted?) e-mail provider? That an e-mail is sent from your account? If you don't trust them, then worse things can happen to you. Like they don't deliver mail to you, they send it to the wrong recipient, or they take part in man in the middle attacks.

This option is supposed to replace behavior of TorBirdy, which is an extension for using Thunderbird with Tor. The concept of Tor is to hide the user's identity, even from the servers they are connecting to. That is why we want to prevent metadata, which could be used to fingerprint the user, from leaking to those servers.

And if the draft is stored locally, the content language isn't preserved anyway.

In fact, it is, I've just tested that. Overwriting with en-US is really counterproductive, not every has an en-US version of TB installed.

I also tested it, twice now. Here is what I do (tested with Thunderbird 68.2.2 on Debian sid/unstable and Thunderbird built from current tip):

  1. Click "Write" to create a new message
  2. Click the language button in the bottom right corner of the message window and change the language to a non-en-US one. Write some text in that language.
  3. Click "Save" to save as draft. I have configured the drafts folder to "Local Folders" for this account.
  4. Close the message.
  5. Open the message in the local drafts folder, click "Edit".

The result is that the message is opened with language set back to en-US.

Say someone has en-US and es-ES dictionary installed and writes a draft in Spanish. You'll then restore en-US for him/her. Sorry, that's 100% unacceptable. As Magnus said, if at all, don't write the header, preferably only when sending.

OK, I found a way to prevent the ContentLanguage header to be set at all: Setting the content language to an empty string. This way, a user who has their default dictionary set to Spanish will have their drafts re-opened with Spanish instead of en-US. Is that good enough?

Since it's clear from the header that the mail was sent via Thunderbird, and Thunderbird always sets the ContentLanguage header except if this option is set, the fact that it's missing is also a metadata leak which we might want to avoid.

M$ Outlook has sent the header for decades, only that it was mostly wrong. There's also a User-Agent header that TB sends, or have you suppressed that in some other way?

Yes, in fact TorBirdy clears the User-Agent header via pref("general.useragent.override", "");. My concern is that if we're not doing a good enough job at hiding the fact that the mail was sent via Thunderbird, then the fact that there is no ContentLanguage header means that the user has enabled this very option.

Attachment #9115194 - Attachment is obsolete: true
Attachment #9116021 - Flags: review?(mkmelin+mozilla)

Hmm, re. steps 1-5: If I repeat them in TB 68.3.1, I get the non-en-US language and non-default language restored, so it works as designed. What's going on with you?

Yes, I guess that setting the language to an empty string will result on the header not being written. That destroys "edit draft" which for some reason is already not working for you. That may be good enough. I won't use that option. Magnus, what do you think?

Wow, a 300+KB patch. Something has gone wrong there. I'd prefer to call your pref mail.suppress_content_language.

(accidentally committed leftover files from a merge conflict, sorry, I'm not used to mercurial)

Attachment #9116021 - Attachment is obsolete: true
Attachment #9116021 - Flags: review?(mkmelin+mozilla)
Attachment #9116024 - Flags: review?(mkmelin+mozilla)

As I said: I'd prefer to call your pref mail.suppress_content_language.

Renamed the pref to mail.suppress_content_language.

Attachment #9116024 - Attachment is obsolete: true
Attachment #9116024 - Flags: review?(mkmelin+mozilla)
Attachment #9116025 - Flags: review?(mkmelin+mozilla)
Comment on attachment 9116025 [details] [diff] [review]
Bug-1370217-Add-pref-mail.suppress_content_language.patch

Review of attachment 9116025 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. r=mkmelin
I'll add a test for this as well

Sent to try now: https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=c52c8c09b81cde41fce4a5e9dfeb8d92ae356077
Attachment #9116025 - Flags: review?(mkmelin+mozilla) → review+

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/9edcc31370ac
Add pref mail.suppress_content_language that when set prevents setting Content-Language header. r=mkmelin
https://hg.mozilla.org/comm-central/rev/cfa51f6ced45
test for mail.suppress_content_language. r=me

Status: ASSIGNED → RESOLVED
Closed: 7 months ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 73.0
You need to log in before you can comment on or make changes to this bug.