Closed Bug 1464667 Opened Last year Closed Last year

Suppress <plaintext> in email messages

Categories

(MailNews Core :: MIME, enhancement)

enhancement
Not set

Tracking

(thunderbird_esr52 fixed, thunderbird_esr60 fixed, thunderbird60 fixed, thunderbird61 wontfix, thunderbird62 fixed)

RESOLVED FIXED
Thunderbird 62.0
Tracking Status
thunderbird_esr52 --- fixed
thunderbird_esr60 --- fixed
thunderbird60 --- fixed
thunderbird61 --- wontfix
thunderbird62 --- fixed

People

(Reporter: mkmelin, Assigned: jorgk)

Details

Attachments

(5 files, 6 obsolete files)

MDN says firefox doesn't support <plaintext> - https://developer.mozilla.org/en-US/docs/Web/HTML/Element/plaintext

... but it seems it does. 

It's not treated as <pre>, but really treated as <plaintext>, which has no closing tag(!!). This makes for some pretty interesting and unexpected dom trees in cases when you hope the parser would make sense of the garbage you throw at it. Like bug 1464056 comment 18 - where for some cases Thunderbird is pulling two doms together and displaying them in one.

I'd suggest actually removing support for <plaintext> since at least according to mdn no browser supports it.
OK, change the tag to <pre> or remove it and everything that follows? It's easily doable like MimeInlineTextHTML_insert_lang_div() introduced in bug 1462895.

Do you want to take the bug or should I?
Flags: needinfo?(mkmelin+mozilla)
Attached patch 1464667-plaintext.patch (obsolete) — Splinter Review
This works nicely on my test message. Is that what you had in mind?
Flags: needinfo?(mkmelin+mozilla)
Attachment #8980975 - Flags: feedback?(mkmelin+mozilla)
Wow, I got it all wrong. This is a Code::DOM bug. In any case, with the MIME tweak suggested here, we're cooking with gas ;-)

We can other move this to a new bug, or move this bug here to Mailnews and file another Core::DOM bug.
Comment on attachment 8980975 [details] [diff] [review]
1464667-plaintext.patch

Review of attachment 8980975 [details] [diff] [review]:
-----------------------------------------------------------------

It should just ignore the element, so replace it with nothing. Also need to account for attributes. But let's to it in another bug. 

But I'd like to hear from DOM folks about removing the support from DOM. If this something they agree to we can just take that patch for our branch too, at least 60.
Attachment #8980975 - Flags: feedback?(mkmelin+mozilla) → feedback-
Attachment #8980975 - Attachment is obsolete: true
Magnus, you misspelled the closing tag, but even with </plaintext> that closing tag would be ignored.
Maybe Boris can answer this or find someone who can.
Flags: needinfo?(bzbarsky)
MDN's compat table there is just wrong.  <plaintext> is supported by Firefox, Chrome, Safari, and Edge, as you could tell if you had just tested it instead of asking me to test it...  And there are sites that use it to ensure that nothing after that is parsed as markup, so removing it would likely introduce XSS bugs in sites.  See also https://html.spec.whatwg.org/multipage/parsing.html#plaintext-state and the 'A start tag whose tag name is "plaintext"' bit in https://html.spec.whatwg.org/multipage/parsing.html#parsing-main-inbody

I filed https://github.com/mdn/browser-compat-data/issues/2168 to get the compat table fixed to reflect reality.
Status: NEW → RESOLVED
Closed: Last year
Flags: needinfo?(bzbarsky)
Resolution: --- → INVALID
Boris, many thanks for the advice and testing. Let's take the bug to mailnews.
Status: RESOLVED → REOPENED
Component: DOM: Core & HTML → MIME
Product: Core → MailNews Core
Resolution: INVALID → ---
Summary: remove support for <plaintext> → Suppress <plaintext> in email messages
Status: REOPENED → NEW
Attachment #8980975 - Attachment is obsolete: false
(In reply to Magnus Melin from comment #4)
> It should just ignore the element, so replace it with nothing.
That can be done in a loop.

> Also need to account for attributes.
You mean: detect <plaintext attr=xxx>?

> But let's to it in another bug. 
I recycled this one, so we have the "DOM answer" included.
Attached patch 1464667-plaintext.patch (v2) (obsolete) — Splinter Review
OK, now replacing <plaintext> tags with <plaintextignored>.

While working on this, I noticed bug 1462895 comment #43. That slipped through the review :-(
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
Attachment #8984305 - Flags: review?(mkmelin+mozilla)
Attached patch 1464667-plaintext.patch (v2b) (obsolete) — Splinter Review
Fixed comment.
Attachment #8984305 - Attachment is obsolete: true
Attachment #8984305 - Flags: review?(mkmelin+mozilla)
Attachment #8984306 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8984306 [details] [diff] [review]
1464667-plaintext.patch (v2b)

Review of attachment 8984306 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mime/src/mimethtm.cpp
@@ +223,5 @@
>      message.Insert(NS_LITERAL_CSTRING("</div>"), index);
>  }
> +
> +/*
> + * The following function replaces <plaintext> tags with <plaintextignored>.

Could you please make it e.g. <xx-plaintext> instead. 
Not that we do anything with it but that would keep the dom following standards. (Non-standard elements should use dash in their names)

@@ +238,5 @@
> +
> +  // Make sure we have a <body> before we start.
> +  if (message.Find("<body>") == kNotFound)
> +    return;
> +

there's an extra blank line here

@@ +247,5 @@
> +    message.Insert("Ignored", index+10);
> +    index += 17;
> +  }
> +  index = 0;
> +  while ((index = message.Find("</plaintext", /* ignoreCase = */ false, index)) != kNotFound) {

I think we shouldn't bother with the closing tag, since that do not exit. Which is the whole reason we do this replace.
(In reply to Magnus Melin from comment #12)
> Could you please make it e.g. <xx-plaintext> instead. 
> Not that we do anything with it but that would keep the dom following
> standards. (Non-standard elements should use dash in their names)
Why xx-? Can we make it x-plaintext, like x-western or x-unicode?

> I think we shouldn't bother with the closing tag, since that do not exit.
Well, if you look at the selialised HTML returned from Gecko, it *dose* have the closing tag. So for symmetry we should replace it as well.
Attached patch 1464667-plaintext.patch (v3) (obsolete) — Splinter Review
Attachment #8980975 - Attachment is obsolete: true
Attachment #8984306 - Attachment is obsolete: true
Attachment #8984306 - Flags: review?(mkmelin+mozilla)
Attachment #8984367 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8984367 [details] [diff] [review]
1464667-plaintext.patch (v3)

Review of attachment 8984367 [details] [diff] [review]:
-----------------------------------------------------------------

Great, r=mkmelin
Attachment #8984367 - Flags: review?(mkmelin+mozilla) → review+
Actually, my testcase was off and now that I retested it doesn't work. I'll have to rebuild with tip and see if it works.
Attachment #8984367 - Flags: review+
Attachment #8984367 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8984367 [details] [diff] [review]
1464667-plaintext.patch (v3)

Review of attachment 8984367 [details] [diff] [review]:
-----------------------------------------------------------------

Ah yes it does work after all. It's only that replies don't go by this code path, so there you can still hide things easily
Attachment #8984367 - Flags: review?(mkmelin+mozilla) → review+
Attached patch 1464667-plaintext.patch (v4) (obsolete) — Splinter Review
You're right, compose wasn't covered. Now it is. I duplicated the code since the strings and parameters are different.
Attachment #8984367 - Attachment is obsolete: true
Attachment #8984702 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8984702 [details] [diff] [review]
1464667-plaintext.patch (v4)

Review of attachment 8984702 [details] [diff] [review]:
-----------------------------------------------------------------

This works for replies too, but it doesn't yet cover forward.
Attached patch 1464667-plaintext.patch (v5) (obsolete) — Splinter Review
Now with forward.
Attachment #8984702 - Attachment is obsolete: true
Attachment #8984702 - Flags: review?(mkmelin+mozilla)
Attachment #8984763 - Flags: review?(mkmelin+mozilla)
Comment on attachment 8984763 [details] [diff] [review]
1464667-plaintext.patch (v5)

Review of attachment 8984763 [details] [diff] [review]:
-----------------------------------------------------------------

All working now, thx! r=mkmelin

::: mailnews/compose/src/nsMsgCompose.cpp
@@ +601,5 @@
> +remove_plaintext_tag(nsString &body)
> +{
> +  // Make sure we have a <body> before we start.
> +  if (body.Find("<body>", /* ignoreCase = */ true) == kNotFound)
> +    return;

was there a reason for this check?
for forward it's not quite clear it would always have a <body>. Seems it does have one now, but if there's no reason to check for it I'd remove that part so it doesn't break in the future.
Attachment #8984763 - Flags: review?(mkmelin+mozilla) → review+
I guess the check was copied from MimeInlineTextHTML_insert_lang_div() where we need and have a body to insert our "language div".

In this case here, I can drop the body check.
Removed search for body tag where not needed.
Attachment #8984763 - Attachment is obsolete: true
Attachment #8984871 - Flags: review+
1. There should be no need to do this in the HTMLSanitizer class, because if the sanitizer works correctly, it should already strip such funny tags, given that it works based on a whitelist. If <plaintext> passes the sanitizer, there's something badly wrong with the sanitizer. Could you please verify that <plaintext> is stripped even without the patch?

2. I don't like all this string messing. Please make sure you're not introducing buffer overflows :). The ignoreCase oversight is a good example of what can go wrong when you do search&replace on existing HTML documents for security. It doesn't work well. Inserting a <div> or other HTML tag is something else, you don't run into the same problems.

3. For the HTMLParsed class, we have a full DOM tree, so it could be removed this way.

4. Personally, I would rather strip the <plaintext> tag altogether than insert the "x-".

5. Also, like Magnus said, replacing </plaintext is unnecessary and is extra code.

> index += 12;

6. NIT: I'd use simply index++. Safer in case we to replace it with "" later.

Ben
Attachment #8984871 - Flags: review-
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/fca92b0b0692
Remove plaintext tag from HTML. r=mkmelin
Status: ASSIGNED → RESOLVED
Closed: Last yearLast year
Resolution: --- → FIXED
Ben, that comment came to late.
Target Milestone: --- → Thunderbird 62.0
Magnus, what do you want to do here? I suggest uplifting this.

(In reply to Ben Bucksch (:BenB) from comment #24)
> 1. There should be no need to do this in the HTMLSanitizer class, because if
> the sanitizer works correctly, it should already strip such funny tags,
> given that it works based on a whitelist. If <plaintext> passes the
> sanitizer, there's something badly wrong with the sanitizer. Could you
> please verify that <plaintext> is stripped even without the patch?
Yes, already working without the patch. I can remove the call.

> 2. I don't like all this string messing. Please make sure you're not
> introducing buffer overflows :). The ignoreCase oversight is a good example
> of what can go wrong when you do search&replace on existing HTML documents
> for security. It doesn't work well. Inserting a <div> or other HTML tag is
> something else, you don't run into the same problems.
There was no oversight. The previous code worked since anything returned from Gecko has lowercase tags. I just added the ignoreCase wholesale just in case that ever changes.

> 3. For the HTMLParsed class, we have a full DOM tree, so it could be removed
> this way.
So you suggest to implement this twice??

> 5. Also, like Magnus said, replacing </plaintext is unnecessary and is extra
> code.
Well, as I said, it comes out of the parser, so we might as well replace it for symmetry.
Patch to remove the unnecessary call.
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/3a1addee73fc
Remove unnecessary call to MimeInlineTextHTML_remove_plaintext_tag() from mimethsa.cpp. r=me DONTBUILD
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/1509c555770b
Follow-up: Optimisation: only look for closing tag if opening tag is found. r=me DONTBUILD
OK, here I've folded the three patches to apply to beta.
Attachment #8984945 - Flags: review?(mkmelin+mozilla)
> > 3. For the HTMLParsed class, we have a full DOM tree, so it could be removed
> > this way.

> So you suggest to implement this twice??

No, I was suggesting to use DOM manipulation instead of string manipulation.
Yes, but in compose you don't have a DOM. Besides, I fail to see that traversing the DOM recursively looking for <plaintext> and then moving its children to a different parent would be any faster than a string search and inserting of two characters.
Comment on attachment 8984945 [details] [diff] [review]
1464667-combined-patch-for-beta.patch

Review of attachment 8984945 [details] [diff] [review]:
-----------------------------------------------------------------

Didn't retest but it all looks good to me. r=mkmelin
Attachment #8984945 - Flags: review?(mkmelin+mozilla) → review+
You need to log in before you can comment on or make changes to this bug.