Closed Bug 964024 Opened 10 years ago Closed 9 years ago

mozTXTToHTMLConv should never convert content inside <style>, <script>, <head> tags

Categories

(MailNews Core :: Backend, defect)

defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
Thunderbird 38.0

People

(Reporter: realRaven, Assigned: mkmelin)

References

Details

Attachments

(1 file, 1 obsolete file)

When you create a rule like this in a style tag in the <head> section:

<style type="text/css">
body > p, body > div {
 font-family: Georgia;
 font-size: 12pt;
 color: #790dd8;
}
</style>

the child selectors '>' are replaced with &gt; when sending which invalidates the rules. text in <style> should NOT be parsed like text in any other tags to htmlencode > characters.
I made a test case using the <style> element within the body:

<style>
​ /* green */
​ p > tt { background-image: linear-gradient(to bottom, rgb(230, 240, 163) 0%, rgb(210, 230, 56) 50%, rgb(195, 216, 37) 51%, rgb(219, 240, 67) 100%) !important; border-radius: 0.3em; display: inline-block; border: 1px solid rgb(187, 187, 187); padding: 2px 4px; font-size: 10pt ! important;
​background-color: rgb(230,240,163);
​}
​ /* orange */
​ p tt { background-image: linear-gradient(to bottom, rgba(254,213,178,1) 0%,rgba(241,140,51,1) 46%,rgba(235,113,7,1) 51%,rgba(251,168,95,1) 100%);
​background-color: rgb(254,213,178);
​border-radius: 0.3em; display: inline-block; border: 1px solid rgb(187, 187, 187); padding: 2px 4px; font-size: 10pt ! important;
​}
​</style>

resulting Fixed Width text within a <p> are green while composing. 

Example

<p>this is a <tt>test</tt>.</p>

Once you save the email and preview it in "drafts" they become orange. AFAIK there is no workaround. (?)
A very old problem.
IMO style tags should be handled differently from regular text, but they are not.
See bug 234210 and bug 229122
(In reply to Joe Sabash from comment #2)
> A very old problem.
> IMO style tags should be handled differently from regular text, but they are
> not.
> See bug 234210 and bug 229122

Interesting, these are old indeed. I think any type of html parsing should be disabled when in a <style> or <script> tag. (I don't really use or would want to use script in emails but style is something I use on a daily basis.)

I think this is called as part of  nsMsgComposeAndSend::GetBodyFromEditor

http://mxr.mozilla.org/comm-central/source/mailnews/compose/src/nsMsgSend.cpp#1572

and mangles the complete body of the email (It seems to also affect <head>)

http://mxr.mozilla.org/comm-central/source/mozilla/netwerk/streamconv/converters/mozTXTToHTMLConv.cpp#1214

It is tricky to fix as it (from what I can see) is a very primitive method which generally unescapes all text between <tag></tag> pairs, which obviously affects <style> tags as well. However there is an exception with <a> tags so it might be as easy as extending this to <style> as well. I will see if I can test a patch...
Attached patch bug_964024.patch (obsolete) — Splinter Review
I would also like to request approval‑comm‑beta but it might be too early for this. Don't know what the process is here.

Test case:
create a new email and insert the following HTML:

<style type="text/css">
p > tt {
  color: #dd0000 !important;
  font-weight: bold;
}
</style>
<p>This is a <tt>test</tt> for </p>

---
then save and send. test should remain red. (in TT release the > will be turned to &gt; invalidating the rule
Attachment #8372875 - Flags: review?(neil)
(In reply to Axel Grude)
> When you create a rule like this in a style tag in the <head> section:
> 
> <style type="text/css">
> body > p, body > div {
>  font-family: Georgia;
>  font-size: 12pt;
>  color: #790dd8;
> }
> </style>

The easy workaround is to comment out the style block. (This is supported for compatibility with older browsers.)

<style type="text/css"><!--
body > p, body > div {
 font-family: Georgia;
 font-size: 12pt;
 color: #790dd8;
}
--></style>
Comment on attachment 8372875 [details] [diff] [review]
bug_964024.patch

Well, this looks at least as sane as the existing code.

Notes:
I'm not a peer for this code, so I can only give you f+.

This only does <style>, presumably because nobody needs to use <script>, <noscript> or <noframes> in email?

Your new code copies the mistake the existing code does of not checking for whitespace or ">" after the tag. In particular, <abbr> confuses it...

>+	  else if ((Substring(aInString, i + 1, 5)).LowerCaseEqualsASCII("style"))
>+           // if style tag, skip until </style>
Nit: tabs crept in here, please configure your editor to use spaces for indentation.
Attachment #8372875 - Flags: review?(neil) → feedback+
Yeah we should probably also skip over script, noscript, noframes - and, importantly: head. This bug is basically also fixing bug 234210 i think!

As for tests it should be very similar to http://mxr.mozilla.org/comm-central/source/mozilla/netwerk/test/unit/test_mozTXTToHTMLConv.js
Blocks: 234210
(In reply to Magnus Melin from comment #7)
> Yeah we should probably also skip over script, noscript, noframes - and,
> importantly: head. This bug is basically also fixing bug 234210 i think!
> 
Ok, I will take <head> <frames> <noframes> as well if it is in scope for this bug. Not sure if I even would want to support script in emails?

> As for tests it should be very similar to
> http://mxr.mozilla.org/comm-central/source/mozilla/netwerk/test/unit/
> test_mozTXTToHTMLConv.js
I am not quite familiar with the concept of scripting texts, how are they triggered and do they need to be submitted with the patch?
(In reply to neil@parkwaycc.co.uk from comment #6)
> Comment on attachment 8372875 [details] [diff] [review]
> bug_964024.patch
> 
> Well, this looks at least as sane as the existing code.
> 
> Notes:
> I'm not a peer for this code, so I can only give you f+.

who is? Sorry this is only my second real patch so I will need some guidance what the proper etiquette is.

> 
> This only does <style>, presumably because nobody needs to use <script>,
> <noscript> or <noframes> in email?
> 
> Your new code copies the mistake the existing code does of not checking for
> whitespace or ">" after the tag. In particular, <abbr> confuses it...
> 
I want to fix this and add support for other tags (head, frame, noframes) so I would like to break out the skipping of multi-character tokens into a private function. Something like

private bool skipSection(nsString& aInString, nsString tagName, int& offset, int length);

(I might need some help with the proper parameter types) and then use it within the loop like this:

else if (skipSection(aInString, 'head', i, lengthOfInString)) { ; }
else if (skipSection(aInString, 'style', i, lengthOfInString)) { ; }
else if (skipSection(aInString, 'script', i, lengthOfInString)) { ; }

etc. skipSection would return true if it found the tag and increase i byref accordingly for use byth eouter for loop.

As regards comment #7 Looking at noframes / frames I am not whether they should actually be omitted as they usually contain encodeable stuff...
(In reply to Axel Grude [:realRaven] from comment #8)
> (In reply to Magnus Melin from comment #7)
> > Yeah we should probably also skip over script, noscript, noframes - and,
> > importantly: head. This bug is basically also fixing bug 234210 i think!
> > 
> Ok, I will take <head> <frames> <noframes> as well if it is in scope for
> this bug. Not sure if I even would want to support script in emails?

The module you're touching isn't only about mail but a general module. Script is disabled by the mail content policy anyways, so what you change here doesn't matter.

Same for the frames/noframes. Why waste effort to do things there when it could only go wrong?

> > As for tests it should be very similar to
> > http://mxr.mozilla.org/comm-central/source/mozilla/netwerk/test/unit/
> > test_mozTXTToHTMLConv.js
> I am not quite familiar with the concept of scripting texts, how are they
> triggered and do they need to be submitted with the patch?

Every push to the repository triggers test runs (as shown up on https://tbpl.mozilla.org/). And yes, you should include it in the patch.

So, you write the new test file, hg add it, then add it in mozilla/netwerk/test/unit/xpcshell.ini.
At this point you need a rebuild on some level to get the xpcshell.ini change to get noticed.

The syntax of running the single test has been in flux. Currently for comm-central it should be like this (run from the your objdir)
make -C mozilla xpcshell-tests TEST_PATH=netwerk/test/unit/test_mozTXTToHTMLConv.js

You could ask BenB to review as I think he originally wrote it. (Or then some Necko peer.)
Axel: are you working on this?
(In reply to Magnus Melin from comment #11)
> Axel: are you working on this?

Just having a look now at this again. I might need some help functionalising as I said in Comment 9 as my C++ is very rusty!
(In reply to Magnus Melin from comment #11)
> Axel: are you working on this?

Magnus: I had written some code addressing this but didn't have time to write a test file. I was just reminded of it by Mark who owns the smartTemplate4 addon which I do devlopement work for. Unfortunately I haven't got an awful lot of time to make it 100% generic or fix prvious problems with the code (as pointed out by Neil) as my C++ is just not up to scratch, but it  handles the cases >, frames and noframes IIRC.

Can you help me getting the code landed?
Sure, please attach what you have.
^^^
Flags: needinfo?(axelg)
Magnus Ping
Although neil's workaround suggestion in comment 5 appears to work fine, it is not only counter-intuitive, but a real pain to edit into the composition.
I think the only thing Axel has more to offer is the remarks in comment 9
I think if we can get this fixed for the style tag alone, that fix would be sufficient, or just the dead section.
Flags: needinfo?(axelg)
Having thought about it more, the relevant tags should be <style>, <script>, <head>. frame stuff is irrelevant.
Assignee: nobody → mkmelin+mozilla
Status: NEW → ASSIGNED
Component: Message Compose Window → Backend
OS: Windows 7 → All
Product: Thunderbird → MailNews Core
Hardware: x86_64 → All
Summary: ON send child selector is replaced with &gt; → mozTXTToHTMLConv should never convert content inside <style>, <script>, <head> tags
Attachment #8372875 - Attachment is obsolete: true
Attachment #8540332 - Flags: review?(honzab.moz)
Attachment #8540332 - Flags: feedback?(neil)
Attachment #8540332 - Flags: feedback?(ben.bucksch)
Attachment #8540332 - Flags: feedback?(neil) → feedback+
honza: review ping
Comment on attachment 8540332 [details] [diff] [review]
bug964024_scanHTML_converts_style.patch

Review of attachment 8540332 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/streamconv/converters/mozTXTToHTMLConv.cpp
@@ +1252,5 @@
> +      }
> +      else if (Substring(aInString, i + 1, 5).LowerCaseEqualsASCII("style"))
> +           // if style tag, skip until </style>
> +      {
> +        i = aInString.Find("</style>", true, i);

when this is part of e.g. a content rule or a just in a comment, we will cut there and not at the actual end of the style tag.

could we use our existing html parser to sanitize for us rather?
It's true there can be edge cases not covered, but those should be rare in comparison. 

I don't know what you're proposing, but mozITXTToHTMLConv works with strings only, so a sanitation step would have to happen before calling this code.
Comment on attachment 8540332 [details] [diff] [review]
bug964024_scanHTML_converts_style.patch

Review of attachment 8540332 [details] [diff] [review]:
-----------------------------------------------------------------

Dropping r? until answered.
Attachment #8540332 - Flags: review?(honzab.moz)
Comment on attachment 8540332 [details] [diff] [review]
bug964024_scanHTML_converts_style.patch

Review of attachment 8540332 [details] [diff] [review]:
-----------------------------------------------------------------

Like I said earlier, such sanitation steps have to be done prior to calling mozITXTToHTMLConv. 

For thunderbird we kind of do that, the string to converted is obtained from the m_editor->OutputToString
I just re-tested: if you for instance put a css comment /* test </style> */ the output (that you see) in the editor is broken, since it's not valid html. (Well, broken as in you see the rest of the style rules as <style> ends there.)
Attachment #8540332 - Flags: review?(honzab.moz)
(In reply to Magnus Melin from comment #23)
> Comment on attachment 8540332 [details] [diff] [review]
> bug964024_scanHTML_converts_style.patch
> 
> Review of attachment 8540332 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Like I said earlier, such sanitation steps have to be done prior to calling
> mozITXTToHTMLConv. 
> 
> For thunderbird we kind of do that, the string to converted is obtained from
> the m_editor->OutputToString
> I just re-tested: if you for instance put a css comment /* test </style> */
> the output (that you see) in the editor is broken, since it's not valid
> html. (Well, broken as in you see the rest of the style rules as <style>
> ends there.)

Not sure I follow the comment.  You support my concern or disprove it?

BTW, would be great to rewrite this whole with bug 1024056...
Comment on attachment 8540332 [details] [diff] [review]
bug964024_scanHTML_converts_style.patch

Review of attachment 8540332 [details] [diff] [review]:
-----------------------------------------------------------------

I can't say I'm super happy about this patch, but the code itself is already bad.  So, let's take it in and see what breaks.

r=honzab with the comments addressed.

::: netwerk/streamconv/converters/mozTXTToHTMLConv.cpp
@@ +1268,5 @@
> +        else
> +          i += 9;
> +      }
> +      else if (Substring(aInString, i + 1, 4).LowerCaseEqualsASCII("head") &&
> +               (aInString.CharAt(i + 5) == ' ' || aInString.CharAt(i + 5) == '>'))

this nice check should be in the upper two as well.
Attachment #8540332 - Flags: review?(honzab.moz) → review+
Thx! https://hg.mozilla.org/integration/mozilla-inbound/rev/a8c47d8abcba
Flags: in-testsuite+
Target Milestone: --- → Thunderbird 38.0
No longer blocks: 234210
Attachment #8540332 - Flags: feedback?(ben.bucksch)
https://hg.mozilla.org/mozilla-central/rev/a8c47d8abcba
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
(In reply to Honza Bambas (:mayhemer) from comment #24)
> Not sure I follow the comment.  You support my concern or disprove it?

I agreed it may be a (minor) problem for invalid input, but, that the api is built on string input so I don't see it feasible to do sanitation at this point. For thunderbird we use the serialized dom which is valid, so bad input is not really an issue there.
Tested for style in head and body in Daily 20150202030227
Thanks Magnus for taking this on.
(now I have to brush up on my rusty CSS skills)
Status: RESOLVED → VERIFIED
Depends on: 1274602
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: