Closed Bug 1462895 Opened 6 years ago Closed 6 years ago

In "original HTML" and "simple HTML" view the body of the message is placed outside <div class=moz-text-html>

Categories

(MailNews Core :: MIME, defect)

defect
Not set
major

Tracking

(thunderbird_esr52 fixed, thunderbird_esr60 fixed, thunderbird60 fixed, thunderbird61 wontfix, thunderbird62 fixed)

RESOLVED FIXED
Thunderbird 62.0
Tracking Status
thunderbird_esr52 --- fixed
thunderbird_esr60 --- fixed
thunderbird60 --- fixed
thunderbird61 --- wontfix
thunderbird62 --- fixed

People

(Reporter: jorgk-bmo, Assigned: jorgk-bmo)

References

Details

(Keywords: regression)

Attachments

(1 file, 7 obsolete files)

Attached image simple HTML not in div tag.png (obsolete) —
Working on bug 1419417 we noticed that the message body of messages when viewed in "Simple HTML" view is placed outside the <div class=moz-text-html> tag. This is wrong and as a consequence, CSS is not applied properly to the body, including margins and spacing.
Attached patch Move </div> to end (for trunk) (obsolete) — Splinter Review
Assignee: nobody → ben.bucksch
Status: NEW → ASSIGNED
Comment on attachment 8979189 [details] [diff] [review]
Move </div> to end (for trunk)

Review of attachment 8979189 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for the research, I'll try it after lunch.

::: mailnews/mime/src/mimethsa.cpp
@@ +120,1 @@
>                               resultCStr.BeginWriting(),

Nit: Indentation a little questionable here, no?
(just add linebreaks in commit message)
Attachment #8979189 - Attachment is obsolete: true
> Nit: Indentation a little questionable here, no?

That's existing code and formatting. It was there before.

Please consider that it look me 1.5 hours (!) of work just to create these patches in a form that you like. Not coding, just to make the diffs for you in the form that you requested. Each change, patch and review round is very expensive. I do not want to make such changes for trivial, significant matters like whitespace. Esp. for existing code.

As we discussed on chat, we can revisit and fix the style of the entire libmime module. But after discussion, and as a separate, single commit for the entire module.

Wikipedia "Atomic commit": "...becomes especially important when making format changes to the source code. If format and functional changes are combined it becomes very difficult to identify useful changes."
Ben, in C-C and M-C we adjust indentation if the surrounding code changes, some recent examples:
https://hg.mozilla.org/comm-central/rev/f378ff5bbc894ff9863dec27d27f99947726f096#l3.12
https://hg.mozilla.org/mozilla-central/rev/c58f0b4dd849#l143.180

That's independent of how much time you spent fighting with HG.

So there are a few options:
1) I make the change.
2) You make the change.
3) It's technically an r- and we can't move forward.
4) We keep discussing the issue.

Please authorise 1) or make 2) yourself.
Sadly this doesn't work at all. I tested this with attachment 8975458 [details] from bug 1419417 and now the body doesn't show any more at all.

I fixed the indentation I requested and also the commit message. I would immensely appreciate not having to make those changes again.
Attachment #8979193 - Attachment is obsolete: true
Attachment #8979203 - Flags: review-
Richard, looking at this again, class moz-text-html doesn't appear to have any CSS. So while the DOM is wrong, even if it were right, no CSS would apply. What am I missing?
Flags: needinfo?(richard.marti)
Checked with the test message and without a patch from bug 1419417 and the standard CSS (html.css) is applied. But with "Original HTML" I see only the text "Please answer to this email ..." and no attachment lines. With "Simple HTML" I see no text but the two attachment lines.

Checking with inspector I find the message text in "Simple HTML" nowhere.

Do I need a patch from bug 1419417 for this testing? If yes, which should I use.
Flags: needinfo?(richard.marti)
Richard, it's all very confusing right now. I just wanted to confirm that no CSS is defined for moz-text-html.
No special CSS is applied except the default html.css.
Attached file multi-html.eml (obsolete) —
The patch works OK for a simple HTML mail, but this message here doesn't display anything but horizontal separators. Now the patch isn't right.
Now that bug 1419417 has landed, the HTML is placed incorrectly in the DOM for the "original HTML" and "simple HTML" views. We should fix that since it shows some logic error somewhere in MIME processing.

Ben, will you be continuing to work on this bug? If so, please try your patch on the message in attachment 8979247 [details] and attachment 8979258 [details]. Or do a try run with your patch. If not, please unassign yourself.
Summary: In "simple HTML" view the body of the message is placed outside <div class=moz-text-html> → In "original HTML" and "simple HTML" view the body of the message is placed outside <div class=moz-text-html>
Also, if you look at bug 1429491 comment #17 you will see that inserting stuff before the original HTML of the message is absolutely fatal since all the HTML we add leads to all e-mail being rendered in quirks mode. Is we want to correct this, we must stop inserting anything.

So maybe instead of deliberating how to get the closing </div> into position, we should scrap the <div> tag altogether, at least for "original" HTML.

I forgot the NI: Ben, you'll be continuing here?
Flags: needinfo?(ben.bucksch)
We'll remove that <div> in bug 1429491, so no action here.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Flags: needinfo?(ben.bucksch)
Resolution: --- → WONTFIX
Attached patch call-superclass-eof-later.patch (obsolete) — Splinter Review
For the record: This is Ben's suggestion to address the issue. It's is the difference between versions 4 and 5 of his patches from bug 1419417. Version 4 was landed since version 5 doesn't work. Some messages are not rendered correctly at all, like multi-html.eml, others, like a simple multipart/related (HTML+embedded image) crash in debug mode on assert. Tests also crash on assert.
I think this is generally the correct approach, yes. I might be wrong, but there might simply be a detail problem that's causing this. Could you look into the asserts, what's going on?
No I won't. You caused this regression in bug 1419417, so please fix it.
OK, seems like I was wrong, then.
Thanks for finding this.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Severity: normal → critical
Status: REOPENED → ASSIGNED
Here are a few ideas how to fix this:
* In HTMLParsed (which is the main problem we have), you have a DOM tree. You can add the <div> to the DOM
  after parsing and before serializing. That allows to add it in the proper place in the HTML document.
* You could try to add a ParseEnd function to TextHTML base class and move the closing </div> there
  instead of the ParseEOF(). I don't know whether that would work.
* You remove the <div> from the TextHTML superclass and put it in the HTMLParsed and HTMLSanitized subclassed.
  Not nice, but a quick fix and should work, because you control the order there.
Assignee: ben.bucksch → nobody
Severity: critical → major
Status: ASSIGNED → NEW
Hint: If you can part with the <div>, there is code in attachment 8980767 [details] [diff] [review] that will apply the lang attribute directly to the body or html tags. That code is tested and working. Alternatively a hack along the lines of
https://dxr.mozilla.org/comm-central/rev/51678c5b0b3edd827c57a70f1c5baf02d44acead/mailnews/mime/src/mimethpl.cpp#65
might do the trick.

I'll leave it to the assignee to figure something out.
Attachment #8980843 - Attachment is obsolete: true
> If you can part with the <div>, there is code in attachment 8980767 [details] [diff] [review] [diff] [review] that will
> apply the lang attribute directly to the body or html tags. That code is tested and working.

FWIW, I had rejected that patch because of the stylesheet changes, which break a lot of stuff, not because of the lang attribute, which was a good idea. Aside from the stylesheet, I think that patch is neat and a good start.

Instead of putting the lang into the <body>, you could probably simply inject a <div> between the <body> and its child elements. That's what I meant with point 1 in comment 21. That would be the best possible fix.
(In reply to Ben Bucksch (:BenB) from comment #23)
> Instead of putting the lang into the <body>, you could probably ...
You mean "..., one could probably ...".

Looking forward to reviewing the patch. Thanks for taking the bug ;-)
Oh, I missed that, you unassigned yourself. That's a pity, so who's going to fix *your* regression now?
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
Attachment #8979203 - Attachment is obsolete: true
Attached patch 1462895-fix-language-div.patch (obsolete) — Splinter Review
Note that looking for <body> doesn't work since we need to cater for
  <body text="#000000" bgcolor="#FFFFFF">

This took 30 minutes to write, less time than the discussion around it :-(
Attachment #8979013 - Attachment is obsolete: true
Attachment #8979247 - Attachment is obsolete: true
Attachment #8980881 - Flags: review?(mkmelin+mozilla)
Attachment #8980881 - Flags: review?(acelists)
Comment on attachment 8980881 [details] [diff] [review]
1462895-fix-language-div.patch

Review of attachment 8980881 [details] [diff] [review]:
-----------------------------------------------------------------

::: mailnews/mime/src/mimethtm.cpp
@@ +186,5 @@
> +  if (obj->options->format_out != nsMimeOutput::nsMimeMessageBodyDisplay &&
> +      obj->options->format_out != nsMimeOutput::nsMimeMessagePrintOutput)
> +    return;
> +
> +  // Insert <div class="moz-text-html" lang="..."> for the following two purposes:

but why not put the class="moz-text-html" lang="..." directly on the body?

@@ +193,5 @@
> +  //    which font to use.
> +  // Make sure we have a <body> before we start.
> +  int32_t index = message.Find("<body");
> +  if (index == kNotFound)
> +    return;

shouldn't this at least warn?
|message| is assumed to be a serialized document, and should have a body

@@ +216,5 @@
> +  }
> +
> +  index = message.RFind("</body>");
> +  if (index != kNotFound)
> +    message.Insert(NS_LITERAL_CSTRING("</div>"), index);

can you explain what the logic is here?

when do we not have a </body>? should it warn (or similar)?
(In reply to Magnus Melin from comment #27)
> > +  // Insert <div class="moz-text-html" lang="..."> for the following two purposes:
> but why not put the class="moz-text-html" lang="..." directly on the body?
Ben wanted a div. So we're maintaining existing behaviour. To give the body a class is a strange idea from a web developer point of view, no? I'd rather not do that.

> shouldn't this at least warn?
Who would care?

> |message| is assumed to be a serialized document, and should have a body
Maybe not, if the message was empty? I don't know, but nsIDocument::GetBodyElement() can return null, so better safe than sorry.

> > +  index = message.RFind("</body>");
> > +  if (index != kNotFound)
> > +    message.Insert(NS_LITERAL_CSTRING("</div>"), index);
> can you explain what the logic is here?
Umm, we look for the last </body> tag in the serialised output and insert a </div> before it, just like we inserted a <div> after the <body> tag. Looks pretty clear to me.

> when do we not have a </body>? should it warn (or similar)?
Again, no one will see the warning.
(In reply to Jorg K (GMT+2) (bustage-fix and urgent reviews only) from comment #28)
> To give the body a class is a strange idea from a web developer point of view, no? 

I don't see anything strange with that. Semantically its' more correct since you don't mess with the original intent. You can even easily be breaking css rules of the sender's since you modify the document to add a <div>. (Though it might not be that important, since <style> is pretty unsupported in mail clients.)

 
> > shouldn't this at least warn?
> Who would care?

Someone wondering why the message disappeared? :)
Anyway, please at least document it very clearly that <body> must be included.
Comment on attachment 8980881 [details] [diff] [review]
1462895-fix-language-div.patch

Review of attachment 8980881 [details] [diff] [review]:
-----------------------------------------------------------------

I'll let you decide how much of the comments you want to address. r=mkmelin
Attachment #8980881 - Flags: review?(mkmelin+mozilla)
Attachment #8980881 - Flags: review?(acelists)
Attachment #8980881 - Flags: review+
Other MIME classes add <div>s, for example:

<body>
<div class="moz-text-plain" wrap="true" graphical-quote="true" style="font-family: -moz-fixed; font-size: 14px;" lang="x-unicode">

Please file a follow-up to correct the lot.
Addressed some comments ;-)
Attachment #8980881 - Attachment is obsolete: true
Attachment #8980970 - Flags: review+
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/a4a57af137c4
fix insertion of <div class="moz-text-html" lang="...">. r=mkmelin
Status: ASSIGNED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 62.0
Thanks, Jörg, for picking this up.

As to the patch here, I think this string replacement here is too fragile. I'm particularly worried about cases where we might not have any body, or 2 body elements.

We have a DOM tree, why not use that to insert the <div> properly? I think your attachment 8980767 [details] [diff] [review] was a good start, see comment 23.
Duh, edit conflict, you landed just in the moment while I was typing my comment.
> Other MIME classes add <div>s, for example:
> <body>
> <div class="moz-text-plain" wrap="true" graphical-quote="true" style="font-family: -moz-fixed; font-size: 14px;" lang="x-unicode">
> Please file a follow-up to correct the lot.

Please CC me, if you file bugs. But the above <div> looks correct to me. If you modify that, you are likely to break some things that you don't even know exist.
(In reply to Ben Bucksch (:BenB) from comment #34)
> As to the patch here, I think this string replacement here is too fragile.
> I'm particularly worried about cases where we might not have any body, or 2
> body elements.
For the sanitised case we don't have a DOM, so we need string manipulation anyway (of which MIME does heaps already).
Shuffling all the content of the body into a div is a lot of work on a DOM level, but only a few lines in string manipulation. The string passed in is always result of a Gecko serialisation or sanitisation, so if should have one or no body element. If there is no body, we obviously don't have to insert anything.
 
> We have a DOM tree, why not use that to insert the <div> properly?
See above, a lot of manipulation inserting a <div> and then shuffling the <body>'s content into it.
> For the sanitised case we don't have a DOM

Yes. You'd need the string messing in that case. But at least the normal mode would be saved.

> > We have a DOM tree, why not use that to insert the <div> properly?

> See above

Where above? Which comment?

Just create a new DIV, iterate over body children, move them to the <div>, then put the <div> underneath <body>.

Anyway, that was a suggestion to write this in a clean way. Your patch fixes the bug, that's nice.

I simply worry that there are cases where the HTML string manipulation fails, e.g. when there is no body or more than 1 body.
(In reply to Ben Bucksch (:BenB) from comment #39)
> > See above
> Where above? Which comment?
Same comment, just the paragraph above.

> Just create a new DIV, iterate over body children, move them to the <div>,
> then put the <div> underneath <body>.
Exactly. And how error prone is that and how many CPU cycles does this burn compared to a simple string insertion?
That's much safer than the string messing, and it's a simple DOM manipulation.

Anyway... If your fix works in all cases (it appears it does), it's OK. Thanks for the fix.
Pushed by mozilla@jorgk.com:
https://hg.mozilla.org/comm-central/rev/e48189835da7
Follow-up: Use FindChar() instead of Find(). r=me
https://hg.mozilla.org/releases/comm-beta/rev/1ca7373fac5894d1075289ca65972eba2f217e48

This was actually quite fatal:
Find(string, ignoreCase, offset) was called with Find(string, offset) and no one noticed :-(
Using FindChar(char, offset) is actually right now.
So in effect, the <div> got inserted after the <html> tag, not the <body> tag, but apparently Gecko didn't mind.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: