Closed Bug 1429491 Opened 6 years ago Closed 2 years ago

A html message isn't rendered correctly because Thunderbird always renders HTML in quirks mode

Categories

(Thunderbird :: Message Reader UI, enhancement)

52 Branch
enhancement
Not set
normal

Tracking

(thunderbird_esr102 wontfix, thunderbird104 wontfix)

RESOLVED FIXED
105 Branch
Tracking Status
thunderbird_esr102 --- wontfix
thunderbird104 --- wontfix

People

(Reporter: u38342, Assigned: mkmelin)

References

Details

Attachments

(8 files, 6 obsolete files)

170.87 KB, message/rfc822
Details
171.00 KB, message/rfc822
Details
149.26 KB, text/html
Details
182.96 KB, text/html
Details
382.20 KB, text/html
Details
86.52 KB, text/html
Details
48 bytes, text/x-phabricator-request
Details | Review
48 bytes, text/x-phabricator-request
Details | Review
This must be a rendering processing.
Attached page isn't displayed correctly.   I subscribe to this email daily, but every time it fails to render.
Attached file 2nd test case
I've taken one of the messages, extracted the HTML part and decoded the quoted-printable. The result is attached. It displays fine in Firefox.

It's a bit of a mystery why the same HTML is not rendered as part of an e-mail.

Neither viewing nor forwarding nor editing "as new" works. In fact, if I edit as new and get the HTML of the message, it's broken, too.
Here's the same test case HTML, only that the HTML is formatted. It displays the same.
I took the HTML from the TB compose window after an "edit as new" and formatted it the same way. It's hugely different.

Boris, sorry to trouble you with this. Any idea on how to tackle this? I'm just puzzled. Unless something is going wrong in the MIME processing, I don't understand why the HTML is so different.

However, when displaying the page, I get a message "Waiting for api.hubapi.com". In TB I see "Loading Message...". So there's something fishy here. I hope it's not Meltdown ;-)
Flags: needinfo?(bzbarsky)
Attachment 8941964 [details] is in quirks mode.  Attachment 8941961 [details] is in standards mode.  If I add "<!DOCTYPE html>" to the top of attachment 8941964 [details], it starts to render fine (or at least like attachment 8941961 [details]).
Flags: needinfo?(bzbarsky)
That was a quick answer :-)

So TB does quirks mode? Or interprets the HTML in the message which had <!DOCTYPE html ...> in quirks mode anyway? Where do we set that? Or maybe I should ask what we'd need to do so the doctype is respected.
> So TB does quirks mode? 

Just like Gecko.  Depends on the doctype, I'd think.

> Or interprets the HTML in the message which had <!DOCTYPE html ...> in quirks mode anyway?

It shouldn't, I would think, but maybe that's happening somehow...

> Or maybe I should ask what we'd need to do so the doctype is respected.

Should happen automatically if you parse it with the HTML parser.

You could check what actual string you end up feeding to the parser.  You could also check whether nsHTMLDocument::SetCompatibilityMode is getting called and if so with what values and what stacks.
Jorg, is someone looking into my questions from comment 7?
Flags: needinfo?(jorgk)
Boris, thanks for following up, this bug has moved down on my to-do list :-(. So currently no one is looking.
Flags: needinfo?(jorgk)
Also please check out the sample .eml attached to bug #1439922.
Thanks for the sample attached to the other bug. I can confirm that it is displayed incorrectly in TB, but when you extract the HTML from the message, that HTML is displayed correctly in FF.

If I remove the first line of the document
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
to trigger "quirks mode", the page is equally badly displayed in FF (attached here).

So bug 1439922 has the same cause as this bug here that TB shows the document in quirks mode despite the <!DOCTYPE html ...>.
OK, I can probably take a look at this if someone gives me clear steps to reproduce.  I have a debug build of Thunderbird with no mail account set up (and a requirement that I not set one up, fwiw).  I have the eml files in this bug.  Now what do I do?
There are two examples attached here and also one in bug 1439922.

You need a local folder to import the messages (drag them onto the folder) and without any account set up, you don't get local folders. But there's a trick. Start TB with -p and create a new profile. Press cancel on the account creation. Enable the menu, and then go to Tools > Account settings. Bottom left, Account Actions, "Add Feed Account". That gives you an empty RSS feed account with no subscriptions. Also Local Folders will be created, so you can drag your messages to Trash or create a test folder.

Let me know if this is unclear. Thanks as always for your help!
> drag them onto the folder

Any other way?  Dragging in my case is pretty guaranteed to not work because I'm forwarding this browser over SSH from a different machine...
Hmm, OK, do as I said to create Local Folders. Then move the sample e-mails (say xx.eml and yy.eml) into Local Folders. Rename them to lose the file extension (say xx and yy), then edit with a text editor and add
  From - Mon Jan 08 01:53:45 2018
as the first line. That's the mbox delimiter. Then open TB, navigate to "Local Folders" and you'll see the folders xx and yy, both with a single message that renders badly.

Well, TB hacking 101 ;-) Let me know if that doesn't work, I'm usually on IRC.
Thanks, that gives me enough to work with.  So here's what we end up feeding into the HTML parser:

(rr) p mInputStream->mReadState.mReadCursor
$37 = 0x7fcb8dfe1000 "<html>\r\n<head>\r\n<title>Good Bacteria Can Cause Changes in Your Genes; Taste Receptor Dogma Challenged; More</title>\r\n<link rel=\"important stylesheet\" href=\"chrome://messagebody/skin/messageBody.css\">\r\n</head>\r\n<body>\r\n<div class=\"moz-text-html\"  lang=\"x-unicode\"><!DOCTYPE html PUBLIC \"-//W3C//DTD XHTML 1.0 Strict//EN\" \"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd\"><!-- start coded_template: i"...

Note the stuff that got inserted before the original HTML and in particular before the original doctype.  That stuff is coming from the mime machinery in Thunderbird.  for example, the <div class="moz-text-html"> is definitely coming from MimeInlineTextHTML_parse_begin in mailnews/mime/src/mimethtm.cpp....

Anyway, the "<html>" without doctype comes from nsMimeHtmlDisplayEmitter::EndHeader with this stack:

#3  0x00007fcbc6795d15 in nsMimeBaseEmitter::UtilityWriteCRLF(char const*) (this=0x7fcba4c1d100, buf=0x7fcbcfe34c45 "<html>") at ../../../../mailnews/mime/emitters/nsMimeBaseEmitter.cpp:408
#4  0x00007fcbc67997b6 in nsMimeHtmlDisplayEmitter::EndHeader(nsTSubstring<char> const&) (this=0x7fcba4c1d100, name=...) at ../../../../mailnews/mime/emitters/nsMimeHtmlEmitter.cpp:333
#5  0x00007fcbc6774136 in mimeEmitterEndHeader(MimeDisplayOptions*, MimeObject*) (opt=0x7fcba54bcc00, obj=0x7fcba6cfea00) at ../../../../mailnews/mime/src/mimemoz2.cpp:1833
#6  0x00007fcbc677a720 in MimeMessage_write_headers_html(MimeObject*) (obj=0x7fcba6cfea00)
    at ../../../../mailnews/mime/src/mimemsg.cpp:821
#7  0x00007fcbc67797d6 in MimeMessage_close_headers(MimeObject*) (obj=0x7fcba6cfea00)
    at ../../../../mailnews/mime/src/mimemsg.cpp:439
#8  0x00007fcbc677912a in MimeMessage_parse_line(char const*, int32_t, MimeObject*) (aLine=0x7fcb8dca0c00 "\n\n", aLength=1, obj=0x7fcba6cfea00) at ../../../../mailnews/mime/src/mimemsg.cpp:257
#9  0x00007fcbc674ff1d in convert_and_send_buffer(char*, int, bool, int32_t (*)(char*, uint32_t, void*), void*) (buf=0x7fcb8dca0c00 "\n\n", length=1, convert_newlines_p=true, per_line_fn=
    0x7fcbc6778be9 <MimeMessage_parse_line(char const*, int32_t, MimeObject*)>, closure=0x7fcba6cfea00)
    at ../../../../mailnews/mime/src/mimebuf.cpp:152
#10 0x00007fcbc67501db in mime_LineBuffer(char const*, int32_t, char**, int32_t*, uint32_t*, bool, int32_t (*)(char*, uint32_t, void*), void*) (net_buffer=0x7fcb8d7079a5 "\r\n------=_Part_1887964_1851278787.1515586927164\r\nContent-Type: text/plain; charset=utf-8\r\nContent-Transfer-Encoding: quoted-printable\r\n\r\nCells are crowded with macromolecules, which limits the diffusion of protei=\r\nns, especially in prokaryotic cells without active transport in the cytopla=\r\nsm.\r\n\r\nGuide to Selecting In Vitro Test Systems for Pre-Clinical Studies (http://e=\r\nmail.labxmediagroup.co"..., net_buffer_size=168434, bufferP=0x7fcba6cfea38, buffer_sizeP=0x7fcba6cfea48, buffer_fpP=0x7fcba6cfea50, convert_newlines_p=true, per_line_fn=
    0x7fcbc6778be9 <MimeMessage_parse_line(char const*, int32_t, MimeObject*)>, closure=0x7fcba6cfea00)
    at ../../../../mailnews/mime/src/mimebuf.cpp:238

This happens way before we actually start reading any of the HTML data.  That happens with this stack:

#0  0x00007fcbc6ad0a2c in NS_NewLocalFileInputStream(nsIInputStream**, nsIFile*, int, int, int) (result=0x7fcb8dfd0268, file=0x7fcb8dfcfe80, ioFlags=-1, perm=-1, behaviorFlags=0)
    at /home/bzbarsky/mozilla/comm-central/mozilla/netwerk/base/nsNetUtil.cpp:113
#1  0x00007fcbc6780894 in MimePartBufferRead(MimePartBufferData*, int (*)(char const*, int, void*), void*) (data=0x7fcb8dfd0250, read_fn=0x7fcbc676c014 <MimeLeaf_parse_buffer(char const*, int32_t, MimeObject*)>, closure=0x7fcb9e5899d0) at ../../../../mailnews/mime/src/mimepbuf.cpp:267
#2  0x00007fcbc676d1f4 in MimeMultipartAlternative_display_cached_part(MimeObject*, MimeHeaders*, MimePartBufferData*, bool) (obj=0x7fcb8dfd6c00, hdrs=0x7fcb8df63f80, buffer=0x7fcb8dfd0250, do_display=true)
    at ../../../../mailnews/mime/src/mimemalt.cpp:554
#3  0x00007fcbc676c77c in MimeMultipartAlternative_flush_children(MimeObject*, bool, priority_t) (obj=0x7fcb8dfd6c00, finished=true, next_priority=PRIORITY_UNDISPLAYABLE) at ../../../../mailnews/mime/src/mimemalt.cpp:248
#4  0x00007fcbc676c849 in MimeMultipartAlternative_parse_eof(MimeObject*, bool) (obj=0x7fcb8dfd6c00, abort_p=false) at ../../../../mailnews/mime/src/mimemalt.cpp:271
#5  0x00007fcbc6753e75 in MimeContainer_parse_eof(MimeObject*, bool) (object=0x7fcba6cfea00, abort_p=false)
    at ../../../../mailnews/mime/src/mimecont.cpp:108
#6  0x00007fcbc6779d17 in MimeMessage_parse_eof(MimeObject*, bool) (obj=0x7fcba6cfea00, abort_p=false)
    at ../../../../mailnews/mime/src/mimemsg.cpp:565

when we finish with the MIME-parsing process.  :(

I haven't thought of a sane fix for this yet.  I suspect putting all mail in standards mode will cause problems as well.  But at the time when we're doing nsMimeHtmlDisplayEmitter::EndHeader we have no idea whether we should be in standards mode or not....
Summary: A html message isn't rendered correctly (release 52.5.2) → A html message isn't rendered correctly (release 52.5.2) because Thunderbird always renders HTML in quirks mode
Attached patch 1429491-fix-quirks.patch (obsolete) — Splinter Review
Just since I was working on MIME I remembered this bug. I've taken all the MIME prelude out, the HTML now looks pretty original in the Dev Tools inspector, yet, it still seems to be rendered in quirks mode. Tested with attachment 8941493 [details].

Boris, do you have further cycles for this?
Attachment #8979313 - Flags: feedback?(bzbarsky)
The dev tools inspectors is not really going to tell you anything about what the parser is getting.

You want to breakpoint in nsHtml5StreamParser::DoDataAvailable and examine the incoming data.  I might be able to do that sometime, but I have no idea when.  Given that you already have a tree with that patch built, it might be simpler for you to do it.
Flags: needinfo?(jorgk)
I prefer a print to using the debugger:
for (uint32_t i = 0; i<aLength;i++) printf("%c", aBuffer[i]);

I see this:
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<!-- start coded_template: id:4652387406 path:Custom/email/responsive_email/TN_Simple_Email_Template_TN_Styles.html -->

I know where the first line comes from, let me kill it as well.
Flags: needinfo?(jorgk)
Dupe of bug 335499 I think.
Exactly. You suggested a patch in attachment 696972 [details] [diff] [review]. But that always switched to standards mode, which Boris didn't like either, see last paragraph of comment #17.

How about we remove the MIME junk? Let me refresh my patch in a minute.
Attached patch 1429491-fix-quirks.patch (v2) (obsolete) — Splinter Review
With this patch, the test message displays fine. It removes three layers of MIME junk:

1) <link rel="important stylesheet" href="chrome://messagebody/skin/messageBody.css">
2) <meta http-equiv="content-type" content="text/html; charset=...">
3) <div class="moz-text-html"  lang="...">

1) Is most likely necessary or smileys won't work, Richard?
2) Is nonsense with about 99.999% probability. Until today, that
   was http-equiv="Context-Type" (note Conte*x*t) and a no-op.
   We fixed the typo but it's still unnecessary. See bug 1463133.
3) Is nonsense with about 99.999% probability. The value of lang is set to
   a writing system name, so we see "x-western" or "x-unicode".

Boris, would you happen to know whether those values have any positive effect? I removed the lang attribute experimentally in bug 1463133 and it didn't make a difference.
Attachment #8979358 - Flags: feedback?(richard.marti)
Attachment #8979358 - Flags: feedback?(mkmelin+mozilla)
Comment on attachment 8979358 [details] [diff] [review]
1429491-fix-quirks.patch (v2)

Review of attachment 8979358 [details] [diff] [review]:
-----------------------------------------------------------------

messageBody.css is needed yeah, so we can add our out styles

content-type - yes remove that junk

I'm not sure about the lang stuff. Maybe junk, but it's a bit hard to tell. I'd test in particular with chinese/japanese mails and see if it makes any difference.
Attachment #8979358 - Flags: feedback?(richard.marti)
Attachment #8979358 - Flags: feedback?(mkmelin+mozilla)
(In reply to Magnus Melin from comment #25)
> messageBody.css is needed yeah, so we can add our out styles
But there must be a smarter way to inject this without destroying the original message.
Attachment #8979313 - Attachment is obsolete: true
Attachment #8979313 - Flags: feedback?(bzbarsky)
If you want styles you have to inject the styles into the document. The usual way would be to do it in javascript. This is just hardcoding that... 

Anyway, this code does so much ugly stuff it's quite questionable if it's useful to fix a wart or two.
(In reply to Magnus Melin from comment #27)
> If you want styles you have to inject the styles into the document. The
> usual way would be to do it in javascript. This is just hardcoding that...
Yes, we've done this quite a bit in JS during overlay removal.

> Anyway, this code does so much ugly stuff it's quite questionable if it's
> useful to fix a wart or two.
I don't understand the comment. We need to remove the three warts to get proper HTML display. Two will hopefully go in bug 1463133. So we worry about the third one here.
2) <meta http-equiv="content-type" content="text/html; charset=...">
3) <div class="moz-text-html"  lang="...">
have been removed in bug 1463133, so now we need to worry about
1) <link rel="important stylesheet" href="chrome://messagebody/skin/messageBody.css">
which comes from nsMimeHtmlEmitter.cpp.

I think that needs to go and be replaced with CSS injection using createProcessingInstruction(). The problem is where to put that call.
Probably not createProcessingInstruction.
But couldn't you put the code here, when you have the document https://dxr.mozilla.org/comm-central/rev/18881dd127e3b0c0d3f97390c9094e309d4dd9c1/mailnews/mime/src/mimeTextHTMLParsed.cpp#96

Create a <link rel="important stylesheet" href="chrome://messagebody/skin/messageBody.css"> node and insert that node into document head.
Good idea!
Magnus, you want to take the bug since you've already worked on bug 335499?
Before I try for hours, maybe Boris can give us a hint how to best achieve this.

Here's the story. Our MIME engine parses the HTML message part and serialises it out here:
https://dxr.mozilla.org/comm-central/rev/18881dd127e3b0c0d3f97390c9094e309d4dd9c1/mailnews/mime/src/mimeTextHTMLParsed.cpp#88
BTW, mozilla::dom::DOMParser::CreateWithoutGlobal() has just become very important ;-)

What's the quickest way to find the document head? And how to insert a link? I've just looked and we usually create nodes in the editor with |htmlEditor->CreateElementWithDefaults();|, but that doesn't apply here.
Flags: needinfo?(bzbarsky)
This should work:

    nsCOMPtr<nsIDocument> document = parser->ParseFromString(...);
    RefPtr<Element> link = document->CreateElem(NS_LITERAL_STRING("link"),
                                                nullptr, kNameSpaceID_XHTML);
    link->SetAttr(kNameSpaceID_None, nsGkAtoms::rel, 
                  NS_LITERAL_STRING("important stylesheet"), true);
    link->SetAttr(kNameSpaceID_None, nsGkAtoms::href, 
                  NS_LITERAL_STRING("chrome://messagebody/skin/messageBody.css"),
                  true);
    RefPtr<Element> head = document->GetHeadElement();
    head->AppendChildTo(link, true);
Flags: needinfo?(bzbarsky)
Alternately, of course, you could do:

  nsCOMPtr<nsIDocument> document = parser->ParseFromString(...);
  RefPtr<Element> head = document->GetHeadElement();
  head->InsertAdjacentHTML(NS_LITERAL_STRING("beforeend"),
                           NS_LITERAL_STRING("<link rel='stuff' href='whatever'>"),
                           rv2);

and then handle rv2 failing.
Thanks, Boris!

This works nicely for "normal HTML" view, but things like smileys are now missing from "simple HTML" view since the stylesheet was removed for all HTML views.

I'll have to see how to put that back for the simple view.
Attachment #8979358 - Attachment is obsolete: true
Attached patch 1429491-fix-quirks.patch (v4) (obsolete) — Splinter Review
This works nicely.

Boris, I don't know how much interest you have in this bug, but since you even did some debugging, you must be interested ;-) - So f?bz in case you want to look over the code, which is mostly your own, or try it.
Attachment #8980109 - Attachment is obsolete: true
Attachment #8980110 - Flags: review?(mkmelin+mozilla)
Attachment #8980110 - Flags: feedback?(bzbarsky)
Comment on attachment 8980110 [details] [diff] [review]
1429491-fix-quirks.patch (v4)

>+      12);

Would probably be better to use a named literal string and use its .Length() here.
Attachment #8980110 - Flags: feedback?(bzbarsky) → feedback+
Attached patch 1429491-fix-quirks.patch (v4b) (obsolete) — Splinter Review
OK, now using NS_NAMED_LITERAL_CSTRING().
Attachment #8980110 - Attachment is obsolete: true
Attachment #8980110 - Flags: review?(mkmelin+mozilla)
Attachment #8980118 - Flags: review?(mkmelin+mozilla)
Rebased for bug 1464391.
Attachment #8980571 - Flags: review?(mkmelin+mozilla)
Assignee: nobody → jorgk
Status: NEW → ASSIGNED
Attachment #8980118 - Attachment is obsolete: true
Attachment #8980118 - Flags: review?(mkmelin+mozilla)
Attachment #8980571 - Attachment is obsolete: true
Attachment #8980571 - Flags: review?(mkmelin+mozilla)
Let's fix this as part of bug 1464391.
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
After lots of discussion, quirks mode will be with us for a while longer.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Assignee: jorgk → nobody
Status: REOPENED → NEW

I think we need to join the club:

https://www.emailonacid.com/blog/article/email-development/12_things_you_must_know_when_developing_for_gmail_and_gmail_mobile_apps-2/#doctype

If you specify a DOCTYPE other than HTML5 in your email, you’ll find that it won’t render the same way in Gmail as it does in a browser or in an email client that respects your DOCTYPE. This is because Gmail renders all emails using the HTML5 DOCTYPE.
This is an issue that is not exclusive to Gmail. Many other email clients force HTML5, including Yahoo! Mail, Outlook.com, and Yandex on Mobile and Desktop Webmail; Inbox and Yahoo! Mail on iOS; and Inbox on Android. Apple Mail and Outlook both support whatever DOCTYPE you want to use, but since most other email clients support only HTML5, it’s best to just stick with the HTML5 DOCTYPE for your emails.

Blocks: 1744944
OS: Windows 10 → All
Hardware: x86 → All
Summary: A html message isn't rendered correctly (release 52.5.2) because Thunderbird always renders HTML in quirks mode → A html message isn't rendered correctly because Thunderbird always renders HTML in quirks mode
Assignee: nobody → mkmelin+mozilla
Status: NEW → ASSIGNED
Attachment #9288992 - Attachment description: Bug 1429491 - Use standards mode for emails. r=#thunderbird-reviewers → Bug 1429491 - Use standards mode for emails. r=darktrojan
Target Milestone: --- → 105 Branch

Pushed by geoff@darktrojan.net:
https://hg.mozilla.org/comm-central/rev/46ddd8651c31
Use standards mode for emails. r=darktrojan

Status: ASSIGNED → RESOLVED
Closed: 6 years ago2 years ago
Resolution: --- → FIXED

Pushed by mkmelin@iki.fi:
https://hg.mozilla.org/comm-central/rev/3ef4b8ad53a4
follow-up - Fix some broken tests. r=mkmelin

See Also: → 1857161
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: