Closed Bug 528736 Opened 15 years ago Closed 6 years ago

problem displaying multipart mixed with multipart alternative part

Categories

(MailNews Core :: MIME, defect)

x86
Windows NT
defect
Not set
normal

Tracking

(blocking-thunderbird3.1 -)

RESOLVED WORKSFORME
Tracking Status
blocking-thunderbird3.1 --- -

People

(Reporter: Bienvenu, Unassigned)

References

Details

(Keywords: regression, testcase)

Attachments

(2 files)

The attached message displays as entire message source instead of correct html part. This is trunk only - TB 2.0 and 3.0rc1 all display the html part correctly.
viewing message as simple html or plain text also works fine.
Assignee: nobody → bienvenu
Flags: blocking-thunderbird3.1+
Tested with next Tb trunk build. 
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.3a1pre) Gecko/20091114 Shredder/3.1a1pre

(A) With attached test mail, View/Message Body As/Original HTML.
Next error in Error Console. Mail source is displayed.
> Warning: Expected declaration but found '3D'.  Skipped to next declaration.
> Source File: mailbox:///C|/wada/@@@/Mail1/bug-528736.txt?number=0 Line: 0

(B) Commenting out of <META> tag for http-equiv=Content-Type.
> <!--
> <META content=3Dtext/html;charset=3Diso-8859-1 =
> http-equiv=3DContent-Type>
> -->
No problem occurs. HTML is rendered as expected.

Problem like next?
(1) charset check of <META> http-equiv=Content-Type seems to be executed without decode of quoted-printable.
(2) If error occurs during charset check of <META> tag, phenomenon similart to Bug 506504 occurs.
Note: If charset of <META> is processed normally, Bug 513249 occurrs. See also Bug 508946 Comment #1.
Could it be that meta tag detection is causing a reload, confusing MIME?
Is there an easy way to turn it off that I could test?
> Could it be that meta tag detection is causing a reload

Only if something is really broken, since there's a Content-Type header for the same type right there.

What's the regression range?  That would help more than anything else in identifying what changed...
We're definitely reloading the message.

>	msglocal.dll!nsMailboxProtocol::LoadUrl(nsIURI * aURL=0x081f09f0, nsISupports * aConsumer=0x00000000)  Line 436	C++
 	msgbsutl.dll!nsMsgProtocol::AsyncOpen(nsIStreamListener * listener=0x068ff990, nsISupports * ctxt=0x00000000)  Line 602 + 0x1f bytes	C++
 	docshell.dll!nsURILoader::OpenURI(nsIChannel * channel=0x09655f34, int aIsContentPreferred=0, nsIInterfaceRequestor * aWindowContext=0x06ba4248)  Line 840 + 0x19 bytes	C++
 	docshell.dll!nsDocShell::DoChannelLoad(nsIChannel * aChannel=0x09655f34, nsIURILoader * aURILoader=0x044b0bc8, int aBypassClassifier=0)  Line 8516 + 0x41 bytes	C++
 	docshell.dll!nsDocShell::DoURILoad(nsIURI * aURI=0x081f09f0, nsIURI * aReferrerURI=0x00000000, int aSendReferrer=1, nsISupports * aOwner=0x076b1fa8, const char * aTypeHint=0x001ce864, nsIInputStream * aPostData=0x00000000, nsIInputStream * aHeadersData=0x00000000, int aFirstParty=1, nsIDocShell * * aDocShell=0x00000000, nsIRequest * * aRequest=0x001ce7dc, int aIsNewWindowTarget=0, int aBypassClassifier=0, int aForceAllowCookies=0)  Line 8362 + 0x29 bytes	C++
 	docshell.dll!nsDocShell::InternalLoad(nsIURI * aURI=0x081f09f0, nsIURI * aReferrer=0x00000000, nsISupports * aOwner=0x076b1fa8, unsigned int aFlags=0, const wchar_t * aWindowTarget=0x00000000, const char * aTypeHint=0x001ce864, nsIInputStream * aPostData=0x00000000, nsIInputStream * aHeadersData=0x00000000, unsigned int aLoadType=67108866, nsISHEntry * aSHEntry=0x00000000, int aFirstParty=1, nsIDocShell * * aDocShell=0x00000000, nsIRequest * * aRequest=0x00000000)  Line 8046 + 0x8c bytes	C++
 	docshell.dll!nsDocShell::Reload(unsigned int aReloadFlags=1024)  Line 3941 + 0x62 bytes	C++
 	docshell.dll!nsDocShell::ReloadDocument(const char * aCharset=0x001cec2c, int aSource=10)  Line 11006 + 0x17 bytes	C++
 	chardet.dll!nsObserverBase::NotifyDocShell(nsISupports * aDocShell=0x06ba42c8, nsISupports * aChannel=0x0882ab54, const char * charset=0x001cec2c, int source=10)  Line 81 + 0x2a bytes	C++
 	chardet.dll!nsMetaCharsetObserver::Notify(nsISupports * aDocShell=0x06ba42c8, nsISupports * aChannel=0x0882ab54, const nsTArray<nsString> * keys=0x001ceedc, const nsTArray<nsString> * values=0x001ceee8)  Line 305 + 0x28 bytes	C++
 	chardet.dll!nsMetaCharsetObserver::Notify(nsISupports * aDocShell=0x06ba42c8, nsISupports * aChannel=0x0882ab54, const wchar_t * aTag=0x650c52e8, const nsTArray<nsString> * keys=0x001ceedc, const nsTArray<nsString> * values=0x001ceee8, const unsigned int aFlags=0)  Line 151 + 0x1e bytes	C++
 	htmlpars.dll!nsObserverEntry::Notify(nsIParserNode * aNode=0x07e50478, nsIParser * aParser=0x0c297068, nsISupports * aDocShell=0x06ba42c8, const unsigned int aFlags=0)  Line 1105 + 0x43 bytes	C++
 	gklayout.dll!HTMLContentSink::NotifyTagObservers(nsIParserNode * aNode=0x07e50478)  Line 2610	C++
 	htmlpars.dll!CNavDTD::WillHandleStartTag(CToken * aToken=0x118c3418, nsHTMLTag aTag=eHTMLTag_meta, nsIParserNode & aNode={...})  Line 1086 + 0x20 bytes	C++
 	htmlpars.dll!CNavDTD::HandleStartToken(CToken * aToken=0x118c3418)  Line 1286 + 0x14 bytes	C++
 	htmlpars.dll!CNavDTD::HandleToken(CToken * aToken=0x118c3418)  Line 714 + 0xc bytes	C++
 	htmlpars.dll!CNavDTD::BuildModel(nsITokenizer * aTokenizer=0x07bc1dc8, int aCanInterrupt=1, int aCountLines=1, const nsCString * __formal=0x0c297104)  Line 301 + 0xc bytes	C++
 	htmlpars.dll!nsParser::BuildModel()  Line 2427 + 0x5f bytes	C++
 	htmlpars.dll!nsParser::ResumeParse(int allowIteration=1, int aIsFinalChunk=0, int aCanInterrupt=1)  Line 2325 + 0xe bytes	C++
 	htmlpars.dll!nsParser::OnDataAvailable(nsIRequest * request=0x0882ab54, nsISupports * aContext=0x088b6510, nsIInputStream * pIStream=0x078229c8, unsigned int sourceOffset=0, unsigned int aLength=1300)  Line 2955 + 0x17 bytes	C++
 	docshell.dll!nsDocumentOpenInfo::OnDataAvailable(nsIRequest * request=0x0882ab54, nsISupports * aCtxt=0x088b6510, nsIInputStream * inStr=0x078229c8, unsigned int sourceOffset=0, unsigned int count=1300)  Line 306 + 0x30 bytes	C++
 	emitter.dll!nsMimeBaseEmitter::Complete()  Line 954 + 0x31 bytes	C++
OK.  Can you dig into why?  It sure sounds like _someone_ is screwing up: there should be no reload here, if the channel impl is not broken.

Is the change from branch that we started doing the reload?  Or that we started doing it wrong?
(In reply to comment #7)

> 
> Is the change from branch that we started doing the reload?  Or that we started
> doing it wrong?

Definitely the former, but perhaps we're also doing it wrong. In any case, 3.0, based on 1.9.1.5, did not do the reload at all.
OK.  When did we start doing the reload (in terms of trunk nightlies)?

Alternately, why is the "do we already have this charset" test failing?  What does nsDocument::TryChannelCharset do in this case?
I don't know when this started failing - it could have been a very long while ago since I only very recently started running trunk builds.

Because aChannel->GetContentCharset returns an empty string, GetPreferred fails, and TryChannelCharset returns false.

Our channel implementation doesn't implement SetContentCharset (i.e., it just generates a warning), but it doesn't get called either, so implementing it wouldn't seem to make a difference. I can try to figure out why it's not getting called.
for some reason, libmime's nsMimeBaseEmitter::UpdateCharacterSet doesn't set the charset on the channel if it's us-ascii, utf-8, or iso-8859-1, which is perhaps why SetContentCharset isn't getting called on our channel.
> I don't know when this started failing

I was suggesting using nightlies to figure out, if possible.

> Because aChannel->GetContentCharset returns an empty string

That seems like a bug in this case, no?  There's a clearly listed charset in the MIME data...

> doesn't set the charset on the channel if it's us-ascii, utf-8, or iso-8859-1

That seems pretty bizarre to me.

When the meta reload happens, what's the old charset in this case?  Note the "did the charset change?" checks in nsMetaCharsetObserver::Notify.
And of course we might want a separate bug on the reload not working correctly...
(In reply to comment #12)
> > I don't know when this started failing
> 
> I was suggesting using nightlies to figure out, if possible.

Perhaps QA could help with that...
> 
> 
> That seems like a bug in this case, no?  There's a clearly listed charset in
> the MIME data...

Yeah, it all seems wrong on the mailnews side. I've fixed the weirdness in mailnews but it doesn't fix the problem - I'm still digging into that.

> When the meta reload happens, what's the old charset in this case?  Note the
> "did the charset change?" checks in nsMetaCharsetObserver::Notify.

I'll look.
The reason fixing the mailnews code to set the content charset didn't help is that the call to TryChannelCharset happens before libmime figures out the charset and sets it on the channel. I'll go try to figure out what the old charset was, though I suspect it's just empty.
It's not likely to be empty, since we'd fall back to the default document charset if nothing else.  And that should be ISO-8859-1....

How can the call to TryChannelCharset happen before the libmime thing?  Does libmime set the charset on the channel after OnStartRequest?  That's pretty broken, if so.
It was UTF-8, though that must have been a fallback because we haven't set any charset on the channel at that point. 

Libmime doesn't know the charset of the channel until it has parsed that message header. How can we know the charset before we've started parsing the message?
> It was UTF-8

Curious.  With what charset source (you might have to look up the stack for that)?

> How can we know the charset before we've started parsing the message?

You can't, but why is mailnews calling OnStartRequest on the consumer of that part before it has the information OnStartRequest promises to have?

What kind of channel is the nsDocument getting?  Is it a part channel, or something that mailnews synthesizes?
We don't have part channels per se - our channel is the msg protocol object, i.e., one per message loaded. I'll try to see where the OnStartRequested notification is getting generated for the consumer of the message.

kCharsetFromMetaPrescan is the source. And now I remember a patch that changed this. I'll go look for that.

I realize that our channels are not like other channels, and I'd really like to address that for 3.1. Is there some good documentation about how our channels should behave?
> kCharsetFromMetaPrescan is the source.

Of the utf-8?  That seems... odd.

> Is there some good documentation about how our channels should behave?

nsIChannel.idl is pretty good, last I checked... if you have any questions that aren't answered there or on mdc, please let me know and I'll document.
(In reply to comment #20)

> Of the utf-8?  That seems... odd.
From what I could tell poking around the code, yeah.

the patch up for review in bug 505072 fixes this bug, and one thing it tweaks is SetHintCharacterSetSource to kCharsetFromMetaTag from kCharsetFromMetaPrescan.
nsIChannel.idl doesn't say anything about what information OnStartRequest promises, from what I can tell, and the MDC doc seems auto-generated from the idl.

There seems to be a little confusion about the fact that we're displaying a mime part without having a different channel - we're basically emitting just the html for the message we're displaying, and in that case, that happens to be what's in the html part of the message. But it's not a separate fetch request to display that part, unlike, say, an image attachment, which does generate a separate request and a separate channel.
> doesn't say anything about what information OnStartRequest promises

Hmm, indeed.  The basic promise is that all metadata is available at that point and that in particular contentCharset is set correctly.  We should definitely update the docs.  I'll try to go through and do that.

> we're displaying a mime part without having a different channel

Oh, hmm.  So you send the OnStartRequest when you start parsing the outermost MIME envelope?  But OnStartRequest promises a correct contentType as well, and if _that_ didn't work right in your code the HTML wouldn't render at all.  You should basically be setting the charset wherever it is you set the type.
> the patch up for review in bug 505072 fixes this bug

Urgh.  That messes directly with the document viewer?  <sigh>

It'd fix this bug because after that call the <meta> would be ignored, but what charset would it set for this case?
(In reply to comment #23)

> 
> Oh, hmm.  So you send the OnStartRequest when you start parsing the outermost
> MIME envelope?  But OnStartRequest promises a correct contentType as well, and
> if _that_ didn't work right in your code the HTML wouldn't render at all.  You
> should basically be setting the charset wherever it is you set the type.

We know we're generating html for the message no matter what, because that's the mime emitter we're using. I'm not sure who generates the OnStartRequest that the document is seeing; it's possible we could figure out a way of delaying it.
Oh, and if the same channel object represents both the the whole message and the HTML part, then that explains why a reload shows the symptoms of this bug, at least.
> that's the mime emitter we're using.

Hmm.  Does this emitter pass through the raw bytes from the MIME part?  Or do they become PRUnichar somewhere in this pipeline?

What happens if the MIME part is SVG instead of HTML?
(In reply to comment #26)
> Oh, and if the same channel object represents both the the whole message and
> the HTML part, then that explains why a reload shows the symptoms of this bug,
> at least.

Reload creates a whole new channel, but the decision to emit html is controlled by options on the uri, iirc.

I doubt that we just pass through the raw bytes from the mime part; it's very difficult to get libmime to do that.

If the mime part were SVG, I'm not sure what we'd do. In the case of multipart alternative, we look for the highest info content that we know how to display.
> Reload creates a whole new channel, but the decision to emit html is controlled
> by options on the uri, iirc.

I see.  But the reload should be using the same nsIURI as the old channel, I'd think...

> I doubt that we just pass through the raw bytes from the mime part

The reason I asked is because if the bytes go through PRUnichar before becoming bytes again, then the relevant charset is the one used at that point, not whatever's in the headers or meta tag...

> we look for the highest info content that we know how to display.

Where "we" == Gecko, or mailews?  I suggested SVG quite deliberately, since Gecko does in fact know how to render SVG documents...
(In reply to comment #29)
> > Reload creates a whole new channel, but the decision to emit html is controlled
> > by options on the uri, iirc.
> 
> I see.  But the reload should be using the same nsIURI as the old channel, I'd
> think...
Yes, it should. 

> 
> Where "we" == Gecko, or mailews?  I suggested SVG quite deliberately, since
> Gecko does in fact know how to render SVG documents...

we, libmime - it finds the mime class for the part and determines if it is displayable inline - MimeMultipartAlternative_display_part_p
libmime (specifically, nsStreamConverter.cpp) does seem to have a mechanism for postponing the OnStartRequest. I'll see if using it in this situation fixes this bug.
Status: NEW → ASSIGNED
Checked with next builds on which patch for Bug 505072 is landed. Problem of this bug couldn't be reproduced.
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.3a1pre) Gecko/20091118 Shredder/3.1a1pre
> Build identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; rv:1.9.3a1pre) Gecko/20091118 SeaMonkey/2.1a1pre

Remaining issue around inrernal load is Bug 506504.
Bz, I wonder if you would mind looking at this to see if you think it's the right thing to do. I had to revert part of the patch in  bug 505072 in order for the test case to break and then work with this patch.
Attachment #413489 - Flags: superreview?(bzbarsky)
Attachment #413489 - Flags: review?(bugzilla)
I don't know why the old code was ignoring charset changes - it seems unlikely it would be the right thing to d.
Seems ok, though that |PL_strcasestr(cBegin, "charset=")| in the context makes me want to cry... :(
Attachment #413489 - Flags: superreview?(bzbarsky) → superreview+
blocking-thunderbird3.1: --- → rc1+
Flags: blocking-thunderbird3.1+
Standard8, iirc, you said over irc that you couldn't reproduce this issue - the reason is that I needed to partially backout an other fix to recreate the issue, per #c33, in particular, I think the first chunk, the change to nsMessenger.cpp. in #c24, bz says he thinks we should fix this bug; however, I'm not sure if we have a real world issue that we can demonstrate is fixed by this patch. I'm going to take this off the 3.1 blocker list for that reason...
blocking-thunderbird3.1: rc1+ → -
It may well break various extensions to not follow the necko API contract here.  Whether such extensions exist is a separate question, of course.
Attachment #413489 - Flags: review?(bugzilla) → review+
xref bug 506504 and bug 572886 to this bug for ease of tracking and search.
Blocks: 506504, 572886
FYI.
> Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.3a6pre) Gecko/20100617 Shredder/3.2a1pre
With above trunk build, double internal reload was observed even with simple text/html mail during test for bug 572886.
  Simple text/html mail instead of text/html part in multipart/xxx. 
  charset in meta http-equiv="Content-Type" != charset in Content-Type: header
In this case, Bug 505072 doesn't occur(already fixed), but this bug occurs, then phenomenon of Bug 506504 is observed.
Problem is not observed with simple text/plain mail of same charset.
  charset in meta http-equiv="Content-Type" == charset in Content-Type: header
Wrote condition wrongly. Sorry for spam.
- Simple text/html mail instead of text/html part in multipart/xxx 
- charset in meta http-equiv="Content-Type" != charset in Content-Type: header
- other charset of meta http-equiv="Content-Type" exist(multiple meta in html)
FYI.
bug 572886 and problem I wrote in comment #39 can't be observed with html5.enable=false.
Probably worth filing an HTML parser, then...
No longer blocks: 506504
Jorg, another testcase - I didn't test it.  Note comment 36
Assignee: mozilla → nobody
Status: ASSIGNED → NEW
Keywords: testcase
There is no problem with the message attached here. The patch has four hunks, three of which are already in the code base, the one in mailnews/mime/src/nsStreamConverter.cpp is not.

There is no need for action here. Closing this Resolved / WFM since I can't do the research now how and when this got fixed.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: