Open Bug 230423 Opened 21 years ago Updated 2 years ago

bold/italic don't stop at right place, in HTML message when rendered in "as plain text" mode

Categories

(MailNews Core :: MIME, defect)

x86
Windows 2000
defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: agile.bowl7038, Unassigned)

References

Details

(Keywords: testcase)

Attachments

(7 files)

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6b) Gecko/20031208
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6b) Gecko/20031208

Occasionally an email I receive from a portal doesn't display correctly when I
view the message.  I have View | Message Body As | Plain Text set, so I do not
expect to see any HTML tags when viewing the message.

The snippet that doesn't render correctly looks like the following when viewing
the message source:

			  <span class="boldmsg">
			  SPONSORED MESSAGE</span><br>

<b>FREE MCSE 2003 Certification Training & Register to WIN a 42” PLASMA T.V.
Monitor!!!</b><br><br> 



<b>MCSE 2003</b> is one of the most in demand certifications.

"is one" and beyond is bolded when viewing the message as plain text.  I
wouldn't expect this.

This also makes the entire message somewhat difficult to read.




Reproducible: Always

Steps to Reproduce:
1. Create a message with the body that startes in the "additional information"
area.  Also, feel free to email me if you want one of the messages to play with.
2. View | Message Body As | Plain text
3. View the message.

Actual Results:  
99% of the entire message is in bold.
Reproducible in Mozilla 1.5.

Expected Results:  
No HTML markups at all.
If Message Body As | Simple HTML is set, the message is rendered as expected.

<html>
<head>
   <title>Dice.com - Job Alert</title>
   <STYLE TYPE="text/css" MEDIA=screen,print>
   <!--
	   body	{
				   background-color: #ffffff;
				   color: Black;
				   font : normal 14 Verdana, Arial, Helvetica, sans-serif;
			   }
	   td	  {
				   color : Black;
				   font : normal 14 Verdana, Arial, Helvetica, sans-serif;
			   }
	   option  {
				   font : normal 14 Verdana, Arial, Helvetica, sans-serif;
			   }
	   select  {
				   font : normal 14 Verdana, Arial, Helvetica, sans-serif;
			   }
	   A:link{color: #0000cc;}
	   A:active{color: #3333ff;}
	   A:visited{color: #000066;}
	   A:hover{color: #3333ff;}
	   .boldmsg	{
				   background-color: #ffffff;
				   color: Black;
				   font : bold 14 Verdana, Arial, Helvetica, sans-serif, bold;
			   }
	   .textmsg	{
				   background-color: #ffffff;
				   color: Black;
				   font : normal 14 Verdana, Arial, Helvetica, sans-serif;
			   }
   -->
   </STYLE>
</head>
<body topmargin="0" leftmargin="0" text="#000000"
   marginheight="0" marginwidth="0"
   style="font : normal 12 Verdana, Arial, Helvetica, sans-serif">
  <table width="600" cellpadding="0" cellspacing="0" border="0">
	<tr>
	  <td align="left" valign="top">
		&nbsp;
	  </td>
	  <td align="left" valign="top">
		<table width="100%" cellpadding="3" cellspacing="0" border="0">

		  <tr>
			<td width="10" valign="top">
			  <img src="http://seeker.dice.com/assets/images/arrowhead_dark.gif"  border="0">
			</td>
			<td width="100%">
			  <span class="boldmsg">
			  SPONSORED MESSAGE</span><br>

<b>FREE MCSE 2003 Certification Training & Register to WIN a 42” PLASMA T.V.
Monitor!!!</b><br><br> 



<b>MCSE 2003</b> is one of the most in demand certifications.
<b>Raises</b>...<b>Promotions</b>...and <b>Job Security</b> are only a few of
the benefits training can bring to those who are certified!  <b>PLUS</b> –
Register to <b>WIN a 42” PLASMA T.V. Monitor!</b>  Free training CD for taking
our Skills Assessment test.  Offer valid in the United States & Caribbean only.<br> 


<a
href="http://ad.doubleclick.net/clk;6937166;7836278;e?http://www.learnkeydirect.com/assessment.asp?code=dce1054"
target="_blank">FREE MCSE 2003 CD for the first 50 respondents</a>
<br>
			 <br>
			</td>
		  </tr>
Could you please attach a message that exhibits this symptom to this bug?

What does the message look like when you display it *as* HTML?
Test message that demonstrates the bug when View | Message Body As | is set to
Plain Text.  The message will have continuous bolding on the second line.
Attachment created.  When I view these types of messages that have this problem
as HTML (Simple), the message gets laid out as I'd expect it to.

It's just with the "Plain Text" setting that the bolding occurs and gets stuck.

Let me know if you can reproduce this.  
Alright, I've played around with this.  What I see is not how you described the 
problem: "HTML tags" are not being rendered.  I see the href's from <a> tags 
being rendered as <http://mozilla.org>.  This in fact is expected in the 
HTML->plain conversion; it's how the links, which are important content within 
the mail, are maintained.

I also see that text which was designated <b>bold</b> (or styled bold) in the 
HTML being rendered as *bold* in the plain text.  Again, this is expected.  The 
asterisks used to generate the *bold* text are placed where you'd expect them to 
be; there are some rendering issues for certain texts (bug 206298), but 
generally speaking, this is done correctly as well.

However, there is an error in the mail program's display of the structured plain 
text, for a certain condition:
  1) the HTML contains a table
  2) a cell in the table contains bold text, which terminates with the end of
     another embedded tag:
       <td>some text <b>some bold text and break<br></b> more text </td>
                                                ~~~~~~~~
       <td>some text <b>some bold text <a>a link</a></b> more text </td>
                                                ~~~~~~~~
       <td>some text <b>some bold text <i>italics</i></b> more text </td>
                                                 ~~~~~~~~
  3) some other cell (or cells) contains bold text.

In this case, all the table text between the first bold cell and the last one is 
rendered in bold.  Any links that exist in this range are also rendered in bold 
text (not as hotlinks).  This problem also occurs for <i>italic text</i> that's 
converted to /italic text/, and <u>underlined</u> to _underlined_ -- that is, 
all the "structured" forms.

This condition is not as rare as one might expect: many of the firms sending out 
HTML mail use tables and formatting willy-nilly.  I get weekly mail from 
Ticketmaster and it exhibits this same symptom viewed As Plain Text.

Note that this problem can be seen in the plain-text conversion of HTML 
attachments, as well as in messages composed as HTML.  I will add some small 
HTML attachments to this bug as testcases.

Also note: if the plain-text is copied and pasted into a plain-text message, 
that message will render the structured attributes as expected (excepting the 
problems from bug 206298).

xref bug 18012, but this bug is only about this problem with the bold text 
persisting where it shouldn't.

This problem replicated with Windows 2000, 1.5 Final and 1.7a-0108.
Status: UNCONFIRMED → NEW
Ever confirmed: true
OS: Windows XP → Windows 2000
Summary: HTML tags render when Plain Text set → HTML message, as plain text, incorrectly persists 'structured' (bold, italic) attributes
Attached file testcase 1
Attach this file (and the following ones) to an email message .  Send (or save
as Draft).  View the message As Plain, noting how the attachment is rendered.
(Note: attachment should be "Disposition=inline" and the program should be
configured with View|Attachments Inline checked.)
Attached file testcase 2
Attached file testcase 3
Attached file testcase 4
Attached file testcase 5
Yeah, I've seen this bug, too, and it makes the msg look bad. The problem is
that for some reason, the bolding starts, but does not end.

> When I view these types of messages that have this problem
> as HTML (Simple), the message gets laid out as I'd expect it to.

Exactly, that's the "official" workaround. :-)

BTW: Is there any concrete reason why you can't just use the Simple HTML mode?
Product: Browser → Seamonkey
Assignee: sspitzer → mail
*** Bug 224802 has been marked as a duplicate of this bug. ***
The Science email alerts are prominent examples for this annoying bug. I attached one as testcase. The problem is clearly visible after the heading "Bones of Contention, and Dirty Too?".
Assignee: mail → nobody
QA Contact: esther → message-display
Moving to Core, this can still be seen in trunk builds of TB and SM.
Component: MailNews: Message Display → Backend
Product: SeaMonkey → MailNews Core
QA Contact: message-display → backend
Blocks: 289611
Component: Backend → MIME
QA Contact: backend → mime
Keywords: testcase
(In reply to comment #13)
> Moving to Core, this can still be seen in trunk builds of TB and SM.

Mike, your observation of Comment #4 is perfect and it's done by converter.html2txt.structs=true (default is true) for supporting "structured text" (if text/plain, mail.display_struct is used for which true is defaulted too.)
> http://edmullen.net/Mozilla/moz_stext.php
> http://kb.mozillazine.org/Mail_and_news_settings
So, if this bug is for "Tb's behaviour is wrong", I think this bug is INVALID, because html2txt.structs=true(and mail.display_struct=true) is defaulted and because Tb works as designed. (Note: mail.send_struct is defaulted to false)

Why this bug is NEW & still NEW? (NEW == flaw in code of Tb or Mailnews-Core exists.)
This bug is requesting html2txt.structs=false(and mail.display_struct=false) as default?
Or this bug is requesting UI enhancement for "structured text" setting? (Tb version of Bug 199137?)
If this bug is request for default of html2txt.structs=false, I'll vote, because I believe number of (a) << number of (b), as seen in this bug.
  (a) People who is happy because html2txt.structs=true is defaulted.
  (b) People who is unhappy because html2txt.structs=false is not defaulted.
> If this bug is request for default of html2txt.structs=false

No, it's not. It's just a bug in the code.
(In reply to comment #16)
> > If this bug is request for default of html2txt.structs=false
> No, it's not. It's just a bug in the code.

For inconsistent or funny html2txt.structure conversion, and/or conversion which is different from user's normal expectation?
I think you misunderstood this bug. It's only when you view an HTML message, and use the "View | As Plaintext" mode. This particular feature was always just a goody, and it's discouraged.
====== Use "View | Simple HTML" instead, see comment 16. =======

The bug described here is a dup of bug 122876.

There's no point to argue to remove whole features just because they have bugs. Rather, fix the bugs. Or wait for somebody to do it.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → DUPLICATE
Summary: HTML message, as plain text, incorrectly persists 'structured' (bold, italic) attributes → bold/italic don't stop at right place, in HTML message when rendered in "as plain text" mode
Not a dupe.  Look at the test cases.

Bucksch, don't you ever presume to close one of my bugs again.
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
It *is* a dup, unless you can show how bug 122876 doesn't cause it.

This is just another instance of the problem filed in bug 122876: HTML is converted to plaintext, the table is turned into spaces and newlines. The bold tags get turned into * . Then we try to make the *starred* words
bold again, *but that doesn't work when there's a linebreak between start
and end*, like here. That's what bug 122876 is about, and I think that's what this bug is about as well. Therefore, DUP.

> don't you ever presume to close one of my bugs again.

It's not "your bug": You are neither filer nor owner.
And no need for hostility.
I understand that the linebreak issue would *prevent* text from being displayed bold even though it is enclosed in asterisks. But does this also explain why bold print etc. sometimes start in the middle of the text and are never turned off again until the end of the mail?

And why is the plaintext output from the html2txt converter rendered differently than exactly the same plaintext when it appears in an original text/plain message?
> does this also explain why bold print etc. sometimes start in the
> middle of the text and are never turned off again until the end of the mail?

Yes, that's part of the same bug, IIRC, or at least closely related.

> And why is the plaintext output from the html2txt converter rendered
> differently than exactly the same plaintext when it appears in an original
> text/plain message?

It's not rendered differently, it's the same code.
I've seen this bug recently and I wrote a comment somewhere that this happened even for a number of single words on a single row.
(In reply to Thomas D. from comment #23)
> I've seen this bug recently and I wrote a comment somewhere that this
> happened even for a number of single words on a single row.

Or maybe the reverse case of this bug, structured plain text rendered with HTML formatting.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: