Closed Bug 131889 Opened 22 years ago Closed 22 years ago

Edit as New menu inserts extra http-equiv charset lines into mail body & head elements

Categories

(MailNews Core :: Internationalization, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

(Not tracked)

VERIFIED FIXED
mozilla1.1alpha

People

(Reporter: momoi, Assigned: bugzilla)

References

Details

(Keywords: intl, topembed-, Whiteboard: have fix [ish1+])

Attachments

(3 files, 4 obsolete files)

** Observed with 2002-03-12 Win32 trunk build **

I will attch a mailbox file with 4 test cases. One of the test cases is not
consistent but others seem to be consistent.

1. Case 1: Meta equiva charset line is inserted within the <head> element.
   This is a normal case.
   Sent with no special setting (options | format auto-detect)

2. Case 2:  Meta equiva charset line is NOT inserted within the <head> element.
   This is a normal case.

   Sent with options | format | HTML only option ON.

3. Case 3: a longer message but normal. Sent with no special setting (options |
format auto-detect)

4. Case 4: a longer message with a duplicated charset lines in the <head>
    element and within the <body> area.

   This was created by taking the received message in step 3 and using
   Message | Edit as new ..  menu.

It seems that the charset line gets inserted again and again. We should fix 
this problem.

I have not been able to consistently reproduce case 2. We need help
from others on this.
Place this file in your local mail folder and view the sources.
nabeta1 and topembed. This problem will affect embedded client's
mail program badly.
Keywords: nsbeta1, topembed
Keywords: intl
So do we want to put http-equiv charset or not?
Status: NEW → ASSIGNED
>nabeta1 and topembed. This problem will affect embedded client's mail 
>program badly.
I don't understand here, you use Mozilla to generate these messages and you
think some other software cannot handle these message, is that right?
1. in what case the message is illegal in MIME or HTML ? 
IS this bug file against mail sending or receiving?
If all the message we send are legal, then you should file bug against that
receiving software, because other mailer may choose to send them these kind of
message to break them if they don't fix it regardless mozilla change the sending
format or not. 
I don't think HTML or MIME require consistency of meta tag
What is the real issue here ?

I think we should nsbeta1- this one
> 1. in what case the message is illegal in MIME or HTML ? 

We are inserting an extra <Meta HTTP-equiv ...> charset line
in the body as well as the <head> ... </head>.
This is illegal HTML. We need to fix this. This is a Mozilla-side
problem --> Case 4 above. Study Case 4 please before you make your
conclusion.

> IS this bug file against mail sending or receiving?

This is filed against Mail sending by Mozilla. It IS OUR bug.

> I don't think HTML or MIME require consistency of meta tag

This comment applies to Case 2 only. You may be right but
Mozilla should be consistent in generating its own
HTML code. I concede that this is not as critical as Case 4.

> What is the real issue here ?

The big issue is Case 4. We are sending illegal HTML. We should
stop doing this. We will probably cause a problem to some mail
programs which have to interpret our bad HTML.

> I think we should nsbeta1- this one

I disagree. We must fix at least Case 4.It is a bad bug and
is likely to cause a problem for other mail programs.

Case 2, I will try to track it down as time allows me. But it is
not a good thing, either, for us to be incoistent this way. It could
be a symptom of something else that might cause us a problem.

Note: The test msgs are all generated by Mozilla. This is our problem.
>3. Case 3: a longer message but normal. Sent with no special setting 
>(options | format auto-detect)

>4. Case 4: a longer message with a duplicated charset lines in the <head>
>element and within the <body> area.
>
>   This was created by taking the received message in step 3 and using
>   Message | Edit as new ..  menu.
how do you create case 4. what does "taking" mean here ? copy and paste ?
Please specify step-to-step reproduce procedure. Thanks.

>The big issue is Case 4. We are sending illegal HTML. 
Is that ILLEGAL html ? or invalid tag in html? I think it is invalid to have
META in the body but not illegal. 
> >   This was created by taking the received message in step 3 and using
>   Message | Edit as new ..  menu.
> how do you create case 4. what does "taking" mean here ? copy and paste ?
> Please specify step-to-step reproduce procedure. Thanks.

Please let mail QAs reproduce this problem. It should be easy to use
the menu item. If you like, here are the steps:

1. Send yourseld a HTML message with 30 lines of text or more in the body.
2. Receive it and view it. Now choose Message | Edit Message as New
   menu. Add some comments to this and send it out.
3. Receive it and view the source. You will see <Meta HTTP-Equiv...>
   line in the body as well as in the head element.
i tested this a little bit and here is what i found:
- in case you Edit message as new (Kat's message or compose your own) and add
some text on top of the existing text then NO  <Meta HTTP-Equiv...> is inserted
in the header= meaning the  <Meta HTTP-Equiv...> is inserted once in the body (
normal)
- in case you Edit as new inserting some text inside the exsiting lines then
you'll get a second  <Meta HTTP-Equiv...> in the head ( not normal...)
i'll attach the Kat's test case with a line added on top of his text and you'll
see that no  second <Meta HTTP-Equiv...> exist on the page source
Since the main problem here is that of inserting extra invalid
Meta Equiv charset lines into body and head areas, I changed 
the summary line. The incosistent insertion will be filed as a
new bug when I can isolate the condition that leads to the 
problem.
Summary: Inconsistent insertion of http-equiv charset line into mail body → Edit as New menu inserts extra http-equiv charset lines into mail body & head elements
<title> is also duplicated.
I am getting different results the first composed mail and others (Edit as new
to the same message).

The very first mail I compose after I launch.

<html>
<head>
<title></title>
</head>
<body>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-2022-JP">
<title></title>

The second mail.

<html>
<head>
<meta http-equiv="Content-Type"
 content="text/html;charset=ISO-2022-JP">
<title></title>
</head>
<body>
<meta http-equiv="Content-Type" content="text/html;charset=ISO-2022-JP">
<title></title>


So far, I cannot reproduce this when I set "mail.compose.max_recycled_windows" to 0.
cc to ducarroz, sspitzer. For the recycle windows feature, is HEAD section
always cleared per each message send?
Blocks: 132191
nsbeta1- per i18n triage
Keywords: nsbeta1nsbeta1-
embed triage: topembed-, not critical for immediate embedding needs
Keywords: topembedtopembed-
Reassign to ducarroz.
Assignee: nhotta → ducarroz
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.1alpha
nominate nsbeta1
Keywords: nsbeta1-nsbeta1
the difference with recycle compose window is gone since a while. The apparent
problem is that when you edit as new, mozilla insert the raw data of the
original message as the body of a new message wich result of duplication of the
head inside the message body. Probably an editor matter!
jfd: you insert it all into the body?  That sounds very, very, VERY wrong.
http-equiv and others tags are in the <head> and should not be added to the
<body>.  Perhaps you should be doing something like Composer does when it
converts from html source --> Normal view via its tabs.
I am not sure at 100% yet. But if I correctly understand, I should cut the
header and keep only the body part! But I need to preserve the charset and
potentially some other attribute like background colors and image. Let me take a
deeper look...
I can confirm that in the case of edit draft/template/new, we insert the whole
HTML of the original message into editor. I don't know yet for forward inline
which use the same code.

I thing the right thing to do is to extract the charset from the META tag and
set it into the compose field (it's not the case currently, editing a japanese
message give a compose window with the charset set to US-ASCII) and keep only
the body tag and it's content. Editor knows how to deal with the body tag and we
have special code in message compose to take care of the body's attribute.
Attached patch Proposed fix, v1 (obsolete) — Splinter Review
The solution I have choosen is to remove the <head> tag before inserting HTML
into the editor. I did not find the need to fetch the charset set in the META
tag (inside the <head>) has we already get the charset really used in mime.

In the case where the user select a japanese charset, type couple english word,
save the message, close it and then later re-edit it, the charset will have
been reset to US_ASCII as japanese wasn't needed. Naoki, let me know if it's ok
with you?
I think I understand JFD's logic in simply removing the <head> region.
I'd just like to point out that if you wanted to use the new head content 
instead of original and avoid duplication, you could use 
nsIHTMLEditor::ReplaceHeadContentsWithHTML().
That's what we use in Composer to get around the problem of having <head>
content when using InsertHTML(). See RebuildDocumentFromSource() in 
nsHTMLEditor.cpp for details.
Comment on attachment 96248 [details] [diff] [review]
Proposed fix, v1

I'm concerned (but maybe I misunderstand the bug).  If I receive a message that
has Shif-JIS encoding and then I edit as new, shouldn't I retain the shif-jis
encoding?  It seems wrong to me to delete the existing head in favor of
whatever the editor has in it.	Perhaps I don't understand some key things
about mail though.
if the data of the message really contains Shif-JIS, the charset in the
content-type header would have been set to Shif-JIS. In that case we will
correctly set the charset for editing.

Now, if we really need to preserve the characters set used during editing, I
would have to sniff the META tag. I18n folk, please advice...
Attached patch Proposed fix, v2 (obsolete) — Splinter Review
This time, instead of just cuttng the header tag, I save the header content and
re-insert it using the editor function
nsIHTMLEditor::ReplaceHeadContentsWithHTML as suggested by cmanske.
Attachment #96248 - Attachment is obsolete: true
Whiteboard: have fix
Comment on attachment 96760 [details] [diff] [review]
Proposed fix, v2

looks good to me - r=varada
Attachment #96760 - Flags: review+
Comment on attachment 96760 [details] [diff] [review]
Proposed fix, v2

ok, I'll admit I haven't read alecf's latest string document in a month or
two...

Is this a memory leak or is nsCRT::strndup not allocating anything new?

+	     PRUnichar* headData = new PRUnichar[end - start - 6];
+	     if (headData)
+	     {
+	       headData = nsCRT::strndup(&data[start + 6], end - start - 6);
good catch Katy, strdup does allocate the needed memory, not like strcpy.
Attachment #96760 - Flags: needs-work+
Attached patch Proposed fix, v3 (obsolete) — Splinter Review
I fix the memory leak in the new code as well in the old code I did indeed
copy.
Attachment #96760 - Attachment is obsolete: true
Comment on attachment 96861 [details] [diff] [review]
Proposed fix, v3

your patch uses these api:
+	   aEditorShell->ReplaceHeadContentsWithHTML(headContent.get());
+	 aEditorShell->InsertSource(aBuf.get());

editorshell is deprecated and will be going away.  Since you already have an
editor, can you use the editor api instead?  Thanks!
If I want to stop using nsIEditorShell, I would need to have an nsIHTMLEditor
object and what I have is an nsIEditor object in hand. If nsIEditorShell is
going away in the future, I'll have to change a lot of code which is behind the
scoop of this bug.

At that point, see that I will be gone for 3 weeks at the end of this week, I
would rather limit the patch to te strict minimum. I'll file a new bug for
replacing nsIEditorShell in msg compose.
I filled bug 164967 for the replacement of nsIEditorShell in message compose
Comment on attachment 96861 [details] [diff] [review]
Proposed fix, v3

r=cavin. It would be nice if  you can use #define for the constant 6 (stelen of
"<head>" I think) because it's used multiple times.
Attachment #96861 - Flags: review+
Attached patch Proposed fix, v4 (obsolete) — Splinter Review
I simplified the arithmetic...
Attachment #96861 - Attachment is obsolete: true
Comment on attachment 96904 [details] [diff] [review]
Proposed fix, v4

carrying over R=cavin
Attachment #96904 - Flags: review+
Comment on attachment 96904 [details] [diff] [review]
Proposed fix, v4

+	     bodyAttributes.Adopt(nsCRT::strndup(&data[start], end - star));

this can't compile, can it? star should be start, shouldn't it?
oops, I've attached the wrong patch. You are right, it should be start and not star.
Attached patch Proposed fix, v5Splinter Review
Same than before but without the typo!
Attachment #96904 - Attachment is obsolete: true
Comment on attachment 96918 [details] [diff] [review]
Proposed fix, v5

Carry over R=cavin
Attachment #96918 - Flags: review+
Comment on attachment 96918 [details] [diff] [review]
Proposed fix, v5

sr=bienvenu
Attachment #96918 - Flags: superreview+
Fixe checked in the trunk
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
*** Bug 153761 has been marked as a duplicate of this bug. ***
Blocks: 157673
Whiteboard: have fix → have fix [ish1+]
Depends on: 180372
No longer blocks: 157673
Using the test case in comment #7 I was able to reproduce the problem with the
7.0 rtm build so that I could verify the problem is fixed with the 12-6-2002
trunk builds on winxp, mac osx and linux. Verified.  
Status: RESOLVED → VERIFIED
This fix caused a preformance regression: bug 184550
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: