Closed Bug 50821 Opened 24 years ago Closed 24 years ago

Extraneous DOCTYPE declration inserted into HTML files

Categories

(MailNews Core :: Backend, defect, P3)

x86
Windows NT

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: momoi, Assigned: akkzilla)

References

Details

(Whiteboard: [nsbeta3+])

Attachments

(2 files)

** Observed with 8/25/2000 Win32 build with an HTML file
   created by NS6 Beta2. **

This problem was first filed by marina in NS internal bug database
since it was discovered during a localization testing process.

Here is a series of steps to reproduce the problem.

1. Using the latest build, create a JPN HTML message pasting some data
   in from Netscape Japan Home Page. (I don't think it has be
   a JPN msg but this will show you the problem at its severest.
2. Attach the file which I will be uploading next. This file
   was created by Netscape PR2/Mozilla M17.
3. Send it and then receive and try to display both the 
   body text and the attachment. 
4. Result: You can see the body text but not the attachment
           content.


Additional data:

It turns out M17/PR2 inserts the following line into HTML files
it creates:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">


Apparently we are having a problem when this line is in the attachment.

As an experiment, you can take the same PR2 generated HTML
file I will be atatching below and delete the above line
and then follow the same steps above and you will be able
to display the attachment. 

You can also use ISO-8859-1 HTML file created by M17 and 
get a simialr result.
The following is what is generated by 8/25/2000 Win32 M18 build:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head><!-- This page was created by the Gecko output system. -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
....
....

This looks strange with 2 DOPCTYPE definitions but this type
of HTML file does not cause a problem. 
You can't view this type of attachment inline but 
you can either save or open it via the attachment icon.
I don't really understand the original bug, but is it caused by the extra 3.2/EN
doctype?

The output system, for some reason, adds 
<!-- This page was created by the Gecko output system. -->
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">
if you don't set the flag OutputNoDoctype when generating the output.

Whatever reason there was for these comments is lost in the mists of antiquity,
and I think they've outlived their usefulness.  The editor can create its own
doctypes now, and the output system (unlike the editor) doesn't know enough to
be able to choose the right doctype.  I think we should remove the flag and the
code that generates the headers, and rely on the editor to set the doctype.

If anyone agrees (Beth?) and if it would fix this bug, reassign the bug to me
and argue for nsbeta3+.  Removing the extra comment and doctype is very trivial
-- ten minutes' work.  (You can test whether it would work by making sure the
OutputNoDoctype is false in nsHTMLContentSinkStream.cpp, or just removing the
code that outputs the headers there.)
No it seems to be caused by PR2/M17 doctype:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

The current one which includes:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2//EN">

actually prevents this problem from happening.

Since we can display saved HTML file with Browser
and this problem happens only when such a file
is attached to a message, this seems to be a mail
problem. If you eliminate the duplicate doctype, then
we will see this problem again.  

The reason I copied akkana is because of the duplicate doctype
in the current builds.

Can anyone in Editor group tell us how often we might
find a line like:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">

on web pages or in documents created by HTML editors?

If this is common, we will have a problem displaying
such an attachment to mail. If so, we need to get this considered
seriously for beta3. 
It is called the DOCTYPE statement, which is required for HTML files. It is used 
to define the level of DTD that was used to generate the page. Most HTML editors 
insert the doctype statement, so you should expect to see that quite often.
OK. Nomnating for beta3, then.
This problem will probably affect all HTML attachments 
with this DOCTYPE declaration.
Keywords: nsbeta3
Kat -- I didn't catch that there are 2 doctype statements being inserted. This 
is wrong. The doctype within the HEAD is invalid and should not be there. I 
believe Harish needs to be involved with this bug -- adding him to the list and 
sending him a sidenote.
The extra doctype and comment in the head are the ones I was talking about that
are coming from the output system.  It would be trivial to kill that code (my
preference) or to move it so that it's outside the head, if that's more correct
(but then there would still be two doctypes, and I don't think ought to be the
output system's job to insert doctypes since it doesn't have enough information
to choose the right one).
Akkana -- can you kill that output then? That indeed would be the right thing
Note please that there 2 aspects to this bug.

1. Editor is generating incorrect DOCTYPE within <HEAD> tags.
2. An attachment which contains "correct" DOCTYPE causes a display problem, and that the incorrect 
   DOCTYPE seems to be acting as a buffer for this problem. In PR2, the DOCTYPE was correctly generated
   and so HTML pages generated by PR2 when attached will cause the display problem.

That is if you fix 1, it will expose the problem in 2 more prominently. But since the correct DOCTYPE
will be generated other HTML editors, this is a problem that must be faced by the mail
component anyway. 
If someone one wants to separate these 2 issues into 2 bugs, you are welcome.
*** Bug 51320 has been marked as a duplicate of this bug. ***
I didn't realize this bug was assigned to me. Akkana, should this be re-assigned
to you to implement the changes you proposed?
Whiteboard: [b3 new owner?]
Assignee: mscott → akkana
reassign
CC'ed mscott again.
plus
Whiteboard: [b3 new owner?] → [nsbeta3+]
This bug is now dedicated to the DOCTYPE problem in the
editor. Changed the summary line accordingly.
The mail display problem will be filed in a new bug.
Summary: NS6 PR2 generated HTML files cause a display problem when attached to a message → Extraneous DOCTYPE declration inserted into HTML files
I've attached my diffs.  Can someone review them, please?  This should stop the
output system from including the comment and 3.2 doctype in the head.  

Meanwhile, I noticed this: The editor already has code (in editor.js) to put a
doctype as follows:
-//W3C//DTD HTML 4.01 Transitional//EN
if it doesn't see a doctype already in the file.  Is this the code that's
causing the other problem (aside from the second 3.2 doctype in the head, which
my patch fixes)?  Perhaps the JS code needs to be smarter about detecting the
language somehow (how?)
Status: NEW → ASSIGNED
Summary: Extraneous DOCTYPE declration inserted into HTML files → NS6 PR2 generated HTML files cause a display problem when attached to a message
Target Milestone: --- → M18
Midair collision blew away someone else's change to the summary, so I'm putting
it back.
Summary: NS6 PR2 generated HTML files cause a display problem when attached to a message → Extraneous DOCTYPE declration inserted into HTML files
Fixed this part -- the output system will no longer insert the bogus 3.2 doctype
in the head, or the accompanying comment.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Verifying this bug
Output system no longer inserts 3.2 bogus type in the head or
the accompanying comment
Status: RESOLVED → VERIFIED
Product: MailNews → Core
Product: Core → MailNews Core
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: