Closed Bug 236686 Opened 20 years ago Closed 14 years ago

Save Page As... complete produces page with bad layout

Categories

(Core :: DOM: HTML Parser, defect)

x86
All
defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: bugmozillabloborg.20.junkymail, Unassigned)

References

()

Details

(Whiteboard: [fixed by the HTML5 parser])

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.6) Gecko/20040113

When you go to this cvs manual page, and do a save page as... web page,
complete, the files get saved. However, when you open them, the layout is all
messed up. The page should look like the actual CVS page.

Reproducible: Always
Steps to Reproduce:
1. Go to the URL http://www.cvshome.org/docs/manual/cvs-1.11.13/cvs_1.html#SEC1
2. Choose Save Page As... web page, complete
3. Open the saved page and compare to the original page

Actual Results:  
saved page has messed up layout

Expected Results:  
saved page has same layout
When loaded from the web site there is no DOCTYPE.
When Mozilla saves the saved page it adds:
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html401/loose.dtd">

Which means the document will be displayed in Quirks mode when loaded from
the site and in Full standards mode when loaded from disk.
In the markup there is a bad comment:
<!-- Nav_Column-- start left side column for logo and Nav Bar --> 

Mozilla parses comments differently in the two layout modes, see (last item)
http://www.mozilla.org/docs/web-developer/quirks/quirklist.html

So both layouts are in fact correct.
OS: Windows XP → All
It seems the bug is that Mozilla adds the DOCTYPE. If you remove the DOCTYPE
from the file that is saved, the page opens up fine.

Why does Mozilla add a DOCTYPE if one doesn't exist in the source page? It
obviously adds it wrong.
Maybe we should add a DOCTYPE that preserves the layout mode?
Assignee: general → file-handling
Component: Browser-General → File Handling
QA Contact: general → ian
-> embedding apis, this is a webbrowserpersist issue
Assignee: file-handling → adamlock
Component: File Handling → Embedding: APIs
Not all is so simple.  About 200 lines down the page we have:

<!-- Servlet-Specific template --><!-- Wrap Servlet-Specific Help -->
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html401/loose.dtd">
<html>

So we DO have a doctype node in the DOM (see DOM inspector).  And naturally the
doctype node is the first child of the document, since that's the only place
doctype nodes are allowed.

Perhaps the parser should not be tokenizing this as a doctype if it occurs
halfway down the page?  Tokenize it as a comment or marked section instead?  Or
perhaps the HTML content sink should just treat this is a comment in
AddDocTypeDecl() if we already have an open body or something along those lines?
tokenizing that as a doctype seems wrong to me, so i'm all for that suggestion.
Assuming *I* don't have to change the html-parser to actually do this of course :)
Choess, thoughts?
Um, weird. The standards mode detection happens before tokenization, right? So
while having it detected and stuck at the top of the document may be
undesirable, fixing that doesn't seem to me like it would fix the mode breakage.
The mode breakage only happens if you save by serializing the DOM to a file,
then load that file.  At which point the saved file starts with a doctype, of
course.  See comment 1.
This is an automated message, with ID "auto-resolve01".

This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.

While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.

If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.

The latest beta releases can be obtained from:
Firefox:     http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey:   http://www.mozilla.org/projects/seamonkey/
This bug has been automatically resolved after a period of inactivity (see above
comment). If anyone thinks this is incorrect, they should feel free to reopen it.
Status: UNCONFIRMED → RESOLVED
Closed: 19 years ago
Resolution: --- → EXPIRED
Status: RESOLVED → UNCONFIRMED
Resolution: EXPIRED → ---
Comments 5-8 seems to indicate a parser issue...
Marking NEW so it doesn't expire again.
Assignee: adamlock → mrbkap
Status: UNCONFIRMED → NEW
Component: Embedding: APIs → HTML: Parser
Ever confirmed: true
QA Contact: ian → parser
The tricky part here is, what do we do with the DTD if we don't insert it as a
DOCTYPE? We could simply drop it entierly perhaps?
Yes, I think dropping it completely is perfectly reasonable.
Assignee: mrbkap → nobody
The HTML5 parser fixes late doctype handling (discards), late doctype sniffing (sniffs without a buffer limit) and -- in comments.
Status: NEW → RESOLVED
Closed: 19 years ago14 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by the HTML5 parser]
You need to log in before you can comment on or make changes to this bug.