Closed
Bug 118251
Opened 23 years ago
Closed 23 years ago
whitespace formatting should not be reworked
Categories
(Core :: DOM: Serializers, defect)
Tracking
()
RESOLVED
FIXED
Future
People
(Reporter: jmd, Assigned: harishd)
References
Details
Attachments
(2 files, 1 obsolete file)
2.22 KB,
text/plain
|
Details | |
9.01 KB,
patch
|
kinmoz
:
review+
kinmoz
:
superreview+
|
Details | Diff | Splinter Review |
I believe our goal for saving as "complete" (poor name, as it redoes links, etc) should be to modify the original document as little as necessary. When I save my personal page, I notice it removes the <?xml version> at the top, and compresses the doctype, html, head, and title tags all into a single row. Additionally, the </head> tag is compressed onto the line with the last <link> on it. This makes the layout quite ugly, and very difficult to notice some of the tags. Expected: whitespace formating would not be reworked. Example is attached, as long lines can't be reproduced in a bugzilla comment.
Reporter | ||
Comment 2•23 years ago
|
||
Yep, all those newlines before <body> is verbatim.
Reporter | ||
Comment 3•23 years ago
|
||
Notice it also affects </body></html>
I am not sure what I can do about this. The XPFE is telling webbrowserpersist to fixup and save the DOM document and it does so by feeding the loaded DOM tree into the document encoder for saving. It is very likely that any of the original formatting will be gone when it comes out the other end. Regarding the <?xml> tag being stripped. This could be an issue with webbrowserpersist as it stands. Internally it always tells the encoder to save out as text/html and this may be causing the wrong encoder to be invoked, causing odd nodes such as this to be stripped. I will changed the title of this bug to cover that issue until I know for sure.
Summary: xml/doctype/html/head whitespace compressed in save as: complete web page → Webrowserpersist hardcodes text/html for saving DOM documents
Comment 5•23 years ago
|
||
Adam, are you saying that the _DOM_ is saved? So if the page contains document.write() calls, the script is saved _and_ the written content is saved? So when loaded again the written content is present twice?
Reporter | ||
Comment 6•23 years ago
|
||
Actually, that exact thing happened to me the other day (on a differant page) and I didn't think anything of it at the time. Yep, all document.write contents was presented twice. I had 2 ad banners and 2 webtrends page counter/spy things.
Boris, yes I am. The double writing thing issue for embedding covered by bug 115328, but in Mozilla whatever code is telling the webbrowserpersist object to save a DOM needs to pass it the original DOM one not the modified one.
Comment 8•23 years ago
|
||
Bug 118792 filed.
Bug 110135 is checked in. It should now be possible to specify encoding flags and a mime type in the call to saveDocument to control how a document is saved.
Reporter | ||
Comment 10•23 years ago
|
||
FWIW, after Bens Save check-in last night, loading the saved page via: file and saving again removes most of the extra blank lines. Still some places where the formating changes though.
Comment 11•23 years ago
|
||
Webbrowserpersist still uses text/html as the default document encoder but allows the caller to specify any content type they like.
Target Milestone: --- → Future
Updated•23 years ago
|
Keywords: helpwanted
Comment 12•23 years ago
|
||
A combination of tests to figure out the correct content type for xml, html, xhtml * QI for nsIDOMHTMLDocument * Examine doctype to see if is named "html" and whether the public ID contains the substring "xhtml" I am not sure how text documents are represented as a DOM, but if there is an obvious test (e.g. check topmost node is a text node) then it should be possible to handle that too.
Comment 13•23 years ago
|
||
How about using the nsIDOMNSDocument::contentType property? That should be somewhat more reliable than looking at the contents... Text documents have the following DOM: <html><body><pre>text</pre></body></html>
Comment 14•23 years ago
|
||
This patch uses nsIDOMNSDocument to obtain the content type of the document. It then uses that content type to instantiate the proper encoder or falls back on text/html if no encoder for the content type exists I have noticed that even with the correct content type XML documents still don't come out properly (e.g. missing <?xml version="1.0"?> at the top) so there is likely an issue in the document encoder itself. But the persist object is now doing the right thing.
Comment 17•23 years ago
|
||
Comment on attachment 69048 [details] [diff] [review] Updated patch after recent upload changes sr=kin@netscape.com with brade's comment addressed. Also, this: + nsAutoString defaultContentType; + defaultContentType.Assign(NS_LITERAL_STRING("text/html")); can be collapsed into a single init: + nsAutoString defaultContentType(NS_LITERAL_STRING("text/html"));
Attachment #69048 -
Flags: superreview+
Attachment #69048 -
Flags: review+
Comment 18•23 years ago
|
||
> + nsAutoString defaultContentType(NS_LITERAL_STRING("text/html"));
Or better yet:
NS_NAMED_LITERAL_STRING(defaultContentType, "text/html");
Which avoids a string copy and all that.
Comment 19•23 years ago
|
||
Fix is in
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
Comment 20•23 years ago
|
||
over to embed qa for testing [load-balancing]...
QA Contact: sairuh → mdunn
Reporter | ||
Comment 21•23 years ago
|
||
If this fix is actually checked in, it does nothing to clear up what I originally reported. Save as output in todays linux build is exactly the same as in attachment 1 [details] [diff] [review].
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 22•23 years ago
|
||
Fix is definitely checked in but I believe the XML encoder is busted. See comment 14
Comment 23•23 years ago
|
||
Jeremy, is your personal page served as text/html or with some XML mimetype?
Reporter | ||
Comment 24•23 years ago
|
||
Content-Type: text/html <?xml version="1.0" encoding="US-ASCII"?> <meta http-equiv="Content-Type" content="text/html; charset=US-ASCII" /> When I used an XML content-type for the header at least, three of the six browsers I tested wouldn't load it. Save-as shouldn't be dependent on this at all, as far as the user is concerned.
Comment 25•23 years ago
|
||
> Save-as shouldn't be dependent on this at all, as far as the user is concerned.
Save HTML only is not.
Save Full page is, only because it saves the DOM. The <? ?> syntax is
XML-specific and thus does not get encoded. It's not even in the HTML DOM (not
being valid HTML), so it could not _possibly_ get encoded. For XML pages, PIs
_are_ in the DOM.
Comment 26•23 years ago
|
||
This is a dom->text conversion issue
Assignee: adamlock → harishd
Status: REOPENED → NEW
Component: File Handling → DOM to Text Conversion
QA Contact: mdunn → sujay
Summary: Webrowserpersist hardcodes text/html for saving DOM documents → whitespace formatting should not be reworked
Comment 27•23 years ago
|
||
Opened bug 127300 to specifically cover problems with the XML serializer. I think this bug should be closed. It is probably an issue for Ben to decide whether we should be using saveDocument to save XML files (except XHTML) since there is nothing to fix up in them anyway. The saveURI method might be a better choice.
Status: NEW → RESOLVED
Closed: 23 years ago → 23 years ago
Resolution: --- → FIXED
Comment 28•22 years ago
|
||
Reporter: please verify and mark verified-fixed...thanks!
Reporter | ||
Comment 29•22 years ago
|
||
I don't know what to verify. I still get MAJOR whitespace changes when saving my page. If this bug morphed into something else, then whoever morphed it needs to verify. My problem still exsists with 2002052109 Linux 1.0 branch.
You need to log in
before you can comment on or make changes to this bug.
Description
•