Closed Bug 120556 Opened 23 years ago Closed 22 years ago

Save as "Web page, complete" invalidates XHTML by omitting end tags for empty elements

Categories

(Core :: DOM: HTML Parser, defect, P2)

x86
Windows 2000
defect

Tracking

()

RESOLVED WONTFIX
mozilla1.0

People

(Reporter: w.b.garvelink, Assigned: hjtoi-bugzilla)

References

(Blocks 1 open bug)

Details

(Keywords: dataloss, xhtml)

Attachments

(4 files)

(Build id: 2002011703)

After saving a valid XHTML 1.0 document as "Web page, complete", the document no
longer validates. Ending-slashes for self-closing tags such as <meta/> and
<link/> are omitted.
->Parser
Assignee: asa → harishd
Component: Browser-General → Parser
QA Contact: doronr → moied
Should this really be parser?  It sounds like webbrowserpersist is screwing up
the DOM-to-text conversion...
It might save & validate correctly if webbrowserpersist were told to save it as
xhtml. As present webbrowserpersist is not told to save with any content type so
it defaults it text/html and as such probably fouls it up.

I believe the fix would be determine the content of the document - html, xml,
xhtml being the main ones - so the appropriate encoder can be used.
Shouldn't this bug go to Ben so he can tell webbrowserpersist what content type
to use?
And it messes up any &entity; to. Confirming...
Status: UNCONFIRMED → NEW
Ever confirmed: true
Blocks: 115634
It the XHTML document is served as text/html then short-hand-tags will lose '/'.
However, if the document is served as text/xml ( and other flavors of xml ) then
the '/' |should| be retained. If that's not happening then it's a bug in the DOM
serializer.

Reassigning bug to Tanu for further investigation.
Assignee: harishd → tmutreja
Adam's comment#3 is exactly what I find here. The default content type of 
webbrowserpersist i.e. "text/html" is being passed to nsDocumentEncoder and 
causing the bug. So it should not be DOM-TXT serialization bug.

Harish, reassigning to you. Please assign it to the right person and change the 
component accordingly.  
Assignee: tmutreja → harishd
The webbrowserpersist object has changed recently. If you pass in nsnull for the
content type it will ask the DOM for it via it's nsIDOMNSDocument method. So in
the case of XHTML, it should return "application/xhtml+xml" and the correct
serializer will be invoked.

You should verify with a recent build that "text/html" is being used because it
shouldn't be unless the server is mistakenly returning it as the content type.

Even if the XML serializer is invoked, there is still bug 127300 to deal with.
The serializer is not generating valid XML, let alone XHTML.
*** Bug 132205 has been marked as a duplicate of this bug. ***
I think heikki is working on something similar to this. 

--> heikki
Assignee: harishd → heikki
Keywords: dataloss, xhtml
Priority: -- → P2
Target Milestone: --- → mozilla1.0
This seems to be working, marking wfm.

There are still issues, though: it is not possible to just save the XML page. It
always tries to save as complete. Anybody have any idea how to start unwinding
that? Also the save as box only says save XML or XHTML (it does not indicate
that it is actually save as complete, and there are no other options).

Another thing is that when saving, I hit assertion every time. Assertion happens
in nsStandardRL::SetUserPass:
 NS_ASSERTION(mHost.mLen >= 0, "uninitialized");
which is called from nsWebBrowserPersist::FixupNodeAttribute() like this:

            fileAsURI->SetUserPass(NS_LITERAL_CSTRING(""));

and the stack is:

nsDebug::Assertion(const char * 0x01559fd8, const char * 0x01559fc8, const char
* 0x01559f98, int 1107) line 291 + 13 bytes
nsStandardURL::SetUserPass(nsStandardURL * const 0x046a16c8, const nsACString &
{...}) line 1107 + 35 bytes
nsWebBrowserPersist::FixupNodeAttribute(nsIDOMNode * 0x03559a04, const char *
0x0145aae0) line 2172 + 55 bytes
nsWebBrowserPersist::CloneNodeWithFixedUpURIAttributes(nsIDOMNode * 0x036509bc,
nsIDOMNode * * 0x0012f5dc) line 2020
nsEncoderNodeFixup::FixupNode(nsEncoderNodeFixup * const 0x03832208, nsIDOMNode
* 0x036509bc, nsIDOMNode * * 0x0012f5dc) line 2680 + 19 bytes
nsDocumentEncoder::SerializeNodeStart(nsIDOMNode * 0x036509bc, int 0, int -1,
nsAString & {???}) line 299 + 66 bytes
nsDocumentEncoder::SerializeToStringRecursive(nsIDOMNode * 0x036509bc, nsAString
& {???}) line 380 + 20 bytes
nsDocumentEncoder::SerializeToStringRecursive(nsIDOMNode * 0x036b4d28, nsAString
& {???}) line 401 + 21 bytes
nsDocumentEncoder::SerializeToStringRecursive(nsIDOMNode * 0x03599c20, nsAString
& {???}) line 401 + 21 bytes
nsDocumentEncoder::SerializeToStringRecursive(nsIDOMNode * 0x037ac9e4, nsAString
& {???}) line 401 + 21 bytes
nsDocumentEncoder::EncodeToString(nsDocumentEncoder * const 0x03559490,
nsAString & {???}) line 939 + 21 bytes
nsDocumentEncoder::EncodeToStream(nsDocumentEncoder * const 0x03559490,
nsIOutputStream * 0x046a1d20) line 978 + 19 bytes
nsWebBrowserPersist::SaveDocumentWithFixup(nsIDocument * 0x037ac9e0,
nsIDocumentEncoderNodeFixup * 0x03832208, nsIURI * 0x038317e8, int 1, const char
* 0x0012fbe4, const nsString & {""}, unsigned int 0) line 2384 + 44 bytes
nsWebBrowserPersist::SaveDocuments() line 1411 + 87 bytes
nsWebBrowserPersist::OnStopRequest(nsWebBrowserPersist * const 0x036223f8,
nsIRequest * 0x03832290, nsISupports * 0x00000000, unsigned int 0) line 586 + 11
bytes
nsHttpChannel::OnStopRequest(nsHttpChannel * const 0x03832294, nsIRequest *
0x03558f8c, nsISupports * 0x00000000, unsigned int 0) line 2812
nsOnStopRequestEvent::HandleEvent() line 213
nsARequestObserverEvent::HandlePLEvent(PLEvent * 0x046bafdc) line 116
PL_HandleEvent(PLEvent * 0x046bafdc) line 596 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x011f4d90) line 526 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x008006fc, unsigned int 49455, unsigned int 0,
long 18828688) line 1077 + 9 bytes

Adding darin in case this assertion is something we need to be concerned about.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → WORKSFORME
My guess is an nsILocalFile datapath is being supplied to the persist object so
it assumes you want to save complete. For XML the datapath should probably be
nsnull & XHTML should offer the same choices as HTML.
Darin has a bug on the assertion for SetUserPass.  I'm sorry I don't have the
bug # handy.
yeah, the assertion is not critical.
*** Bug 152517 has been marked as a duplicate of this bug. ***
*** Bug 163131 has been marked as a duplicate of this bug. ***
Summary: Save as "Web page, complete" invalidates XHTML → Save as "Web page, complete" invalidates XHTML by omitting end tags for empty elements
This is definitely not working for me in Firebird 0.7.  I saved several pages
which were marked as:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/x
html1/DTD/xhtml1-strict.dtd">

and it removed all the /> leaving the page as invalid XHTML.

Since the last entry for this bug is two years old, I'd like to reopen the bug
because it certainly was never worksforme in Firebird 0.7.
Richard, make sure the mime type of your document is an XML mime type and not
HTML. If you have the file on local disc with suffix .xml or .xhtml then it
should be so. 

If your document is served with the HTML mime type, this is the wrong bug. Open
a new one, component: DOM to Text Conversion.

If your document mime type is an XML mime type, and you see the bug in Firefox
but not in Mozilla, open a new bug, Product: Firefox, Component: General.

Only if your document has an XML mime type and you see this in the Mozilla
(Suite) browser, reopen this bug.

Thanks.
*** Bug 248198 has been marked as a duplicate of this bug. ***
*** Bug 253626 has been marked as a duplicate of this bug. ***
*** Bug 264418 has been marked as a duplicate of this bug. ***
*** Bug 278692 has been marked as a duplicate of this bug. ***
*** Bug 205264 has been marked as a duplicate of this bug. ***
*** Bug 281580 has been marked as a duplicate of this bug. ***
*** Bug 292702 has been marked as a duplicate of this bug. ***
*** Bug 308930 has been marked as a duplicate of this bug. ***
Blocks: 310903
*** Bug 321965 has been marked as a duplicate of this bug. ***
*** Bug 336399 has been marked as a duplicate of this bug. ***
It is interesting that the choice of filename would make the difference. Should not meta http-equiv="Content-Type" content="text/xhtml" be enough?

I tried the following on  Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.10) Gecko/0000000000 Fedora/2.0.0.10-3.fc8 Firefox/2.0.0.10 and indeed for filename ending in .html even when using meta content-type it would come out wrong when saved, but for .xhtml it would come out right:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=utf-8" />

There is a different problem now though, if I have anything that looks like broken html in a <pre> environment, then only the .html file is displayed, the .xhtml file instead complains:

XML Parsing Error: not well-formed
Location: http://woot/test.xhtml
Line Number 63, Column 1:
...

Not exactly what I wanted.
When i wanted to fix a typo in one of my pages on-line this bug turned my perfectly valid XHTML code i parked on-line (I don't know how it was served, but the validator didn't complain, and it doesn't change the matter anyway) to be published somewhere else later into something that had loads of errors to be fixed.

As a quick half fix i propose changing the text "Web Page, complete" into something else like "full Web Page 'snapshot'" and next to that i'd also like "(X)HTML file as received from server".

The user doesn't care much about whether Firefox is right or not,  but cares about clearity and predictability, aka usability. Also if http://validator.w3.org thinks files are OK, while Firefox doesn't agree, please file a bug with the validator or Firefox.
(In reply to comment #22)
> Richard, make sure the mime type of your document is an XML mime type and not
> HTML. If you have the file on local disc with suffix .xml or .xhtml then it
> should be so. 
> 
> If your document is served with the HTML mime type, this is the wrong bug. Open
> a new one, component: DOM to Text Conversion.

I would vote for such a bug. Per the XHTML standards, text/html can be used for XHTML, which should be serialized as XML, even if text/html may be parsed as html. 

(In reply to comment #34)
> It is interesting that the choice of filename would make the difference. Should
> not meta http-equiv="Content-Type" content="text/xhtml" be enough?

text/xhtml doesn't exist, I think you're thinking of application/xhtm+xml.
And note that the meta http-equiv does not have precedence over the media type actually served by the HTTP server.
(In reply to comment #36)
> text/xhtml doesn't exist, I think you're thinking of application/xhtm+xml.

application/xhtml+xml
apologies for the typo, and for the noise.
This bug is not WORKSFORME as can be seen among others by the number of duplicates, even recent ones; and it applies to current trunk as can be seen by its bug 696508 duplicate.

However, the exact same problem was RESOLVED WONTFIX by Boris Zbarsky in bug 696508 comment #8.
Resolution: WORKSFORME → WONTFIX
It certainly is not a WORKSFORME here - I opened bug 696508 having failed to spot this bug report.

I'm not that impressed by RESOLVED WONTFIX either - the software does not work as it should and the developers do not want to fix the problem.

Guess it is time to consider my other, non-Mozilla, browsing options.
See Also: → 1627092
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: