Save as "Web page, complete" invalidates XHTML by omitting end tags for empty elements

RESOLVED WONTFIX

Status

()

Core
HTML: Parser
P2
normal
RESOLVED WONTFIX
16 years ago
6 years ago

People

(Reporter: w.b.garvelink, Assigned: Heikki Toivonen (remove -bugzilla when emailing directly))

Tracking

(Blocks: 1 bug, {dataloss, xhtml})

Trunk
mozilla1.0
x86
Windows 2000
dataloss, xhtml
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(4 attachments)

(Reporter)

Description

16 years ago
(Build id: 2002011703)

After saving a valid XHTML 1.0 document as "Web page, complete", the document no
longer validates. Ending-slashes for self-closing tags such as <meta/> and
<link/> are omitted.
->Parser
Assignee: asa → harishd
Component: Browser-General → Parser
QA Contact: doronr → moied
Should this really be parser?  It sounds like webbrowserpersist is screwing up
the DOM-to-text conversion...

Comment 3

16 years ago
It might save & validate correctly if webbrowserpersist were told to save it as
xhtml. As present webbrowserpersist is not told to save with any content type so
it defaults it text/html and as such probably fouls it up.

I believe the fix would be determine the content of the document - html, xml,
xhtml being the main ones - so the appropriate encoder can be used.

Comment 4

16 years ago
Shouldn't this bug go to Ben so he can tell webbrowserpersist what content type
to use?

Comment 5

16 years ago
And it messes up any &entity; to. Confirming...
Status: UNCONFIRMED → NEW
Ever confirmed: true

Updated

16 years ago
Blocks: 115634

Comment 6

16 years ago
It the XHTML document is served as text/html then short-hand-tags will lose '/'.
However, if the document is served as text/xml ( and other flavors of xml ) then
the '/' |should| be retained. If that's not happening then it's a bug in the DOM
serializer.

Reassigning bug to Tanu for further investigation.
Assignee: harishd → tmutreja

Comment 7

16 years ago
Adam's comment#3 is exactly what I find here. The default content type of 
webbrowserpersist i.e. "text/html" is being passed to nsDocumentEncoder and 
causing the bug. So it should not be DOM-TXT serialization bug.

Harish, reassigning to you. Please assign it to the right person and change the 
component accordingly.  
Assignee: tmutreja → harishd

Comment 8

16 years ago
The webbrowserpersist object has changed recently. If you pass in nsnull for the
content type it will ask the DOM for it via it's nsIDOMNSDocument method. So in
the case of XHTML, it should return "application/xhtml+xml" and the correct
serializer will be invoked.

You should verify with a recent build that "text/html" is being used because it
shouldn't be unless the server is mistakenly returning it as the content type.

Even if the XML serializer is invoked, there is still bug 127300 to deal with.
The serializer is not generating valid XML, let alone XHTML.

Comment 9

16 years ago
*** Bug 132205 has been marked as a duplicate of this bug. ***

Comment 10

16 years ago
I think heikki is working on something similar to this. 

--> heikki
Assignee: harishd → heikki
Keywords: dataloss, xhtml
Priority: -- → P2
Target Milestone: --- → mozilla1.0
Created attachment 78275 [details]
XHTML testcase
Created attachment 78276 [details]
XML testcase
This seems to be working, marking wfm.

There are still issues, though: it is not possible to just save the XML page. It
always tries to save as complete. Anybody have any idea how to start unwinding
that? Also the save as box only says save XML or XHTML (it does not indicate
that it is actually save as complete, and there are no other options).

Another thing is that when saving, I hit assertion every time. Assertion happens
in nsStandardRL::SetUserPass:
 NS_ASSERTION(mHost.mLen >= 0, "uninitialized");
which is called from nsWebBrowserPersist::FixupNodeAttribute() like this:

            fileAsURI->SetUserPass(NS_LITERAL_CSTRING(""));

and the stack is:

nsDebug::Assertion(const char * 0x01559fd8, const char * 0x01559fc8, const char
* 0x01559f98, int 1107) line 291 + 13 bytes
nsStandardURL::SetUserPass(nsStandardURL * const 0x046a16c8, const nsACString &
{...}) line 1107 + 35 bytes
nsWebBrowserPersist::FixupNodeAttribute(nsIDOMNode * 0x03559a04, const char *
0x0145aae0) line 2172 + 55 bytes
nsWebBrowserPersist::CloneNodeWithFixedUpURIAttributes(nsIDOMNode * 0x036509bc,
nsIDOMNode * * 0x0012f5dc) line 2020
nsEncoderNodeFixup::FixupNode(nsEncoderNodeFixup * const 0x03832208, nsIDOMNode
* 0x036509bc, nsIDOMNode * * 0x0012f5dc) line 2680 + 19 bytes
nsDocumentEncoder::SerializeNodeStart(nsIDOMNode * 0x036509bc, int 0, int -1,
nsAString & {???}) line 299 + 66 bytes
nsDocumentEncoder::SerializeToStringRecursive(nsIDOMNode * 0x036509bc, nsAString
& {???}) line 380 + 20 bytes
nsDocumentEncoder::SerializeToStringRecursive(nsIDOMNode * 0x036b4d28, nsAString
& {???}) line 401 + 21 bytes
nsDocumentEncoder::SerializeToStringRecursive(nsIDOMNode * 0x03599c20, nsAString
& {???}) line 401 + 21 bytes
nsDocumentEncoder::SerializeToStringRecursive(nsIDOMNode * 0x037ac9e4, nsAString
& {???}) line 401 + 21 bytes
nsDocumentEncoder::EncodeToString(nsDocumentEncoder * const 0x03559490,
nsAString & {???}) line 939 + 21 bytes
nsDocumentEncoder::EncodeToStream(nsDocumentEncoder * const 0x03559490,
nsIOutputStream * 0x046a1d20) line 978 + 19 bytes
nsWebBrowserPersist::SaveDocumentWithFixup(nsIDocument * 0x037ac9e0,
nsIDocumentEncoderNodeFixup * 0x03832208, nsIURI * 0x038317e8, int 1, const char
* 0x0012fbe4, const nsString & {""}, unsigned int 0) line 2384 + 44 bytes
nsWebBrowserPersist::SaveDocuments() line 1411 + 87 bytes
nsWebBrowserPersist::OnStopRequest(nsWebBrowserPersist * const 0x036223f8,
nsIRequest * 0x03832290, nsISupports * 0x00000000, unsigned int 0) line 586 + 11
bytes
nsHttpChannel::OnStopRequest(nsHttpChannel * const 0x03832294, nsIRequest *
0x03558f8c, nsISupports * 0x00000000, unsigned int 0) line 2812
nsOnStopRequestEvent::HandleEvent() line 213
nsARequestObserverEvent::HandlePLEvent(PLEvent * 0x046bafdc) line 116
PL_HandleEvent(PLEvent * 0x046bafdc) line 596 + 10 bytes
PL_ProcessPendingEvents(PLEventQueue * 0x011f4d90) line 526 + 9 bytes
_md_EventReceiverProc(HWND__ * 0x008006fc, unsigned int 49455, unsigned int 0,
long 18828688) line 1077 + 9 bytes

Adding darin in case this assertion is something we need to be concerned about.
Status: NEW → RESOLVED
Last Resolved: 16 years ago
Resolution: --- → WORKSFORME

Comment 14

16 years ago
My guess is an nsILocalFile datapath is being supplied to the persist object so
it assumes you want to save complete. For XML the datapath should probably be
nsnull & XHTML should offer the same choices as HTML.

Comment 15

16 years ago
Darin has a bug on the assertion for SetUserPass.  I'm sorry I don't have the
bug # handy.

Comment 16

16 years ago
yeah, the assertion is not critical.
Created attachment 78398 [details]
txt
Created attachment 78399 [details]
xul
*** Bug 152517 has been marked as a duplicate of this bug. ***

Comment 20

16 years ago
*** Bug 163131 has been marked as a duplicate of this bug. ***

Updated

16 years ago
Summary: Save as "Web page, complete" invalidates XHTML → Save as "Web page, complete" invalidates XHTML by omitting end tags for empty elements

Comment 21

14 years ago
This is definitely not working for me in Firebird 0.7.  I saved several pages
which were marked as:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/x
html1/DTD/xhtml1-strict.dtd">

and it removed all the /> leaving the page as invalid XHTML.

Since the last entry for this bug is two years old, I'd like to reopen the bug
because it certainly was never worksforme in Firebird 0.7.
Richard, make sure the mime type of your document is an XML mime type and not
HTML. If you have the file on local disc with suffix .xml or .xhtml then it
should be so. 

If your document is served with the HTML mime type, this is the wrong bug. Open
a new one, component: DOM to Text Conversion.

If your document mime type is an XML mime type, and you see the bug in Firefox
but not in Mozilla, open a new bug, Product: Firefox, Component: General.

Only if your document has an XML mime type and you see this in the Mozilla
(Suite) browser, reopen this bug.

Thanks.

Comment 23

14 years ago
*** Bug 248198 has been marked as a duplicate of this bug. ***

Comment 24

14 years ago
*** Bug 253626 has been marked as a duplicate of this bug. ***

Comment 25

13 years ago
*** Bug 264418 has been marked as a duplicate of this bug. ***
*** Bug 278692 has been marked as a duplicate of this bug. ***

Comment 27

13 years ago
*** Bug 205264 has been marked as a duplicate of this bug. ***

Comment 28

13 years ago
*** Bug 281580 has been marked as a duplicate of this bug. ***

Comment 29

13 years ago
*** Bug 292702 has been marked as a duplicate of this bug. ***

Comment 30

12 years ago
*** Bug 308930 has been marked as a duplicate of this bug. ***

Updated

12 years ago
Blocks: 310903
*** Bug 321965 has been marked as a duplicate of this bug. ***

Comment 32

12 years ago
*** Bug 336399 has been marked as a duplicate of this bug. ***

Updated

10 years ago
Duplicate of this bug: 393043

Comment 34

10 years ago
It is interesting that the choice of filename would make the difference. Should not meta http-equiv="Content-Type" content="text/xhtml" be enough?

I tried the following on  Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.8.1.10) Gecko/0000000000 Fedora/2.0.0.10-3.fc8 Firefox/2.0.0.10 and indeed for filename ending in .html even when using meta content-type it would come out wrong when saved, but for .xhtml it would come out right:

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta http-equiv="Content-Type" content="text/xhtml;charset=utf-8" />

There is a different problem now though, if I have anything that looks like broken html in a <pre> environment, then only the .html file is displayed, the .xhtml file instead complains:

XML Parsing Error: not well-formed
Location: http://woot/test.xhtml
Line Number 63, Column 1:
...

Not exactly what I wanted.

Comment 35

10 years ago
When i wanted to fix a typo in one of my pages on-line this bug turned my perfectly valid XHTML code i parked on-line (I don't know how it was served, but the validator didn't complain, and it doesn't change the matter anyway) to be published somewhere else later into something that had loads of errors to be fixed.

As a quick half fix i propose changing the text "Web Page, complete" into something else like "full Web Page 'snapshot'" and next to that i'd also like "(X)HTML file as received from server".

The user doesn't care much about whether Firefox is right or not,  but cares about clearity and predictability, aka usability. Also if http://validator.w3.org thinks files are OK, while Firefox doesn't agree, please file a bug with the validator or Firefox.

Comment 36

10 years ago
(In reply to comment #22)
> Richard, make sure the mime type of your document is an XML mime type and not
> HTML. If you have the file on local disc with suffix .xml or .xhtml then it
> should be so. 
> 
> If your document is served with the HTML mime type, this is the wrong bug. Open
> a new one, component: DOM to Text Conversion.

I would vote for such a bug. Per the XHTML standards, text/html can be used for XHTML, which should be serialized as XML, even if text/html may be parsed as html. 

(In reply to comment #34)
> It is interesting that the choice of filename would make the difference. Should
> not meta http-equiv="Content-Type" content="text/xhtml" be enough?

text/xhtml doesn't exist, I think you're thinking of application/xhtm+xml.
And note that the meta http-equiv does not have precedence over the media type actually served by the HTTP server.

Comment 37

10 years ago
(In reply to comment #36)
> text/xhtml doesn't exist, I think you're thinking of application/xhtm+xml.

application/xhtml+xml
apologies for the typo, and for the noise.
Duplicate of this bug: 430534
Duplicate of this bug: 696508
This bug is not WORKSFORME as can be seen among others by the number of duplicates, even recent ones; and it applies to current trunk as can be seen by its bug 696508 duplicate.

However, the exact same problem was RESOLVED WONTFIX by Boris Zbarsky in bug 696508 comment #8.
Resolution: WORKSFORME → WONTFIX

Comment 41

6 years ago
It certainly is not a WORKSFORME here - I opened bug 696508 having failed to spot this bug report.

I'm not that impressed by RESOLVED WONTFIX either - the software does not work as it should and the developers do not want to fix the problem.

Guess it is time to consider my other, non-Mozilla, browsing options.
You need to log in before you can comment on or make changes to this bug.