Closed Bug 541937 Opened 10 years ago Closed 10 years ago

XMLSerializer: link tags always serialized empty (child nodes are appended as siblings)

Categories

(Core :: Serializers, defect, critical)

defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
blocking1.9.2 --- needed
status1.9.2 --- .7-fixed

People

(Reporter: thomas.comiotto, Assigned: laurent)

References

Details

(Keywords: regression, verified1.9.2)

Attachments

(4 files, 2 obsolete files)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; de-DE; rv:1.9.2) Gecko/20100115 Firefox/3.6
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; de-DE; rv:1.9.2) Gecko/20100115 Firefox/3.6

As of Firefox 3.6 xhtml link tags are serialized empty, no matter what the document mime-type is.    

This breaks my application that depends on correct xslt serialization.   

Example:

<xsl:template match="foo">
    <link>
      <xsl:attribute name="href">value</xsl:attribute>
    </link>
</xsl:template>

This is now serialized as:

<link/>
<xsl:attribute name="href">value</xsl:attribute>
 



Reproducible: Always

Steps to Reproduce:
1. Run the attached html testcase: 
Serialization of: 
<link xmlns="http://www.w3.org/1999/xhtml"><!-- child nodes --><content/></link>
Actual Results:  
<link xmlns="http://www.w3.org/1999/xhtml" />
<!-- child nodes --><content/>

Expected Results:  
<link xmlns="http://www.w3.org/1999/xhtml"><!-- child nodes --><content/></link>
Attached file XMLSerializer Testcase
Component: General → Extension Compatibility
Component: Extension Compatibility → XSLT
Product: Firefox → Core
QA Contact: general → xslt
No it's not an xslt bug. It breaks my extension which is depending on correct serialisation of xslt documents. The serializer makes the assumption that xhtml link tags are always empty - no matter what type of document contains them. 

It also url-encodes the href attributes. I'll file enother bug for the hasty auto url-encoding.
Component: XSLT → XML
Just to make it clear, did this work in firefox 3.5? I.e. is this a regression?

If so, it could be a regression from bug 500937?
Blocks: 500937
Keywords: regression
Yes it worked in firefox 3.5.
I'd reopen bug 500937. I think the idea was to introduce a set of constructors for xml/xhtml/html serialization but that never got implemented AFAICS. 

Also, the changes are undocumented. I'd expect some notes under
 
https://developer.mozilla.org/en/XMLSerializer
https://developer.mozilla.org/en/Firefox_3.6_for_developers 

possibly with example code for xml serialization only.
The test case is broken. It depends on the rest of the page getting parsed before setTimeout fires.

(In reply to comment #5)
> I'd reopen bug 500937.

That seems too hasty here. A better fix is to make the XHTML serializer allow children on elements that are void elements in text/html and aren't permitted to have children in valid XHTML.

I'm curious: What's the use case for putting children on XHTML elements that can't validly have child elements?

> I think the idea was to introduce a set of constructors
> for xml/xhtml/html serialization but that never got implemented AFAICS. 

It got implemented but we decided not to land it.
(In reply to comment #6)
> I'm curious: What's the use case for putting children on XHTML elements that
> can't validly have child elements?

Oops. Sorry. I saw the use case now.
QA Contact: xslt → xml
Is the XHTML serializer's behavior by design or an accidental side effect of sharing code with the HTML serializer?
Since it is an XML document used in the test case, the XML serializer is called, not the XHTML serializer.

I'm going to investigate on it.
(In reply to comment #9)
> Since it is an XML document used in the test case, the XML serializer is
> called, not the XHTML serializer.

Bug 500937 changed that.
sorry, you're right, I didn't remember this enhancement :-)

I found the responsible code :

- http://mxr.mozilla.org/mozilla1.9.2/source/content/base/src/nsXHTMLContentSerializer.cpp#478
- http://mxr.mozilla.org/mozilla1.9.2/source/content/base/src/nsXHTMLContentSerializer.cpp#623

I think the fix is to check the childNodes of elements, or at least, to check if there are child elements.
Status: UNCONFIRMED → NEW
Component: XML → Serializers
Ever confirmed: true
QA Contact: xml → dom-to-text
Status: NEW → ASSIGNED
I'm preparing a patch
Assignee: nobody → laurent
Attached patch the patch (obsolete) — Splinter Review
Attachment #425222 - Flags: review?(hsivonen)
Thanks! 

Should I file another bug for the unwanted url-encoding of href attributes (e.g. in xstl or other xml based template languages the href attribute might contain  an evaluated expression -> href="{$url}") or could a fix be added to this patch?
no, create a new bug please.
Disclaimer: I'm not officially qualified to review anything.

>+PRBool
>+nsXHTMLContentSerializer::IsEmptyElement(nsIContent * aContent) {

It might be less confusing to name this method "HasChildren", because "empty element" in the (X)HTML context is often considered to have a special meaning (what's now "void element" in HTML5).

>+
>+    PRUint32 i, childCount = aContent->GetChildCount();

The code in this method seems to be indented too far.

>+
>+    for (i = 0; i < childCount; ++i) {
>+
>+      nsIContent* child = aContent->GetChildAt(i);
>+
>+      if (!child->IsNodeOfType(nsINode::eTEXT))
>+        return PR_FALSE;
>+
>+      if (!child->TextIsOnlyWhitespace())
>+        return PR_FALSE;

What's the motivation of moving whitespace around? If the document has been parsed from text/html or from text/html-compatible XML, there will be no whitespace. OTOH, for text/html-incompatible uses, such as round-tripping XSLT programs, it seems odd to move the space around.
Attached patch patch v2 (obsolete) — Splinter Review
Addressed Henri's comments. I close the tag immediately only if there are no children or just empty text nodes.
Attachment #425222 - Attachment is obsolete: true
Attachment #428394 - Flags: superreview?(jonas)
Attachment #428394 - Flags: review?(Olli.Pettay)
Attachment #425222 - Flags: review?(hsivonen)
Comment on attachment 428394 [details] [diff] [review]
patch v2

>+    nsIParserService* parserService = nsContentUtils::GetParserService();
>+  
>+    if (parserService) {
>+      PRBool isContainer;
>+      parserService->IsContainer(parserService->HTMLAtomTagToId(aName),
>+                               isContainer);
Align isContainer under parserService->HTMLAtomTagToId


Could you add tests at least for <link></link> and <link /> and <link> </link>,
and also a test where you after parsing the document remove all the text from
the textnode of <link> </link> and then serialize the document.
Attachment #428394 - Flags: review?(Olli.Pettay) → review+
Attached patch patch v2;1Splinter Review
I added the tests.
Attachment #428394 - Attachment is obsolete: true
Attachment #430735 - Flags: superreview?(bzbarsky)
Attachment #428394 - Flags: superreview?(jonas)
Comment on attachment 430735 [details] [diff] [review]
patch v2;1

1)  Add a test for the HTML version of this code too, to make sure that when serializing as HTML we still skip those kids of <link>?

2)  Why bother with the check for empty textnodes?  Because there's no way to represent them in the serialization?  I think it would be just as simple to look at the child count of the element and skip the self-closing thing if the child count is nonzero.

sr=bzbarsky either way.
Attachment #430735 - Flags: superreview?(bzbarsky) → superreview+
Blocks: 422403
checked in the trunk http://hg.mozilla.org/mozilla-central/rev/ee481095f7c2 . I added tests.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Attachment #430735 - Flags: approval1.9.2.3?
Cool, thanks. What about Bug 547667 ?
fixed orange tests on windows http://hg.mozilla.org/mozilla-central/rev/694082ea95bf
The test case made the HTML5 parser go back to orange due to doctype case differences and due to white space differences after </body>. Here's a fix.
Attachment #435153 - Flags: review?(laurent)
Comment on attachment 435153 [details] [diff] [review]
Fix HTML5 parser orange

ok, thanks Henri.
Attachment #435153 - Flags: review?(laurent) → review+
Thanks for the r. Pushed the fix for the test:
http://hg.mozilla.org/mozilla-central/rev/c12c247d4018
(In reply to comment #22)
> Cool, thanks. What about Bug 547667 ?

Does bug 547667 need to be landed with the patch on this bug, or are they separate issues?
they are separate issues.
Comment on attachment 430735 [details] [diff] [review]
patch v2;1

This patch fixes a regression in gecko 1.9.2 when serializing XHTML+XSLT, so it should be applied in the 1.9.2 branch
Attachment #430735 - Flags: approval1.9.2.4? → approval1.9.2.5?
blocking1.9.2: --- → ?
Can we get a combined patch please, to ensure that the test fix lands with the original patch?
blocking1.9.2: ? → needed
Attachment #445411 - Flags: approval1.9.2.5?
Attachment #430735 - Flags: approval1.9.2.5?
Comment on attachment 445411 [details] [diff] [review]
Combined patch for the 1.9.2 branch

a=beltzner for mozilla-1.9.2 default
Attachment #445411 - Flags: approval1.9.2.5? → approval1.9.2.5+
Attachment #445411 - Flags: approval1.9.2.5+ → approval1.9.2.6+
patch checked in mozilla-1.9.2 http://hg.mozilla.org/releases/mozilla-1.9.2/rev/f848db43b669  (gecko 1.9.2.6)
Verified for 1.9.2.7 with manual testcase above using build 1 (Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.7) Gecko/20100701 Firefox/3.6.7 (.NET CLR 3.5.30729)). Checked in tests appear to be passing.
Keywords: verified1.9.2
Duplicate of this bug: 579667
You need to log in before you can comment on or make changes to this bug.