Closed Bug 226815 Opened 21 years ago Closed 21 years ago

xsl:output method='text' should not escape using entities

Tracking

()

Status:

VERIFIED INVALID

People

(Reporter: manos_lists, Assigned: peterv)

Details

Attachments

(5 files)

test XML file to be used as the source document 21 years ago Manos Batsis 242 bytes, text/xml		Details
test XSLT file to demonstrate the problem 21 years ago Manos Batsis 786 bytes, text/xml		Details
what the result should be like 21 years ago Manos Batsis 190 bytes, text/plain		Details
what the result is (which is srong) 21 years ago Manos Batsis 318 bytes, text/plain		Details
modified Axel's example 21 years ago Manos Batsis 1.09 KB, text/html		Details

Manos Batsis

Reporter

Description

•

21 years ago

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5) Gecko/20031007
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5) Gecko/20031007

The text output method should be supported properly without escaping by
transformiix or the Moz glue/serialization  code according to [1]:

[[[
The text output method outputs the result tree by outputting the string-value of
every text node in the result tree in document order without any escaping.
]]] 

Additionally, no HTML elements like <html><head></head><body> or <pre
id="transformiixResult"> should be generated to wrap the result since the output
method is text.

[1] http://www.w3.org/TR/xslt#section-Text-Output-Method

Reproducible: Always

Steps to Reproduce:
1. in an XSLT stylesheet, set the oputput method to 'text' and output characters
 like '<' or '&'that are XML escaped using the predefined XML entities

Actual Results:  
In your result, note that the characters are still escaped in the form of
entities. Additionally, the result is wrapped in an XML root element which
itself is wrapped inside HTML elements

Expected Results:  
Instead, the (rendered/serialized) result should only include the unescaped text
generated by the stylesheet without wrapping it in HTML tags.
I think this is a major bug for XSLT as the text output method cannot be used
with expected results.

Manos Batsis

Reporter

Comment 1

•

21 years ago

Attached file test XML file to be used as the source document — Details

Manos Batsis

Reporter

Comment 2

•

21 years ago

Attached file test XSLT file to demonstrate the problem — Details

Manos Batsis

Reporter

Comment 3

•

21 years ago

Attached file what the result should be like — Details

Manos Batsis

Reporter

Comment 4

•

21 years ago

Attached file what the result is (which is srong) — Details

Axel Hecht

Updated

•

21 years ago

Attachment #136348 - Attachment mime type: text/xsl → text/xml

Axel Hecht

Comment 5

•

21 years ago

tested on 1.4 and trunk, INVALID.
The generated result does indeed look like attachment 136349 [details], the result in 
attachment 136350 [details] is the html serialisation of the generated document.
This wrapping document is compatible with what mozilla does for text documents,
just look at the selection source for 136349, if you mark all.
If you use document inspector on the result of the transformation, you will 
see that the textnode in the document actually does have '<' as required.

Status: UNCONFIRMED → RESOLVED

Closed: 21 years ago

Resolution: --- → INVALID

Manos Batsis

Reporter

Comment 6

•

21 years ago

Hi Axel,

The document inspector does show '<' instead of '&lt;' but if I ctrl+A and
context-select "view selection source" I see entities. Also, if I use the JS
interface the result contains entities as well, so I guess the DOM inspector
just renders entities as it renders markup in general...

Axel Hecht

Comment 7

•

21 years ago

xsl=document.implementation.createDocument('','',null)
xml=document.implementation.createDocument('','',null)
xsl.load("file:///C:/Sources/QATests/test.xsl")
xml.load("file:///C:/Sources/QATests/test.xml")
p=new XSLTProcessor
p.importStylesheet(xsl)
f=p.transformToFragment(xml, document)
f.firstChild.nodeValue
returns:

		<documentElement attribute="value">
		<element attribute="value">
		<element>
		<element attribute="value">
		<element>
		<element attribute="value">
		<element>
		<documentElement>

I have no idea what you're doing. (Note that I skipped the async loading stuff)

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 8

•

21 years ago

Manos: It's the 'view selection source' code that replaces '<' with '&lt;'. Do
the same thing with the 'what the result should be like' attachment. The
DOM-inspector shows the nodevalue just as it's in the DOM.

transformiix displays textresult the same way that mozilla displays textfiles.
Which is a reasonable thing to do IMHO.

What we should possibly do is to set the mime-type of the result-document to
'text/plain' since that is done for textdocuments

Manos Batsis

Reporter

Comment 9

•

21 years ago

Attached file modified Axel's example — Details

Right, the MIME should be dependant on the xsl:output method.

I am insisting that not everything is quite right though. Axel's example uses
nodeValue, which will not return entities anyway AFAIK. I created an attachment
that prints both the nodeValue and a string serialization of the transformation
result (using an XMLSerializer): the latter is full with entities.

Peter Van der Beken [:peterv]

Assignee

Comment 10

•

21 years ago

Well doh, XMLSerializer needs to translate to entities, otherwise the resulting
string wouldn't contain valid XML! I have no idea why you would expect this not
to happen, this works as advertized. What exactly are you trying to do?

Manos Batsis

Reporter

Comment 11

•

21 years ago

> XMLSerializer needs to translate to entities, otherwise 
> the resulting string wouldn't contain valid XML!

If the transformation result we feed in the serializer was produced by a
stylesheet, with the output method set to text, it's not supposed to be valid
XML in the first place.

In this case however, it actually is; but the serializer, in his attempt to
"make sure" this is going to be well-formed XML ruins the actual markup which is
 not the smarter thing to do, thats all. IMHO the XML serializer should either
refuse to process the transformation result (which is not XML in the first
place, per the xsl:output method) or not try to be smart in escaping it. 

I'm not saying the implementation does not work as advertized! The reason I use
XMLSerializer is that I dont know the output method of the XSLT in advance...

Axel Hecht

Comment 12

•

21 years ago

VERIFIED INVALID.
There is *no* way to make an xml serializer take a DOMNode and not encode the
occuring '<'. You do either that or insert a CDATA section, which is as far
away from what you want as the entity stuff.
DOM does not have text documents, period. So, as we need a DOM to display stuff, 
there is no way to get around this.
As finding out the output method is trivial (at least for text, that has to be
explicitly set), there is no real problem here in the first place.

Status: RESOLVED → VERIFIED

Peter Van der Beken [:peterv]

Assignee

Comment 13

•

21 years ago

The XMLSerializer takes a DOM Node as input and outputs it as serialized XML, it
does not take a string so I don't understand why you'd expect it to not
translate to entities.
Furthermore, as Axel points out, the DOM doesn't have "text documents" so what
you want will never work in the DOM. There's a reason why we convert the result
of the transformation for output="text" into a textnode with a pre element (with
id transformiixResult) as its parent, it's so you can detect this situation
through the DOM and do the reasonable thing: access the result through
.nodeValue. I don't think there's anything to solve here, you just need to adapt
your code to the fact that you're using a non-serializing processor.
We don't support section 16 in Mozilla, we don't have to and we can't anyway. We
do try to support what we can or adapt it to a non-serializing model. Giving you
access to the resulting string through a nodeValue of a textNode is one of those
adaptations. The alternative is removing support for output="text".

Jonas Sicking (:sicking) No longer reading bugmail consistently

Comment 14

•

21 years ago

Manos: you still havn't provided an example where textfiles and text-output
differs. What are you trying to do that fails?

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

xsl:output method='text' should not escape using entities

Categories

(Core :: XSLT, defect)

Tracking

()

People

(Reporter: manos_lists, Assigned: peterv)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(5 files)

Description

Comment 1

Comment 2

Comment 3

Comment 4

Updated

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Attachment

General

Description

File Name

Content Type