Closed
Bug 226786
Opened 22 years ago
Closed 22 years ago
Blanks And New Line Characters Get Stripped From CDATA when prettyprinting
Categories
(Core :: XML, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: vgendler, Assigned: hjtoi-bugzilla)
References
Details
Attachments
(4 files)
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5) Gecko/20031007
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.5) Gecko/20031007
Blanks and new line characters get stripped from XML CDATA section.
Also the keyword CDATA together with symbols "[" and "]" are not showed up.
Reproducible: Always
Steps to Reproduce:
1. See the attached pictures for Mozilla and IE and sample XML file.
Actual Results:
Incorrect picture in Mozilla
Expected Results:
Correct picture in IE
![]() |
||
Updated•22 years ago
|
Summary: Blanks And New Line Characters Get Stripped From CDATA → Blanks And New Line Characters Get Stripped From CDATA when prettyprinting
![]() |
||
Updated•22 years ago
|
Attachment #136328 -
Attachment mime type: text/plain → text/xml
Las changed was:
bz-vacation@mit.edu changed:
What |Removed |Added
----------------------------------------------------------------------------
Attachment #136328 [details]|text/plain |text/xml
mime type| |
PLEASE DO NOT CHANGE!!!!!!!!!!!!!!!!!!
THIS BUG REPORTS THAT Mozilla DOES NO OUTPUT XML FILES CORRECTLY, SO
THIS CHANGE PRODUCES WRONG EXAMPLE OUTPUT.
---------- I will attach this XML file again as text/plain ------------
This is more or less by design. There is nothing in any spec that says that
newlines and whitespace in CDATA sections are more important then newlines and
whitespace in other textnodes. So why should we perseve that whitespace in one
case but not the other?
Status: UNCONFIRMED → RESOLVED
Closed: 22 years ago
Resolution: --- → INVALID
By design??????????????
CDATA contains DATA and each space and new line character belong to data - has
meaning.
See IE, XMLSpy, ... ANY XML editor.
If this is "by design" then the design is wrong.
Couple of examples.
1. CDATA contains EJB QL statement (J2EE), for instance
SELECT Object(p) FROM schema WHERE schema.attr AS p LIKE 'abc de%'
In Mozilla we see
SELECT Object(p) FROM schema WHERE schema.attr AS p LIKE 'abc de%'
And this is wrong!!!
2. CDATA contains code snippet written in Python language.
Not only we will see the complete mess but also WRONG code because
in Python lines indentation have syntactical meaning.
I am reading now J2EE 1.4 documentation which has XML code as links.
This links show WRONG code - I have to use MSIE.
I can give you hundreds of examples why it is wrong.
You can reformat XML code and still have well-formed XML document
even valid (if there is DTD or XML schema) but CDATA contents has nothing
to do with XML code. It is data and as such should be preserved as it is.
As stated, the whitespace in normal textnodes can be just as important as
whitespace is cdata-sections so if we should preserve 'formatting' in one we
should in the other. You can put python or images or whatever in textnodes
without using cdata sections
Reporter | ||
Comment 10•22 years ago
|
||
Up to you guys, up to you.
Anyway, the last attempt.
In Additional Comment #6 From Jonas Sicking 2003-11-26 09:36 you wrote
"There is nothing in any spec that says that
newlines and whitespace in CDATA sections are more important then newlines and
whitespace in other textnodes."
I am not talking about what is MORE or LESS important.
I said that CDATA sections must be outputted preserving ALL characters in it.
Here is an excerpt from W3C XML specification
http://www.w3.org/TR/REC-xml
======================================================
Section 3.3.3 Attribute-Value Normalization
If the attribute type is not CDATA, then the XML processor must further process
the normalized attribute value by discarding any leading and trailing space
(#x20) characters, and by replacing sequences of space (#x20) characters by a
single space (#x20) character.
======================================================
Here we clearly see that CDATA MUST NOT be processed.
I think that mark this issue as "RESOLVED INVALID" you prevent other developers
to express their opinion regarding this issue.
whitespace in cdata sections are preserved in the DOM so we're not breaking any
specs here, we're simply using a css-style that doesn't do what you want it to
do. Note that the xml-spec says the same thing about whitespace in textnodes as
in cdata sections, they must be preserved.
If you were to write an application that used the data from the XML (through DOM
or any other method) you will find that the whitespace is there.
As stated, cdata-sections and other text should not be treated differently when
it comes to whitespace. Whitespace is equally important in both. It could be
argued that we should have some mode in the prettyprint that preserves all
whitespace, but then IMHO we should do that for both textnodes and cdata-sections.
Reporter | ||
Comment 12•22 years ago
|
||
You wrote:
Additional Comment #11 From Jonas Sicking 2003-11-27 22:02
"we're not breaking any specs here, we're simply using a css-style that doesn't
do what you want it to do"
NO! Nobody wants to have this. It is not only me - check W3C specs, check ANY
XML book.
css-style is not enough - in the css you can define WINDINGS font. Right? It
will create very funky document. Try this too.
"Note that the xml-spec says the same thing about whitespace in textnodes as
in cdata sections, they must be preserved." - SO! Preserve them!
"If you were to write an application that used the data from the XML (through DOM
or any other method) you will find that the whitespace is there." - Right!
But I also want TO SEE WHAT I GET! This is what I see in IE but unfortunately
NOT in Mozilla. The same I see in ANY XML editor. Do you want people to abandon
Mozilla and use IE?
"As stated, cdata-sections and other text should not be treated differently when
it comes to whitespace." - Dead wrong and W3C (and ALL XML books) states this
as I quoted in the previous message.
Want to HURT Mozilla, make it unusable (for XML) go ahead, do it.
Reporter | ||
Comment 13•22 years ago
|
||
Some additions.
I have checked many books about this subject matter. All of them state the same
as described in W3C's XML specification.
Here is an example from
"J2EE™ Web Services" By Richard Monson-Haefel
Addison Wesley ISBN: 0-321-14618-2
2.1.2.5 CDATA Section
A CDATA section allows you to mark a section of text as LITERAL so that it will
NOT be PARSED for tags and SYMBOLS, but will instead be considered just a string
of characters.
As we see again CDATA section must be treated differently. It is not the same
as other text nodes. Actually XML document is a TEXT document and as such
ALL its elements are text. It does not mean that all of them must be treated
in the same way. One more example - comments - they must not be parsed too.
It is similar to HTML <pre></pre> Tag - Mozilla, of course, does not parse
the text in this tag.
You have only found quotes that say that whitespace in cdata should be
preserved, nothing that says that it doesn't need to in textnodes, so my
statement still holds true: textnodes and cdata-sections are no different when
it comes to whitespace, it needs to be preserved in both.
Reporter | ||
Comment 15•22 years ago
|
||
Message Additional Comment #14 From Jonas Sicking 2003-11-28 14:46
You said "You have only found quotes that say that whitespace in cdata should be
preserved, nothing that says that it doesn't need to in textnodes"
First of all XML document is a text document that is
it contains TEXT, only TEXT, nothing but the TEXT. I do not understand what
you are mean under "textnodes". Everything is text there.
Next, the W3C's XML specification clearly says that CDATA must not be parsed.
The same say all XML books authors, the same implemented in all XML editors,
the same, of course implemented in IE.
Once again CDATA is different than other "textnodes" as the XML specs says.
Mozilla also does not show the CDATA markups together with "[" and "]".
Many Control Centers for various J2EE application servers implemented
as browser applications. All of them have screens for editing component
descriptors which are XML documents. Such implementation of CDATA
makes impossible to present these descriptors for viewing and editing
as I showed in example of EJB QL for EJB.
Assignee | ||
Comment 16•22 years ago
|
||
Vladimir, what Jonas is referring to as nodes are DOM nodes, I believe. When
Mozilla parses the XML, it is treated per the XML spec. But then it will be put
into DOM, whose rules differ slightly from XML. Then the XML is transformed with
XSLT, which can not preserve all constructs in the original XML (for example,
CDATA sections and namespace declarations). Finally the result DOM is displayed
with CSS, whose rules are again different.
But, supposing the result DOM still has the whitespace you'd need, could we
change the CSS style to preserve whitespace? Anybody see any problems with that?
We could even have alternate CSS stylesheet for that case.
Reopening while we discuss that.
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
Yeah, I'd be fine with adding more alternative stylesheets that preserves
whitespace. We'd have to add some capabilities to the xslt-engine if we want to
be able to just preserve stuff for cdata-sections though which i'm less sure i
want to do.
Reporter | ||
Comment 18•22 years ago
|
||
Correct Heikki. I can add to it that in this version of Mozilla we are not
able to read the latest J2EE specification. Unfortunately we can do this in IE.
As much as I hate IE and love Mozilla I have no other choice as to use damn IE
regardless all "for" and "gains".
Do we want this? I do NOT.
Assignee | ||
Comment 19•22 years ago
|
||
What do you know - the Monospace alternative stylesheet we supply already
contains the rule to preserve whitespace
(http://lxr.mozilla.org/seamonkey/source/content/xml/document/resources/XMLMonoPrint.css).
Vladimir, see if this is enough for you: when you open an unstyled XML document
(like the first attachment in this bug), select the Monospace alternative
stylesheet (View > Use style > Monospace). Does it now look like you wanted?
Reporter | ||
Comment 20•22 years ago
|
||
Almost Heikki!!! I mean this is enough to see the correct contents of
CDATA sections. Should be also the keyword CDATA and all brackets "[", "]"
preserved?
Thank you sir!
Assignee | ||
Comment 21•22 years ago
|
||
Yes, they should be preserved, but at the moment they are not. The reason is
because we do the pretty printed view via XSLT transformation and as far as I
know XSLT will not be able to preserve them. We would need an extension in our
XSLT engine to handle them. There are also other things that we loose because of
XSLT, but they are all covered by bug 175946.
Closing this as worksforme.
Status: UNCONFIRMED → RESOLVED
Closed: 22 years ago → 22 years ago
Resolution: --- → WORKSFORME
Reporter | ||
Comment 22•22 years ago
|
||
One more thing: may be make this style as a default for XML.
Onse again, thank you very much!
Comment 23•20 years ago
|
||
*** Bug 300593 has been marked as a duplicate of this bug. ***
You need to log in
before you can comment on or make changes to this bug.
Description
•