Closed Bug 305242 Opened 19 years ago Closed 14 years ago

MSDN code examples in CDATA not rendered

Categories

(Core :: DOM: HTML Parser, defect)

defect
Not set
major

Tracking

()

RESOLVED FIXED

People

(Reporter: jwatte-mozilla, Unassigned)

References

()

Details

(Whiteboard: [fixed by the HTML5 parser])

User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10) Gecko/20050716 Firefox/1.0.6

Go to MSDN. Read up on Standard Annotations and Semantics for DirectX9 Effects.
Realize that the code samples don't render in Firefox. They render in Internet
Explorer. The View Frame Source command on the right-hand frame shows that the
code in question is CDATA encapsulated, like so:

<![CDATA[><pre>
int VariableName : SasGlobal
<
    SasVersion 
    [OptionalAnnotations]
>;
</pre><![CDATA[]]&gt;]]>

It also shows that Microsoft likes generating verbose HTML, but that's beside
the point...


Reproducible: Always

Steps to Reproduce:
1. Run Firefox
2. Browse to the link
3. Notice that code samples aren't displayed

Actual Results:  
Text is displayed, but not code samples.

Expected Results:  
Displayed the code samples inline with the text.
That's some... interesting markup, and interesting results. Judging by "view
selection source" we parse that into

<!--[CDATA[><pre>
int VariableName : SasGlobal
<
    SasVersion 
    [OptionalAnnotations]-->;
<!--[CDATA[]]-->

which is rather odd, and IE seems to just display it as a <pre>, which is rather
odd, and from my understanding of named CDATA sections the proper display would
be to display literally:

><pre> int VariableName : SasGlobal < SasVersion [OptionalAnnotations] >;
</pre><![CDATA[]]&gt;

which is an odd thing to want displayed.
Assignee: nobody → parser
Component: General → HTML: Parser
OS: Windows XP → All
Product: Firefox → Core
QA Contact: general → mrbkap
Hardware: PC → All
Version: unspecified → Trunk
I agree that it looks odd, but it should be displayed!

It may be that the CDATA in turn gets fed back into the page using some DHTML
trick, and thus they include the formatting; I haven't analyzed it that deeply.
The decision was made a long time ago to treat explicit CDATA sections like
these in HTML as comments (<!-- -->) instead of text. IIRC HTML 4.01 is silent
on what to do with them. Do we want to reverse direction and display them as
text? <![CDATA[]]> is one of those things that never really saw the light of day
anyway in HTML UAs.

Ian, any opinion here?
The HTML 4.01 specification strongly implies that marked CDATA sections are to be supported. 

Section 4, Conformance: requirements and recommendations (http://www.w3.org/TR/html401/conform.html), says "An HTML document is an SGML document that meets the constraints of this specification."

B.3.5 Marked Sections, (http://www.w3.org/TR/html401/appendix/notes.html), says "SGML also defines the use of marked sections for CDATA content", giving an example using <![CDATA[.... This section is in "Appendix B: Performance, Implementation, and Design Notes", "B.3 SGML implementation notes". Please note the implication that some UA's are known to not implement this. 

See also, the previous subsection, "B.3.3 SGML features with limited support": there is the ambiguous "SGML systems conforming to [ISO8879] are expected to recognize a number of features that aren't widely supported by HTML user agents. We recommend that authors avoid using all of these features." But, does not clearly specify which features. This does not relieve UA's of the obligation to support these features.

Presumably, the XML spec is a reasonable guide for marked CDATA: "Within a CDATA section, only the ]]> string is recognized as markup." (see http://www.w3.org/TR/xml11/#sec-cdata-sect).

This bug needs a clear description of exactly what IE does with CDATA sections.
IE doesn't put actual CDATA sections into the DOM at all: <![CDATA[ foo ]]> will disappear entirely. However, the presence of a ">" in a CDATA section's content causes them to treat it as an odd sort of comment instead: the <![C is treated as the comment start delimiter, and the > and the preceeding two characters are treated as the comment end delimiter. Thus <![CDATA[> produces the same DOM as <!--DAT--> and <![CDATA[ <bar> ]]> produces the same DOM as <!--DATA[ <b--> ]]>.

In the case of that MSDN article, their DOM from

<![CDATA[><pre>
int VariableName : SasGlobal
<
    SasVersion 
    [OptionalAnnotations]
>;
</pre><![CDATA[]]&gt;]]>

is

#comment: DAT
<PRE>
|__ #text: int VariableName ...

Please tell me we're not going to implement that. If we're going to follow someone else's display of CDATA sections in HTML, Opera's "><pre> int VariableName : SasGlobal < SasVersion [OptionalAnnotations] >;
</pre><![CDATA[]]&gt;" seems vastly more defensible.
(In reply to comment #7)
> Please tell me we're not going to implement that.

We're not going to implement that. I think this bug should be about whether or not <![CDATA[]]> sections should show their content as text or if we should treat them as comments (which we do now, though they're kind of weird).
I agree that this bug should be about the correct handling of CDATA. IE's behavior is worth noting, but not emulating.

(Duplicate bug 334821 has a test case attached, and references XPATH to support correct CDATA markup handling.)
(In reply to comment #9)
> I agree that this bug should be about the correct handling of CDATA. IE's
> behavior is worth noting, but not emulating.
> 
> (Duplicate bug 334821 has a test case attached, and references XPATH to support
> correct CDATA markup handling.)

I'm not sure it's the same. This bug is about how CDATA sections get rendered (if at all), bug 334821 is about XPath accessing non-rendered documents. It might be the same limitation we hit, but I do not know.
This bug is about tag-soup parsing.  It's not related to bug 334821.
CDATA should be passed directly to the output. 

This: <![CDATA[ <foo>bar</foo> ]]> 
Should display on screen as: <foo>bar</foo>

This is certainly what should happen in the case of an XHTML document -- see REC-xml 2.7: http://www.w3.org/TR/2000/REC-xml-20001006#sec-cdata-sect

e.g http://interreality.org/~reed/tmp/cdata-xhtml.html .

For HTML 4... probably the same thing I guess.
Also, someone should change the title of this bug? It's not really abound MSDN and it's weird code, Firefox skips the contents of CDATA in all pages.
(aha, the weird extra <[CDATA[ that's inserted at the end of the real CDATA in the code given above is to get around a CDATA parsing bug in IE6!!)
> This is certainly what should happen in the case of an XHTML document

Not relevant here.

What _is_ relevant is what happens for HTML.  Please bring this up in the HTML Working Group; this needs to be specified.
The WHATWG spec now defines this in detail, in a way that is compatible with what IE does (and what Safari does).
Status: UNCONFIRMED → NEW
Ever confirmed: true
Ian, can you post a link to this definition?
So basically what Blake didn't want to do in comment 8?
Pretty much. You don't get the corrupted comment data that IE has, though.
I think Firefox should show CDATA contents as text, not treat it as comment. As far as I understand, this is also what the W3C specifications mean. And it would make including XML/HTML/code samples considerably easier.
Jaan, that would break websites.
Boris, why?  The purpose of CDATA is to provide a span of the document where normal tag parsing is not done, so you can put text in with angle bracket and such without having to escape them as entities. It would only break a website that uses CDATAs but was for some strange reason relying on the fact that Firefox skips CDATAs. I would venture that there are no such websites on the web :)
Things that rely on IE's handling of CDATA (and they exist, I assure you) would break.  Of course they're broken now too.  I'm not saying we shouldn't fix this bug, just that "the purpose of CDATA" is no more compatible with tag-soup parsing than the purpose of SGML NET syntax is.
Assignee: parser → nobody
QA Contact: mrbkap → parser
Depends on: html5-parsing
The test cases are gone, but I verified with Hixie's Live DOM Viewer that this is fixed for the example in comment 7.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by the HTML5 parser]
You need to log in before you can comment on or make changes to this bug.