Closed Bug 392975 Opened 17 years ago Closed 17 years ago

textarea displays CDATA tags - inconsistent with how CDATA is handled by script tag

Categories

(Core :: DOM: HTML Parser, defect)

1.8 Branch
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 27403

People

(Reporter: gus.heck, Unassigned)

Details

(Keywords: testcase)

Attachments

(3 files)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6 Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6 This is not a dup of Bug 310928 the <textarea> tag is handling CDATA sections in a manner that is inconsistent with their handling elsewhere. The following: <textarea><![CDATA[ <some> things that <look/> like markup & junk ]]> other stuff that is &lt;escaped&gt; with entity references </textarea> is a valid xhtml 1.0 fragment, but if the CDATA tags are removed it is not. Unfortunately, when this code is rendered into the browser, the <![CDATA[ and ]]> are displayed in the textarea. Thus when the form is submitted the CDATA start and end tags are sent along as form data. However, the &gt; and the &lt; are sent as < >. The XHTML spec does not adress the use of CDATA in textarea directly, but it does speak specifically about script and style (http://www.w3.org/TR/xhtml1/#h-4.8) Style and Script are defined in the DTD as: <!-- style info, which may include CDATA sections --> <!ELEMENT style (#PCDATA)> <!-- script statements, which may include CDATA sections --> <!ELEMENT script (#PCDATA)> The definition for textarea is quite similar <!ELEMENT textarea (#PCDATA)> <!-- multi-line text field --> the string '<![CDATA[' does not constitute legal javascript and it does not constitute legal CSS. Since neither the javascript engine nor the CSS engine complains about the use of CDATA tags, it follows that the CDATA start and end tags removed at parse time. Why shouldn't textarea do the same? Finally, there is a clear violation of the DOM spec (http://www.w3.org/TR/REC-DOM-Level-1/level-one-core.html#ID-E067D597). If you create such a text area, and display it in the DOM Inspector, it does NOT create a node for the CDATA section as specified. (as a side note there is possibly another bug... CDATA seems to be treated as a comment inside pre tags, one might look generally into how CDATA is handled). Reproducible: Always Steps to Reproduce: 1.Write a page that has a CDATA section inside a textarea 2.View the page Actual Results: CDATA tag is displayed to the user. Expected Results: CDATA should be parsed and removed in the same fashion as script and style tags.
Component: General → HTML: Parser
Product: Firefox → Core
QA Contact: general → parser
Version: unspecified → 1.8 Branch
Attached file Testcase #1
This example works for me, Firefox 2.0.0.6 and trunk on Linux.
-> INCOMPLETE
Status: UNCONFIRMED → RESOLVED
Closed: 17 years ago
Resolution: --- → INCOMPLETE
Perhaps your test case works because the browser is in "clean up the mess" mode. The test you posted has the following validation errors: line 12 column 9 - Error: required attribute "rows" not specified line 12 column 9 - Error: required attribute "cols" not specified line 12 column 9 - Error: document type does not allow element "textarea" here; missing one of "p", "h1", "h2", "h3", "h4", "h5", "h6", "div", "pre", "address", "fieldset", "ins", "del" start-tag Attaching test case that is valid html, and displays the problem.
Status: RESOLVED → UNCONFIRMED
Resolution: INCOMPLETE → ---
Oh and FWIW I'm now still seeing it in Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.8) Gecko/20071008 Firefox/2.0.0.8
If you serve your XHTML content with a "text/html" mime type then it will be parsed as HTML, not XHTML. If you want to use CDATA sections you should use an XML mime type, for example "application/xhtml+xml" See bug 27403 comment 24.
Status: UNCONFIRMED → RESOLVED
Closed: 17 years ago17 years ago
Keywords: testcase
Resolution: --- → DUPLICATE
Could we try living int the real world? application/xhtml+xml is not a mime type that anyone can serve on anything beyond a toy/internal/specialized web site. I wish it were not so, but that is the current state of the web. If you do serve application/xhtml+xml to the world, every IE user (sadly 70-80% of all users) will get a file download box instead of your page. (as shown in the attached screen shot). The current state means that you can't write a valid xhtml page that contains a textarea pre-populated with data containing < > & etc, and expect it to work in Firefox. Your choices are broken, invalid xhtml that works, or xhtml that Firefox can't render.
Status: RESOLVED → UNCONFIRMED
Resolution: DUPLICATE → ---
Note that the attachment I submitted, which is served as text/html, is valid according to the w3c validator. http://validator.w3.org/check?uri=https%3A%2F%2Fbugzilla.mozilla.org%2Fattachment.cgi%3Fid%3D287360&charset=%28detect+automatically%29&doctype=Inline&group=0 So either their own validator is wrong, or the spec which (allegedly, I haven't found the reference yet) says CDATA is not allowed based on the mime type the document is served under is perhaps being misinterpreted. To me it seems very odd that the validity of a document should be dependent on the metadata accompanying that document as claimed in bug 27403 and bug 82829. If you write the file to disk does it become valid or invalid depending on whether or not your program guesses the mimetype correctly? That doesn't make much sense. Try this experiment on for size. Take a PDF, remove .pdf, point acrobat reader at it... it still works. Granted the OS can't figure out that acrobat reader is the right program to display the file, but the ability of the reader to display the file is not dependent on metadata encoded in the file name.
I can't even find the term CDATA in this: http://www.w3.org/TR/2002/NOTE-xhtml-media-types-20020801/ And searching http://www.w3.org/TR/xhtml1/ for both "CDATA" and "text/html" I find nothing. I'm not sure I believe bug 27403 comment 24. Just where in the XHTML spec is this prohibition on CDATA when served as text/html?
Gus Heck, Can you please attach an example which works (ie dont show <![CDATA[) on IE6 and IE7 your attachment 287360 [details] shows <![CDATA[ on both IE6 and IE7
No I can't IE doesn't handle it properly either. I never claimed that this was a IE parity bug. The way bugs get fixed in IE is things work in Firefox and people make fun of IE for it. Eventually the get embarrassed and fix it. :). application/xhtml+xml works in Firefox, but the failure in IE is so incredibly bad that only a very savy users would even have a chance of comprehending the problem. Some weirdness in a text-area that can be worked around... that one could at least put up on a site and blame IE... I do this sort of thing on my personal pages with great glee... to promote firefox and make IE look bad :). I did this with CSS fixed positioning for example. IE users got a separate, less pretty but functional style sheet. Firefox/Mozilla users got the cool version of the site. I am not sure why so many people hate doctype sniffing... it's no different than what most programs do if they support older file formats (*.doc could be from Wordpad, or almost any previous version of word. So the metadata to the files extension of what program to use to interpret it, still leaves the flavor of file to the program. Why should reading files off the web be different than reading them from disk? When files are on disk programs look for a header that tells them what flavor the file is and then render correctly. If nobody's supposed to look at it, why even have a Doctype?.
(PS: replying not in order) (In reply to comment #10) > of file to the program. Why should reading files off the web be different than > reading them from disk? When files are on disk programs look for a header that > tells them what flavor the file is and then render correctly. If nobody's (I am not sure, somebody may correct me if I am wrong) I think both IE and Firefox, render file from local disk after finding the mime-type by looking value of "Content Type" in windows registry for that extension. Example for *.xhtml files it is under "HKEY_CLASSES_ROOT\.xhtml" > I am not sure why so many people hate doctype sniffing... it's no different doctype sniffing: problem is this case 1) ======= Say a I have a link to JPG image file, and when somebody click the link I want to make the image to be offered as download rather than automatically visible in the web browser. if there is no content sniffing I can send an image file with http header mime-type=application/x-download and browser prompts user to download but if there is content sniffing, browser automatically find it as a JPG file and will directly render it, which is not what I wanted. case 2) ======= Say I have website which is for teaching HTML and I want to show actual HTML of an example page to the visitor, I can put a link to the original source but served with http header mime-type=text/plain Firefox with show the HTML source code of the page But a browser with content sniffing (IE used to sniffing, now am not sure) will render it as HTML page, defeating the purpose of link. > ======= > ======= Now let me come to the topic !!! "Sending XHTML as text/html Considered Harmful" Why? this is what I was asked to read please see http://www.hixie.ch/advocacy/xhtml also you can read http://www.w3.org/TR/xhtml-media-types/ it says in "Abstract" In summary, ...... ..... and the use of 'text/html' SHOULD be limited to HTML-compatible XHTML 1.0 documents. ................ and <![CDATA[ is not HTML-compatible > supposed to look at it, why even have a Doctype?. This is my opinion. 1. If a content is served as "application/xhtml+xml" treat it as "xhtml" 2. If a content is served as "text/html" start reading it as HTML, but in <html><header> tag if we see <meta http-equiv="Content-Type" content="application/xhtml+xml" /> switch parsing mode to XHTML and re-parse content which will identify !DOCTYPE declaration correctly. I assume, this should solve all issues with xhtml vs html Also I want Firefox to display an error icon on status bar if any parsing error occurs. That will allow a QA personal of the web-site to argue with the web developer to fix it. Firefox is not giving parsing error for HTML now. And for JS/CSS error it is displayed only at JS Console. But not on some place easily visible to an average QA tester. So from above you will see my opinion is closer to yours. But I am only a Firefox user, not a mozilla.org employee or somebody pays to mozilla.org. So we really dont have an voice!!! And http://bugzilla.mozilla.org/ is only an internal tool of mozilla.org which by the mercy of mozilla.org they are sharing with public like me and you. Here is the "Bugzilla Etiquette", which we are supposed to read, if we want to participate https://bugzilla.mozilla.org/page.cgi?id=etiquette.html > people make fun of IE for it. Eventually the get embarrassed and fix it. :). > application/xhtml+xml works in Firefox, but the failure in IE is so incredibly SOLUTION TO YOUR PROBLEM !!!! I assume your are having problem when you serve pages dynamically. So go http://web-sniffer.net/?url=http://mozilla.org/ with IE6/IE7 as well as Firefox under "HTTP Request Header" For Firefox you will see:- Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8[CRLF] For IE you will see:- Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, application/x-shockwave-flash, application/xaml+xml, application/vnd.ms-xpsdocument, application/x-ms-xbap, application/x-ms-application, */*[CRLF] So you see for Firefox there is 1. application/xhtml+xml 2. application/xml but not for IE (dont confuse with application/xaml+xml in IE, which is XAML not XHTML) Now what you needed to do is simple!!! Before you start generating page content, get "Accept" request header. Write a small function which take "Accept" header and return proper mime-type. Use mime-type to set response header "Content-Type" this is the logic for finding proper mime-type function getProperMimeType(strAcceptHeader){ if ("application/xhtml+xml" in strAcceptHeader) then return "application/xhtml+xml"; if ("application/xml" in strAcceptHeader) then return "application/xml"; if ("text/xml" in strAcceptHeader) then return "text/xml"; return "text/html"; } This will make Firefox to get XHTML page with "application/xhtml+xml" and IE6/7 to receive page with "text/html" once when IE6/7 start supporting "application/xhtml+xml" they will automatically start getting XHTML page with "application/xhtml+xml" with out you doing any change. Hope this helped !!! So can we mark this bug closed ?
(In reply to comment #8) > I'm not sure I believe bug 27403 comment 24. Hmm, you don't believe what Ian "Hixie" Hickson says about HTML? Note the Editor of this document and then change your mind... http://www.whatwg.org/specs/web-apps/current-work/ I recommend you read http://www.hixie.ch/advocacy/xhtml Please don't reopen this bug again.
Status: UNCONFIRMED → RESOLVED
Closed: 17 years ago17 years ago
Resolution: --- → DUPLICATE
Ok, it's not a disucssion forum, and I don't want to discuss further, but your comment about Ian Hickson, and the idea that his role in the html 5 effort, exempts him from critical examination is *intolerable*. >Hmm, you don't believe what Ian "Hixie" Hickson says about HTML? Sorry, but I don't deify any one. If the document they refer to doesn't support their argument in my opinion, I will disagree with them... even if they are the pope. While I'm here I might as well share the rest of my thoughts... >it says in "Abstract" > In summary, ...... > ..... and the use of 'text/html' SHOULD be > limited to HTML-compatible XHTML 1.0 documents. > ................ > >and <![CDATA[ is not HTML-compatible This argument occured to me, but it didn't make sense because <![CDATA[ blocks are not part of the "language" of either an xml or html document. They are a processing level construct. ALL xml documents regardless of their doctype must support them, therefore, it is something that should be transparent to every xml language. In essence it is metadata about the enclosed data not part of the xml language. Imagine you were to have a blob of simple alpha numeric PCDATA (say english text) and then in a second version you needlessly enclosed a chunk of the middle of it in CDATA. The idea that this should result in 3 nodes in the second case seems logically wrong. Certainly the browser (or other client) shouldn't be rendering the two copies differently. From the perspective of the XML language (xhtml in this case) it's all one chunk of text between an opening and closing tag. In other words the code that interprets (and renders) XHTML (as opposed to that which parses it) should be entirely unaware of CDATA or not CDATA. Finally, I was grateful to be reminded of content negotiation... It's something I had forgotten about... Several years ago I was quite enamored of it, but my web-server administration time is minuscule these days. In fact the case I was dealing with was software written by others with a few minor edits to customize some pages and attempt to make them validate. Rewriting the guts to send the right content type based on accept headers is not going to happen. It was proxied through an apache instance, so munging it there might work. The cases against doctype sniffing above.. Case 1... Valid in the world as it exists. Though I think it exists entirely because the language (html/xhtml) has no way to express the instruction to the browser. This is really where the instruction to not display it (or display it to another frame, etc) should live, not at transport level. Case 2. Again, html or xhtml should be giving the author a means to say things like "show this content raw" (if such is an intended capability of an HTML client to begin with. Changing the mime type to not match is (always imho) a hack. (sadly hacks are currently required in some cases). Both cases essentially argue, that since html/xhtml doesn't let me tell the browser what handling I want, I should use http headers that mismatch the content to get around that. Practical workarounds in today's world, but something that ought to be eliminated in the future.
<http://lists.w3.org/Archives/Public/www-html/2000Sep/0024.html> is what Hixie was referring to, and the email it in reply to covers the problems about sniffing.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: