Closed Bug 552137 Opened 15 years ago Closed 15 years ago

[HTML5] Normal characters dropped in innerHTML setter when surrounded by U+0000

Categories

(Core :: DOM: HTML Parser, defect)

x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 566280

People

(Reporter: sroussey, Unassigned)

Details

Attachments

(1 file)

User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6 (.NET CLR 3.5.30729) Build Identifier: When parsing, a text nodes starting with a null character will have them removed. Reproducible: Always Steps to Reproduce: In an empty doc like <html><body></body></html> try this: document.body.innerHTML="\0asdf\0" results in: document.body.textContent.length == 5 while document.body.textContent="\0asdf\0" results in: document.body.textContent.length == 6 Firebug bug reference: http://code.google.com/p/fbug/issues/detail?id=2917
With which parser? If it's only happening with the non-HTML5 parser, I don't think we care...
Component: General → HTML: Parser
QA Contact: general → parser
Summary: Text Node can't start with null → Text Node can't start with null when parsed as part of innerHTML
I haven't done anything to enable the HTML5 parser, and Firebug certainly isn't running using it, as far as I know.
Component: HTML: Parser → General
For the HTML5 parser: document.body.innerHTML="\0asdf\0" results in: document.body.textContent.length == 2 which is worse...
Component: General → HTML: Parser
Check results by attached simple HTML and script, to see rendering result and escaped content value. Following is obj.textContent.length, escape(obj.textContent), escape(obj.innerHTML) values. HTML5 is disabled. innerHTML="\0asdf\0" => 5 / asdf%uFFFD / asdf%uFFFD textContent="\0asdf\0" => 6 / %00asdf%00 / %00asdf%00 HTML5 is enabled. innerHTML="\0asdf\0" => 2 / %uFFFD%uFFFD / %uFFFD%uFFFD textContent="\0asdf\0" => 6 / %00asdf%00 / %00asdf%00 HTML5 is disabled: Parser looks to discard first 0x00, replace last 0x00 by U+FFFD. HTML5 is enabled: Parser replaces 0x00+asd by U+FFFD and f+0x00 by U+FFFD? What is correct handling of 0x00 in HTML source? If a script uses obj.innerHTML to put special binary like "\0" in a text node, I think it can be said wrong use or misuse of DOM property by the script.
Attachment #432503 - Attachment mime type: text/html → text/html;charset=iso-8859-1
(In reply to comment #4) > HTML5 is enabled: > Parser replaces 0x00+asd by U+FFFD and f+0x00 by U+FFFD? That's odd. > What is correct handling of 0x00 in HTML source? The correct handling per HTML5 is replacing U+0000 with U+FFFD. The rationale is defense in depth. If the HTML5 dropped \0 but black-listing naive intermediate security enforcers didn't, an attacker could insert \0 to bypass blacklists. (Whitelists would be better of course, hence defense in depth as opposed to just defense.) See also http://www.w3.org/Bugs/Public/show_bug.cgi?id=9096
So is there a way in html to display the code marker for \u0000 the way we can in JS via document.body.textContent="\0" ?
(In reply to comment #6) <p>&#0;asdf&#0;</p> == <p>&#x0000;asdf&#x0000;</p> == obj.textContent="\0asdf\0";, if HTML5 is disabled. If HTML5 is enabled, &#0; / &#x0000; was converted to U+FFFD. Escaped values: (checked with Fx 3.6.0 on MS Win-XP SP3) HTML5 is disabled : 6 / %00asdf%00 / %00asdf%00 HTML5 is enabled : 6 / %uFFFDasdf%uFFFD / %uFFFDasdf%uFFFD
Status: UNCONFIRMED → NEW
Ever confirmed: true
Summary: Text Node can't start with null when parsed as part of innerHTML → [HTML5] Normal characters dropped in innerHTML setter when surrounded by U+0000
Duplicating forward, because the newer bug has a fix.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → DUPLICATE
> So is there a way in html to display the code marker for \u0000 the way we can > in JS via document.body.textContent="\0" ? No. Why would you need it?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: