Closed
Bug 474670
Opened 15 years ago
Closed 14 years ago
Unterminated HTML entities sometimes rendered
Categories
(Core :: DOM: HTML Parser, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: volkmarkostka, Unassigned)
References
Details
(Keywords: regression, Whiteboard: [fixed by the HTML5 parser])
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090121 Minefield/3.2a1pre Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090121 Minefield/3.2a1pre If you put a string like "<a href="http://go/here?id=1&lang=0">here</a>" inside a textarea the &lang unterminated entity is rendered. The output is "<a href="http://go/here?id=1<=0">here</a>". This also happens if you use the HTML editor of the "WebDeveloper Toolbar". First described here: http://forums.mozillazine.org/viewtopic.php?f=25&t=1043535 This seriously impacts CMS systems. Reproducible: Always Steps to Reproduce: Load this HTML page: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <title>Untitled Document</title> </head> <body> <form action="test.html"> <textarea name="text" cols="80"><a href="http://go/here?id=1&lang=0">here</a></textarea> </form> </body> </html> Actual Results: <a href="http://go/here?id=1<=0">here</a> Expected Results: <a href="http://go/here?id=1&lang=0">here</a>
Comment 1•15 years ago
|
||
This has changed in the last few months of 2005.
Keywords: regression
Product: Firefox → Core
QA Contact: general → general
Version: unspecified → Trunk
Reporter | ||
Comment 2•15 years ago
|
||
2005? That old?
Comment 3•15 years ago
|
||
Yeah. Regression range is http://bonsai.mozilla.org/cvsquery.cgi?module=PhoenixTinderbox&date=explicit&mindate=2005-11-03+03%3A00&maxdate=2005-11-03+14%3A00 I think it could be bug 312104.
Blocks: 312104
Status: UNCONFIRMED → NEW
Component: General → HTML: Parser
Ever confirmed: true
QA Contact: general → parser
Comment 4•15 years ago
|
||
I claim this is a dup of bug 155047. That makes it surprising that there's a regression range in 2005, though.
Reporter | ||
Comment 5•15 years ago
|
||
Yep. Sounds very like #155047 but the context seems different.
Comment 6•15 years ago
|
||
How is the context different? Both bugs are about the parsing of <a href>.
Comment 7•15 years ago
|
||
See Bug 278404(DUP'ed to Bug 155047) for lecture by Boris Zbarsky on "Character Entity of HTML(based on SGML)" for Dan(opener of Bug 278404) and stupid me. History looks to be; 1) Initially, it worked as designed(as Bug 155047 and many DUPs say). 2) It was broken between 2004-11-10 and 2004-12-19(perhaps by bug 88952) when "Character Entity delimited by a space" in <textarea>. => Bug 312104 was opened and fixed in 2005. Note: Since Target Milestone:mozilla1.9alpha1, Bug 312104 still occurs on Sm 1.x. This bug's actual result is an evidence that Bug 312104 was fixed correctly.
Reporter | ||
Comment 8•15 years ago
|
||
To comment 6: The difference in the context is that the link in the textarea is displayed not rendered. For me the question is if the content of a textarea is part of the DOM or not. If not the content should not be interpreted at all.
Comment 10•15 years ago
|
||
Per HTML5, textarea content is an RCDATA element. RCDATA elements can contain character references. That being said, HTML5 parser will not consider "&lang" (without semicolon) as Named character reference.
Updated•15 years ago
|
Depends on: html5-parsing
Reporter | ||
Comment 11•15 years ago
|
||
Additional info for trunk: If HTML5 is enabled then the display is correct.
Updated•14 years ago
|
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by the HTML5 parser]
Comment 12•13 years ago
|
||
Apologies for bumping this old bug, but I didn't want to log a new bug as I am pretty confident the issue I am seeing is a regression of this one. The code in the original submission seems to be clear of this problem, but I can still reproduce the bug using different entity tags - for example; < instead of &lang. There appears to be some confusion about how this should work (there are several older bugs here about it : 222193, 155047) - apparently in ye olde days it was valid SGML, but this appears to now be at odds with the HTML5 spec ( http://www.w3.org/TR/html5/syntax.html#character-references ) which states: "The name must be one that is terminated by a U+003B SEMICOLON character (;)", which seems pretty unequivocal to me :) The following code reproduces it for me: <!DOCTYPE html> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" /> <title>Untitled Document</title> </head> <body> <form action="test.html"> <textarea name="text" cols="80"><a href="http://go/here?id=1<=0">here</a></textarea> </form> </body> </html> Actual results: <a href="http://go/here?id=1<=0">here</a> Expected results: <a href="http://go/here?id=1<=0">here</a>
Comment 13•13 years ago
|
||
(In reply to David Harrison from comment #12) > this appears to now be at odds with the HTML5 > spec ( http://www.w3.org/TR/html5/syntax.html#character-references ) which > states: "The name must be one that is terminated by a U+003B SEMICOLON > character (;)", which seems pretty unequivocal to me :) That text states requirement for writing HTML--not consuming it. The rules for consuming character references are at http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#consume-a-character-reference When <a href="http://go/here?id=1<=0">here</a> occurs as markup, "http://go/here?id=1<=0" is part of an attribute value and <= does not tokenize as a character reference. The rules for attribute values and non-attribute value text are different. In your example, <a href="http://go/here?id=1<=0">here</a> is text in a textarea, so the attribute value rules don't apply, so < tokenizes as a character reference. In textarea text, you need to escape & as &.
Comment 14•13 years ago
|
||
Henri, Thanks for the clarification on consuming/writing. I'll have a read of that now. I should also point out that the conversion of < also takes place outside of TEXTAREAs - e.g., <!DOCTYPE html> <html> <body> Here's a test < </body> </html>
Comment 15•13 years ago
|
||
(In reply to David Harrison from comment #14) > I should also point out that the conversion of < also takes place outside > of TEXTAREAs - e.g., > > <!DOCTYPE html> > <html> > <body> > Here's a test < > </body> > </html> That's expected, since < is outside an attribute value here, too.
You need to log in
before you can comment on or make changes to this bug.
Description
•