Closed
Bug 474670
Opened 16 years ago
Closed 15 years ago
Unterminated HTML entities sometimes rendered
Categories
(Core :: DOM: HTML Parser, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: volkmarkostka, Unassigned)
References
Details
(Keywords: regression, Whiteboard: [fixed by the HTML5 parser])
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090121 Minefield/3.2a1pre
Build Identifier: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.9.2a1pre) Gecko/20090121 Minefield/3.2a1pre
If you put a string like "<a href="http://go/here?id=1&lang=0">here</a>" inside a textarea the &lang unterminated entity is rendered. The output is "<a href="http://go/here?id=1<=0">here</a>".
This also happens if you use the HTML editor of the "WebDeveloper Toolbar".
First described here: http://forums.mozillazine.org/viewtopic.php?f=25&t=1043535
This seriously impacts CMS systems.
Reproducible: Always
Steps to Reproduce:
Load this HTML page:
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>Untitled Document</title>
</head>
<body>
<form action="test.html">
<textarea name="text" cols="80"><a href="http://go/here?id=1&lang=0">here</a></textarea>
</form>
</body>
</html>
Actual Results:
<a href="http://go/here?id=1<=0">here</a>
Expected Results:
<a href="http://go/here?id=1&lang=0">here</a>
Comment 1•16 years ago
|
||
This has changed in the last few months of 2005.
Keywords: regression
Product: Firefox → Core
QA Contact: general → general
Version: unspecified → Trunk
Reporter | ||
Comment 2•16 years ago
|
||
2005? That old?
Comment 3•16 years ago
|
||
Yeah. Regression range is
http://bonsai.mozilla.org/cvsquery.cgi?module=PhoenixTinderbox&date=explicit&mindate=2005-11-03+03%3A00&maxdate=2005-11-03+14%3A00
I think it could be bug 312104.
Blocks: 312104
Status: UNCONFIRMED → NEW
Component: General → HTML: Parser
Ever confirmed: true
QA Contact: general → parser
Comment 4•16 years ago
|
||
I claim this is a dup of bug 155047. That makes it surprising that there's a regression range in 2005, though.
Reporter | ||
Comment 5•16 years ago
|
||
Yep. Sounds very like #155047 but the context seems different.
Comment 6•16 years ago
|
||
How is the context different? Both bugs are about the parsing of <a href>.
Comment 7•16 years ago
|
||
See Bug 278404(DUP'ed to Bug 155047) for lecture by Boris Zbarsky on "Character Entity of HTML(based on SGML)" for Dan(opener of Bug 278404) and stupid me.
History looks to be;
1) Initially, it worked as designed(as Bug 155047 and many DUPs say).
2) It was broken between 2004-11-10 and 2004-12-19(perhaps by bug 88952)
when "Character Entity delimited by a space" in <textarea>.
=> Bug 312104 was opened and fixed in 2005.
Note:
Since Target Milestone:mozilla1.9alpha1, Bug 312104 still occurs on Sm 1.x.
This bug's actual result is an evidence that Bug 312104 was fixed correctly.
Reporter | ||
Comment 8•16 years ago
|
||
To comment 6:
The difference in the context is that the link in the textarea is displayed not rendered.
For me the question is if the content of a textarea is part of the DOM or not. If not the content should not be interpreted at all.
Comment 10•16 years ago
|
||
Per HTML5, textarea content is an RCDATA element. RCDATA elements can contain character references.
That being said, HTML5 parser will not consider "&lang" (without semicolon) as Named character reference.
Updated•15 years ago
|
Depends on: html5-parsing
Reporter | ||
Comment 11•15 years ago
|
||
Additional info for trunk:
If HTML5 is enabled then the display is correct.
Updated•15 years ago
|
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by the HTML5 parser]
Comment 12•13 years ago
|
||
Apologies for bumping this old bug, but I didn't want to log a new bug as I am pretty confident the issue I am seeing is a regression of this one.
The code in the original submission seems to be clear of this problem, but I can still reproduce the bug using different entity tags - for example; < instead of &lang.
There appears to be some confusion about how this should work (there are several older bugs here about it : 222193, 155047) - apparently in ye olde days it was valid SGML, but this appears to now be at odds with the HTML5 spec ( http://www.w3.org/TR/html5/syntax.html#character-references ) which states: "The name must be one that is terminated by a U+003B SEMICOLON character (;)", which seems pretty unequivocal to me :)
The following code reproduces it for me:
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
<title>Untitled Document</title>
</head>
<body>
<form action="test.html">
<textarea name="text" cols="80"><a href="http://go/here?id=1<=0">here</a></textarea>
</form>
</body>
</html>
Actual results:
<a href="http://go/here?id=1<=0">here</a>
Expected results:
<a href="http://go/here?id=1<=0">here</a>
Comment 13•13 years ago
|
||
(In reply to David Harrison from comment #12)
> this appears to now be at odds with the HTML5
> spec ( http://www.w3.org/TR/html5/syntax.html#character-references ) which
> states: "The name must be one that is terminated by a U+003B SEMICOLON
> character (;)", which seems pretty unequivocal to me :)
That text states requirement for writing HTML--not consuming it. The rules for consuming character references are at http://www.whatwg.org/specs/web-apps/current-work/multipage/tokenization.html#consume-a-character-reference
When <a href="http://go/here?id=1<=0">here</a> occurs as markup, "http://go/here?id=1<=0" is part of an attribute value and <= does not tokenize as a character reference.
The rules for attribute values and non-attribute value text are different. In your example, <a href="http://go/here?id=1<=0">here</a> is text in a textarea, so the attribute value rules don't apply, so < tokenizes as a character reference.
In textarea text, you need to escape & as &.
Comment 14•13 years ago
|
||
Henri,
Thanks for the clarification on consuming/writing. I'll have a read of that now.
I should also point out that the conversion of < also takes place outside of TEXTAREAs - e.g.,
<!DOCTYPE html>
<html>
<body>
Here's a test <
</body>
</html>
Comment 15•13 years ago
|
||
(In reply to David Harrison from comment #14)
> I should also point out that the conversion of < also takes place outside
> of TEXTAREAs - e.g.,
>
> <!DOCTYPE html>
> <html>
> <body>
> Here's a test <
> </body>
> </html>
That's expected, since < is outside an attribute value here, too.
You need to log in
before you can comment on or make changes to this bug.
Description
•