Closed
Bug 153777
Opened 23 years ago
Closed 16 years ago
Entities should not be parsed in HTML ID attribute values
Categories
(Core :: DOM: HTML Parser, defect, P3)
Core
DOM: HTML Parser
Tracking
()
RESOLVED
WONTFIX
Future
People
(Reporter: bugmail, Unassigned)
References
()
Details
(Keywords: testcase)
Attachments
(2 files, 1 obsolete file)
Character references in ID and class attribute values should not be interpreted
to correspond to their decoded counterparts in style data, since such references
are not allowed in ID and class attribute values.
That is, an element <p id="D"> should not get style data from a statement
with the selector, "#D", and likewise for classes.
Scratch the part about classes; character references are allowed in those. My
mistake.
Summary: Character references in IDs and classes should not be interpreted for style purposes → Character references in IDs should not be interpreted for style purposes
Attachment #88880 -
Attachment is obsolete: true
I believe the CLASS part of this bug is invalid, since
http://www.w3.org/TR/html4/struct/global.html#h-7.5.2
says that class is a cdata-list, whose definition explicitly includes character
entity handling, and the definition of CLASS makes no exception.
The issue with IDs is valid but is a bug in the HTML parser, not the style system.
Assignee: dbaron → harishd
Component: Style System → Parser
QA Contact: ian → moied
Summary: Character references in IDs should not be interpreted for style purposes → Entities should not be handled in HTML ID attributes
I submitted the previous comment over a mid-air collision since I wanted to
override the changes. Confirming bug, since the collision handling didn't do
that right the first time.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 6•23 years ago
|
||
The bug will be hard to fix, unfortunately. Our attribute parsing more or less
assumes that all attributes are CDATA, so at the point where the tokenizer
actually expands entities in attributes, it doesn't know an ID from its own right
elbow. This might change when harishd tries to remove the tokenizer.
Keywords: testcase
Status: NEW → ASSIGNED
Priority: -- → P3
Target Milestone: --- → Future
Also reproduced using Win32/2003010408. Setting All/All.
OS: MacOS X → All
Hardware: Macintosh → All
Summary: Entities should not be handled in HTML ID attributes → Entities should not be parsed in HTML ID attribute values
Christopher, is there a bug about harishd's change you refer to in your comment 6?
Comment 9•22 years ago
|
||
Bug 105138, but it's very blue-sky at the moment; harishd has essentially placed
parser in maintenance mode (he's working mostly on SOAP/P3P/WSDL), and it's
rather regression-prone, to say the least. There might be an opportunity for
this in 1.5 alpha, but I don't know who'd do the work...
Comment 10•22 years ago
|
||
Chris: I did land a patch, some time last year, to reduce heap overhead caused
by the parser nodes ( bug 177994 ), however I never got the time to spend on
tokens. That said, eliminating tokens is non trivial and I dare not do it :(
Comment 11•20 years ago
|
||
Comment 12•20 years ago
|
||
According to
http://www.w3.org/TR/html4/types.html#type-id
ID's should all match the regexp:
[a-zA-Z][a-zA-Z0-9:._-]*
I believe this is relevant, as another side effect of parsing the ID attribute
as CDATA is that it allows other illegal ID forms such as
id="42"
id="Div 1"
id="item#3"
None of these are matched by css selectors, but all are accessible via
javascript. (See attached example in #11.)
Updated•16 years ago
|
Assignee: harishd → nobody
Status: ASSIGNED → NEW
QA Contact: moied → parser
Comment 13•16 years ago
|
||
I'm taking the liberty to mark this WONTFIX per HTML5.
Status: NEW → RESOLVED
Closed: 16 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•