Closed
Bug 20253
Opened 26 years ago
Closed 26 years ago
when parsing html, eHTMLTag_entity GetText includes ; but not &
Categories
(Core :: DOM: HTML Parser, defect, P3)
Core
DOM: HTML Parser
Tracking
()
VERIFIED
FIXED
M13
People
(Reporter: akkzilla, Assigned: akkzilla)
Details
When parsing XIF to the nsHTMLContentSinkStream, when we get an eHTMLTag_entity
tag in AddLeaf at line 996, GetText() returns the inner part of the entity (e.g.
"lt"). But when we're parsing html, GetText for an eHTMLTag_entity includes the
semicolon. This causes us to get double semicolons in html output that was
originally parsed from html (e.g. in automated tests, or in a stream converter).
To see this, build in htmlparser/tests/outsinks, add a printf or set a
breakpoint to see what GetText is returning, then go to dist/bin and run:
TestOutput -i text/html -o text/html -f 0 OutTestData/simple.html
I don't see a way to hack in a temporary workaround, because
nsHTMLContentSinkStream doesn't know whether it's being called from parsing XIF
or HTML; it needs to be able to rely on the result being consistent either way.
This isn't a bug in the parser. When the HTML file is loaded, the entity is
stored as it was seen, so the semicolon may or may not be present. We don't
strip the semicolons. The XIF buffer that is provided to the XIFDTD, and
subsequently to the nsHTMLContentSinkStream has stripped the semicolons from the
entities. There's nothing I can do about that. The semi's need to come out of
the content model if they we present when we read the file.
| Assignee | ||
Updated•26 years ago
|
Status: NEW → RESOLVED
Closed: 26 years ago
Resolution: --- → FIXED
| Assignee | ||
Comment 2•26 years ago
|
||
Rick and I discussed this: turns out that the ; isn't actually required, and
isn't always there, so the parser includes it when it is there. The &, on the
other hand, is required, so the parser doesn't bother to include it. We've
changed the nsXIFDTD to include the semicolon like the CNavDTD does, and removed
the inclusion of the semicolon from the sink.
| Assignee | ||
Updated•26 years ago
|
Status: RESOLVED → VERIFIED
| Assignee | ||
Comment 3•26 years ago
|
||
QA: you can't verify this with a release build, and no one else cares, so I'll
mark it verified.
You need to log in
before you can comment on or make changes to this bug.
Description
•