Closed Bug 608408 Opened 14 years ago Closed 14 years ago

innerHTML round-trip failure with "<a><table><a>"

Categories

(Core :: DOM: HTML Parser, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jruderman, Unassigned)

References

Details

(Keywords: testcase)

Attachments

(1 file)

Attached file testcase
When given "<a><table><a>", the parser generates an insane tree:
  <a><a></a><table></table></a>

This is difficult to serialize as text/html, because "<a><a>" will not parse as nested 'a' elements.

I claim this is a bug in the parser, not the serializer: the parser should not generate nested 'a' elements in this case.
Component: DOM → HTML: Parser
QA Contact: general → parser
This is what the HTML5 spec currently requires the parser to do, as far as I can tell from reading it...  <table> means that <a> inside it can happen without closing the outer <a>, but then the adoption agency algorithm kicks in.

I'm not sure it's worth messing with the parsing here for this edge case, though...

Note that in HTML it's trivial to create non-round-trippable situations in all sorts of ways (in the sense that serializing and reparsing the DOM won't behave like the original DOM).  Unfortunately, these situations are required for web compat.
So from what I understand, what Jesse has been detecting is when a DOM *outputted from the parser* can't be round-tripped through a serializer.

I'm not sure how common it is for that to happen, so far I think this is the only legit such case found.

I agree that it's not important in and of itself to fix this edgecase. Especially if there turns out that there are lots more like it then we should simply stop running this test.

But if this is the only case when this can happen, or at least one of very few, then it might be worth fixing or whitelisting these cases.
The spec even has an example about this now:
"In the non-conforming stream <a href="a">a<table><a href="b">b</table>x, the first a element would be closed upon seeing the second one, and the "x" character would be inside a link to "b", not to "a". This is despite the fact that the outer a element is not in table scope (meaning that a regular </a> end tag at the start of the table wouldn't close the outer a element). The result is that the two a elements are indirectly nested inside each other — non-conforming markup will often result in non-conforming DOMs when parsed."

Resolving as WONTFIX in the "product behaves as intended" sense.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: