Closed Bug 1541822 Opened 6 years ago Closed 4 years ago

Import upstream patch that makes doctype name null to pass html5lib tokenizer tests

Categories

(Core :: DOM: HTML Parser, task, P3)

task

Tracking

()

RESOLVED FIXED
93 Branch
Tracking Status
firefox93 --- fixed

People

(Reporter: hsivonen, Assigned: hsivonen)

Details

Attachments

(2 files)

<!DOCTYPE> and similar report the tag name as the empty string. Should be null.

Severity: trivial → normal
Priority: -- → P3

Now I don't understand why the spec and the html5lib test suite except null when the name is not nullable in the DOM spec: https://dom.spec.whatwg.org/#documenttype

Skipping https://github.com/validator/htmlparser/commit/3be25a0e44adda338c99bcc85ae9b6167522bc75 for the moment.

Sadly, the specified interface between the tokenizer and the tree builder differs for doctype tokens from what the DOM exposes. The html5lib test suite tests the tokenizer interface as specified. The tokenizer interface isn't exposed to the Web.

Taking the upstream patch here and ensuring that there are no Web-exposed changes is easier for everyone than either keeping the branch divergence between Gecko and the validator or getting the spec, the test suite, and all other projects relying on the test suite changed.

Type: defect → task
Summary: Doctype without tag name results in empty string tag name instead of null tag name → Import upstream patch that makes doctype name null to pass html5lib tokenizer tests

This change ensures that the tokenizer sets the doctype name to null
when the doctype name is missing in the input source.

Otherwise, without this change, the doctype name is set to the empty
string — which doesn’t conform to the requirements in the HTML spec, and
which causes us to fail 9 tests in the html5lib-tests suite.

Relates to https://github.com/validator/htmlparser/issues/35

Unfortunately, the HTML tokenizer can emit a doctype token with a missing doctype name and the html5lib
test suite tests this even though the doctype name is not nullable in the DOM. This tests checks that
the empty string still appears in the DOM.

Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
Pushed by hsivonen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/49feef3910bf Ensure doctype name is set to null when missing r=smaug https://hg.mozilla.org/integration/autoland/rev/9ba4529104b9 test - Check that null doctype name from the tokenizer shows up as the empty string in the DOM. r=smaug
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/30085 for changes under testing/web-platform/tests
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 93 Branch
Upstream PR was closed without merging
Upstream PR merged by moz-wptsync-bot
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: