Open Bug 1490606 Opened 6 years ago Updated 2 years ago

Warn when parsing void or self-closing custom elements

Categories

(Core :: DOM: HTML Parser, enhancement, P3)

enhancement

Tracking

()

People

(Reporter: vporof, Unassigned)

References

(Blocks 2 open bugs)

Details

Current behaviour when encountering malformed custom element tags is unspecced. There’s ample discussion and various reasons as to why, some of them being security and reducing complexity[0][1]. Parsing these void or self-closing custom elements is treated similarly to other regular HTML elements (that is, differently to contentless[2] exceptions), whereby these tags are considered never closing and thus parenting their subsequent textual siblings. Custom elements are often used in conjunction with templating[3], either through tagged template literals or otherwise some other form of programmatically created template elements. When it comes to HTML parsing, common high profile examples of this are with Polymer[4], LitHTML[5] or HyperHTML[6]. All of this is at odds with the status-quo of defining and using composable components in virtual-dom powered frameworks, where XML-like parsing is expected. This can make the transition to using Web Components, Polymer, LitHTML, HyperHTML etc. more surprising to developers unfamiliar with the intricacies of XML vs. HTML parsing. Consider the following two examples, where Foo and Bar are components: React: ``` class Parent extends React.Component { render() { return ( <div> <MyFoo /> <MyBar /> </div> ); } } ``` Polymer: ``` class Parent extends PolymerElement { static get template() { return html` <div> <my-foo /> <my-bar /> </div> `; } } ``` In the Polymer example, the Bar component is parented under the Foo component’s shadow root, regardless of whether or not slots are used. In the React example, Foo and Bar are siblings, regardless of whether or not children props are used. Needless to say that the output is structurally sufficiently different to be a source of frustration. I suggest having console warnings in place for helping out in these situations. Console output is not specced and would help. I wonder if there are any concerns about performance for HTML parsing for regular web content, as parsing errors or quirks are extensively handled but not actively reported. Perhaps view-source tools are a good place for this instead? It’s not obvious to me how ergonomic that'd be, but it should be possible to add annotations in the devtools debugger for programmatically created template elements encountering void or self closing custom elements tags.
 One additional area where this could help us directly is in transitioning browser.xml to HTML. There’s plenty of places where we use custom elements and assume XML parsing, e.g.[7]. Thoughts? [0] https://github.com/w3c/webcomponents/issues/624 [1] https://github.com/whatwg/html/issues/919#issuecomment-276329905 [2] https://www.w3.org/TR/2011/WD-html-markup-20110113/syntax.html#void-elements [3] https://developer.mozilla.org/en-US/docs/Web/HTML/Element/template [4] https://www.polymer-project.org/3.0/docs/devguide/dom-template#dommodule [5] https://github.com/Polymer/lit-html [6] https://github.com/WebReflection/hyperHTML [7] https://searchfox.org/mozilla-central/source/browser/base/content/urlbarBindings.xml#48
(In reply to Victor Porof [:vporof][:vp] from comment #0) > Current behaviour when encountering malformed custom element tags is > unspecced. How do you mean? Surely the situation is specced to be exactly what you observe with Polymer. > Thoughts? The parser has the capability to report this error in View Source, but View Source isn't really relevant for HTML snippets embedded in JS. https://searchfox.org/mozilla-central/source/parser/html/nsHtml5TreeBuilderCppSupplement.h#1865 It would probably be easier to expose all tree builder errors in the console than pick and choose which ones to expose. For perf reasons, we should probably do the reporting only if the console is open before the parse.
(In reply to Henri Sivonen (:hsivonen) from comment #1) > (In reply to Victor Porof [:vporof][:vp] from comment #0) > > Current behaviour when encountering malformed custom element tags is > > unspecced. > > How do you mean? Surely the situation is specced to be exactly what you > observe with Polymer. That's right, parsing of void or self-closing custom elements isn't handled by the HTML parser in the same way XML would, and that's by design. > > > Thoughts? > > The parser has the capability to report this error in View Source, but View > Source isn't really relevant for HTML snippets embedded in JS. > https://searchfox.org/mozilla-central/source/parser/html/ > nsHtml5TreeBuilderCppSupplement.h#1865 > > It would probably be easier to expose all tree builder errors in the console > than pick and choose which ones to expose. For perf reasons, we should > probably do the reporting only if the console is open before the parse. That's great, so it looks like all of this stuff is already handled, it's just a matter of surfacing this stuff in devtools one way or another. Any suggestions on how to do that in a performance-sensitive manner?
(In reply to Victor Porof [:vporof][:vp] from comment #2) > Any suggestions on how to do that in a performance-sensitive manner? We need some synchronous static method for querying if error reporting is wanted. The chrome process should report to content processes when the console is opened and closed so that the method can respond synchronously in content processes. We should add a boolean mReportErrors to nsHtml5TreeBuilder. We should set that based on the query in the first sentence and also when mViewSource is non-null. Then in parser/html/nsHtml5TreeBuilderCppSupplement.h every instance of if (MOZ_UNLIKELY(mViewSource)) { mViewSource->AddErrorToCurrentRun("errFoo"); } should be replaced with MaybeReportError("errFoo"); where the method MaybeReportError is void nsHtml5TreeBuilder::MaybeReportError(cont char* aErrorKey) { if (MOZ_UNLIKELY(mReportErrors)) { if (mBuilder) { mBuilder->ReportError(aErrorKey); } else if (MOZ_UNLIKELY(mViewSource)) { mViewSource->AddErrorToCurrentRun(aErrorKey); } else { nsHtml5TreeOperation* treeOp = mOpQueue.AppendElement(mozilla::fallible); if (MOZ_UNLIKELY(!treeOp)) { MarkAsBrokenAndRequestSuspensionWithoutBuilder(NS_ERROR_OUT_OF_MEMORY); return nullptr; } treeOp->Init(eTreeOpReportError, aErrorKey, tokenizer->getLineNumber()); } } } Where mBuilder->ReportError() is new main-thread-callable method that looks up the error message by localization key aErrorKey and reports the message to the console and eTreeOpReportError is a new tree op that once executed on the main thread reports the message identified by the localization key aErrorKey to the console with an associated line number.
Thanks for the ample feedback Henri!
Is there work to be done here?
Flags: needinfo?(vporof)
Yes, but there's no one to champion it yet. Is there a preference for moving this bug outside of this component? Work does involve changes to the HTML parser, but otherwise DevTools: Console might be appropriate.
Flags: needinfo?(vporof)
I'm happy to leave this here in the Parser component until we know what we're going to do.
Priority: -- → P3
hyperHTML permits all elements to be self‑closing by using regex replace to generate closing tags for all non‑void elements. See this gist for a detailed comparison of hyperHTML and Polymer’s lit‑html: https://gist.github.com/WebReflection/fadcc419f5ccaae92bc167d8ff5c611b
Parsing differences between all of these templating literals is going to generate even more confusion, unless there's some future where nuances are normalised.
See Also: → 1493964
Blocks: 1493964

Hello Victor,
I would be happy to work on this issue. Would request you to give me a gist of the conversation and maybe suggest steps to solve this issue. Am a new contributor. Helps to be clear that I understand everything and can then approach it effectively.

(In reply to Karan Sapolia from comment #10)

Would request you to give me a gist of the conversation and maybe suggest steps to solve this issue.

Comment 3 describes the parser-side change needed.

See
https://searchfox.org/mozilla-central/source/parser/html/nsHtml5TreeBuilderCppSupplement.h#1193
https://searchfox.org/mozilla-central/source/parser/html/nsHtml5TreeOperation.cpp#966
https://searchfox.org/mozilla-central/source/parser/html/nsHtml5TreeOpExecutor.cpp#809
for an example of an existing code path for getting an error report from the parser to the console.

(In reply to Karan Sapolia from comment #10)

Hello Victor,
I would be happy to work on this issue. Would request you to give me a gist of the conversation and maybe suggest steps to solve this issue. Am a new contributor. Helps to be clear that I understand everything and can then approach it effectively.

Awesome stuff, thanks for the interest Karan!

This bug has all the context, especially in comment 0 and 3. Once this is fixed, we can better show these warnings in the front-end, with some appropriate UI (bug 1493964).

Let me know if you need any help setting up your development environment.

Thank you Henri and Victor. I have the dev environment setup. Let me go through the comments and links and reach out to you with questions and patch drafts. Are you both available on IRC too?

(In reply to Karan Sapolia from comment #13)

Thank you Henri and Victor. I have the dev environment setup. Let me go through the comments and links and reach out to you with questions and patch drafts. Are you both available on IRC too?

I'm usually on both slack and IRC. Feel free to ping in this bug as well.

Assignee: nobody → sapoliakaran
Status: NEW → ASSIGNED

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: sapoliakaran → nobody
Status: ASSIGNED → NEW
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.