Closed Bug 483155 Opened 11 years ago Closed 2 years ago

[HTML5] Make HTML5 pre-interned tokens carry node creation function pointers

Categories

(Core :: DOM: HTML Parser, enhancement, P5)

Other Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: hsivonen, Assigned: hsivonen)

References

Details

Attachments

(3 files, 1 obsolete file)

Node creation could be optimized by making the pre-interned well-known tokens come with function pointers to functions for creating the right node types for each token is HTML/MathML/SVG modes.
Priority: -- → P5
Blocks: 1347643
Assignee: nobody → hsivonen
Status: NEW → ASSIGNED
This patch rather obviously eliminates nsHTMLTags::CaseSensitiveLookupTag from the path of the HTML parser creating element nodes, but so far I've failed to convince myself that this makes things faster overall.
Attached file Benchmark
(In reply to Henri Sivonen (:hsivonen) from comment #2)
> but so far I've
> failed to convince myself that this makes things faster overall.

This benchmark (adapted from the one bholley attached to bug 1347525) convinces me that the patch here is worthwhile.
(The benchmark result shows up in Web Console.)
Attached patch Java changes (obsolete) — Splinter Review
Attachment #8895259 - Flags: review?(bugs)
What kind of improvements do you see with the patch?

(I wonder how to review this massive change)
(In reply to Olli Pettay [:smaug] from comment #7)
> What kind of improvements do you see with the patch?

I see the best one of the 200 iterations of the attached test case improve by around 5%. (It makes sense to look at the best run and not the average, because the best run is the one with the least noise, and the run as a whole is quite noisy.)

> (I wonder how to review this massive change)

I suggest looking at it by considering these subparts:
 1) ElementName.java self-generation code reading the function pointer lists
 2) The HTML parser code propagating the function pointers
 3) The refactoring around giving the HTML and SVG creation function pointers a type that can be referred to from the parser
 4) The refactoring of nsHtml5TreeOperation to split element creation into one method for each of HTML, SVG and MathML
 5) The refactoring of the Tor SVG and MathML disabling code
Comment on attachment 8894846 [details]
Bug 483155 - Put content creator function pointers onto nsHtml5ElementName.

https://reviewboard.mozilla.org/r/166010/#review171812

::: parser/html/nsHtml5Highlighter.cpp:84
(Diff revision 2)
>    mOpQueue.AppendElement()->Init(nsGkAtoms::html, EmptyString(), EmptyString());
>  
>    mOpQueue.AppendElement()->Init(STANDARDS_MODE);
>  
> -  nsIContent** root = CreateElement(nsGkAtoms::html, nullptr, nullptr);
> +  nsIContent** root =
> +    CreateElement(nsGkAtoms::html, nullptr, nullptr, NS_NewHTMLSharedElement);

Some comment why NS_NewHTMLSharedElement is used here

::: parser/html/nsHtml5Highlighter.cpp:88
(Diff revision 2)
> -  nsIContent** root = CreateElement(nsGkAtoms::html, nullptr, nullptr);
> +  nsIContent** root =
> +    CreateElement(nsGkAtoms::html, nullptr, nullptr, NS_NewHTMLSharedElement);
>    mOpQueue.AppendElement()->Init(eTreeOpAppendToDocument, root);
>    mStack.AppendElement(root);
>  
> -  Push(nsGkAtoms::head, nullptr);
> +  Push(nsGkAtoms::head, nullptr, NS_NewHTMLSharedElement);

ditto

::: parser/html/nsHtml5StackNode.cpp:136
(Diff revision 2)
>    this->node = node;
>    this->attributes = nullptr;
>    this->refcount = 1;
>    MOZ_ASSERT(elementName->isInterned(),
>               "Don't use this constructor for custom elements.");
> +  this->htmlCreator = nullptr;

why null creator when ns is xhtml?

::: parser/html/nsHtml5StackNode.cpp:170
(Diff revision 2)
>    this->popName = popName;
>    this->ns = kNameSpaceID_XHTML;
>    this->node = node;
>    this->attributes = nullptr;
>    this->refcount = 1;
> +  this->htmlCreator = nullptr;

Why null creator when ns is after all xhtml.

::: parser/html/nsHtml5Tokenizer.cpp:1811
(Diff revision 2)
>            }
>            if (hi < lo) {
>              NS_HTML5_BREAK(outer);
>            }
>            appendCharRefBuf(c);
> +          continue;

What has this to do with this bug?

::: parser/html/nsHtml5Tokenizer.cpp:4360
(Diff revision 2)
>            }
>            loloop_end: ;
>            if (hi < lo) {
>              NS_HTML5_BREAK(outer);
>            }
> +          continue;

What has this to do with this bug?

::: parser/html/nsHtml5TreeBuilder.cpp:3941
(Diff revision 2)
>                                                     entry->ns,
>                                                     entry->name,
>                                                     clone,
>                                                     entry->popName,
> -                                                   entry->attributes);
> +                                                   entry->attributes,
> +                                                   entry->getHtmlCreator());

Why html creator and not svg creator? It is not obvious  when reading this code.

::: parser/html/nsHtml5TreeBuilder.cpp:4160
(Diff revision 2)
>    nsIContentHandle* currentNode = stack[currentPtr]->node;
> -  nsIContentHandle* elt =
> -    createElement(kNameSpaceID_XHTML, nsGkAtoms::head, attributes, currentNode);
> +  nsIContentHandle* elt = createElement(kNameSpaceID_XHTML,
> +                                        nsGkAtoms::head,
> +                                        attributes,
> +                                        currentNode,
> +                                        htmlCreator(NS_NewHTMLSharedElement));

Some comment about NS_NewHTMLSharedElement usage here, please

::: parser/html/nsHtml5TreeBuilder.cpp:4620
(Diff revision 2)
>                           node->ns,
>                           node->name,
>                           node->node,
>                           node->popName,
> -                         node->attributes->cloneAttributes(nullptr));
> +                         node->attributes->cloneAttributes(nullptr),
> +                         node->getHtmlCreator());

Why we care about only html creator and not svg creator?

::: parser/html/nsHtml5TreeBuilder.cpp:4638
(Diff revision 2)
>                           node->ns,
>                           node->name,
>                           node->node,
>                           node->popName,
> -                         nullptr);
> +                         nullptr,
> +                         node->getHtmlCreator());

ditto

::: parser/html/nsHtml5TreeBuilder.cpp:4724
(Diff revision 2)
>          node->ns,
>          nsHtml5Portability::newLocalFromLocal(node->name, interner),
>          node->node,
>          nsHtml5Portability::newLocalFromLocal(node->popName, interner),
> -        node->attributes->cloneAttributes(nullptr));
> +        node->attributes->cloneAttributes(nullptr),
> +        node->getHtmlCreator());

I don't understand this. Why we html creator? Why can't there be svg creator?

::: parser/html/nsHtml5TreeBuilder.cpp:4741
(Diff revision 2)
>          node->ns,
>          nsHtml5Portability::newLocalFromLocal(node->name, interner),
>          node->node,
>          nsHtml5Portability::newLocalFromLocal(node->popName, interner),
> -        nullptr);
> +        nullptr,
> +        node->getHtmlCreator());

Same here.

::: parser/html/nsHtml5TreeBuilderCppSupplement.h:392
(Diff revision 2)
>  
>  nsIContentHandle*
>  nsHtml5TreeBuilder::createHtmlElementSetAsRoot(nsHtml5HtmlAttributes* aAttributes)
>  {
> -  nsIContentHandle* content =
> -    createElement(kNameSpaceID_XHTML, nsGkAtoms::html, aAttributes, nullptr);
> +  nsHtml5ContentCreatorFunction creator;
> +  creator.html = NS_NewHTMLSharedElement;

This could use a comment why NS_NewHTMLSharedElement is used here.
(just some hint that shared element is used for <html>)

::: parser/html/nsHtml5TreeOperation.h:354
(Diff revision 2)
> +                               : eTreeOpCreateSVGElementNotNetwork;
> +        mFour.svgCreator = aCreator.svg;
> +      } else {
> +        MOZ_ASSERT(aNamespace == kNameSpaceID_MathML);
> +        mOpCode = eTreeOpCreateMathMLElement;
> +        mFour.htmlCreator = nullptr;

Why we need to set htmlCreator to null here? Why not also svgCreator? Or shouldn't it be ok to not set neither.

::: parser/htmlparser/nsHTMLTagList.h:19
(Diff revision 2)
>    HTML_TAG macro in useful ways through the magic of C preprocessing.
> +  Additionally, it is consumed by the self-regeneration code in
> +  ElementName.java from which nsHtml5ElementName.cpp/h is translated.
> +
> +  If you edit this list, you need to re-run ElementName.java
> +  self-regeneration and the HTML parser Java to C++ translation.

Could add some hint how to do self-regeneration and translation. Like, where is that all documented.
Attachment #8894846 - Flags: review?(bugs) → review-
Comment on attachment 8895259 [details] [diff] [review]
Java changes

Not sure why I need to look at this if I look at the .cpp code.
Attachment #8895259 - Flags: review?(bugs)
Some of the review comments are possibly silly, but the code really could be more self-documenting.
(In reply to Olli Pettay [:smaug] from comment #10)
> Comment on attachment 8895259 [details] [diff] [review]
> Java changes
> 
> Not sure why I need to look at this if I look at the .cpp code.

There are some translator tweaks that don't show up in the other patch (but their output does).
Comment on attachment 8894846 [details]
Bug 483155 - Put content creator function pointers onto nsHtml5ElementName.

https://reviewboard.mozilla.org/r/166010/#review171812

> Some comment why NS_NewHTMLSharedElement is used here

Added comment.

> ditto

Added comment.

> why null creator when ns is xhtml?

This code path is not used for formatting elements and we only need to save the creator for formatting elements. Comment added on the Java side. Replicating the comment in C++ is bug 559547.

> Why null creator when ns is after all xhtml.

This code path is not used for formatting elements and we only need to save the creator for formatting elements. Comment added on the Java side. Replicating the comment in C++ is bug 559547.

> What has this to do with this bug?

It's an artifact of a clang-format update. Looks like ./mach clang-format with our new clang-format likes to insert explicit continues at those points.

> What has this to do with this bug?

clang-format again.

> Why html creator and not svg creator? It is not obvious  when reading this code.

This code clones only HTML formatting elements.

> Some comment about NS_NewHTMLSharedElement usage here, please

Added comment on the Java side.

> Why we care about only html creator and not svg creator?

Because we only need to be able to clone formatting elements based on stack node. (There's a comment on the Java side.)

> ditto

Same reason.

> I don't understand this. Why we html creator? Why can't there be svg creator?

It's only needed on stack nodes for HTML formatting elements.

> Same here.

It's only needed on stack nodes for HTML formatting elements.

> This could use a comment why NS_NewHTMLSharedElement is used here.
> (just some hint that shared element is used for <html>)

Added comment.

> Why we need to set htmlCreator to null here? Why not also svgCreator? Or shouldn't it be ok to not set neither.

We don't *need* to, but I thought zeroing `mFour` would look nicer in a debugger than having garbage there. Removed.

> Could add some hint how to do self-regeneration and translation. Like, where is that all documented.

Added documentation in parser/html/java/README.txt and made the comment point to it.
Attachment #8895259 - Attachment is obsolete: true
Attachment #8895768 - Flags: review?(bugs)
Comment on attachment 8895768 [details] [diff] [review]
Java changes (incl. translator tweaks)

Looks reasonable. I don't pretend my review here is the best possible, but couldn't see anything obviously bad.

All the translation and self-regeneration stuff is a bit weird.
Attachment #8895768 - Flags: review?(bugs) → review+
Comment on attachment 8894846 [details]
Bug 483155 - Put content creator function pointers onto nsHtml5ElementName.

https://reviewboard.mozilla.org/r/166010/#review172522

::: dom/svg/SVGTagList.h:13
(Diff revision 3)
>  
>    This file contains the list of all SVG tags.
>  
>    It is designed to be used as inline input to SVGElementFactory.cpp
> -  *only* through the magic of C preprocessing.
> +  through the magic of C preprocessing.
> +  

Some extra space here?

::: parser/html/java/README.txt:91
(Diff revision 3)
> +# Save.
> +cd ../.. # Back to parser/html/java/
> +make translate
> +cd ../../..
> +./mach clang-format
> +

Thanks for these comments.
I wonder if some of this could be automated, this all feels rather hacky.

Not about this bug though.
And luckily new elements or attributes are added only very rarely.
Attachment #8894846 - Flags: review?(bugs) → review+
Comment on attachment 8894846 [details]
Bug 483155 - Put content creator function pointers onto nsHtml5ElementName.

https://reviewboard.mozilla.org/r/166010/#review172522

> Some extra space here?

Fixed.
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/5b706cd241cb
Put content creator function pointers onto nsHtml5ElementName. r=smaug
https://hg.mozilla.org/projects/htmlparser/rev/291d3f8d5e1b3261a7fcc4ad656f8d3451545c2e
Mozilla bug 483155 - Put Gecko content creator function pointers on ElementName. r=smaug.
This is causing valgrind test failures, afaict because mSVGEnabled and mMathMLEnabled are never initialized, so SVGEnabled()/MathMLEnabled() are reading random memory:

[task 2017-08-11T07:03:57.670972Z] 07:03:57     INFO -   1:47.71 TEST-UNEXPECTED-FAIL | valgrind-test | Conditional jump or move depends on uninitialised value(s) at SVGEnabled / NS_NewElement / nsXMLContentSink::CreateElement / nsXMLContentSink::HandleStartElement
[task 2017-08-11T07:03:57.671218Z] 07:03:57     INFO -   1:47.71 ==35231== Conditional jump or move depends on uninitialised value(s)
[task 2017-08-11T07:03:57.671448Z] 07:03:57     INFO -   1:47.71 ==35231==    at 0xE7D92ED: SVGEnabled (nsNodeInfoManager.h:126)
[task 2017-08-11T07:03:57.671758Z] 07:03:57     INFO -   1:47.71 ==35231==    by 0xE7D92ED: NS_NewElement(mozilla::dom::Element**, already_AddRefed<mozilla::dom::NodeInfo>&&, mozilla::dom::FromParser, nsAString const*) (nsNameSpaceManager.cpp:205)
Flags: needinfo?(hsivonen)
Oops. I forgot to refactor the initialization after I moved that stuff from document to nodeinfomanager after the last try run. :-( I'll develop a follow-up.
Turns out landing a follow-up on autoland is hard. Waiting for a sheriff to do a backout before I can fix this on autoland.
Flags: needinfo?(hsivonen)
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/048ed01b19e0
Put content creator function pointers onto nsHtml5ElementName. r=smaug
https://hg.mozilla.org/mozilla-central/rev/048ed01b19e0
Status: ASSIGNED → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.