Closed Bug 1355441 Opened 7 years ago Closed 7 years ago

Don't allocate HTML5StackNodes all the time from heap

Categories

(Core :: DOM: HTML Parser, enhancement, P1)

50 Branch
enhancement

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: smaug, Assigned: wchen)

References

Details

Attachments

(3 files, 2 obsolete files)

      No description provided.
Blocks: 1347525
This was discussed in some other bug, but
Blocks: 934607
Blocks: 1355472
Priority: -- → P1
On the other bug, I said the we could limit the total number of nodes per tree builder instance. That's not actually correct. So instead, we should probably do something like ask a page of memory from jemalloc at a time and have a custom allocator that can handle a list of such pages.

What I said about using a null member of the stack node as an indicator of whether a given node is live or not (so as to not need a separate bitmap of free slots) holds, though.
Assignee: nobody → wchen
This patch updates the Java TreeBuilder code to reuse StackNode. In places where we created a new instance of StackNode, we now call into a helper function which checks for a StackNode that currently is currently unused before allocating a new one. I decided not to switch to a custom allocator in C++ since I'm not convinced the extra complexity would be worth the additional gains (if any) and it's straight forward to do on the Java side. This patch solves the main problem of allocating a StackNode for every node in the tree, we only need to allocate up to the max depth of the tree.

If you still feel that we should switch to a custom allocator in C++ then I can update the patch to use one.
Attachment #8862279 - Flags: review?(hsivonen)
Comment on attachment 8862279 [details] [diff] [review]
Reuse StackNode in TreeBuilder to avoid malloc

Review of attachment 8862279 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah, this seems like a better approach than doing effectively the same thing in C++ only. Not marking r+ only due to the observation below that stack nodes don't really need to point to their owner tree builder.

::: src/nu/validator/htmlparser/impl/StackNode.java
@@ +28,5 @@
>  import nu.validator.htmlparser.annotation.NsUri;
>  
>  final class StackNode<T> {
> +    // The tree builder that owns this stack node.
> +    final TreeBuilder<T> treeBuilder;

This seems unnecessary bloat in the stack nodes. I suggest removing this field, making release() return idxInTreeBuilder when refcount goes to zero (or -1 otherwise) and then wrapping the calls to release() with the sort of code that's in notifyUnusedStackNode in this patch.
Attachment #8862279 - Flags: review?(hsivonen)
Attachment #8862279 - Attachment is obsolete: true
Attachment #8863574 - Flags: review?(hsivonen)
This patch contains only the manual change I made to the CPP side. I didn't include the translated code because the Java repository doesn't translate to what we have in m-c, perhaps you still have some local changes that need to be pushed.
Attachment #8863575 - Flags: review?(hsivonen)
Attachment #8863574 - Flags: review?(hsivonen) → review+
Comment on attachment 8863575 [details] [diff] [review]
Cpp changes - Reuse StackNode in TreeBuilder to avoid malloc.

Review of attachment 8863575 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for fixing this!
Attachment #8863575 - Flags: review?(hsivonen) → review+
Pushed by wchen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/2e7615b554ee
Reuse StackNode in HTML parser TreeBuilder to avoid malloc. r=hsivonen
Bug 1355472 seems to cause quite some use of getUnusedStackNode
(In reply to Olli Pettay [:smaug] from comment #10)
> Bug 1355472 seems to cause quite some use of getUnusedStackNode

Seeing one call per start tag to getUnusedStackNode is expected. Are you seeing usage that spends substantially more time there than almost all the time returning fast on the third line without looping?
Flags: needinfo?(bugs)
Backout by ihsiao@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/cbd10998876e
Backed out changeset 2e7615b554ee for crashing at nsHtml5TreeBuilder::getUnusedStackNode() intermittent failures
(In reply to Henri Sivonen (:hsivonen) from comment #11)
> (In reply to Olli Pettay [:smaug] from comment #10)
> > Bug 1355472 seems to cause quite some use of getUnusedStackNode
> 
> Seeing one call per start tag to getUnusedStackNode is expected. Are you
> seeing usage that spends substantially more time there than almost all the
> time returning fast on the third line without looping?

I was seeing some allocation happening in getUnusedStackNode. But perhaps that is then expected?
Flags: needinfo?(bugs)
(In reply to Olli Pettay [:smaug] from comment #14)
> (In reply to Henri Sivonen (:hsivonen) from comment #11)
> > (In reply to Olli Pettay [:smaug] from comment #10)
> > > Bug 1355472 seems to cause quite some use of getUnusedStackNode
> > 
> > Seeing one call per start tag to getUnusedStackNode is expected. Are you
> > seeing usage that spends substantially more time there than almost all the
> > time returning fast on the third line without looping?
> 
> I was seeing some allocation happening in getUnusedStackNode. But perhaps
> that is then expected?

The expected pattern is:

 * One call to getUnusedStackNode for each logically new node (roughly per one start tag).
 * As many actual heap allocations for nsHtml5StackNodes as is the maximum element nesting depth in the document. (Plus a few in case of misnested inline markup.)
 * getUnusedStackNode most often returning without performing a heap allocation. A heap allocation should happen when a new nesting maximum (during the parse) is being reached.
 * getUnusedStackNode most often returning without looping. Looping should mainly arise from misnested inline markup causing the recycling not to purely match the stack push/pop.
The intermittent failures were due to a race between the tree builder and speculations that were holding on to stack nodes via a snapshot. When speculations were removed, it would release stack nodes from a diffrent thread which calls into the TreeBuilder and that code isn't thread safe.

This patch makes it so that stack nodes created for the snapshot are not owned by a TreeBuilder and manages its own lifecycle so that it doesn't need to communicate with the TreeBuilder.
Attachment #8863574 - Attachment is obsolete: true
Flags: needinfo?(wchen)
Attachment #8866968 - Flags: review?(hsivonen)
Attached patch v2 diff v3Splinter Review
intradiff between v2 and v3
Attachment #8866968 - Flags: review?(hsivonen) → review+
Pushed by hsivonen@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/d8c78b2b736b
Reuse StackNode in TreeBuilder to avoid malloc. r=hsivonen.
Pushed without "checkin-needed" while wchen is away so that we can measure Quantum Flow progress sooner.
https://hg.mozilla.org/mozilla-central/rev/d8c78b2b736b
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: