nsHtml5TreeBuilder::startTokenization takes significant amount time when parsing small innerHTML strings

RESOLVED FIXED

Status

()

Core
HTML: Parser
RESOLVED FIXED
4 years ago
9 months ago

People

(Reporter: smaug, Unassigned)

Tracking

(Blocks: 1 bug)

Trunk
x86_64
Linux
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qf:p3])

Attachments

(2 attachments)

(Reporter)

Description

4 years ago
http://mozilla.pettay.fi/moztests/innerhtmltest.html is a testcase I've taken from somewhere. That will be a lot faster once we have the don't-parse-short-none-tag-soup, but
if that data had a tag, it would be slow again.
nsHtml5TreeBuilder::startTokenization is about 10% of the time currently, 
and mostly because it allocates lots of stuff.

We should be able cache most of the data structures the method creates.
(Reporter)

Updated

3 years ago
Blocks: 1097376
Whiteboard: [qf]

Comment 1

11 months ago
Olli, is this still an issue?
Flags: needinfo?(bugs)
(Reporter)

Comment 2

11 months ago
Created attachment 8855002 [details]
testcase

(This test doesn't get the fast path, since this has an element to parse.)

On linux we seem to be a lot faster than Chrome (312 vs 451), but we could also do significantly better too. 
nsHtml5TreeBuilder::startTokenization is very malloc heavy.
7% of the innerHTML in this case, all that is just malloc.
Overall of the innerHTML malloc is 18%

I was chatting today with hsivonen about reusing nsHtml5StackNodes. If we did that everywhere in parser, that should reduce malloc/free usage.
Allocating nsHtml5StackNodes show up in other profiles.

So, I guess this particular bug isn't very high priority, but I'd like to see better nsHtml5StackNodes usage.
Flags: needinfo?(bugs) → needinfo?(hsivonen)
(Reporter)

Comment 3

11 months ago
I guess jArray usage also causes malloc usage. Could we use AutoTArray?
Or if autoJArray is like AutoTArray, then that?
(In reply to Olli Pettay [:smaug] (review queue closed. Please ask reviews from other DOM peers) from comment #2)
> I was chatting today with hsivonen about reusing nsHtml5StackNodes. If we
> did that everywhere in parser, that should reduce malloc/free usage.
> Allocating nsHtml5StackNodes show up in other profiles.

Let's make each tree builder allocate a chunk of memory sizeof(nsHtml5StackNode) * MAX_REFLOW_DEPTH and a free bitmap for it and allocate the stack nodes out of that. This way, the nodes can still be held behind pointers in two places, etc, and, thanks to the free bitmap, be freed out of order. It'll be mostly in order, though, so searching the the bitmap linearly when allocating shouldn't be too terrible, especially if the allocator remembers where the search ended the last time.

(In reply to Olli Pettay [:smaug] (review queue closed. Please ask reviews from other DOM peers) from comment #3)
> I guess jArray usage also causes malloc usage.

Where in the parser is this enough of a problem?

> Could we use AutoTArray?

No, it doesn't have the right semantics for passing it around. (The point of jArray is that it groups a length to a pointer so that it's almost as cheap to pass as a pointer, but unlike a bare pointer, the length comes along.)

> Or if autoJArray is like AutoTArray, then that?

Auto in autoJArray is about auto-deleting the buffer, not about stack-allocating the buffer.
Flags: needinfo?(hsivonen)
(In reply to Henri Sivonen (:hsivonen) from comment #4)
> (In reply to Olli Pettay [:smaug] (review queue closed. Please ask reviews
> from other DOM peers) from comment #2)
> > I was chatting today with hsivonen about reusing nsHtml5StackNodes. If we
> > did that everywhere in parser, that should reduce malloc/free usage.
> > Allocating nsHtml5StackNodes show up in other profiles.
> 
> Let's make each tree builder allocate a chunk of memory
> sizeof(nsHtml5StackNode) * MAX_REFLOW_DEPTH and a free bitmap for it and
> allocate the stack nodes out of that.

No need for a free bitmap. If upon free we null out a member of nsHtml5StackNode that's never null when live, we don't need a separate bitmap.
Whiteboard: [qf] → [qf:p3]
On further thought, a fixed-size block might not work after all. Anyway, a custom allocator for these should be feasible.
(In reply to Henri Sivonen (:hsivonen) from comment #4)
> > Could we use AutoTArray?
> 
> No, it doesn't have the right semantics for passing it around.

We could, however, use AutoTArray in the case jArray doesn't get passed around: nsHtml5HtmlAttributes doesn't really change that often, so there isn't a great benefit from keeping it in sync with Java by the means of the translator. If we moved that particular class to be untranslated, we could eliminate the most commonly-allocated multiple jArrays with a single AutoTArray whose storage would get allocated as part of the malloc for nsHtml5HtmlAttributes itself.

Let's do that.
(Reporter)

Updated

11 months ago
Depends on: 1355441
(Reporter)

Comment 8

11 months ago
Bug 1355441 filed about nsHtml5StackNodes
(Reporter)

Comment 9

9 months ago
Let's close this. If there is another testcase where we're slow in innerHTML parsing, better to file a new bug.
Status: NEW → RESOLVED
Last Resolved: 9 months ago
Resolution: --- → FIXED
(Reporter)

Comment 10

9 months ago
Created attachment 8872948 [details]
test v2 (large innerHTML)

I was just testing something related to innerHTML and modified the test.
You need to log in before you can comment on or make changes to this bug.