Closed Bug 1807017 Opened 2 years ago Closed 11 months ago

Consider to reuse the memory when nsHtml5Tokenizer::end() is called

Categories

(Core :: DOM: HTML Parser, enhancement)

enhancement

Tracking

()

RESOLVED FIXED
126 Branch
Tracking Status
firefox126 --- fixed

People

(Reporter: smaug, Assigned: smaug)

References

(Blocks 1 open bug)

Details

(Whiteboard: [sp3])

Attachments

(2 files)

https://searchfox.org/mozilla-central/rev/e6b709df9b93858364f02ab89f40d78762693db8/parser/html/nsHtml5Tokenizer.cpp#4998 shows up a bit in some speedometer 2 profiles, at least in VanillaJS. We seem to be deleting the autoJArray.arr. Could you reuse that memory somehow?

This is for innerHTML/DOMParser and not for parse from the network, right? And specifically for nsHtml5Tokenizer::strBuf, right?

For innerHTML/DOMParser it would be feasible to reuse that memory. The risk is that strBuf stretches to the worst case of the parser invocation, so if we unconditionally reuse the memory, a parse that needs an exceptionally long buffer would cause a memory leak-like situation with the long buffer sticking around for the rest of the process lifetime.

So we'd probably need some mechanism to delete an overlong buffer anyway. If we just put a magic threshold in nsHtml5Tokenizer::end, the delete would still show up on profiles if benchmark caused allocation above the threshold in a loop.

The fancy way would probably be to have an idle task delete the buffer if it's longer than some magic constant.

This is for innerHTML, yes.

Whiteboard: [sp3]

This is starting to show up quite a lot in some cases.

Could we do something simple now and keep the buffer around say whenever it is <= 1024 ?
It is though often quite a lot larger, up to 65536. Clearing using an idle runnable might be better.

I'd need to re-learn how to add some very Gecko specific stuff into .java, so perhaps you could write a small patch?

Flags: needinfo?(hsivonen)

I guess I can do this on nsHtml5StringParser level, no need to touch java side at all.

Flags: needinfo?(hsivonen)

Hopefully this isn't too complicated, but I wanted to make sure the runnable doesn't keep any other objects alive and thus
it needed to have Disconnect to clear the raw pointer.
The basic idea is to cache the previous buffer until we have idle time.
This doesn't seem to affect performance enough to show up in numbers, but based on the profiles this does
affect execution by reducing the overhead coming from nsHtml5Tokenizer::end()

Assignee: nobody → smaug
Status: NEW → ASSIGNED
Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a317563d8729 Consider to reuse the memory when nsHtml5Tokenizer::end() is called, r=hsivonen
Pushed by opettay@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/2ce47a20dd08 Consider to reuse the memory when nsHtml5Tokenizer::end() is called, r=hsivonen
Status: ASSIGNED → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: --- → 126 Branch
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: