Created attachment 811950 [details] Testcase All this test does is: this.hiddenPlayground.innerHTML = ...; Numbers: Chrome 31: 1280 ms Safari 6.0.5: 1292 ms Nightly 27: 1764 ms bz: sorry for flooding you with needinfo? requests, but I've filed bugs on most of the Peacekeeper tests now except some of the DOM/jQuery ones.
Hmm, we used to be faster than others with innerHTML, at some point. But I guess that depends on what is being parsed.
I don't see anything special in the profile. Just many things taking a bit too much time. *Element::SetAttr and stuff under it and also HTML parser seems to be quite malloc-happy: nsHTML5HTMLAttributes dtor takes significant time and nsHtml5TreeBuilder::appendToCurrentNodeAndPushElementMayFoster tends to malloc quite a bit. Some object recycling might make sense.
Yeah... at least on Mac, free is 20% of the time here. malloc is 15%; realloc is 4%. Olli, how insane would it be to have an innerHTML cache mapping strings to DOM fragments that we could just clone? I wonder how often it would get hit in non-benchmark situations...
Flags: needinfo?(bzbarsky) → needinfo?(bugs)
Henri, I'll need some help figuring out how to optimize the parser, given that the java-c++ translation isn't quite working atm.
Caching DOM fragments... I wonder how to make the hashtable look up fast, if strings are very similar. Perhaps just cache 10 fragments or something? I'll try that, but the parser needs still some optimizing.
A super simple cache gives 50% speed up here, but it is not clear to me whether it is useful outside benchmarks. It might be. Trying to find some innerHTML heavy code.
Created attachment 815178 [details] [diff] [review] WIP cache a simple cache. not sure we want this kind of thing, but maybe it wouldn't cause harm. https://tbpl.mozilla.org/?tree=Try&rev=1ef2088d9c9f
Created attachment 815500 [details] [diff] [review] innerHTML cache https://tbpl.mozilla.org/?tree=Try&rev=4c9bd8fb7db6 I was thinking to move this to parser level, but that becomes tricky because of different notifying based on caching or not caching.
Attachment #815178 - Attachment is obsolete: true
Created attachment 815628 [details] [diff] [review] v3 https://tbpl.mozilla.org/?tree=Try&rev=3e162e5f5460
Attachment #815500 - Attachment is obsolete: true
Hmm, what caching is hard, since things like img and video start loading and may trigger events.
(In reply to Olli Pettay [:smaug] from comment #10) > Hmm, what caching is hard, since things like img and video start loading and > may trigger events. It's not possible to detect these cases? Else we should work on optimizing the HTML parser..
We should definitely optimize HTML parser in any case. I need to still figure out a sane way to do that since the current Java->C++ translation makes coding harder than usually.
(In reply to Olli Pettay [:smaug] from comment #12) > We should definitely optimize HTML parser in any case. > I need to still figure out a sane way to do that since the current > Java->C++ translation makes coding harder than usually. If you mean recycling the attribute holders, what makes it hard is the reuse of the off-the-main-thread code on the main thread--not the Java stuff. The Java stuff *already* has the capability to reuse attribute holders. In general, we could call malloc a lot less for innerHTML if we had a distinct main-thread-oriented HTML parser similar to the one initially landed before Firefox 3.6. The Java code supports pluggable tree builders. However, to avoid virtual calls, that abstraction is flattened away in C++. If we want to plug in a tree building back end that doesn't generate tree ops but actually builds the tree right away, we could either: * Use the translator infrastructure to generate a second copy of the parser and avoid virtual calls. * Introduce virtual calls to the C++ side and have two subclasses of the tree builder.
That is to say, I think adding some sort of attribute holder reuse to the off-the-main-thread architecture would be a lot less of a fix than getting rid of the entire off-the-main-thread architecture for innerHTML and DOMParser. It wouldn't even be particularly hard. What we need to decide is whether we want to avoid adding a bunch of frequent virtual calls or whether we want to avoid the code size cost of having a second copy of parser core.
(In reply to Henri Sivonen (:hsivonen) from comment #13) > If you mean recycling the attribute holders, what makes it hard is the reuse > of the off-the-main-thread code on the main thread--not the Java stuff. The > Java stuff *already* has the capability to reuse attribute holders. I mean doing low level optimizations in general, so that we can for example reduce use of malloc/free everywhere.
(In reply to Olli Pettay [:smaug] from comment #15) > I mean doing low level optimizations in general, so that we can for example > reduce use of malloc/free everywhere. But even before we get to that level of optimization, we should first get the off-the-main-thread machinery out of the way. It's the main cause of malloc/free in the parser. If we get rid of a lot of malloc/free, low-level optimization might not make sense.
On linux I get ~1450 both in Nightly and Chromium-32.
On Windows I see the following. The variance figures are just ballpark estimates. Chrome 32.0.1700.102: 1043 ms +/- 10 ms Chrome 33.0.1750.46: 1004 ms +/- 5 ms Chrome 34.0.1809.0: 962 ms +/- 5 ms Firefox 29.0 2014-01-29: 882 ms +/- 20 ms
I'd say this is fixed, but will get better once bug 959150 lands.
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.