Closed
Bug 386769
Opened 17 years ago
Closed 16 years ago
Make setting innerHTML faster by caching innerHTML's parser and content sink
Categories
(Core :: DOM: Core & HTML, defect, P1)
Tracking
()
RESOLVED
FIXED
mozilla1.9beta4
People
(Reporter: mtschrep, Assigned: bent.mozilla)
References
(Blocks 1 open bug, )
Details
(Keywords: perf)
Attachments
(5 files, 2 obsolete files)
483 bytes,
text/html
|
Details | |
243.84 KB,
application/zip
|
Details | |
1.71 KB,
text/html
|
Details | |
1.82 KB,
text/html
|
Details | |
33.38 KB,
patch
|
peterv
:
superreview+
|
Details | Diff | Splinter Review |
While running the uBenchmarks at the URL above I get the following results: IE7: DOM speed 511ms Minefield(2007070304): DOM speed 482ms Safari 3 beta: DOM speed 54ms The DOM speed benchmark is just a tight loop of: document.getElementById('speed_dom').innerHTML = "Testing... ("+i+"/1000)"; faster in Safari Entire test is below. function DOMSpeed() { var startTime = new Date() ; for( var i = 0; i <= 1000; i++ ) { document.getElementById('speed_dom').innerHTML = "Testing... ("+i+"/1000)"; } var endTime = new Date(); document.getElementById('speed_dom').innerHTML = (endTime - startTime); document.getElementById('speed_total').innerHTML = eval(document.getElementById('speed_total').innerHTML) + (endTime - startTime); }
I suspect this isn't getElementById performance but rather performance of mutating the DOM, in this case by setting .innerHTML. Mutating the DOM causes us to reflow the document, it seems like safari is much better in this testcase at batching those reflows.
Comment 2•17 years ago
|
||
Comment 3•17 years ago
|
||
I can attach a profile if people care, but basic breakdown is: 80% total time spent under nsGenericHTMLElementTearoff::SetInnerHTML 0.7% total time spent under nsHTMLDocument::GetElementById 3% total time spent in security checks. The rest is XPConnect/DOM/JS overhead, basically. As far as the 80% above goes, with all percentages listed as percentages of total time on this testcase throughout the rest of this comment: 21% clearing the old kids of the div 27% parsing the new kids 29% appending the new kids Within those 7% constructing frames, etc when content is removed 4% tearing down frames, etc when content is removed Both of those numbers include all the reflow-related stuff here, which is limited to reflow events; reflow is async, so it never happens during the timing of this testcase. More interesting: 23% firing and handling of mutation events under RemoveChildAt and InsertBefore. Why are we even firing those? Oh, and under parsing, about a third of the time is spent creating the parser, etc. The other two thirds are spent under nsParser::ParseFragment, with a fifth of that being scanner creation, and the rest tokenization, content sink stuff, CNavDTD, and all that mess. So the real issues here, as far as I can tell, are that actually parsing stuff is kinda expensive and for some reason we're firing mutation events. At least in both my Firefox and Seamonkey local builds...
Comment 4•17 years ago
|
||
Oh, and just to make sure we're on the same page, content GetElementByID code is 0.7% of this testcase, as I said. There's some overhead on top of that, of course, but the real time is not spent there. This bug could use a better summary. Of course this site could use better tests... ;)
Comment 5•17 years ago
|
||
I filed bug 386802 on the mutation event issue. See further analysis there.
Updated•17 years ago
|
Comment 6•17 years ago
|
||
Results vary a lot but after fixing bug 386802, uBenchmarks DOM speed dropped from ~270 to 160 on this machine.
Another test http://stevenlevithan.com/demo/replaceHtml.html gives 200x better speed for IE6 Also tested on another fater PC, with IE7 and Firefox. IE7 was way faster than Firefox Result ====== Firefox:-- Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9a8pre) Gecko/2007091405 Minefield/3.0a8pre 100 elements... innerHTML (destroy only): 56ms innerHTML (create only): 10ms innerHTML (destroy & create): 74ms replaceHtml (destroy only): 12ms (4.7x faster) replaceHtml (create only): 17ms (1.7x slower) replaceHtml (destroy & create): 19ms (3.9x faster) Done. 1000 elements... innerHTML (destroy only): 769ms innerHTML (create only): 100ms innerHTML (destroy & create): 1415ms replaceHtml (destroy only): 23ms (33.4x faster) replaceHtml (create only): 145ms (1.4x slower) replaceHtml (destroy & create): 147ms (9.6x faster) Done. 2500 elements... innerHTML (destroy only): 3441ms innerHTML (create only): 506ms innerHTML (destroy & create): 3928ms replaceHtml (destroy only): 53ms (64.9x faster) replaceHtml (create only): 388ms (1.3x faster) replaceHtml (destroy & create): 349ms (11.3x faster) Done. IE 6 100 elements... innerHTML (destroy only): 0ms innerHTML (create only): 0ms innerHTML (destroy & create): 0ms replaceHtml (destroy only): 0ms (~ same speed) replaceHtml (create only): 0ms (~ same speed) replaceHtml (destroy & create): 0ms (~ same speed) Done. 1000 elements... innerHTML (destroy only): 0ms innerHTML (create only): 16ms innerHTML (destroy & create): 31ms replaceHtml (destroy only): 0ms (~ same speed) replaceHtml (create only): 15ms (~ same speed) replaceHtml (destroy & create): 31ms (~ same speed) Done. 2500 elements... innerHTML (destroy only): 16ms innerHTML (create only): 63ms innerHTML (destroy & create): 78ms replaceHtml (destroy only): 16ms (~ same speed) replaceHtml (create only): 62ms (~ same speed) replaceHtml (destroy & create): 78ms (~ same speed) Done.
Reporter | ||
Comment 8•17 years ago
|
||
Safari 3.0.0.4 is down to 6 or 7 on the testcase - Minefield is at 78 or so. Cane we profile again to see if there is anything else left to hit?
Flags: blocking1.9+
Priority: -- → P2
Comment 9•17 years ago
|
||
This is from a Nov 12 build, but I doubt much has changed since. The testcase was the "standalone testcase" I attached, with the loop size increased to 10,000. Executive summary: Total hit count: 137032 Hits under nsGenericHTMLElement::SetInnerHTML: 90697 Hits under nsHTMLDocument::GetElementById: 1142 Hits under nsWindowSH::GetProperty: 5606 Hits under nsHTMLDocumentSH::NewResolve: 4627 Hits under IsWrapperSameOrigin (XOW): 5666 (most from nsWindowSH::GetProperty) Hits under XPCCallContext::XPCCallContext: 3626 (some under IsWrapperSameOrigin) Breaking down the SetInnerHTML part (which is less than 70% of the profile anyway): 51661 nsContentUtils::CreateContextualFragment 20265 nsHTMLDivElement::AppendChild 12211 nsContentUtils::SetNodeTextContent 2519 mozAutoSubtreeModified::~mozAutoSubtreeModified() The SetNodeTextContent is clearing out the old content. Time here is about 10% HasMutationListeners, 80% PresShell::ContentRemoved, 10% other gunk. The AppendChild is putting in the new kid. About 75% of the time here is the ContentAppended notification, but there's a lot of other stuff happening too... The CreateContextualFragment is about 60% nsParser::ParseFragment, a whole bunch of QI and parser creation, some nsCOMPtr::~nsCOMPtr (destroying the parser, I bet), building the tag stack (1811 hits under nsHTMLDivElement::GetNodeName), etc. There's string stuff, but I think sicking's recent change changed that a lot so worth recheckign with it. Under ParseFragment, there's the usual content model construction gunk. So things to jump out at me: * CreateContextualFragment uses nsIDOM* APIs when perhaps it should use our internal ones. Thing like GetParentNode() are just not called for here! * Lots of time is spent on xpconnect/js/security checks, as usual * It might maybe be worth recycling HTML parsers instead of just allocating/deallocating them a lot. * It might be nice if the parser parsed directly into the target node instead of into a document fragment as it does now... Most of that is small peanuts, though, and doesn't get us to where we should be. Over here Opera is about 8x faster than we are on that testcase (so about equivalent to Safari). :(
Comment 10•17 years ago
|
||
I wonder. If in the testcase you add: document.body.offsetWidth; right after the innerHTML set, how does that affect Safari? It slows us down by a factor of 2 or so (not unexpected), but seems to have next to no effect on Opera....
Reporter | ||
Comment 11•17 years ago
|
||
No difference on Safari with that change...
So it sounds like this is 30% security checks, this is a known problem and affects anything DOM related :( Though recently I think jst and brendan has been talking about a fix that might help. Another thing that seems to be biting us are the notifications going out. I probably made this part worse by a good bit when I simplified the fragment-insertion stuff :(. Hmm.. now that we no longer send mutation-events for aNotify=false, wouldn't it be trivial to simply insert the whole fragment in one swoop, notify once, and then send mutation-events if needed for all inserted nodes? There should be no way that script can execute while we're inserting the nodes, right? FWIW, i fixed nsContentUtils::CreateContextualFragment to use nsINode APIs in bug 403549, but I doubt that makes a noticeable difference.
Comment 13•17 years ago
|
||
It's about 5-6% security checks. Not sure where the 30% number came from in comment 12. In this case the fragment only has one child node, so I doubt that your change affected notifications for this testcase much. If we had a whole bunch of HTML in that innerHTML then it would matter, of course. I do think we've mostly killed off the BindToTree execution of script, so I believe you're correct: we could insert, notify once, and then fire mutation events. I'll apply the patch for bug 403549 and reprofile. But I agree that it probably doesn't help that much.
Reporter | ||
Comment 14•17 years ago
|
||
(In reply to comment #13) > It's about 5-6% security checks. Not sure where the 30% number came from in > comment 12. > Even at 30% there is still a factor of 5-6 in perf loss here v.s. Safari or Opera. Where's the rest of the time going?
Reporter | ||
Comment 15•17 years ago
|
||
Results on new testcase: Trunk (11/15/07): innerHTML only: 84 innerHTML + style: 149 innerHTML concat: 158 innerHTML copy from other div: 189 Safari 3.0.0.4: innerHTML only: 6 innerHTML + style: 35 innerHTML concat: 466 innerHTML copy from other div: 105 Opera 9.5b1 innerHTML only: 32 innerHTML + style: 54 innerHTML concat: 101 innerHTML copy from other div: 56 I factored out getElementByID from the main loop. The only difference in these four tests is the in that loop: 1) elem.innerHTML = "Testing... ("+i+"/1000)"; 2) elem.style.top = i + 'px'; elem.style.left = i + 'px'; elem.innerHTML = "Testing... ("+i+"/1000)"; 3) elem.innerHTML = elem.innerHTML + i; 4) elem.innerHTML = "Testing... ("+i+"/1000)"; elem2.innerHTML = elem.innerHTML; Is Safari doing something to special case the first test? As soon as you touch style attributes the difference drops to 3-5x (still bad). If you append things get much closer.
Comment 16•17 years ago
|
||
i feel attachment 288941 [details]
(Testing a couple of different variants for innerHTML mutations)
is not testing it properly, as it is only assigning text content to .innerHTML
So I modified it to assign HTML
and found IE wins!!!
Comment 17•17 years ago
|
||
We've spent a fair amount of time aggressively optimizing setting of style.top and style.left.... Seriously, our implementation of innerHTML is pretty heavyweight: create a new HTML parser, run stuff through it, build a DOM inside a document fragment, insert the DocumentFragment. Frankly, for typical innerHTML use out there (outside of synthetic benchmarks) that's pretty much fine from what I've seen; I have yet to see a bug report with a real site that's being slow because of innerHTML (whereas .style.* stuff is all over).
Comment 18•17 years ago
|
||
> We've spent a fair amount of time aggressively optimizing setting of style.top
I should have said "repeated setting". We do as little work as possible at set time and put it all off until later, basically.
Comment 19•17 years ago
|
||
Besides synthetic benchmarks, Ajax libraries do set innerHTML in preference to constructing DOM nodes, from what I've read. Maybe jresig has data. It would be great to have data on frequence of innerHTML sets, with simple text vs. markup in the string broken down. /be
Comment 20•17 years ago
|
||
jQuery uses innerHTML really aggressively: it's used as a means of HTML String to DOM Node serialization (which we then use to inject into a document at a random location). We've felt the speed effects of innerHTML significantly (although, to be fair, Internet Explorer is usually much worse in this respect). So, yes, any speed improvements to innerHTML will absolutely be felt (improving the speed of virtually all DOM manipulation methods).
Assignee: nobody → jonas
Comment 21•17 years ago
|
||
(In reply to comment #20) > respect). So, yes, any speed improvements to innerHTML will absolutely be felt > (improving the speed of virtually all DOM manipulation methods). You can consider replaceHtml method mentioned at http://stevenlevithan.com/demo/replaceHtml.html Or if you want to retain original node, I have alternate method that is almost fast function replaceHtml (el, html) { var oldEl = typeof el === "string" ? document.getElementById(el) : el; /*@cc_on // Pure innerHTML is slightly faster in IE oldEl.innerHTML = html; return oldEl; @*/ var next = oldEl.nextSibling; var parent = oldEl.parentNode; parent.removeChild(oldEl); oldEl.innerHTML = html; parent.insertBefore(oldEl,next); return oldEl; };
Assignee | ||
Comment 22•17 years ago
|
||
This patch caches the parser and content sink per document as we discussed.
Comment 23•17 years ago
|
||
Don't you need to add that parser to cycle collection? Do parsers even participate in cycle collection right now? Also, what if GetCachedParser() returned an already_AddRefed and returned forget() on the nsCOMPtr (and documented all this in the interface)? That way you wouldn't have to manually set the cached parser to null: as soon as you get it, you own it.
Comment 24•17 years ago
|
||
(In reply to comment #23) > Don't you need to add that parser to cycle collection? It should be. Fragmentcontentsink owns the document and parser keeps fragmentcontentsink alive and document keeps parser alive... Btw, patches are easier to review when created with at least -8 context.
Comment 25•17 years ago
|
||
And I suggest using XPCOM_MEM_LEAK_LOG when hacking things like this.
Comment 26•17 years ago
|
||
Comment on attachment 299043 [details] [diff] [review] Cache innerHTML's parser and content sink - In nsIDocument.h: + virtual nsIParser* GetCachedParser() { + return nsnull; + } I like bz's idea about making this return an already_AddRefed<nsIParser> (and making the HTML doc one use forget() on the nsCOMPtr. Also, I'd suggest renaming this to maybe GetFragmentParser(), as this parser is specifically for fragment parsing alone, not to be confused with mParser. - In nsContentUtils::CreateContextualFragment(): + // See if the document has a cached parser. nsHTMLDocument is the only one ... has a cached fragment parser. - In nsIParser.h: + virtual void Reset() { + // Do nothing. + } Could this be pure virtual so you don't need an empty implementation here? - In nsParser::Finalize(): This could get confusing in the future MMgc world where objects have finalizers. Maybe name this Cleanup() instead? - In nsParser.h: - virtual ~nsParser(); + virtual ~nsParser() { Finalize(); } I'd leave this in the cpp file, since it's virtual it won't ever be inlined anyways. And yeah, mCachedParser (or mFragmentParser) needs to be added to cycle collection, just like mParser is. With that fixed, this looks good.
Attachment #299043 -
Flags: review?(jst) → review-
Assignee | ||
Comment 27•16 years ago
|
||
Updated to fix all of jst's comments. The cycle collector stuff went a little deeper than I would have liked, but it doesn't leak anything anymore.
Attachment #299043 -
Attachment is obsolete: true
Attachment #300140 -
Flags: review?(jst)
Assignee | ||
Updated•16 years ago
|
Attachment #300140 -
Flags: review?(peterv)
Comment 28•16 years ago
|
||
You don't need to add nsExpatDriver to cycle collection, like you added CNavDTD?
Comment 29•16 years ago
|
||
Comment on attachment 300140 [details] [diff] [review] Cache innerHTML's parser and content sink, v2 r=jst
Attachment #300140 -
Flags: review?(jst) → review+
Comment 30•16 years ago
|
||
Comment on attachment 300140 [details] [diff] [review] Cache innerHTML's parser and content sink, v2 Add nsExpatDriver, like bz said, which means you also need to traverse mTokenizer in the CParserContexts.
Attachment #300140 -
Flags: review?(peterv) → superreview+
Assignee | ||
Comment 31•16 years ago
|
||
Ok, added that.
Attachment #300140 -
Attachment is obsolete: true
Attachment #300456 -
Flags: superreview?
Assignee | ||
Updated•16 years ago
|
Attachment #300456 -
Flags: superreview? → superreview?(peterv)
Assignee | ||
Updated•16 years ago
|
Priority: P2 → P1
Target Milestone: --- → mozilla1.9beta3
Updated•16 years ago
|
Attachment #300456 -
Flags: superreview?(peterv) → superreview+
Assignee | ||
Comment 33•16 years ago
|
||
Fixed.
Status: ASSIGNED → RESOLVED
Closed: 16 years ago
Resolution: --- → FIXED
Comment 34•16 years ago
|
||
Is this really "fixed" in the "Now faster than Safari" sense? I sort of doubt it. For performance bugs it's really a good idea to do the work in dependencies filed on particular optimizations so that the main bug can continue to be used as a tracking bug... :(
Assignee | ||
Comment 35•16 years ago
|
||
I can move the patch into another bug and reopen this if you'd like?
Comment 36•16 years ago
|
||
Probably better to get a clean tracking bug filed at this point... this bug is about 15 comments past that point. :(
Updated•16 years ago
|
Updated•5 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•