Closed
Bug 922681
Opened 11 years ago
Closed 11 years ago
Consider a SetInnerHTML fast-path for text with no markup
Categories
(Core :: DOM: Core & HTML, defect)
Tracking
()
RESOLVED
FIXED
mozilla29
People
(Reporter: bzbarsky, Assigned: jandem)
References
Details
(Whiteboard: [qa-])
Attachments
(2 files, 2 obsolete files)
10.72 KB,
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
4.42 KB,
patch
|
bzbarsky
:
review+
|
Details | Diff | Splinter Review |
See bug 922016 comment 2. Basically, if the input string does not contain any of '<', '&', '\r', '\0', then it's plaintext and we don't have to spin up an HTML parser at all. The thing is, checking for those chars itself takes time, so we may want to only do this for short strings. Especially because the longer a string is the more likely it is to have markup in it. I thought we had a bug on this already, but I don't see it...
Reporter | ||
Comment 1•11 years ago
|
||
Oh, Henri, where does the "depends on the element" bit come in?
Flags: needinfo?(hsivonen)
Comment 2•11 years ago
|
||
I was thinking of <pre>, but it looks like I was wrong.
Flags: needinfo?(hsivonen)
Assignee | ||
Comment 3•11 years ago
|
||
This seems to work on some cases I tried. Please review carefully because I don't really understand everything that's going on in these methods.
Assignee | ||
Comment 4•11 years ago
|
||
Comment on attachment 825258 [details] [diff] [review] Patch Requesting feedback? from hsivonen as suggested by smaug :)
Attachment #825258 -
Flags: feedback?(hsivonen)
Updated•11 years ago
|
Attachment #825258 -
Flags: feedback?(hsivonen) → feedback+
Assignee | ||
Comment 5•11 years ago
|
||
Some numbers for the testcase in bug 922016: Safari: 1180 ms FF new: 1250 ms Chrome: 1550 ms FF old: 1950 ms So this wins about 35% and we're faster than Chrome and not that much slower than Safari. We can win more by optimizing other parts of that benchmark like the |clear.style.clear = "both"|.
How much does the patch slow down the cases when there is markup. http://mozilla.pettay.fi/moztests/innerhtml_nonsimple_nostyle_test.html for example, though might be good to write a worst case test too.
Assignee | ||
Comment 7•11 years ago
|
||
(In reply to Olli Pettay [:smaug] from comment #6) > How much does the patch slow down the cases when there is markup. > http://mozilla.pettay.fi/moztests/innerhtml_nonsimple_nostyle_test.html for > example, I don't see much of a difference: 12.2-12.5 ms both with and without the patch. It makes sense though because the first character is < so we will stop scanning immediately. > though might be good to write a worst case test too. A worst-case test would look like the one below: string with length < 100 and near the end a markup character. Unfortunately the numbers are very noisy but if I look at the lowest number in 10 runs it's probably 3-4% slower or so. But again, it's hard to say much due to the high variance :( <script> function test() { var el = document.getElementById("foo"); var t = new Date; for (var i=0; i<200000; i++) el.innerHTML = "0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456&"; el.innerHTML = (new Date - t) + "ms"; } window.onload = test; </script> <div id="foo"></div>
Assignee | ||
Comment 8•11 years ago
|
||
bz: review ping, in case you forgot about this.
Reporter | ||
Comment 9•11 years ago
|
||
I didn't; I'm just still somewhat backlogged. Should get to this today.
Reporter | ||
Comment 10•11 years ago
|
||
Comment on attachment 825258 [details] [diff] [review] Patch >+ nsAString::const_iterator start, end; >+ aStr.BeginReading(start); >+ aStr.EndReading(end); Clearer to just do: const PRUnichar* start = aStr.BeginReading(); const PRUnichar* end = aStr.EndReading(); Please document that you're not using FindCharInSet because null is one of the chars you want to search for. > + if (doc->IsHTML() && aInnerHTML.Length() < 100 && !ContainsMarkup(aInnerHTML)) { Why do we need the IsHTML() test there? Please document the reason, because it's very non-obvious to me why that test is needed. Also, document that we limit to short strings to avoid ContainsMarkup taking too long. And if there was a principled way to pick "100" document that, else document that the choice was pretty much gut feeling. >+ aError = nsContentUtils::SetNodeTextContent(target, aInnerHTML, false); This could also be: target->FragmentOrElement::SetTextContentInternal(aInnerHTML, aError); (fully qualified to avoid the virtual dispatch). Not sure which is uglier. The non-ugly version would be: target->SetTextContent(aInnerHTML, aError); but that has an extra virtual call... I guess explicit SetNodeTextContent is ok for now. r=me with the nits.
Attachment #825258 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 11•11 years ago
|
||
I added the doc->IsHtml() test just to be conservative but if you also don't see a good reason for it we may as well remove it. This patch does that and addresses your other nits. Requesting review just to make sure you're okay with this change.
Attachment #825258 -
Attachment is obsolete: true
Attachment #830832 -
Flags: review?(bzbarsky)
Reporter | ||
Comment 12•11 years ago
|
||
Comment on attachment 830832 [details] [diff] [review] Patch v2 r=me
Attachment #830832 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 13•11 years ago
|
||
Hm this patch is failing this M5 test: parser/htmlparser/tests/mochitest/test_html5_tree_construction_part2.html https://tbpl.mozilla.org/?tree=Try&rev=955c00609173 Will investigate today or tomorrow..
Reporter | ||
Comment 14•11 years ago
|
||
Hmm. I wonder whether there's weirdness with setting innerHTML on tbody or whatnot (does that imply the tr/td tags?).
Assignee | ||
Comment 15•11 years ago
|
||
Here's a minimal testcase. With my patch context.innerHTML is empty and the test alerts FAIL: function f() { var context = document.createElementNS("http://www.w3.org/1999/xhtml", "html"); context.innerHTML = ""; var pass = (context.innerHTML == "<head></head><body></body>"); alert(pass ? "PASS" : "FAIL"); } f(); Boris, do you know what's going on?
Flags: needinfo?(bzbarsky)
Reporter | ||
Comment 16•11 years ago
|
||
Yes, I do, unfortunately. <body> and <head> have optional start and end tags in HTML. If you look at http://www.whatwg.org/specs/web-apps/current-work/multipage/the-end.html#html-fragment-parsing-algorithm step 4 substep 6 lands us in http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#reset-the-insertion-mode-appropriately which does various special-casing bits. In the particular case of <html> we end up in the "before head" insertion mode, and in the "data state" (from step 4 substep 1 of the fragment parsing algorithm) in this case. Then in the data state, when EOF is encountered we get an end-of-file token. Looking at http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#the-before-head-insertion-mode an EOF falls into the "Anything else" clause, which will create a <head> element and switch to the "in head" insertion mode. Then we land in http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#parsing-main-inhead and fall into "Anything else" (still with our EOF token), which moves us to "after head" mode. Then we go to http://www.whatwg.org/specs/web-apps/current-work/multipage/tree-construction.html#the-after-head-insertion-mode and again hit "Anything else" and create a <body> element move to the "in body" mode. Anyway, long story short we probably need to not do this optimization in cases when the context element is anything that would lead to an insertion mode other than "in body" per http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing.html#reset-the-insertion-mode-appropriately Unfortunately, there's quite a long list of things that have that behavior. I wonder whether it's worth just using a boolean flag on nodeinfo (or on the context element, since we have better flag infrastructure there) to keep track of this stuff. Or alternately to only fast-path innerHTML sets on a whitelist of elements that are commonly hit. :(
Flags: needinfo?(hsivonen)
Flags: needinfo?(bzbarsky)
Flags: needinfo?(bugs)
Comment 17•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #16) > Unfortunately, there's quite a long list of things that have that behavior. > I wonder whether it's worth just using a boolean flag on nodeinfo (or on the > context element, since we have better flag infrastructure there) to keep > track of this stuff. Makes sense for me. I forgot to test those. I just tested <pre> as a potential special case earlier.
Flags: needinfo?(hsivonen)
Assignee | ||
Comment 18•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #16) > Anyway, long story short we probably need to not do this optimization in > cases when the context element is anything that would lead to an insertion > mode other than "in body" per > http://www.whatwg.org/specs/web-apps/current-work/multipage/parsing. > html#reset-the-insertion-mode-appropriately This patch sets the ElementHasWeirdParserInsertionMode flag on the elements mentioned there. This fixes the minimal testcase in comment 15. For some elements like <td> this may be overzealous: the "in cell" insertion mode should only behave != "in body" for start/end tags (right?), and we filter these out already. Not setting the flag may be confusing or break other uses of it in the future. What do you think? Also, where can I do this for the colgroup element? This was the only element I couldn't find :)
Attachment #830832 -
Attachment is obsolete: true
Attachment #8348339 -
Flags: review?(bzbarsky)
Reporter | ||
Comment 19•11 years ago
|
||
Comment on attachment 8348339 [details] [diff] [review] Patch v3 You need to set the flag in HTMLTableColElement as well, at least for the colgroup case. r=me with that fixed and a test added that would have caught this issue.
Attachment #8348339 -
Flags: review?(bzbarsky) → review+
Reporter | ||
Updated•11 years ago
|
Flags: needinfo?(bugs)
Assignee | ||
Comment 20•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #19) > r=me with that fixed and a test added that would have caught this issue. The following test caught it (see comment 13): parser/htmlparser/tests/mochitest/test_html5_tree_construction_part2.html Or should I add a mochitest for the testcase in comment 15 based on that one?
Flags: needinfo?(bzbarsky)
Assignee | ||
Comment 21•11 years ago
|
||
Added the call to HTMLTableColElement and pushed: https://hg.mozilla.org/integration/mozilla-inbound/rev/86969d38372b Green on Linux64/OS X now: https://tbpl.mozilla.org/?tree=Try&rev=27f2ebefc1ad
Flags: needinfo?(bzbarsky)
Reporter | ||
Comment 22•11 years ago
|
||
> The following test caught it (see comment 13):
No, I mean specifically for <colgroup> elements. Or did that test catch the missing flag on those too?
Flags: needinfo?(jdemooij)
Assignee | ||
Comment 23•11 years ago
|
||
(In reply to Boris Zbarsky [:bz] from comment #22) > No, I mean specifically for <colgroup> elements. Or did that test catch the > missing flag on those too? Sorry I misunderstood your comment. Let me try to write a test for <colgroup>.
Reporter | ||
Comment 24•11 years ago
|
||
Ideally we'd have tests for all of these element types, actually.
Assignee | ||
Comment 25•11 years ago
|
||
Because we filter out strings with "<" etc, most of these tests pass even if I comment out the HasWeirdParserInsertionMode() check, but some of them fail.
Attachment #8348989 -
Flags: review?(bzbarsky)
Flags: needinfo?(jdemooij)
Comment 26•11 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/86969d38372b
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla29
Reporter | ||
Comment 27•11 years ago
|
||
Comment on attachment 8348989 [details] [diff] [review] Testcase r=me
Attachment #8348989 -
Flags: review?(bzbarsky) → review+
Assignee | ||
Comment 28•11 years ago
|
||
Tests: https://hg.mozilla.org/integration/mozilla-inbound/rev/2ca8e48896c4
Updated•11 years ago
|
Whiteboard: [qa-]
Updated•5 years ago
|
Component: DOM → DOM: Core & HTML
You need to log in
before you can comment on or make changes to this bug.
Description
•