Closed Bug 1420161 Opened 2 years ago Closed 2 years ago

testcase showing Firefox slower than Chrome

Categories

(Core :: Layout: Text and Fonts, enhancement)

enhancement
Not set

Tracking

()

RESOLVED FIXED
mozilla59
Tracking Status
firefox59 --- fixed

People

(Reporter: pascalc, Assigned: jfkthame)

References

Details

(Keywords: nightly-community, perf, Whiteboard: [qf])

Attachments

(4 files)

While working on community support for Quantum launch, I talked with a developper that mentionned that we were slower than chrome on an application he wrote and that we were a bit slower with 57 than we used to be in 56.

(in French):
https://www.nextinpact.com/news/105581-firefox-57-quantum-disponible-performances-fonctionnalites-et-ambitions.htm#/comment/5957803


The developper nicely created a reduced testcase exposing our performance problem that I am attaching to this bug. You can see in the HTML comment the performance he is getting:
   // Chrome 64	: 240 ms
   // FF 56 		: 340 ms
   // FF 59 		: 380 ms

I can confirm on my own machine that there is a performance gap between Nightly 59 and Chrome 64, although the numbers vary with each load of the page, I am getting results aroung 400ms on Nightly vs 300ms on Chrome 64.

I am tentatively filing this bug in Core::Dom since the Javascript included in this page manipulates child nodes but I don't know what part of the code is the cause of the slowness.
Thanks for the test case. This may be a dupe of bug 843936 but I'm not sure.
Component: DOM → Layout: View Rendering
See Also: → 843936
Whiteboard: [perf] → [qf]
Largely text layout
Component: Layout: View Rendering → Layout: Text
Something like 78% is reflow, of which quite a bit is text handling, and 16% is styling (Stylo).
Flags: needinfo?(jfkthame)
Flags: needinfo?(cam)
Each time the script takes one of the <p> children from the div-to-split <div>, it leaves behind the surrounding whitespace (although that whitespace will be collapsed in rendering). So every time round the loop, the text content in the DOM for that original <div> changes, so it gets marked as dirty; and when we call newElem.offsetHeight, that flushes layout. And the <div id="div-to-split"> has an ever-increasing number of whitespace-only textframe children before its remaining <p> children. This keeps its reflow expensive every time round the loop.

Modifying the testcase to replace

 <div id="div-to-split">
    <p>Text</p>
    <p>Text</p>
    <p>Text</p>
    <p>Text</p>
    ...etc

with

 <div id="div-to-split">
    <p>Text</p><!--
    --><p>Text</p><!--
    --><p>Text</p><!--
    --><p>Text</p><!--
    ...etc

so that there's no extra whitespace between the <p> children suddenly makes Firefox *faster* than Chrome (whose performance is more-or-less unchanged) instead of slower.
Flags: needinfo?(jfkthame)
Could we fix the case where we're slow?
This appears to improve our performance on the example here by about 20% on my system, which still doesn't bring us fully up to Chrome but at least it's something.... there's still a substantial difference between the testcases with/without whitespace, though, so perhaps there is scope for further improvement at another level.
A second thing we can do here, after adding the optimization to ReflowText, is to put a similar early-return in TrimTrailingWhiteSpace so that it doesn't do any work in a case like this where the frame's content was entirely trimmable space, and ReflowText already determined this. With this patch, I get another 15-20% improvement (on top of the improvement from the first patch) on the testcase here, which should bring us fairly close to Chrome's result. So far, tryserver seems to think it's OK to do this without breaking tests: https://treeherder.mozilla.org/#/jobs?repo=try&revision=7650b37e0f7a22bcc54439df09bd34b24d858fac.
(In reply to Olli Pettay [:smaug] from comment #5)
> Could we fix the case where we're slow?

The two patches above make it quite a bit less slow, though it's still slower than the version without whitespace.

Something that would give us a big win here, though, would be if we could merge adjacent text nodes in the DOM when the intervening node is removed. Is there some way that could be done?

Just as a test, I tried hacking nsINode::doRemoveChildAt to check whether the node being removed is between two text nodes, and if so, append the content of the following one to the preceding, and delete the following node. This by itself (without the other patches above) gives me a speedup of about 45% (seriously!) on this testcase, or about 20% extra if used on top of the above patches. But (as you would no doubt expect) it causes various test failures, because arbitrary hacking at the DOM tree like that isn't really legitimate: https://treeherder.mozilla.org/#/jobs?repo=try&revision=25a3caa83f236d0171cb1216320df16827143842.

So is there any way we (i.e. someone who knows what they're doing with DOM stuff, unlike me...) could make this actually work without breaking various stuff, or is it fundamentally impossible?
Flags: needinfo?(bugs)
Comment on attachment 8931650 [details] [diff] [review]
Add an early return to nsTextFrame::ReflowText for the case where nothing was left after we've trimmed whitespace

Mats, does this seem a reasonable "fast-path" to add? Offhand, I don't see that it should break anything.
Attachment #8931650 - Flags: review?(mats)
Attachment #8931686 - Flags: review?(mats)
(In reply to Jonathan Kew (:jfkthame) from comment #8)
> (In reply to Olli Pettay [:smaug] from comment #5)
> > Could we fix the case where we're slow?
> 
> The two patches above make it quite a bit less slow, though it's still
> slower than the version without whitespace.
> 
> Something that would give us a big win here, though, would be if we could
> merge adjacent text nodes in the DOM when the intervening node is removed.
> Is there some way that could be done?
That would be against DOM/HTML specs.
Flags: needinfo?(bugs)
But we could in theory try some hacks. If the page doesn't have MutationObservers, nor Mutation Event listeners, and only the parent and sibling nodes have pointer to some TextNode, then we could, I guess, do some merging.
Though, even in that case we'd need to store the expected DOM tree in some form and create it on demand. Sounds very complicated.

Could layout handle consecutive text nodes somehow differently, like do some sort of merge there?
I'm not sure, offhand; sounds like a possible followup bug. I see that nsCSSFrameConstructor does try to avoid creating frames for text nodes that are collapsible whitespace at a line boundary (see nsCSSFrameConstructor::ConstructFramesFromItem) when possible. That avoids creating frames for the whitespace when we initially reflow the testcase here; but it doesn't apply, apparently, when we're reconstructing frames after the script has modified the content.
Comment on attachment 8931650 [details] [diff] [review]
Add an early return to nsTextFrame::ReflowText for the case where nothing was left after we've trimmed whitespace

This looks reasonable to me, although as noted in part 2, it does
lead to not having text runs in more cases.  We should handle that
gracefully in general though.

I noted while reading the code:
https://searchfox.org/mozilla-central/rev/45a3df4e6b8f653b0103d18d97c34dd666706358/layout/generic/nsTextFrame.cpp#9494-9502
We've had some problems before with rendering an editing caret
in zero-height text frames.  It seems like this patch might
perhaps make that worse?  (I don't recall exactly what the state
of those issues are ATM, perhaps we've already solved that somehow?)

nit: s/whitespace-skipping/skipping whitespace/ seems like it
would read better? and perhaps "trimming" instead of "skipping"
since that's the term we use nearby.

I think you should merge part 2 with this patch before landing,
since it looks like it's fixing a regression from this patch.
Attachment #8931650 - Flags: review?(mats) → review+
Attachment #8931686 - Flags: review?(mats) → review+
(In reply to Jonathan Kew (:jfkthame) from comment #13)
> I'm not sure, offhand; sounds like a possible followup bug. I see that
> nsCSSFrameConstructor does try to avoid creating frames for text nodes that
> are collapsible whitespace at a line boundary (see
> nsCSSFrameConstructor::ConstructFramesFromItem) when possible

Right, but we only do that in the ContentAppended case since it requires
this chunk:
https://searchfox.org/mozilla-central/rev/45a3df4e6b8f653b0103d18d97c34dd666706358/layout/base/nsCSSFrameConstructor.cpp#7761-7775

Also, we currently only handle one such text node, not consecutive ones.

This patch enables the same optimization for ContentRangeInserted and also
handles consecutive ignorable text nodes.  Both are required to handle
the testcase at hand.  This works as expected - it suppresses frames for
all the left-over text nodes in the first <div>.

However, it actually makes performance significantly worse!
I haven't looked into it in detail, but I'm guessing that destroying
the text frames is slower than just leaving them in.
Pushed by jkew@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/68733f09f56e
Add early-return codepaths to nsTextFrame::ReflowText and TrimTrailingWhiteSpace to optimize processing of frames containing only trimmable whitespace. r=mats
(In reply to Mats Palmgren (:mats) from comment #14)
> I think you should merge part 2 with this patch before landing,
> since it looks like it's fixing a regression from this patch.

AFAIK, part 1 didn't result in any actual regression (TrimTrailingWhitespace would still work as expected, as it would create the textrun for itself if needed); part 2 just provides an additional optimization. But I merged the two parts anyway, as they're closely related and both tiny patches.
(In reply to Olli Pettay [:smaug] from comment #3)
> Something like 78% is reflow, of which quite a bit is text handling, and 16%
> is styling (Stylo).

Just FTR, my local profiling (with the nsTextFrame optimizations applied) now shows about 54% reflow, 26% styling. The reflow is now primarily block & inline layout (lots of time in nsBlockFrame::ReflowDirtyLines), but relatively little is left down at the nsTextFrame level.

On average, I see roughly the same perf on Nightly (with the patches) and Chrome now, though the Chrome figure has a lot more variability than ours. (Sometimes they beat us by as much as 15% or so, but other times they're an almost equal amount slower. Perhaps they're more affected by exactly how caching works out at some level.)
Olli asked me to take a look to see if there was something to improve in the style-system side of things... Will take a look tomorrow.
Flags: needinfo?(emilio)
https://hg.mozilla.org/mozilla-central/rev/68733f09f56e
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla59
I don't see anything particularly outstanding... There's a bit of overhead here and there that we can look into, but not trivial fixes.
Flags: needinfo?(emilio)
Assignee: nobody → jfkthame
Flags: needinfo?(cam)
You need to log in before you can comment on or make changes to this bug.