Closed Bug 1393098 Opened 7 years ago Closed 7 years ago

Investigate if nsTextFrame::CharacterDataChanged could return early if the frame is already dirty

Categories

(Core :: Layout, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Performance Impact high
Tracking Status
firefox57 --- fixed

People

(Reporter: smaug, Assigned: dholbert)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

When profiling the second testcase for https://bugzilla.mozilla.org/show_bug.cgi?id=1346723 I see nsTextFrame::CharacterDataChanged and PresShell::FrameNeedsReflow being called quite a bit.
Looks like we end up calling PresShell::FrameNeedsReflow for already dirty text frames.
Could that be optimized?
Whiteboard: [qf]
In a profile I'm looking at, nsTextFrame::CharacterDataChanged takes about 4% of input.value = "new value" time.
I was going to slightly-question whether this is just pathological behavior, but it looks like speedometer does something like this (based on bug 1346723 comment 0), so I suppose we have to care.
This is definitely not pathological. I would assume this to show up in any <input>/<textarea> value change, including typing. And anyhow, affects bug 1346723 rather badly (after some fixes this is up to 5.4% of value setting).
Assignee: nobody → dholbert
Status: NEW → ASSIGNED
(In reply to Olli Pettay [:smaug] from comment #0)
> Looks like we end up calling PresShell::FrameNeedsReflow for already dirty
> text frames. Could that be optimized?

It might seem like we could skip the FrameNeedsReflow() call if the frame is already dirty, but unfortunately it's not quite that simple.

 - The FrameNeedsReflow() call in question here does *more* than marking the frame dirty -- it also marks all ancestors' intrinsic sizes as needing to be recalculated (due to the nsIPresShell::eStyleChange argument that it passes in).
 - And there are other codepaths that set the frame's dirty bit *without* making that change to the ancestors (e.g. InvalidateFrameDueToGlyphsChanged calls FrameNeedsReflow with the [misleadingly-named] nsIPresShell::eResize argument, and that call does *not* invalidate ancestors' intrinsic sizes).
 - So: the frame's dirty bit is *not* sufficient to tell us whether it'd be safe to skip our FrameNeedsReflow(...,eStyleChange,...) call.

We can add another dedicated bit to tell us whether we've made this exact same FrameNeedsReflow(...,eStyleChange,...) call before, though (which we set in this function, and then check on repeated calls, and clear on reflow).  I'll be working along those lines here.

One slightly-tricky factor is that we don't have any spare nsFrameStateBits for text frames. We might be able to squeeze in a bool:1 member variable, though -- this doesn't seem to bloat the size of nsTextFrame on my machine. If it bloats the struct on other configurations, we may need to be sneakier (or just buy ourselves some more frame state bits).
Perhaps we can steal the least significant bit of mNextContinuation?  :-/  It's gross but AFAICT the object doesn't have any room for additional bits sadly...
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #5)
> Perhaps we can steal the least significant bit of mNextContinuation?  :-/ 
> It's gross but AFAICT the object doesn't have any room for additional bits
> sadly...

Not specifically in nsTextFrame, but there are still some free bits in the nsIFrame base; I assume that's what dholbert is considering.

http://searchfox.org/mozilla-central/rev/5696c3e525fc8222674eed6a562f5fcbe804c4c7/layout/generic/nsIFrame.h#4176
Can we do some optimizations based on the fact that the textframe is in native anonymous content, inside <input> (or <textarea>)?

After some other fixes, this is now 4.7% of input.value setting.
Of that 4.7% 64% seems to be FrameNeedsReflow call.
Whiteboard: [qf] → [qf:p1]
(In reply to Daniel Holbert [:dholbert] from comment #4)
> We might be able to squeeze in a bool:1 member variable,
> though -- this doesn't seem to bloat the size of nsTextFrame on my machine.
> If it bloats the struct on other configurations, we may need to be sneakier
> (or just buy ourselves some more frame state bits).

It looks like this is fine, actually -- no special hackery needed.

I ran a debugging patch through Try, to prints sizeof(nsTextFrame), and it produced the same results before/after adding a bool:1 member to that class, on *nearly* all platforms. The only platform that showed an increase was 32-bit linux builds.  But I think the change is minimal enough (and our 32-bit linux usage is low enough, particularly given increasing 64-bit adoption) that we don't need to agonize about saving bits here and there for specifically just that one platform.

(I don't have any data for android, because my printf didn't show up in the output or the logcat there, but I don't think that's worth agonizing over either.)

Platform:                  unmodified  with bool:1  Increase
==========                 ==========  ===========  ========
 Linux opt                 88          92           +4 bytes
 Linux x64 opt             136         136          0
 OS X 10.10 opt            136         136          0
 Windows 7 opt             96          96           0
 Windows 7 pgo             96          96           0
 Windows 10 x64 opt        136         136          0
 Windows 10 x64 pgo        136         136          0
 windows7-32-stylo opt     96          96           0
 windows10-64-stylo opt    136         136          0

The Try runs I used were:
unpatched: https://treeherder.mozilla.org/#/jobs?repo=try&revision=bee0f961a25e48f28ebcd9f7ff3f728da1df667a
patched:   https://treeherder.mozilla.org/#/jobs?repo=try&revision=dd253e1f5e9c87d9c13d1514c86e4a797693cac7
Comment on attachment 8902902 [details]
Bug 1393098 part 0: Refactor logic (add helper bool & reduce GetParent calls), in nsTextFrame::CharacterDataChanged.

https://reviewboard.mozilla.org/r/174616/#review179716

::: layout/generic/nsTextFrame.cpp:4874
(Diff revision 1)
> -        lastDirtiedFrame->GetParent() != textFrame->GetParent()) {
> +    bool hasSameParentAsLastDirtied = !lastDirtiedFrame ||
> +      lastDirtiedFrame->GetParent() != textFrame->GetParent();
> +
> +    if (!hasSameParentAsLastDirtied) {

Sorry, this logic is backwards!  The boolean condition really represents "hasDIFFERENTParent" not "hasSameParent".  Just noticed when running the build locally (text frames weren't painting in some cases) :)

I'll fix & repost.
Attachment #8902902 - Flags: review?(jfkthame)
Attachment #8902903 - Flags: review?(jfkthame)
To validate that this patch helps, I profiled the pageload of attachment 8848015 [details] (the testcase mentioned in comment 0 here) in two local opt builds -- one unpatched, one patched (but otherwise the same).

Unpatched build:
 http://bit.ly/2vL3Wsg (11ms in CharacterDataChanged)
 http://bit.ly/2vKIT9b (12.4ms in CharacterDataChanged)

Patched build:
 http://bit.ly/2wjObfw (5.2ms in CharacterDataChanged)
 http://bit.ly/2wk03yg (5.6ms in CharacterDataChanged)

So this does indeed seem to be helping -- it shaves off ~5ms (50%) from the CharacterDataChanged time in this testcase.
Looks great -- good to see this does indeed give a significant improvement. However, I would really prefer us to consider using one of the free bits in nsIFrame for this purpose.

Currently, nsTextFrame ends with three 32-bit members, which means that in a 64-bit build it will have 4 bytes of trailing padding, and so the added bool:1 doesn't increase the size. But in a 32-bit build, the addition of this single bit will make the class 4 bytes bigger. In either 32- or 64-bit builds, however, we have the 11-bit gap in nsIFrame as noted in comment 6, and so we can define an extra bool:1 there at no cost.

I realize that defining a bool flag in nsIFrame that is only for use by nsTextFrame is less "clean" than defining it in nsTextFrame itself, but IMO it's worth this minor hack to save 4 bytes per textframe on Android, as memory on low-end devices can be quite constrained already.
(In reply to Jonathan Kew (:jfkthame) from comment #18)
> In a 32-bit build, the addition of
> this single bit will make the class 4 bytes bigger.

I worried about that too, but I tested (see comment 8) and AFAICT there's no change on Win32 (though you're right for Linux32, and I don't know about Android).

Having said that: I'm sure this bit *would* be responsible for a growth on all platforms, if we happened to add another nscoord member (or any 32-bit member) to nsTextFrame.  So I share some of your uneasiness.

> In either 32- or 64-bit
> builds, however, we have the 11-bit gap in nsIFrame as noted in comment 6,
> and so we can define an extra bool:1 there at no cost.
> I realize that defining a bool flag in nsIFrame that is only for use by
> nsTextFrame is less "clean" than defining it in nsTextFrame itself, but IMO
> it's worth this minor hack to save 4 bytes per textframe on Android, as
> memory on low-end devices can be quite constrained already.

Fair enough, OK -- I'll do that.
(In reply to Daniel Holbert [:dholbert] from comment #19)
> (In reply to Jonathan Kew (:jfkthame) from comment #18)
> > In a 32-bit build, the addition of
> > this single bit will make the class 4 bytes bigger.
> 
> I worried about that too, but I tested (see comment 8) and AFAICT there's no
> change on Win32

Yes, I noticed that and was curious about the difference between Linux32 and Win32 (though not curious enough to really investigate!).... I guess MSVC must be doing something a bit different, perhaps resulting in 8-byte alignment even on 32-bit?

> (though you're right for Linux32, and I don't know about
> Android).

I didn't actually test but I'm moderately confident Android would be like Linux32 here.

> Fair enough, OK -- I'll do that.

Thanks!
Interdiff for moving the bit to nsIFrame: https://reviewboard.mozilla.org/r/174620/diff/4-5/

(Ignore the nsTextFrame color-related stuff in the middle -- those aren't changes in my patch, it's just code that changed underneath me between patch versions. It should only show up in the interdiff, not in the actual patch, because that's how MozReview is.)
Flags: needinfo?(jfkthame)
Comment on attachment 8902902 [details]
Bug 1393098 part 0: Refactor logic (add helper bool & reduce GetParent calls), in nsTextFrame::CharacterDataChanged.

https://reviewboard.mozilla.org/r/174616/#review180080
Attachment #8902902 - Flags: review?(jfkthame) → review+
Comment on attachment 8902903 [details]
Bug 1393098 part 1: Adjust nsTextFrame::CharacterDataChanged to skip redundant requests for reflow, via a new boolean member-var.

https://reviewboard.mozilla.org/r/174620/#review180082

LGTM, thanks!
Attachment #8902903 - Flags: review?(jfkthame) → review+
Pushed by dholbert@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/007b471778c3
part 0: Refactor logic (add helper bool & reduce GetParent calls), in nsTextFrame::CharacterDataChanged. r=jfkthame
https://hg.mozilla.org/integration/autoland/rev/9c13f2c5b951
part 1: Adjust nsTextFrame::CharacterDataChanged to skip redundant requests for reflow, via a new boolean member-var. r=jfkthame
https://hg.mozilla.org/mozilla-central/rev/007b471778c3
https://hg.mozilla.org/mozilla-central/rev/9c13f2c5b951
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Thanks. I see this dropped from 4-5% to 1.7% of input.value setter.
Flags: needinfo?(jfkthame)
Performance Impact: --- → P1
Whiteboard: [qf:p1]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: