Closed Bug 1381157 Opened 7 years ago Closed 7 years ago

The "mContent->GetPrimaryFrame() == this" check in nsIFrame::HasOpacityInternal is not very cache friendly

Categories

(Core :: Web Painting, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: mstange, Assigned: neerja)

References

(Blocks 1 open bug)

Details

Attachments

(2 files, 1 obsolete file)

A cache line wastage profile of one frame of displaylist building on https://jrmuizel.github.io/implementation-tests/dl-test.html shows that HasOpacityInternal is responsible for loading a lot of unused memory into the processor's L3 cache. This takes up memory bandwidth that could be used for other things during display list building.

The profiles are here:
 - read_bytes: https://perfht.ml/2tcc7BG
 - wasted_bytes: https://perfht.ml/2tbHz33

Note that these profiles are inverted and have a search filter applied.

See bug 1379694 comment 4 about how to interpret these profiles.

The profiles show that HasOpacityInternal causes us to read 528KB from memory, of which 400KB are never accessed (75% unused). This is because we read in 64 byte chunks (cache lines) and the data that we need within each of those chunks is a lot less than that.

The elements in the testcase don't have opacity (or opacity animations) applied to them.

The cache miss is in the line that does mContent->GetPrimaryFrame() == this:

http://searchfox.org/mozilla-central/rev/cbd628b085ac809bf5a536109e6288aa91cbdff0/layout/generic/nsFrame.cpp#1358
>  EffectSet* effects =
>    aEffectSet ? aEffectSet : EffectSet::GetEffectSet(this);
>  return (mContent && mContent->GetPrimaryFrame() == this &&
>          nsLayoutUtils::HasAnimationOfProperty(effects, eCSSProperty_opacity));

I think we could skip the first checks if we had a bit on the frame that says "This frame is the primary frame of mContent". Then we wouldn't need to look inside mContent for the first two checks.
But the HasAnimationOfProperty check might still need to look inside mContent.
Blocks: 1379694
Assignee: nobody → npancholi
The try run for this patch looks ok. There are some failures but they seem to be unrelated. 
https://treeherder.mozilla.org/#/jobs?repo=try&revision=46488fa197368c9a616849939f8ac86352e1a3ca&selectedJob=124630350
Thanks! This seems to help a bit; in my testcase it reduces the number of bytes read per display list building from 5.3MB to 5.2MB (100 KB reduction). It also eliminates nsIFrame::HasOpacityInternal entirely from the cache line wastage high scores. However, in its place we now have the function nsStyleUserInterface::GetEffectivePointerEvents, which accesses the frame's content in order to check whether the content has a parent:

> uint8_t
> nsStyleUserInterface::GetEffectivePointerEvents(nsIFrame* aFrame) const
> {
>   if (aFrame->GetContent() && !aFrame->GetContent()->GetParent()) {

So it looks like this is going to be a bit of a whack-a-mole game until we've eradicated all accesses to the frame's content, at least in the common case. And it's hard to know how many more instances of this problem are left, because the cache grind tool only gives us the location of the *first* access of the relevant piece of memory, it doesn't tell us what other places access the same piece of memory while it's in the cache.

But I think this patch is probably worth taking on its own.


I have two review comments:

The NS_PRECONDITION in nsIContent::SetPrimaryFrame makes it look like the following cases are allowed:

mPrimaryFrame == aFrame
mPrimaryFrame == nullptr && aFrame != nullptr
mPrimaryFrame != nullptr && aFrame == nullptr

In other words, an nsIContent can lose its current primary frame; mPrimaryFrame can be set to null again. In that case, this patch would leave the that frame's mIsPrimaryFrame flag at true, but it would need to be set to false.
It's possible that an nsIContent only loses its current primary frame if that frame is in the process of being destroyed, and in that case, the frame's mIsPrimaryFrame value wouldn't matter. But I don't know if that's really the case, and we probably shouldn't rely on it.

So I think instead of creating a static method nsIFrame::SetPrimaryFrame, you could move your code into nsIContent::SetPrimaryFrame, and add a public method nsIFrame::SetIsPrimaryFrame(bool).

Then nsIContent::SetPrimaryFrame would call mPrimaryFrame->SetIsPrimaryFrame(false) on its old primary frame (if it has one), and ->SetIsPrimaryFrame(true) on its new primary frame (if it's being set to something non-null). That way you also wouldn't have to change any of the existing callers to nsIContent::SetPrimaryFrame.

The declaration of the mIsPrimaryFrame field can move down to after mIsWrapperBoxNeedingRestyle, where the comment "There is an 11-bit gap left here." is. And that comment will need to be adjusted.
(In reply to Markus Stange [:mstange] from comment #4)
> The declaration of the mIsPrimaryFrame field can move down to after
> mIsWrapperBoxNeedingRestyle, where the comment "There is an 11-bit gap left
> here." is.

With this change, the number of bytes read during display list building goes down even further, to 5.11MB.
(In reply to Markus Stange [:mstange] from comment #4)
> So it looks like this is going to be a bit of a whack-a-mole game until
> we've eradicated all accesses to the frame's content, at least in the common
> case.

After disabling the access in nsStyleUserInterface::GetEffectivePointerEvents, there's only one more location that bubbles up, and that's in FindElementBackground where we check whether the frame's content is a body tag. With that also disabled, the number of bytes read during display list construction on the testcase goes down to 4.94MB (~7% savings).
As my next step, I'm going to apply these patches to a clean build so that I can check whether display list construction *time* is also reduced by 7%.
(In reply to Markus Stange [:mstange] from comment #4)

Thanks Markus! I made all the changes that you pointed out except this one -
 
> So I think instead of creating a static method nsIFrame::SetPrimaryFrame,
> you could move your code into nsIContent::SetPrimaryFrame, and add a public
> method nsIFrame::SetIsPrimaryFrame(bool).

The reason I had to use a static function is that nsIContent does not include nsIFrame.h but only forward declares it so I can't call nsIFrame::SetIsPrimaryFrame from it. I did try including nsIFrame.h in nsIContent.h just to see if it builds and it doesn't but I'm guessing this is undesirable to do anyway. 

The try run for this patch is:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=29a14474727e74f746050148a2c50e96cf57b105
(In reply to Neerja Pancholi[:neerja] from comment #8)
> The reason I had to use a static function is that nsIContent does not
> include nsIFrame.h but only forward declares it so I can't call
> nsIFrame::SetIsPrimaryFrame from it. I did try including nsIFrame.h in
> nsIContent.h just to see if it builds and it doesn't but I'm guessing this
> is undesirable to do anyway.

I agree that including nsIFrame.h in nsIContent.h is undesirable. I don't have a real opinion on how to implement this. I'm forwarding the review to bz who implemented SetPrimaryFrame.
Oh, he's on vacation until the 26th. I think we should wait for him.
(In reply to Markus Stange [:mstange] from comment #10)
> Oh, he's on vacation until the 26th. I think we should wait for him.

Since bz seems busy with vacation backlog, do you have suggestions for any other reviewers?
Attachment #8899913 - Flags: review?(mstange) → review?(mats)
Comment on attachment 8899913 [details]
Bug 1381157 - Cache 'mContent->GetPrimaryFrame == this' with a flag on nsIFrame and replace these calls to GetPrimaryFrame.

https://reviewboard.mozilla.org/r/170938/#review181536

I agree with mstange's comment about nsIFrame::SetPrimaryFrame.
I think we can move nsIContent::SetPrimaryFrame to a .cpp file and include nsIFrame.h there.

Also, all uses of 'mIsPrimaryFrame' to check the state should be replaced with
a call to IsPrimaryFrame(), and mIsPrimaryFrame should be private.

typo: "( mIsPrimaryFrame"

Your "we must recheck (IsInUncomposedDoc() || IsInShadowTree())" comment
is a bit confusing.  Why exactly can't you use IsPrimaryFrame() there?
(the bug have no comments that mentions "assert" when I searched for it)
Attachment #8899913 - Flags: review?(mats) → review-
I've addressed my nits above to speed up the process in case we want to
take this for v57.

Neerja, please let me know if I made any mistakes.
Attachment #8899913 - Attachment is obsolete: true
Attachment #8907756 - Flags: review+
Attachment #8907756 - Flags: feedback?(npancholi)
Attached file assertion stacks
I've investigated the assertion we would get from changing:
-  bool isPrimaryFrame = (mContent && mContent->GetPrimaryFrame() == this);
+  bool isPrimaryFrame = IsPrimaryFrame();
in nsIFrame::Destroy.

The assertion is: dom/base/FragmentOrElement.cpp:322
322       MOZ_ASSERT(IsInUncomposedDoc() || IsInShadowTree(), "This will end badly!");
That's the first stack in this attachment.

Backtracking from that, it seems this is the primary frame for a scrollbar-
button; the second stack is for the SetIsPrimaryFrame on this frame.

Going forward from that, it seems IsInUncomposedDoc() becomes false
when we call UnbindFromTree on this anonymous content.
This is the third stack.

So, I figured that perhaps we could just relax the assertion to:
"!aFrame || IsInUncomposedDoc() || IsInShadowTree()"
but then I get another assertion on shutdown instead:
in nsINode::~nsINode at dom/base/nsINode.cpp:157
157       MOZ_ASSERT(mSubtreeRoot == this, "Didn't restore state properly?");
(rr) p this->mSubtreeRoot
$1 = (nsINode *) 0x0
and that was nulled out by our call from SetIsPrimaryFrame(nullptr)
from nsFrame::DestroyFrom.  This call appears to me to be "correct",
but apparently we don't do that currently.  The value we overwrite at
that point is indeed the 'this' content, which was set by
nsINode::SetSubtreeRootPointer in UnbindFromTree, and the value we
overwrite there is indeed our primary frame.

I suspect we can't call SetPrimaryFrame(nullptr) before the assignment
in SetSubtreeRootPointer because that might cause us to not destroy
the frame?  (I didn't try).  We could probably call
GetPrimaryFrame()->SetIsPrimaryFrame(false) there though.

bz, thoughts on any of this?
(I'm inclined to take the above patch as is)
Flags: needinfo?(bzbarsky)
s/SetIsPrimaryFrame(nullptr)/SetPrimaryFrame(nullptr)/
(In reply to Mats Palmgren (:mats) from comment #13)
> Created attachment 8907756 [details] [diff] [review]
> Neerja's patch with nits fixed
> 
> I've addressed my nits above to speed up the process in case we want to
> take this for v57.
> 
> Neerja, please let me know if I made any mistakes.

I apologize for the delay in fixing these nits and thanks for taking the time to do this! When I tried to fix this, I was concerned about the correct place to put the implementation for SetPrimaryFrame and I will defer to what you think is right.
Attachment #8907756 - Flags: feedback?(npancholi)
So the key part is that mPrimaryFrame on elements is part of a union with mSubtreeRoot.  If we're in the tree, mPrimaryFrame is the part of the union that's in use, and points to the primary frame.  If we're not in the tree, mSubtreeRoot is in use and points to the subtree root.

So in normal operation, on node removal we will:

1) Call ContentRemoved, tearing down frames, etc.
2) UnbindFromTree

in that order.  The UnbindFromTree call sets mSubtreeRoot, if aNullParent || !mParent->IsInShadowTree().

This means that normally nsFrame::DestroyFrom runs when mPrimaryFrame is in fact the primary frame (if any).

The problem you're running into is that for scrollbars we reverse the order.  We're talking about frame-created anonymous content here.  It gets unbound via ScrollFrameHelper::Destroy calling mOuter->DestroyAnonymousContent which unbinds the anon content for the scrollbars.  But nsHTMLScrollFrame::DestroyFrom and nsXULScrollFrame::DestroyFrom both call ScrollFrameHelper::Destroy _before_ they call their superclass DestroyFrom method, and its the latter that tears down the frames for the scrollbar bits.

So the right fix would be to change the scrollframe destruction ordering such that we grab the list of anon content we need to unbind, then do superclass destroy, then unbind the anon content.

Furthermore, we should make similar changes at all other nsIFrame::DestroyAnonymousContent callsites.

In practice, this is probably followup fodder.  Taking the attached patch with the comment in nsFrame::DestroyFrom adjusted to document the real reason the check is the way it is, with a link to the followup bug, etc, is fine by me.  But please do make sure the comment gets updated.  This has nothing to do with the IsInUncomposedDoc() checks per se and more to do with the fact that native anon content teardown doesn't follow the usual sequence and hence puts things in a bad state that we then have to work around (and IsInUncomposedDoc() acts as the workaround).
Oh, the right place to put SetPrimaryFrame is probably in dom/base/nsIContentInlines.h.  You should be able to include nsIFrame.h there.
Flags: needinfo?(bzbarsky)
OK, I filed bug 1400618 on the Destroy/UnbindFromTree order for NACs.
https://hg.mozilla.org/mozilla-central/rev/97a042be82e5
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: