Add a nsIFrame::mClass field with the concrete frame class ID to devirtualize various frame methods

RESOLVED FIXED in Firefox 55

Status

()

Core
Layout
P2
normal
RESOLVED FIXED
3 months ago
3 months ago

People

(Reporter: mats, Assigned: mats)

Tracking

(Blocks: 4 bugs, {perf})

Trunk
mozilla55
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

Attachments

(4 attachments)

(Assignee)

Description

3 months ago
In bug 1360241 we added mType to devirtualize GetType(), which is great.
In bug 1362886 we tried to devirtualize IsLeaf() by piggybacking a
table lookup based on this field, which did not end well.  We could
simply add a bit for that, but there are other things we want to
devirtualize like IsFrameOfType, do_QueryFrame etc and storing what
is mostly static class data on every frame for each of those seems
wasteful.

In this bug I will replace mType with a mClass field (also a byte) to
store the unique nsQueryFrame::FrameIID which every concrete frame class
has (it's used for allocating it from the pres shell arena).
This can be used to devirtualize all these methods using a table-based
lookup.

I'll refactor Type() to use this field so that it becomes
"return sLayoutFrameTypes[mClass]".
(Assignee)

Updated

3 months ago
Keywords: perf
(Assignee)

Updated

3 months ago
Blocks: 1364813
(Assignee)

Updated

3 months ago
Blocks: 1364815
(Assignee)

Comment 1

3 months ago
Created attachment 8867979 [details] [diff] [review]
part 1 - Make every concrete frame class be a NS_DECL_QUERYFRAME_TARGET.

Mostly just boilerplate changes.  The relevant change is to nsFrame.h.
Attachment #8867979 - Flags: review?(jfkthame)
(Assignee)

Comment 2

3 months ago
Created attachment 8867980 [details] [diff] [review]
part 2 - Add a nsIFrame::mClass field and propagate the concrete class' value up the ctor chain.

The interesting changes are in nsIFrame.h, nsFrame.* and nsQueryFrame.h.
I'm adding a nsQueryFrame::ClassID enum class and a nsIFrame::mClass
field of that type.  I'm also adding a kClass static const to every
concrete frame class, and then propagating that value up the ctor
chain to nsIFrame.  I've also added an assertion in nFrame::Init:
  MOZ_ASSERT(nsQueryFrame::FrameIID(mClass) == GetFrameId());
which should catch any errors.

(I'm intentionally keeping mType for now, it'll be removed in part 3)
Attachment #8867980 - Flags: review?(jfkthame)
(Assignee)

Comment 3

3 months ago
Created attachment 8867982 [details] [diff] [review]
part 3 - Implement Type() by indexing a static array using the mClass field.  Remove the nsIFrame::mType field.

This adds the Type() values for each concrete frame class in nsFrameIdList.h
so that we can generate an array with those values to look them up
in Type(), which becomes "return sLayoutFrameTypes[mClass]".

Fwiw, I did a Try run that asserted that mType == sLayoutFrameTypes[mClass]
before I removed that field and it passed all regression tests.
Attachment #8867982 - Flags: review?(jfkthame)
(Assignee)

Comment 4

3 months ago
Created attachment 8867983 [details] [diff] [review]
part 4 - Remove the now unused LayoutFrameType values from the ctors.
Attachment #8867983 - Flags: review?(jfkthame)
(Assignee)

Comment 5

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=b64e4893b79105c0556e0a64c3c89eefe1298b4a
This sounds reasonable, just wondered if you have been able to see any impact on profiles or is it all so low-level that it gets lost in the noise?
(Assignee)

Comment 7

3 months ago
I suspect that each of these suggested improvements (see depends list) is probably
too small to move the needle on something like Tp5.  I've seen them show up in
perf logs, and I think others (including Ehsan) have too; usually in the ~0.1%
range though.  I do think it's worth optimizing though, since it's pretty easy
to do and even small improvements eventually adds up.
I'll try to do a talos Try run with all of them when I get the chance...
How does the cost of looking up a global variable compare to the cost of a virtual function call?  (Does this vary by OS, depending on how the adjustments for loading libraries at different base addresses are handled?)
Comment on attachment 8867979 [details] [diff] [review]
part 1 - Make every concrete frame class be a NS_DECL_QUERYFRAME_TARGET.

Review of attachment 8867979 [details] [diff] [review]:
-----------------------------------------------------------------

I confess I didn't individually check all the boilerplate changes!
Attachment #8867979 - Flags: review?(jfkthame) → review+
Comment on attachment 8867980 [details] [diff] [review]
part 2 - Add a nsIFrame::mClass field and propagate the concrete class' value up the ctor chain.

Review of attachment 8867980 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/forms/nsFormControlFrame.h
@@ +24,5 @@
>      * Main constructor
>      * @param aContent the content representing this frame
>      * @param aParentFrame the parent frame
>      */
> +  explicit nsFormControlFrame(nsStyleContext*, nsIFrame::ClassID);

We could drop the 'explicit' here (and in other places where a 1-arg constructor is becoming a 2-arg one), though I guess it's harmless to leave it.

::: layout/generic/nsIFrame.h
@@ +606,4 @@
>  
>    NS_DECL_QUERYFRAME_TARGET(nsIFrame)
>  
> +  explicit nsIFrame(ClassID aID, mozilla::LayoutFrameType aType)

Drop 'explicit'.

(Looks like there are quite a few more places we could do this; I'm not noting them individually.)
Attachment #8867980 - Flags: review?(jfkthame) → review+
Comment on attachment 8867982 [details] [diff] [review]
part 3 - Implement Type() by indexing a static array using the mClass field.  Remove the nsIFrame::mType field.

Review of attachment 8867982 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/generic/nsIFrame.h
@@ +2643,5 @@
>     *
>     * @see mozilla::LayoutFrameType
>     */
> +  mozilla::LayoutFrameType Type() const {
> +    MOZ_ASSERT(uint8_t(mClass) < MOZ_ARRAY_LENGTH(sLayoutFrameTypes));

s/MOZ_ARRAY_LENGTH/mozilla::ArrayLength/, I think.
Attachment #8867982 - Flags: review?(jfkthame) → review+
Comment on attachment 8867980 [details] [diff] [review]
part 2 - Add a nsIFrame::mClass field and propagate the concrete class' value up the ctor chain.

Review of attachment 8867980 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/generic/nsIFrame.h
@@ +606,4 @@
>  
>    NS_DECL_QUERYFRAME_TARGET(nsIFrame)
>  
> +  explicit nsIFrame(ClassID aID, mozilla::LayoutFrameType aType)

Actually, given that patch 4 will then remove the second (original) param, you might as well leave this alone at this stage.
Attachment #8867983 - Flags: review?(jfkthame) → review+
(In reply to David Baron :dbaron: ⌚️UTC-7 from comment #8)
> How does the cost of looking up a global variable compare to the cost of a
> virtual function call?  (Does this vary by OS, depending on how the
> adjustments for loading libraries at different base addresses are handled?)

Intuitively, I'd expect the global variable to be slightly cheaper; both need to look something up -- the global var or the class's vtable -- but the global variable doesn't involve a PC jump, so instruction cache behavior should be better. But the combination of optimizing compilers and CPU features like branch prediction can easily play havoc with my intuition about stuff...!
(Assignee)

Comment 14

3 months ago
(In reply to David Baron :dbaron: ⌚️UTC-4 from comment #8)
> How does the cost of looking up a global variable compare to the cost of a
> virtual function call?  (Does this vary by OS, depending on how the
> adjustments for loading libraries at different base addresses are handled?)

I posted some performance analysis in bug 1362886 comment 17.
It seems to me that an array lookup is 1.3-2.0 times faster than a virtual call,
even under ideal conditions for virtual calls (no I-cache misses).

Comment 15

3 months ago
Pushed by mpalmgren@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/45bf72e9b3b3
part 1 - Make every concrete frame class be a NS_DECL_QUERYFRAME_TARGET.  r=jfkthame
https://hg.mozilla.org/integration/mozilla-inbound/rev/47ed59045f9b
part 2 - Add a nsIFrame::mClass field and propagate the concrete class' value up the ctor chain.  r=jfkthame
https://hg.mozilla.org/integration/mozilla-inbound/rev/9164c8933697
part 3 - Implement Type() by indexing a static array using the mClass field.  Remove the nsIFrame::mType field.  r=jfkthame
https://hg.mozilla.org/integration/mozilla-inbound/rev/78e989ad55af
part 4 - Remove the now unused LayoutFrameType values from the ctors.  r=jfkthame

Comment 16

3 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/45bf72e9b3b3
https://hg.mozilla.org/mozilla-central/rev/47ed59045f9b
https://hg.mozilla.org/mozilla-central/rev/9164c8933697
https://hg.mozilla.org/mozilla-central/rev/78e989ad55af
Status: NEW → RESOLVED
Last Resolved: 3 months ago
status-firefox55: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
According to AreWeFastYet.com, this contributed a 20% improvement to speedometer-misc-Preact-TodoMVC-Adding100Items-sync.

https://arewefastyet.com/#machine=17&view=single&suite=speedometer-misc&subtest=Preact-TodoMVC-Adding100Items-sync
Blocks: 1339557
Also some nice improvements in speedometer-misc-VueJS-TodoMVC-Adding100Items-sync:

https://arewefastyet.com/#machine=17&view=single&suite=speedometer-misc&subtest=VueJS-TodoMVC-Adding100Items-sync
You need to log in before you can comment on or make changes to this bug.