Open Bug 1449288 Opened 6 years ago Updated 1 year ago

Consider reducing the size of vtables

Categories

(Core :: General, defect, P3)

defect

Tracking

()

People

(Reporter: bzbarsky, Unassigned)

References

(Depends on 2 open bugs, Blocks 1 open bug)

Details

(Whiteboard: [overhead:2MB])

Attachments

(2 files)

At least on some platforms, vtables are not shareable across processes.  So we should see what we can do to reduce the space used by vtables.

On 64-bit Linux, today, opt build, not --enable-release, using Nathan's suggestion of:

  readelf -sW libxul.so | grep _ZTV | awk '{ sum += $3 } END { print sum }'

I get 2,349,024 bytes.

Some highlights:

1) Layout frames:

  readelf -sW ../obj-firefox-opt/dist/bin/libxul.so | grep _ZTV | awk '{ print $3, $8 }' | c++filt | grep 'Frame$' | grep -v :: | awk '{ sum += $1 } END { print sum }'

says that we have 190KB of vtables for layout frame classes.  The vtable for nsIFrame is 1KB.  The vtable for nsFrame is 1128 bytes.  We should see about shrinking down the nsIFrame vtable, at least.

2) Elements:

  readelf -sW ../obj-firefox-opt/dist/bin/libxul.so | grep _ZTV | awk '{ print $3, $8 }' | c++filt | egrep '(HTML|SVG|XUL).*Element$' | awk '{ sum += $1 } END { print sum }'

says we have about 180KB of element class vtables.  The vtable for nsINode is 488 bytes.  The vtable for nsIContent is 752 bytes.  The vtable for mozilla::dom::Element is 960 bytes.  The vtable for nsGenericHTMLElement is 1032 bytes.  Again, shrinking down these "shared by everything" vtables can lead to somewhat easy wins.

3) Runnables:

  readelf -sW ../obj-firefox-opt/dist/bin/libxul.so | grep _ZTV | awk '{ print $3, $8 }' | c++filt | grep Runnable | awk '{ sum += $1 } END { print sum }'

says we might have 190KB of runnable vtables.  That's going to decrease for release due to bug 1447744, of course.

4) Events:

  readelf -sW ../obj-firefox-opt/dist/bin/libxul.so | grep _ZTV | awk '{ print $3, $8 }' | c++filt | grep 'Event$' | grep mozilla::dom | awk '{ sum += $1 } END { print sum }'        

says we might have 74KB of event vtables.

5) Display items:

  readelf -sW ../obj-firefox-opt/dist/bin/libxul.so | grep _ZTV | awk '{ print $3, $8 }' | c++filt | grep 'nsDisplay' | awk '{ sum += $1 } END { print sum }'

says we might have 44KB of display item vtables.

I'll attach something people can grep for themselves to see what else they can find.  First column is bytes, after that is demangled name.
Depends on: 1449290
Depends on: 1449393
Depends on: 1449404
For events, the virtual AS*Event methods are probably not helping.
Bug 1332680 has a bunch of classes and methods that can get 'final' keywords without refactoring.

The biggest wins from that right now are targeting Release and AddRef. Adding 'final' versions of those macros is Bug 1446509 - once hat's done I can go around sprinkling them; but I haven't tried tackling the macro problem yet.
Depends on: 1332680, 1446509
(In reply to Tom Ritter [:tjr] from comment #4)
> Bug 1332680 has a bunch of classes and methods that can get 'final' keywords
> without refactoring.
> 
> The biggest wins from that right now are targeting Release and AddRef.
> Adding 'final' versions of those macros is Bug 1446509 - once hat's done I
> can go around sprinkling them; but I haven't tried tackling the macro
> problem yet.

Does that actually reduce the size of vtables (i.e. those methods are no longer in the vtable), or does it just mean that particular calls can be made non-virtual?
You will get vtable size reduction (with GCC) only if devirtualization actually renders the vtable dead.  It happens only if GCC manages to inline enough to see all the way from ctor vtable store to the end of lifetime of the instance.

Changing vtable layout at compiler side is pretty hard because it is difficult to track what can happen outside of current translation unit (even with LTO program is not complete because of shared libs). Could be done for anonymous namespaces.
(In reply to Jan Hubicka from comment #6)
> You will get vtable size reduction (with GCC) only if devirtualization
> actually renders the vtable dead.  It happens only if GCC manages to inline
> enough to see all the way from ctor vtable store to the end of lifetime of
> the instance.
> 
> Changing vtable layout at compiler side is pretty hard because it is
> difficult to track what can happen outside of current translation unit (even
> with LTO program is not complete because of shared libs). Could be done for
> anonymous namespaces.

Both of these sound pretty scary, because they'd totally violate the assumptions of XPConnect/xptcall.  At least they're very unlikely to happen...
clang has a -Wweak-vtables warning for classes that have no out-of-line virtual functions (e.g. classes defined only in header files) and thus need a vtable in every translation unit in which they are used. I don't know if LTO already consolidates the duplicate vtables.

https://clang.llvm.org/docs/DiagnosticsReference.html#wweak-vtables
For GCC (and almost surely for clang too) such vtables will end up in comdat section and will be merged by linker. (at least on targets that support comdat merging which includes all ELF targets, MacOS and Windows)

So all you waste in this case is object file size.  For LTO the vtable will also land into every IL object file that may possibly use it and will be merged at the linktime.

Concerning comment #7 I do not understand what is scary here perhaps because i do not know what XPConnect/xptcall is.
Depends on: 1451363
Priority: -- → P3
Whiteboard: [overhead:2MB]
We can probably eliminate the need for this with bug 1470591.
It's still desirable to reduce the amount of vtables we have, whether or not that data is shared across processes.
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: