Need table driven implementation of |QueryInterface|




18 years ago
13 years ago


(Reporter: Scott Collins, Assigned: shaver)




Firefox Tracking Flags

(Not tracked)



(2 attachments)



18 years ago
We think we can save significant code-space by implementing |QueryInterface| with
a single table-driven implementation.


18 years ago
Target Milestone: M13


18 years ago
Priority: P3 → P1


18 years ago
Target Milestone: M13 → M14

Comment 1

18 years ago
Moving out to M14 per scc's email and since he is on vacation.

Comment 2

18 years ago
Changing to M15 as per discussion with dp
Target Milestone: M14 → M15

Comment 3

18 years ago
I need to do a space calculation to see exactly what savings we're talking about 
here.  Then we can better prioritize this against other performance/size tasks.
Target Milestone: M15 → M20

Comment 4

17 years ago
mass re-assigning to my new bugzilla account
Assignee: scc → scc


17 years ago

Comment 5

17 years ago
dp is no longer changing qa contact to default for this product
QA Contact: dp → kandrot
I'm not sure what you meant by a single table-driven implementation.  However, I
made an attempt to do something that seems table-driven to me.  (I wouldn't call
it single since it still does the work in a macro, although it could, I think,
be moved to a function.)  Anyway, it generated slightly larger code on both
optimized and debug builds with gcc 2.91.66.

But it's probably not what you were thinking of...
Created attachment 21255 [details] [diff] [review]
my attempt...
Er, ignore that patch.  I'd have thought testing it on the entire docshell
directory would catch really stupid mistakes, but it obviously didn't.  jag did.
Created attachment 21262 [details] [diff] [review]
a version of the previous patch that actually seems to work

Comment 10

17 years ago
If the table-driven implementation is in (or whatever), wouldn't
that add a link-time requirement?

Is there a way to define a function in a header file and have it folded in a
single instance of it in the object code when linked together (each object file
has an instance of the compiled code, but only one is kept when linking occurs)?

Also, some speed improvement could be had by having the table be really smart
and actually put the entries into a hash table (the first word of a UUID makes a
good hash value!). I had thought of doing this in XPLC
(, but I have a easier time, as my
AddRef/Release/QueryInterface implementation is provided through a mix-in
template instead of a template, so I can easily have a constructor for the
template to convert the table parameter into a static hash table.

Theoretically I think a good enough compiler could optimize this kind of stuff
at compile time, but I don't think that exist yet...

Comment 11

17 years ago
People who care: do we still want this?  Consideration over the past year or so
makes me wonder if it's worth the effort.  Thoughts?
Target Milestone: --- → Future
I think it's worth the effort, even where that effort involves providing a way
for people to do this without a linkage dependency.  Being able to share this
data with the classinfo stuff (see 67699) is probably going going to be a
biggish space win as nsIClassInfo starts to spread, kudzu-like, through the tree.

scc: I'll take this bug, because of the synergy with 67699, but you should feel
free to snatch it back if you want.
Assignee: scc → shaver
So it's a space loss on Linux because g++'s inlining of class-static methods
returning method-static constants loses.  I've got it working in my tree, and
I'm working on a way to trick g++ into not sucking as hard.

I could attach my patch to see if it helps on the Mac, where the compiler is
from heaven.  If anyone's interested, let me know.  (It requires chunks of my
patch from 67699, too, sorry.)

This is a 0.9 thing, I think -- ain't nobody gonna let me land it before 0.8.1,
even if I fixed the inlining thing.
Now that we have better compilers and more places using classinfo, I'm going to
try this again.
Priority: P1 → --
Target Milestone: Future → ---
Unless someone can provide compelling footprint/perf reasons, this is not a 1.x
Target Milestone: --- → Future

Comment 16

15 years ago
so I spent a little time with this again... my approach involved externalizing
the the loop which does the QI.. (i.e. so its shared between all classes, as
NS_TableQueryInterface()) but I can't seem to make any interface table
initialize. I'm still poking at it. does anyone who poked at this remember any
particular problem they had getting their table to initialize? 

I have this:
struct nsInterfaceEntry {
  nsIID mIID;
  PRUint32 mOffset;

and then even manual table descriptions like:

const nsInterfaceEntry foo[] = {
  { NS_GET_IID(nsIGenericFactory), 0 }


I'm hitting errors in VC++ like "cannot convert from 'const struct nsID' to
'const struct nsInterfaceEntry' when clearly I have the {} in the right places..
I've fiddled with const, references (nsIID&) and I just can't make it work!

I'm going to try compiling the above patch next.
Now that we are using gcc3.x for default builds which generates better code than
the 2.9x series, I think this is worth revisiting.
Keywords: footprint

Comment 18

13 years ago
so i've played with this a little bit (like those brave souls before me...) with
gcc 3.4 on linux. i started off basically with dbaron's patch, and made a few
interesting observations which shaver summed up already in comment 13.

1) when storing the IIDs in a table via a reference (REFNSIID), gcc stores the
IIDs as separate static constants, and generates code (which lives in QI) to
initialize the table with the pointers to the nsIID's at runtime. the efficiency
of static data storage is okay with this approach (we just pay overhead for one
pointer per IID), but the runtime init code sucks. (for both space and perf - it
requires a comparison and jump at runtime, to check the table's inited for each
QI call.)

2) when attempting to store the IIDs inline in the table (raw nsIID), to avoid
the runtime init code, gcc isn't able to handle it well. it sets aside 16 bytes
(of |zero|) for each IID in the table, but still stores each IID separately and
generates runtime init code for them. like shaver, i've been unable to trick gcc
into seeing through the static |classname->GetIID()| fncall, to insert the
|static const nsIID| directly - that function call indirection confuses it.

if we can find a way to get this to work, my impression is that it'd be well
worth it - we're looking at decent codesize savings, no perf impact (or possibly
an improvement), "classinfo for cheap", and it opens the way for other tricks
(like alecf's single NS_TableQueryInterface() per library - more space savings).

Comment 19

13 years ago
i have an idea i'm going to try... if we can get a |#define| or a |static const
nsIID| variable in the .h's generated by xpidlgen, which holds the nsIID value
directly and whose name contains the classname, then we can alter the
NS_IMPL_QUERY_BODY(_interface) macro to reference this directly. we can use the
C preprocessor's concatenation function to make a name (like NS_nsIFoo_IID) out
of the _classname (nsIFoo), which can't be done with the current naming scheme
used by xpidlgen (NS_IFOO_IID). (i've tried to do it by having xpidlgen declare
a |static const nsIID _iid = {...};| inside the interface class, but C++ doesn't
allow static initializers inside class declarations afaict.)

this will be a pain to implement, given the number of interface definition .h's
in the tree that aren't generated by xpidlgen, but it'll give us the best end
result possible. thoughts?

Comment 20

13 years ago
I remember hacking at this quite a bit.

The problem I kept running into was that in order to get the array to be 'const'
enough, you needed different macros, as Dan mentions. I think you did need to
end up having declarations like (I may have the pointer/const stuff wrong, but
fiddle around with some hand written tables before trying to make macros work)

struct nsInterfaceEntry {
  nsIID mIID;
  PRUint32 mOffset;

static const nsIID& ifoo = { ... };
static const nsIID& ibar = { ....};

static const nsInterfaceEntry[] interface_list = { 
{ ifoo, ifoo_offset},
{ ibar, ibar_offset}

basically, you couldn't declare them completely inline.. and because you
basically have to make use of the IID twice (once to declare the IID constant,
and once in the array) it would make macros pretty hard to write.

look in Essential COM by Don Box for some handwritten code that does what we
want - you'll see how hard it is to translate it into macros..

Comment 21

13 years ago
oh and be careful with static const in .h files - I think that means you end up
with a private, unreferenced static global in each file that #includes it. That
would make a lot of bloat on any platform other than Win32

Comment 22

13 years ago
The lazy way is just to store the pointers to nsCOMTypeInfo<type>::GetIID in the
table. There's no problem initializing those, even in full debug.

Comment 23

13 years ago
Just FYI... I used to work on XPCOM, a long time ago, and spun out to work on
XPLC (there was a demand for an "XPCOM Lite", back in the days, and XPLC is
enjoying some success on its own).

BTW, I'm also one of those who first reported possibility of bloat related to
UUID storage explosion, but if I recall, the problem that we had was that GCC
did not handle the static variable in the function properly, and that each
compilation unit would contain all the included IIDs, whether they were used or
not. There was also another method at the time which involved an NS_DEFINE_IID
macro, which would also take space even if not used, if I am not mistaken. Turns
out that having a copy of an IID in every single compilation unit THAT ACTUALLY
USES IT (instead of just "that includes it") isn't so bad.

It's actually a bit safer, in case of header disparity between different
compilation units (which ought to never happen, but...).
gcc has improved a lot since then.

Comment 25

13 years ago
i finished adding the extra IID #defines treewide, and surprisingly the codesize
savings are only 7.5k (out of 12.1M). i'm not sure why the savings are so small
here... perhaps the linker previously was culling off duplicate IID definitions
within libraries, which it can no longer do if they're integrated into separate
tables. i'll have to look into this more, but for now, this really is looking
like a dead end to me. an external NS_TableQueryInterface() might buy us
something, but my gut feeling is that it'll be small potatoes, given the small
size of the generated assembly that actually does the loop & comparison.

(perf impact of the fully-static-table situation was zero - same as our existing
Based on dwitte's recent findings, WONTFIX.
Last Resolved: 13 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.