Open
Bug 250107
Opened 20 years ago
Updated 2 years ago
inline nsQueryInterface operator () ... and more
Categories
(Core :: XPCOM, defect)
Tracking
()
NEW
People
(Reporter: bernard.alleysson, Unassigned)
Details
(Keywords: perf)
Attachments
(1 file, 7 obsolete files)
54.09 KB,
patch
|
Details | Diff | Splinter Review |
nsQueryInterface doesn't inherit from nsCOMPtr_helper and thus nsQueryInterface operator() is not virtual and can be inlined. Because it is not inlined at -O1 and there's only 2 call sites (only one gets compiled actually), operator() is simply removed and the original call is inlined by hand. It will completely the operator() call overhead noticed in bug 249652. Same applies to nsQueryInterfaceWithError operator (). I only tested this with MSVC 7.1. See the patch for more information. Note that assign_assuming_AddRef() calls in nscomptr.cpp are not inlined at -O1. They could be replaced by some macros to force inlining. Note that the code can be simplified if it is sure that a NULL pointer is returned by QI when QI fails (see comment in the patch). Note that other operator () from classes inheriting from nsCOMPtr_helper could be simplified also by changing the signature to void operator() ( ... ) instead of nsresult operator () (...) and ensuring that the returned pointer is NULL on error (just forget about the result of QI)
Reporter | ||
Comment 1•20 years ago
|
||
Reporter | ||
Comment 2•20 years ago
|
||
assign_assuming_addref() is meant to be inlined because it is defined inline with the class declaration but it is not inlined with MSVC 7.1 -O1 build, force inlining with the macro NSCAP_ASSIGN_ASSUMING_ADDREF With this patch, I have verified that all code of nscomptr.cpp is inlined, no more call overhead for operator() and assign_assuming_addref(). It saves a few cycles for each of nscomptr.cpp methods and they are called quite a lot...
Reporter | ||
Comment 3•20 years ago
|
||
Make all nsCOMPtr_helper return "void" instead of "nsresult" and insure that they set the answer to NULL in case of error (most of them did that already). Before : 2 checks, one in assign_from_helper(), one in operator(). After: only one check, in operator(). With the removing of the return value + the check saved, it saves a few cycles of every operator() call, and there are quite a lot...
Reporter | ||
Comment 4•20 years ago
|
||
It is the final patch. The standard nsISupports implementation returns NULL on error, so I guess that it is the contract. Removed the checks for failure in assign_from_qi() and assign_from_qi_with_error(). Also changed some QueryElementAt implementation to match nsISupports and removed checks from the operators () using them. I'm setting the perf keyword because it touches some top functions.
Reporter | ||
Updated•20 years ago
|
Keywords: perf
Summary: inline nsQueryInterface operator () → inline nsQueryInterface operator () ... and more
Reporter | ||
Comment 5•20 years ago
|
||
Sorry, I've changed the signatures (again): "virtual void operator()( const nsIID& aIID, void** aInstancePtr) const;" to "virtual void* operator()( const nsIID& aIID ) const;" because the return value is *aInstancePtr and "void assign_from_qi_with_error( nsISupports *rawPtr, nsresult *errorPtr, const nsIID& iid )" to "nsresult assign_from_qi_with_error( nsISupports *rawPtr, const nsIID& iid )" because the return value is nsresult The conversion was pretty straightforward. The assembly code is much better now. Less parameters on the stack, eax is used for the return value, ecx for "this", and hopefully with the __fastcall stuff edx will hold &aIID. It will be just perfect. I think that the next step would be to remove more operator() virtualness but it would require to add headers (nsIWeakreference.h and stuff) to nsComptr.h and/or to move code around. nsQueryInterface also looks quite a bit useless now (remove it ?).
Can you really change the signature of nsCOMPtr_helper? Won't that break plugins?
The reason I didn't inline nsQueryInterface::operator() right away was that I wasn't sure if we ever had NSCAP_FEATURE_USE_BASE undefined. In that case nsQueryInterface::operator() is called from the function nsCOMPtr<T>::assign_from_qi( const nsQueryInterface qi, const nsIID& aIID ) which is potentially compiled once per .cpp file, so an inlined nsQueryInterface::operator() could increase the binsize a lot. However, there are a lot of 'if's in the above (which is why I didn't deal with this in the first place). We might define NSCAP_FEATURE_USE_BASE in all builds that we care about. And even when it's undefined the linker might remove all redundant functions.
Reporter | ||
Comment 8•20 years ago
|
||
(In reply to comment #7) NSCAP_FEATURE_USE_BASE is undefined when NS_DEBUG is defined and otherwise it is explitcitly set on top of nsCOMPtr.h, so I guess that NSCAP_FEATURE_USE_BASE is only undefined with debug builds. With my patch, it adds 190 Ko of code (didn't try without). > The reason I didn't inline nsQueryInterface::operator() right away was that I > wasn't sure if we ever had NSCAP_FEATURE_USE_BASE undefined. In that case > nsQueryInterface::operator() is called from the function > nsCOMPtr<T>::assign_from_qi( const nsQueryInterface qi, const nsIID& aIID ) > which is potentially compiled once per .cpp file, so an inlined > nsQueryInterface::operator() could increase the binsize a lot. I can add a macro, same thing I've done with assign_assuming_addref() (it isn't inlined when FEATURE_USE_BASE is undefined, but is always inlined in nscomptr.cpp). >Can you really change the signature of nsCOMPtr_helper? Won't that break >plugins? I don't know. Plugins should link with a static glue library. I thought all plugins would go through xpt calls and the static glue would contain static implementation of ncCOMPtr_helper methods, thus no need to link against xpcom.dll. Maybe I could keep the old signatures but only use the new ones (be sure they are not removed by the linker). I'll test some plugins.
Reporter | ||
Comment 9•20 years ago
|
||
I added some macros and now with a build without NSCAP_USE_BASE I'm at +180 Ko instead of +190 Ko. I tried different plugins (flash, java, quicktime, wmp, adobe pdf). They all work fine. The ActiveX plugin is broken because it links against xpcom.dll (import a lot of stuff) but I've rebuilt it and used successfully. But it is even better ! I've made the constructors "nsCOMPtr( const nsQueryInterface qi )" and "nsCOMPtr( const nsQueryInterfaceWithError qi )" inline (see patch v1). For instance the first one is just now: nsCOMPtr( const nsQueryInterface qi ) : nsCOMPtr_base(qi.get()) // assign from |do_QueryInterface(expr)| { if (mRawPtr) mRawPtr->QueryInterface(NS_GET_IID(nsISupports), NS_REINTERPRET_CAST(void**, &mRawPtr)); } it is very efficient, it is now much better to write: nsCOMPtr<nsIXXXXXX> ptr(do_QueryInterface(other)); // optimized, inline nsCOMPtr<nsIXXXXXX> ptr(do_QueryInterface(other, &rv)); // optimized, inline than nsCOMPtr<nsIXXXXXX> ptr = do_QueryInterface(other); // not optimized, not inline nsCOMPtr<nsIXXXXXX> ptr = do_QueryInterface(other, &rv); // not optimized, not inline because we don't have to check the old pointer and release it if it's not NULL and because it is totally inline. The generated code is the same you would write with raw COM pointers. No more assign_from_qi() overhead. There's no code size increase (I've done my best to avoid it at all cost). There's even a small footprint win (a few Ko). I've also rewritten nsGetInterface operator() to use raw pointers, because even with the optimizations above, the ~nsCOMPtr() was not inlined. It is much better now. Now I am looking into removing nsGetInterface operator() virtual call (I don't think I can inline do_GetInterface() because it is a bit large). And there's also a lot of cleanup to do to remove bad nsCOMPtr construction code, patterns like "ptr = do_QueryInterface" to replace with "ptr(do_QueryInterface)") I've also run browser buster, I went to all my favorites sites, used print preview : no problem at all, it works for me
Reporter | ||
Updated•20 years ago
|
Attachment #152476 -
Attachment is obsolete: true
Attachment #152482 -
Attachment is obsolete: true
Attachment #152491 -
Attachment is obsolete: true
Attachment #152507 -
Attachment is obsolete: true
Attachment #152522 -
Attachment is obsolete: true
Comment 10•20 years ago
|
||
(In reply to comment #9) > nsCOMPtr<nsIXXXXXX> ptr(do_QueryInterface(other)); // optimized, inline > nsCOMPtr<nsIXXXXXX> ptr = do_QueryInterface(other); // not optimized, not hm, do compilers really not optimize the second to be exactly the same as the first?
I'm still nervous about changing the nsCOMPtr_helper signature. I guess it might
work since nsCOMPtr and nsCOMPtr_base are the only callers of nsCOMPtr_helper
and it lives in the gulecode which is linked into all plugins. It would be nice
if someone that knows this code better then me could verify this though
(bsmedberg, dbaron, bryner, scc). Triel and error in nsCOMPtr makes me nervous.
> nsCOMPtr<nsIXXXXXX> ptr(do_QueryInterface(other)); // optimized, inline
> nsCOMPtr<nsIXXXXXX> ptr = do_QueryInterface(other); // not optimized, not
I too would think that these compile to the same thing. In fact, the C++
specification says that these mean the same thing as long as nsCOMPtr<T> has a
ctor that takes a whatever-do_QueryInterface-returns. It shouldn't even need to
rely on the optimizer. This is a behaviour inherited from C language.
|int i = 5| does not mean "create i and then assign it 5". It means "create an i
with the value of 5". Same thing goes for classes.
There are articles online but i can't remember the site off the top of my head.
Reporter | ||
Comment 12•20 years ago
|
||
I've just made some breakpoints randomly (xul), and I can see that you're right. I spoke too fast. But it's worse, in that file the contructor is not inlined but it's still 2 lines of code and 30 bytes ! I guess I will have to check all this again later. I worked on a bit of code (nsGetInterface operator()) and I tried to optimize it. I swear it was inlined there ! Maybe the compiler (msvc7.1) can choose not to inline some template methods in one dll (layout), and inline in another dll (xpcom) ? Maybe it is smarter than I am :-) I'll check that later. I don't know if I can change nsCOMPtr_helper methods but I noticed that in bug 249652 NS_FASTCALL is applied to all methods. It is the same as changing parameters : the plugin will crash if it is linked dynamically with xpcom.dll. I think that it changes the name decoration so the symbols will not be found. Some input from people that knows about this code would be great ! I think that even if it's not as good as I thought it would be in the first place it's still an improvement.
Reporter | ||
Comment 13•20 years ago
|
||
The fact that the new nsCOMPtr contructors were inlined at some places and not inlined at other places was expected. Declaring a function inline just makes it a good candidate for inlining, nothing more. Then, given the constraint on code size (/O1), the compiler can choose to inline or not. It depends on the compiler. So I've just removed the inline code from the constructors for more safety and went back to make a function call instead. I've added detailed comments in the patch, I'm just summarizing the benefits of the patch: a) Remove some call overhead from all operator() by changing the declaration of all operator() and adjusting the implementation accordingly. Saves a few cycles on every operator() call. b) Inlined nsQueryInterface operator() and assign_assuming_addref(). No more call overhead from these functions at all. No function calls from nsCOMPtr.cpp anymore. It is done by using some macros. No code size increase when nsCOMPtr_base is not used because the common code is factored out of nsCOMPtr<> into a new nsCOMPtr_factor class (jonas request). c) New constructors from nsQueryInterface. Saves a few cycles on every constructor by providing specialized assign_from_qi function for this particular case. d) Faster do_GetInterface() by removing all overhead from nsCOMPtr<> in its implementation. Saves one pair of AddRef()/Release() calls for every use of do_GetInterface() (there are more than 400 occurences all over). code size numbers: without the patch: 12984832 with the patch: 12977152 (-7.5 Kb) Most of savings (6 Kb) come from d) because removing nsCOMPtr<> remove some hidden inlines constructor/destructor call from nsGetInterface class. Other changes account for 1.5 Kb only. when nsCOMPtr_base is not used: without the patch: 13178368 with the patch: 13197312 (+18.5 Kb) Code size increased because of c) because it introduced 2 new nsCOMPtr<> methods and these could not be factored out because they use mRawPtr member. Feel free to take a look at the patch. Any comments are welcome. I don't know if I have to go for reviews now. I think I would need someone to test this on linux/mac before. I can also provide smaller patches based on a) b) c) or d) if you're only insterested in part of it.
Reporter | ||
Updated•20 years ago
|
Attachment #152683 -
Attachment is obsolete: true
Reporter | ||
Comment 14•20 years ago
|
||
Comment on attachment 153142 [details] [diff] [review] updated with detailed comments I'd like to know what reviewers think of it. I'm running with the patch, tested the plugins, run browser buster. No problem.
Attachment #153142 -
Flags: review?(dougt)
Updated•20 years ago
|
Attachment #153142 -
Flags: review?(dougt) → review?(scc)
Reporter | ||
Comment 15•20 years ago
|
||
Attachment #153142 -
Attachment is obsolete: true
Reporter | ||
Updated•20 years ago
|
Attachment #153142 -
Flags: review?(scc)
Reporter | ||
Updated•20 years ago
|
Attachment #153780 -
Flags: review?(scc)
Updated•18 years ago
|
Assignee: dougt → nobody
QA Contact: xpcom
Comment 16•17 years ago
|
||
This this something we want to resurrect?
Comment 17•5 years ago
|
||
Seems like this might still be useful. Presumably we want to at least make sure nsQueryInterfaceISupports::operator()
gets inlined into its two callsites?
Flags: needinfo?(nfroyd)
Comment 18•5 years ago
|
||
(In reply to Boris Zbarsky [:bzbarsky, bz on IRC] from comment #17)
Seems like this might still be useful. Presumably we want to at least make sure
nsQueryInterfaceISupports::operator()
gets inlined into its two callsites?
I think it already does, thanks to LTO? readelf -sW $NIGHTLY/libxul.so |grep nsQueryInterface
doesn't seem to turn up anything related to this bug.
Flags: needinfo?(nfroyd)
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•