Created attachment 639827 [details] [diff] [review] use MOZ_ALWAYS_INLINE Comparing gcc 4.2 performance with clang I found out that forcing the inlining of these functions improves clang's performance in dromaeo. Looking at it a bit, the reasons seems to be in addition to the call overhead: * rv becomes a register. These two functions take a pointer to rv an the callers have it on the stack. This allows the compiler to simplify things like if (foo) *rv = bar; to a simple phi and delete the BB once they are inlined. * ~XPCLazyCallContext has: if (mCcxToDestroy) mCcxToDestroy->~XPCCallContext(); By inlining the compiler realizes that nothing is setting mCcxToDestroy and removes the check an the call. These also seems to help other compiler that are more conservative that gcc 4.2. Talos is still running, but the comparison is at bit.ly/Ps8qxz
Forgot to add: compiling the same preprocessed dom_quickstubs.cpp (so no clang only features like final) with and without this patch, with gcc 4.2 and clang 159509 the sizes are: -rw-r--r-- 1 espindola staff 1197428 6 Jul 18:23 dom_quickstubs-clang-fast.o -rw-r--r-- 1 espindola staff 1156736 6 Jul 18:23 dom_quickstubs-clang-master.o -rw-r--r-- 1 espindola staff 1316792 6 Jul 18:22 dom_quickstubs-gcc-fast.o -rw-r--r-- 1 espindola staff 1316832 6 Jul 18:22 dom_quickstubs-gcc-master.o Note how the size with gcc 4.2 decreases with this patch. I guess it was already inlining every anyway and does it earlier with this patch .
5 years ago
Comment on attachment 639827 [details] [diff] [review] use MOZ_ALWAYS_INLINE This looks very sane to me, but I would also like us to have Boris' take on this as well.
Comment on attachment 639827 [details] [diff] [review] use MOZ_ALWAYS_INLINE r=me. I can totally see how not inlining these would have a big effect! We should take a good look at our new DOM bindings and what clang actually generates for them codewise...
FWIW, I can definitely attest to these functions being hot. See bug 622301 comment 15.
Rafael, it would be really interesting if you could do another try push to compare the perf numbers with this fix being in. Thanks!
I did: http://bit.ly/MgFlAl traverse.html is fixed, I am looking at modify.html. It seems to be in isalloc_validate.