Last Comment Bug 771683 - Mark some functions MOZ_ALWAYS_INILINE
: Mark some functions MOZ_ALWAYS_INILINE
Product: Core
Classification: Components
Component: XPConnect (show other bugs)
: unspecified
: x86 Mac OS X
-- normal (vote)
: mozilla16
Assigned To: Nobody; OK to take it and work on it
: Andrew Overholt [:overholt]
Depends on:
  Show dependency treegraph
Reported: 2012-07-06 15:20 PDT by Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
Modified: 2012-07-09 09:39 PDT (History)
4 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---

use MOZ_ALWAYS_INLINE (1.16 KB, patch)
2012-07-06 15:20 PDT, Rafael Ávila de Espíndola (:espindola) (not reading bugmail)
ehsan: review+
bzbarsky: review+
Details | Diff | Splinter Review

Description User image Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-07-06 15:20:05 PDT
Created attachment 639827 [details] [diff] [review]

Comparing gcc 4.2 performance with clang I found out that forcing the inlining of these functions improves clang's performance in dromaeo. Looking at it a bit, the reasons seems to be in addition to the call overhead:

* rv becomes a register. These two functions take a pointer to rv an the callers have it on the stack. This allows the compiler to simplify things like

if (foo)
  *rv = bar;

to a simple phi and delete the BB once they are inlined.

* ~XPCLazyCallContext has:

        if (mCcxToDestroy)

By inlining the compiler realizes that nothing is setting mCcxToDestroy and removes the check an the call. These also seems to help other compiler that are more conservative that gcc 4.2. Talos is still running, but the comparison is at
Comment 1 User image Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-07-06 15:26:51 PDT
Forgot to add: compiling the same preprocessed dom_quickstubs.cpp (so no clang only features like final) with and without this patch, with gcc 4.2 and clang 159509 the sizes are:

-rw-r--r--  1 espindola  staff  1197428  6 Jul 18:23 dom_quickstubs-clang-fast.o
-rw-r--r--  1 espindola  staff  1156736  6 Jul 18:23 dom_quickstubs-clang-master.o
-rw-r--r--  1 espindola  staff  1316792  6 Jul 18:22 dom_quickstubs-gcc-fast.o
-rw-r--r--  1 espindola  staff  1316832  6 Jul 18:22 dom_quickstubs-gcc-master.o

Note how the size with gcc 4.2 decreases with this patch. I guess it was already inlining every anyway and does it earlier with this patch
Comment 2 User image :Ehsan Akhgari 2012-07-06 16:21:22 PDT
Comment on attachment 639827 [details] [diff] [review]

This looks very sane to me, but I would also like us to have Boris' take on this as well.
Comment 3 User image Boris Zbarsky [:bz] (still a bit busy) 2012-07-06 17:50:41 PDT
Comment on attachment 639827 [details] [diff] [review]

r=me.  I can totally see how not inlining these would have a big effect!

We should take a good look at our new DOM bindings and what clang actually generates for them codewise...
Comment 4 User image Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-07-06 18:19:40 PDT
Comment 5 User image Ryan VanderMeulen [:RyanVM] 2012-07-07 12:00:23 PDT
Comment 6 User image Bobby Holley (:bholley) (busy with Stylo) 2012-07-08 03:23:33 PDT
FWIW, I can definitely attest to these functions being hot. See bug 622301 comment 15.
Comment 7 User image :Ehsan Akhgari 2012-07-09 08:22:17 PDT
Rafael, it would be really interesting if you could do another try push to compare the perf numbers with this fix being in.  Thanks!
Comment 8 User image Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-07-09 09:39:16 PDT
I did:

traverse.html is fixed, I am looking at modify.html. It seems to be in isalloc_validate.

Note You need to log in before you can comment on or make changes to this bug.