Last Comment Bug 771266 - Inline nsCOMPtr_base's destructor
: Inline nsCOMPtr_base's destructor
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: mozilla16
Assigned To: :Ehsan Akhgari (busy, don't ask for review please)
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2012-07-05 12:11 PDT by :Ehsan Akhgari (busy, don't ask for review please)
Modified: 2012-07-05 18:04 PDT (History)
6 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Patch (v1) (1.31 KB, patch)
2012-07-05 12:13 PDT, :Ehsan Akhgari (busy, don't ask for review please)
justin.lebar+bug: review+
Details | Diff | Review

Description :Ehsan Akhgari (busy, don't ask for review please) 2012-07-05 12:11:50 PDT
Today we were looking at some profiles, and we saw that ~nsCOMPtr_base() is not inlined.  It should probably be declared as such.  :-)
Comment 1 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-05 12:13:52 PDT
Created attachment 639425 [details] [diff] [review]
Patch (v1)
Comment 2 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-05 13:10:00 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/b84af5d0c13c
Comment 3 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-05 13:27:54 PDT
Also, https://hg.mozilla.org/integration/mozilla-inbound/rev/34c8a2369cbc, with irc-review from Neil.
Comment 4 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-07-05 14:39:36 PDT
For the record, the gcc produced body in a release build is:

 125fd6a:       55                      push   %rbp
 125fd6b:       48 8b 3f                mov    (%rdi),%rdi
 125fd6e:       48 89 e5                mov    %rsp,%rbp
 125fd71:       48 85 ff                test   %rdi,%rdi
 125fd74:       74 0a                   je     125fd80 <_ZN13nsCOMPtr_baseD1Ev+0x16>
 125fd76:       48 8b 07                mov    (%rdi),%rax
 125fd79:       48 8b 40 10             mov    0x10(%rax),%rax
 125fd7d:       c9                      leaveq 
 125fd7e:       ff e0                   jmpq   *%rax
 125fd80:       c9                      leaveq 
 125fd81:       c3                      retq 

and we have 21717 calls to it in XUL :-)
Comment 5 Justin Lebar (not reading bugmail) 2012-07-05 14:52:52 PDT
Wow, I should not have r+ed this so quickly!
Comment 6 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-05 14:56:35 PDT
(In reply to comment #5)
> Wow, I should not have r+ed this so quickly!

If you're worried about the code size, I'll watch it and backout if necessary.
Comment 7 Justin Lebar (not reading bugmail) 2012-07-05 14:58:34 PDT
Sounds good.  I'm also worried about icache locality, but that's obviously a trade-off, and we'll see how our benchmarks are affected.
Comment 8 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-07-05 15:05:11 PDT
(In reply to Ehsan Akhgari [:ehsan] from comment #3)
> Also, https://hg.mozilla.org/integration/mozilla-inbound/rev/34c8a2369cbc,
> with irc-review from Neil.

Why?
Comment 9 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-05 15:08:45 PDT
(In reply to comment #8)
> (In reply to Ehsan Akhgari [:ehsan] from comment #3)
> > Also, https://hg.mozilla.org/integration/mozilla-inbound/rev/34c8a2369cbc,
> > with irc-review from Neil.
> 
> Why?

See the bug that Jeff filed.
Comment 10 Justin Lebar (not reading bugmail) 2012-07-05 15:15:28 PDT
> See the bug that Jeff filed.

Bug 771317.
Comment 11 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-05 15:21:45 PDT
(In reply to Ehsan Akhgari [:ehsan] from comment #3)
> Also, https://hg.mozilla.org/integration/mozilla-inbound/rev/34c8a2369cbc,
> with irc-review from Neil.

Backed out: https://hg.mozilla.org/integration/mozilla-inbound/rev/57ae1fdbcfc5
Comment 12 :Ehsan Akhgari (busy, don't ask for review please) 2012-07-05 15:50:41 PDT
Codesighs on Mac:

before: 42432300
after: 42531300
difference: 99000

For Linux32 non-PGO, seems like we have not gathered enough info on the graph server, and the difference is sort of lost.  But this shouldn't make a difference on Windows and Linux, since we do PGO there and the compiler is supposed to be smart enough to inline these calls...
Comment 13 Joshua Cranmer [:jcranmer] 2012-07-05 15:54:14 PDT
(In reply to Rafael Ávila de Espíndola (:espindola) from comment #4)
> For the record, the gcc produced body in a release build is:
> 
>  125fd6a:       55                      push   %rbp
>  125fd6b:       48 8b 3f                mov    (%rdi),%rdi
>  125fd6e:       48 89 e5                mov    %rsp,%rbp
>  125fd71:       48 85 ff                test   %rdi,%rdi
>  125fd74:       74 0a                   je     125fd80
> <_ZN13nsCOMPtr_baseD1Ev+0x16>
>  125fd76:       48 8b 07                mov    (%rdi),%rax
>  125fd79:       48 8b 40 10             mov    0x10(%rax),%rax
>  125fd7d:       c9                      leaveq 
>  125fd7e:       ff e0                   jmpq   *%rax
>  125fd80:       c9                      leaveq 
>  125fd81:       c3                      retq 
> 
> and we have 21717 calls to it in XUL :-)

Note that a lot of this is just from frame overhead. You should see in the inlined version just
mov (%rdi), %rdi
test %rdi, %rdi
je +N
mov (%rdi), %rax
mov 0x10(%rax),%rax
call *%rax

That's 17 bytes if I can count correctly, 9 bytes if the compiler can optimize array the null check, and 6 bytes or fewer if the vtable pointer can be reused, compared to a baseline of 5 bytes for the call itself.
Comment 15 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-07-05 18:04:56 PDT
> For Linux32 non-PGO, seems like we have not gathered enough info on the
> graph server, and the difference is sort of lost.  But this shouldn't make a
> difference on Windows and Linux, since we do PGO there and the compiler is
> supposed to be smart enough to inline these calls...

Note that linux pgo doesn't use LTO, so this should help there too (if there is a hot call).

Cheers,
Rafael

Note You need to log in before you can comment on or make changes to this bug.