Closed
Bug 227537
Opened 21 years ago
Closed 18 years ago
Make use of ELF visibility attribute
Categories
(Core :: XPCOM, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: bryner, Assigned: bryner)
Details
(Keywords: fixed1.7, perf)
Attachments
(7 files, 2 obsolete files)
20.81 KB,
patch
|
dbaron
:
review+
darin.moz
:
superreview+
|
Details | Diff | Splinter Review |
12.42 KB,
patch
|
dbaron
:
review+
dbaron
:
superreview+
|
Details | Diff | Splinter Review |
257.76 KB,
patch
|
Details | Diff | Splinter Review | |
2.35 KB,
text/plain
|
Details | |
179.78 KB,
patch
|
Details | Diff | Splinter Review | |
929 bytes,
patch
|
bryner
:
review+
blizzard
:
superreview+
asa
:
approval1.7+
|
Details | Diff | Splinter Review |
32.19 KB,
patch
|
dbaron
:
review+
dbaron
:
superreview+
|
Details | Diff | Splinter Review |
We have the potential for a fair speedup across the board on Linux (at least
6-7% by some rough measurements) by marking most of our functions and methods as
having hidden visibility, using __attribute__ ((visibility ("hidden"))). In
particular, we can do this for the following types of functions:
- All methods and functions which are not to be called from outside of the DSO.
- _All_ functions and methods in our XPCOM component libraries, because these
are only ever called from other DSOs via vtables. (with the special exception of
NSGetModule).
What does this buy us?
- You save a trip through the PLT for each call (one load and one jump)
- If a function does not access global variables and calls only functions marked
static or hidden, you save some instructions at the beginning of the function to
set up the PIC register (and you gain ebx as a general purpose register in that
function to boot).
Hidden visibility also applies to global data, though the savings isn't quite as
much.
So, now the question is how we could apply this to the code base with the
minimal amount of work. There are a few possibilities:
1. A macro, on each method declaration. It could be folded into NS_IMETHOD too,
but we'd want a version that doesn't declare virtual and stdcall.
2. Implement per-class visibility in gcc and use a macro on each class declaration.
3. Implement a global -fvisibility=hidden switch in gcc, and explicitly mark the
methods we want to _export_ (most of these are already annotated with NS_EXPORT
for win32).
Options 2 and 3 obviously don't buy us anything immediately and depend on
getting a gcc patch done and checked in and probably waiting for the next gcc
release that contains it. They do, however, make the Mozilla code less messy.
#2 isn't really that hard (I even have a rough patch, although visibility is a
bit hosed in general on the gcc trunk). #3 requires a lot more thought and we
could have a difficult time getting concensus on that (you have to come up with
a mechanism to know that all of your symbols from DSOs that you link with will
not be hidden).
Comment 1•21 years ago
|
||
We did something similar with NO_VTABLE. See:
http://lxr.mozilla.org/seamonkey/source/xpcom/base/nscore.h#159
Comment 2•21 years ago
|
||
while i agree that it would be very nice to avoid having to decorate every
function that we wish to make hidden, i'm not sure i can think of a better
solution either that does not require GCC changes.
>1. A macro, on each method declaration. It could be folded into NS_IMETHOD
>too,but we'd want a version that doesn't declare virtual and stdcall.
you threw me a bit with the "virtual and stdcall" ... on linux NS_IMETHOD is
declared like this:
#define NS_IMETHOD_(type) virtual type
#define NS_IMETHOD NS_IMETHOD_(nsresult)
are you saying that the "hidden" attribute does not work when added to a
function marked with the virtual keyword? i'm not sure i understand ;-)
at any rate, i was going to point out that it should be okay to decorate all of
our functions that currently use NS_IMETHOD as "hidden" since they do not have
__declspec(dllexport) on windows. in other words, if it is true that these
functions are already by default hidden on windows, then we know that it should
be okay to mark all of them hidden on linux. or, am i missing something?
Comment 3•21 years ago
|
||
one more comment: my thinking is that it is probably a reasonable, partial
solution to just do the NS_IMETHOD fixup, and then wait for GCC changes and/or
provide a NS_HIDDEN macro that can be selectively applied elsewhere in the tree.
that would at least get us some significant savings immediately, right?
> are you saying that the "hidden" attribute does not work when added to a
> function marked with the virtual keyword? i'm not sure i understand ;-)
I think he meant that we'd want to mark some functions that aren't virtual as
hidden. This would require an additional macro that could be used in places
where we now have no macro.
Comment 5•21 years ago
|
||
yeah, bryner confirmed that. i misunderstood.
> you save some instructions at the beginning of the function to set up the PIC
> register (and you gain ebx as a general purpose register in that function to
> boot).
I believe this happens only for "internal" visibility, "hidden" allows a
function to be called indirectly via a pointer so PIC setup is still necessary.
Assignee | ||
Comment 7•21 years ago
|
||
My tests seem to show that you can eliminate those instructions. With the
following testcase:
void func();
void func_hidden() __attribute__ ((visibility ("hidden")));
void call_func()
{
func();
}
void call_func_hidden()
{
func_hidden();
}
this assembly is generated (c++ -Os -fPIC -c -o vis.o vis.cpp):
00000000 <call_func()>:
0: 55 push %ebp
1: 89 e5 mov %esp,%ebp
3: 53 push %ebx
4: e8 fc ff ff ff call 5 <call_func()+0x5>
9: 81 c3 02 00 00 00 add $0x2,%ebx
f: e8 fc ff ff ff call 10 <call_func()+0x10>
14: 5b pop %ebx
15: 5d pop %ebp
16: c3 ret
17: 90 nop
00000018 <call_func_hidden()>:
18: 55 push %ebp
19: 89 e5 mov %esp,%ebp
1b: e8 fc ff ff ff call 1c <call_func_hidden()+0x4>
20: 5d pop %ebp
21: c3 ret
with gcc 3.3.2.
Assignee | ||
Comment 8•21 years ago
|
||
something like this...
Assignee | ||
Updated•21 years ago
|
Attachment #136923 -
Flags: superreview?(darin)
Attachment #136923 -
Flags: review?(dbaron)
> My tests seem to show that you can eliminate those instructions.
I was think of the hidden function (func_hidden in your example). It will
still probably load the GOT offset.
Gcc seems to be inconsistent in the example. In one case it loads the GOT even
though it doesn't have to. In the other it assumes that a function outside the
current file is well-behaved. I think there may be a gcc bug here. I wouldn't
be surprised if gcc changed its behavior in the next version. I also haven't
tried different optimizations to see if there's a difference.
I'm not objecting to this mind you. It's a good idea. You should go farther.
Have you considered regparm in certain situations?
BTW, I've been testing visibility in some projects of my own and I don't see
the speedup you do. Maybe mozilla's making too many function calls.
Assignee | ||
Comment 10•21 years ago
|
||
> In one case it loads the GOT even though it doesn't have to.
Sure it has to, if I link this object file into a DSO. Then the section of
interest becomes:
818: e8 f3 ff ff ff call 810 <__i686.get_pc_thunk.bx>
81d: 81 c3 57 12 00 00 add $0x1257,%ebx
823: e8 b4 fe ff ff call 6dc <_init+0x38>
with the last call being into a PLT stub:
6dc: ff a3 10 00 00 00 jmp *0x10(%ebx)
6e2: 68 08 00 00 00 push $0x8
6e7: e9 d0 ff ff ff jmp 6bc <_init+0x18>
So it's necessary to load the ebx register with the GOT address in this case,
since you go through the PLT (which you have to do for functions with default
visibility, since you have no idea which DSO it will be found in).
> In the other it assumes that a function outside the current file is well-behaved.
How do you mean?
> Have you considered regparm in certain situations?
Yes, as well as using stdcall (to reduce code bloat from many callers containing
the same code to pop arguments). I'm holding off on doing either of those for
NS_IMETHOD though because of binary compatibility, and we'll need another macro
to use it for internal things. Though it would make sense to do this for
NS_IMETHOD in cases where we don't care about binary compatibility.
Also, I should cite my reference for a number of the things I've talked about here:
http://people.redhat.com/~drepper/dsohowto.pdf
Comment 11•21 years ago
|
||
My take on this is:
I personally think -fvisibility is too coarse grained and that the default as it
is today is fine. Mark the functions which can be hidden as such. Using the
attribute for NS_IMETHOD is certainly no problem. Other places can change with
NS_HIDDEN.
As for the effects on code generation: it is actually more important to use the
visibility attributes correctly for global variable. They are otherwise a
complete mess.
If a function, compiled with -fpic/-fPIC, is not using any global data or calls
any non-static function, the PIC register isn't loaded. Now, with the
visibility attribute this can be extended. Even non-static function, declared
with hidden, can be called without the PIC register being loaded. And hidden
variable can be accessed without PIC as well since the addressing can be done
relative to the GOT or via PC-relative addressing.
On platforms where the caller is responsible for loading the PIC register it is
now possible to determine whether the destination is in the same DSO and avoid
even more PIC loads.
As for the "internal" visibility. It's not defined, I don't know where somebody
would get the idea it is useful for this purpose. Consider reading the paper
bryner cited.
Re regparm and stdcall. We use both in glibc, selectively though. Not all
internal functions are declared this way. This has mainly historical reasons.
In the past gcc had some issues with using regparm for functions with many
parameter. The code was correct, but slow. Today this should mostly be no issue.
Anyway: I'm all for adding the attributes in as many places as possible as soon
as possible. For functions with up to 3 parameters regparms can be a big help.
stdcall is not necessarily a win nowadays since the compiler generated code
which resuses stack frames. On modern processors
movl %eax, (%esp)
movl %ebx 4(%esp)
is faster than
pushl %ebx
pushl %eax
The stack frames are cleaned up in due time but can be reused if there is more
than one function call.
Assignee | ||
Comment 12•21 years ago
|
||
> And hidden variable can be accessed without PIC as well since the addressing
can > be done relative to the GOT or via PC-relative addressing.
Didn't we talk about this awhile back? I thought at that point you said that
x86 doesn't have a PC-relative addressing mode for data, so the PIC register
must always be loaded to access global variables. And aside from global
variables, I don't know what other sort of variable you'd want to declare with
hidden visibility.
Comment 13•21 years ago
|
||
> Didn't we talk about this awhile back? I thought at that point you said that
> x86 doesn't have a PC-relative addressing mode for data, so the PIC register
> must always be loaded to access global variables.
x86 has no PC-relative addressing, others do. This is why x86 uses addressing
relative to the GOT.
> And aside from global variables, I don't know wat other sort of variable you'd
> want to declare with hidden visibility.
Certainly only for global variable. What I said is that this is important since
adressing indirectly through an address in the GOT is more expensive than
accessing relative through the GOT address, and more so for register starved
machines. This is the code needed to access a global variable
movl foo@GOT(%ebx), %eax
movl (%eax), %eax
This is the code which can be used if foo is known to be hidden
movl foo@GOTOFF(%ebx), %eax
For amd64 the later becomes
movl foo(%rip), %eax
which you see does not need the PIC register. I.e., if a function does not
access non-automatic or non-thread-local variables and doesn't call global
functions, it does not need the PIC register loaded. If the knowledge about the
function called is left to the linker (i.e., the hidden attribute is not given)
then the compiler has to load the PIC register. Any access to a non-automatic
and non-thread-local variable will cause, on x86, the PIC register to be loaded
(well, the compiler might decide to use a different register than %ebx but the
loading has to happen). On processors with better instruction sets it is
usually not necessary to load the PID register when non-global variables are
accessed.
Now you see, telling the compiler about hidden variables is important since the
linker cannot do any optimization. With function which the linker knows are
local, one at least saves the PLT slot and the indirect jump. But optimal
results, including avoiding to load the PIC register, can only be achieved if
the compiler knows everything.
I'd prefer hidden to be the default in Mozilla.
1) More compatible with MSVC++
1b) Don't need to deal with both NS_HIDDEN and NS_EXPORT macros
2) Less likely to break something making a hidden function public than vice versa
3) Most functions and data will be hidden
Comment 15•21 years ago
|
||
Brian Ryner:
> So it's necessary to load the ebx register with the GOT address in this
> case
You're right, I wasn't thinking clearly.
>> In the other it assumes that a function outside the current file is
>> well-behaved.
> How do you mean?
I was wondering if a function is responsible for preserving ebx when it
doesn't have much knowledge of the function it's calling. If yes, then I
would think the calling function should preserve ebx, which gcc doesn't.
If no, then it's irrelevant.
Brian points out that it's probably not possible to make hidden the default
without massive changes. That's OK, I just want to get on the record as saying
that that's where we'd like to be in case someone ever gets a chance to evolve
things in that direction.
In the meantime I agree in favour of option #2, plus we'll need an NS_HIDDEN
macro that can be applied to global functions (and preferably also global
variables and methods of a non-hidden class). The NS_IMETHOD changes are OK but
hopefully only as a temporary hack.
Assignee | ||
Comment 17•21 years ago
|
||
I don't think that option #2 (per-class visibility) will be that difficult to do
correctly in gcc... I'm just waiting on some of the visibility stuff to
stabilize a bit on the gcc trunk. However, from my reading of the gcc checkin
policies, such a change would not be accepted for 3.4 at this point, which means
it would be probably 7 months (very optimistically) before we could make use of it.
In the meantime I'd like to go ahead and mark individual methods, even if we
have to mark all the methods in a class.
I'd rather not complexify our code, especially given that we've lived without
this optimization for many years.
Can't we just use our own patched gcc on appropriate tinderboxen/build machines?
Comment 19•21 years ago
|
||
This might be slightly offtopic to this bug, but I am now able to link the
entire gecko core (GRE) as a single static lib including xpcom (and optionally
NSPR, but I don't see how that is helpful); in that scenario, linking our
non-frozen string impl (nsString etc) as "hidden" would be very helpful to
embedders (especially medium-weight embedders like mfcembed for whom
nsEmbedstring or darin's new string API are rather impractical).
--BDS
Assignee | ||
Comment 20•21 years ago
|
||
> I'd rather not complexify our code, especially given that we've lived without
> this optimization for many years.
I don't see this as making the code more complicated. More verbose, perhaps,
but that's like saying specifying 'static' on a C function makes the code
complex. And yes, we've lived without the optimization for years. Too long -
we're trying to improve things.
> Can't we just use our own patched gcc on appropriate tinderboxen/build machines?
Seems risky and it's a pain for anyone who wants to build it themselves.
> Seems risky and it's a pain for anyone who wants to build it themselves.
Yeah but that's only temporary until your changes propagate.
Maybe it's just me but every NS_BLAH macro that we add to function declarations
makes me a little more sad.
Assignee | ||
Comment 22•21 years ago
|
||
Comment on attachment 136923 [details] [diff] [review]
make NS_IMETHOD have hidden visibility by default
This patch had no meaurable effect on startup time or page load time, but did
reduce code size by about 30 KB (on redwood).
We can probably get more performance gains by putting hidden visibility on our
_non_ virtual methods.
Comment 23•21 years ago
|
||
It seems to me that a simple metric would be to just count the number of
relocation records in each library. Hidden visibility should make some of them
disappear.
Comment 24•21 years ago
|
||
bryner: can you make all the methods on my templated hashtables invisible? that
should make a significant impact on codesize, and perhaps perf (although I don't
think they are used in much perf-sensitive code at this point).
Comment 25•21 years ago
|
||
> * Using the visibility("hidden") attribute allows the compiler to use
> * PC-relative addressing to call this function. If a function does not
> * access any global data, and does not call any methods which are not either
> * file-local or hidden, then on ELF systems we avoid loading the PC into a
> * register at the start of the function.
i think perhaps this should read something more like:
"... then on ELF systems we avoid loading the address of the PLT into a
register at the start of the function. This not only reduces code but also
frees up that register for general purpose use."
Updated•21 years ago
|
Attachment #136923 -
Flags: superreview?(darin) → superreview+
Attachment #136923 -
Flags: review?(dbaron) → review+
Assignee | ||
Comment 26•21 years ago
|
||
I checked the patch in but I had to disable it because gcc on luna does not
like visibility("default"). This patch re-enables it but makes us not depend
on visibility("default") being honored. The major change is that
IMETHOD_VISIBILITY is now the entire __attribute__ string, and I provide magic
defines NS_VISIBILITY_HIDDEN and NS_VISIBILITY_DEFAULT (the latter is just an
empty #define).
I also added a couple of examples to the comments for using NS_HIDDEN by itself
(ctors and dtors) and NS_HIDDEN_(type).
I tested this on Linux (and verified the expected hidden symbols), Mac, and
Windows.
Assignee | ||
Updated•21 years ago
|
Attachment #140499 -
Flags: superreview?(dbaron)
Attachment #140499 -
Flags: review?(dbaron)
Comment 27•21 years ago
|
||
Comment on attachment 140499 [details] [diff] [review]
don't depend on visibility("default")
#define NS_IMETHODIMP_(type) type
does this not require IMETHOD_VISIBILITY?
Assignee | ||
Comment 28•21 years ago
|
||
No, visibility is only on the declaration, not the implementation.
Attachment #140499 -
Flags: superreview?(dbaron)
Attachment #140499 -
Flags: superreview+
Attachment #140499 -
Flags: review?(dbaron)
Attachment #140499 -
Flags: review+
Comment 29•21 years ago
|
||
fwiw, i think this last change was probably good anyways since some other random
compiler may have yet another way of specifying visibility :-/
Comment 30•21 years ago
|
||
Updated•21 years ago
|
Attachment #140952 -
Flags: superreview?(dbaron)
Attachment #140952 -
Flags: review?(bryner)
Comment on attachment 140952 [details] [diff] [review]
Swap type and visibility
What's this for?
Comment on attachment 140952 [details] [diff] [review]
Swap type and visibility
minusing for lack of an explanation as to what this is fixing and why it won't
break existing things
Attachment #140952 -
Flags: superreview?(dbaron) → superreview-
Comment 33•21 years ago
|
||
Comment on attachment 140952 [details] [diff] [review]
Swap type and visibility
Strange, I could have sworn that I typed in a long comment when requesting
(super)review... anyway, basically, gcc 2.9.5 likes them the other way around.
Attachment #140952 -
Flags: superreview- → superreview?(dbaron)
Comment on attachment 140952 [details] [diff] [review]
Swap type and visibility
Does it work on gcc 3.x this way? (Have you tested?)
Assignee | ||
Comment 35•21 years ago
|
||
Fwiw, the gcc documentation about attributes is very long and yet fails to give
a definitive answer about where the attribute should (and can) go for function
declarations. Some examples show it at the end of the declaration, but that's a
no-go for us if we want it to be automatic for NS_IMETHOD. I was perhaps
confused into thinking that if I put it before the return type, it may end up
applying to the return type instead of to the function (per one of the "It
should do <X> but does not currently" comments in the documentation).
So yeah, if we can verify that this works on gcc 2.95.x and 3.x (not just
compiles, but actually generates the symbol as hidden) then let's do this.
Assignee | ||
Comment 36•21 years ago
|
||
This adds NS_HIDDEN to everything we don't want to export (which, fortunately,
we've already annotated with NS_COM for Windows). It's a large patch but the
changes are very mechanical. I did go through and verify that everything that
still had GLOBAL binding and DEFAULT visibility in the final libxpcom.so was
supposed to be exported. And aside from testing that the build generally
works, I built with -Wl,--no-undefined and got no linker errors about
undefined symbols.
Assignee | ||
Comment 37•21 years ago
|
||
These are some guidelines I figured out and followed as I was creating the
above patch. I'm planning to post it to n.p.m.xpcom after getting a little bit
of feedback on it, but I thought it might be helpful here.
Assignee | ||
Comment 38•21 years ago
|
||
Comment on attachment 141309 [details] [diff] [review]
Apply NS_HIDDEN liberally within libxpcom
Sorry to doom you with this darin, but it's what you get for being an xpcom
peer :)
Attachment #141309 -
Flags: review?(darin)
Assignee | ||
Comment 39•21 years ago
|
||
Comment on attachment 140952 [details] [diff] [review]
Swap type and visibility
So, er, regarding this patch to swap the type and visibility... I just had to
do some digging for when support for the visibility attribute appears and it
doesn't seem to have been until gcc 3.3 (although Red Hat backported it to
3.1/3.2 in the 8.0 and 9 releases).
Am I missing something?
Assignee | ||
Comment 40•21 years ago
|
||
This patch implements hidden visibility for internal functions for the JS
engine. It's a reimplementation of what's in XPCOM, because the JS engine
builds standalone. Since the JS standalone build isn't autoconf'd, I used
#ifdefs to enable this for gcc >= 3.3. For the full Seamonkey/Firefox/etc
build we will automatically #define (or not) HAVE_VISIBILITY_ATTRIBUTE via
mozilla-config.h.
In addition, because the JS engine is C-only, I was able to use
hidden-visibility aliases for exported functions/data so that those calls don't
need to go through the PLT from inside the library. I made a pair of macros,
JS_DECL_ALIAS and JS_IMPL_ALIAS which automagically cause calls to foo() from
inside libmozjs to go to _foo_int(), which is a hidden-visibility alias.
The patch saves just over 30 KB of code size with gcc 3.3.2 (-Os).
Oh, one other thing. The mismatched parentheses for the NAME_ALL_GC_ROOTS
version of JS_AddRoot was confusing emacs. I fixed the parentheses... if this
was incorrect then I will take out that change.
Assignee | ||
Updated•21 years ago
|
Attachment #141524 -
Flags: review?(brendan)
Comment 41•21 years ago
|
||
Comment on attachment 141524 [details] [diff] [review]
use visibility("hidden") and alias symbols for js engine
Can you macroize further, say JS_IMPL_PUBLIC_API(t,n) for type t and name n,
that expands to JS_IMPL_ALIAS(n); JS_PUBLIC_API(t) ? Or does the JS_IMPL_ALIAS
have to follow the function (seems likely, but I thought I'd ask)?
It sucks that gcc doesn't give us an option. I would love to hear from drepper
on the possibility of that, to avoid such mindless and invasive, yet perf- and
footprint-winning source changes.
/be
Assignee | ||
Comment 42•21 years ago
|
||
Comment on attachment 141524 [details] [diff] [review]
use visibility("hidden") and alias symbols for js engine
One problem with this patch, that brendan and I have been discussing, is that
JS may return the addresses of externally-visible functions and these will not
pass an equality test against the public function symbol from outside of the
library.
So, at the minimum, for any function address that we could return externally,
we need to get the address of the _exported_ symbol. We could declare a second
asm renaming via .set that gives you a way to reference the external symbol
explicitly (&foo_ext). If that's too much trouble, we can remove the aliases
and renaming for those functions whose addresses might be returned.
Comment 43•21 years ago
|
||
> The patch saves just over 30 KB of code size with gcc 3.3.2 (-Os).
any idea what kind of performance impact this patch has?
Assignee | ||
Comment 44•21 years ago
|
||
Comment on attachment 140952 [details] [diff] [review]
Swap type and visibility
minusing until someone tells me an actual compiler i can get that passes the
configure test and doesn't accept this placement of the attribute
Attachment #140952 -
Flags: review?(bryner) → review-
Attachment #140952 -
Flags: superreview?(dbaron)
Comment 45•21 years ago
|
||
Attachment #140952 -
Attachment is obsolete: true
Comment 46•21 years ago
|
||
Comment on attachment 143670 [details] [diff] [review]
Swap both types and visibilities
The build page still appears to claim gcc2.96-77 support; should this be
accurate then you need to switch the type and attribute because that compiler
does not recognize them your way around.
Attachment #143670 -
Flags: review?(bryner)
Comment 47•21 years ago
|
||
With the help of drepper in #mozilla I tested and gcc2.96 supports hidden
visibility in 5 of 6 cases:
TYPEDEF HIDDEN FUNC(ARGS);
TYPEDEF FUNC(ARGS) HIDDEN;
HIDDEN TYPEDEF FUNC(ARGS);
TYPE * FUNC(ARGS) HIDDEN;
HIDDEN TYPE * FUNC(ARGS);
gcc3.x also supports TYPE * HIDDEN FUNC(ARGS); which is what the current macro
generates. I expect that the configure test is only testing the first case.
Comment 48•21 years ago
|
||
Some comments on attachment 141524 [details] [diff] [review].
1) The macros seem overly complex. I would think it would be simpler to
just have a macro to redefine all the necessary names and a special
wrapper file for ELF platforms that exposes the public API and just calls
the internal functions. You only need to create this file once and you can
compile it any way you want for speed.
2) Are you sure that all ELF platforms understand the assembler tricks
you're using.
3) Isn't using leading underscores encroaching on the system name space?
Assignee | ||
Updated•21 years ago
|
Attachment #143670 -
Flags: review?(bryner) → review+
Updated•21 years ago
|
Attachment #143670 -
Flags: superreview?(blizzard)
Comment 49•21 years ago
|
||
Comment on attachment 141309 [details] [diff] [review]
Apply NS_HIDDEN liberally within libxpcom
removing review request for now since we decided against this (at least i think
that's what we decided)... let me know if i'm wrong ;-)
Attachment #141309 -
Flags: review?(darin)
Updated•21 years ago
|
Attachment #143670 -
Flags: superreview?(blizzard) → superreview+
Comment 50•21 years ago
|
||
Comment on attachment 143670 [details] [diff] [review]
Swap both types and visibilities
Low-risk patch just changes the order of specifiers to satisfy some gcc
versions.
Attachment #143670 -
Flags: approval1.7?
Comment 51•21 years ago
|
||
Comment on attachment 143670 [details] [diff] [review]
Swap both types and visibilities
a=asa (on behalf of drivers) for checkin to 1.7
Attachment #143670 -
Flags: approval1.7? → approval1.7+
Comment 52•21 years ago
|
||
Comment on attachment 143670 [details] [diff] [review]
Swap both types and visibilities
Patch checked in.
Comment 53•21 years ago
|
||
Comment on attachment 141524 [details] [diff] [review]
use visibility("hidden") and alias symbols for js engine
We now think there has to be a better way, right?
/be
Attachment #141524 -
Flags: review?(brendan)
Comment 54•21 years ago
|
||
(In reply to comment #40)
> Created an attachment (id=141524)
> use visibility("hidden") and alias symbols for js engine
Hm... why can't this use visibility("protected")?
from info gcc:
"protected"
Protected visibility indicates that the symbol will be placed
in the dynamic symbol table, but that references within the
defining module will bind to the local symbol. That is, the
symbol cannot be overridden by another module.
Comment 55•21 years ago
|
||
> Hm... why can't this use visibility("protected")?
Forget about protected. It is very expensive at runtime for some architectures
due to the requirements of the C/C++ spec. See my DSO paper if you need to know
more details.
Assignee | ||
Comment 56•21 years ago
|
||
Profiling (with Quantify on Windows) shows that style system methods
(particularly nsStyleContext::GetStyleData, nsRuleNode::GetParentData, and some
of the nsCSSValue methods) are among the most frequently-called functions
during page load. So, applying NS_HIDDEN here seems like the most effective
way to reduce the per-call overhead from the PLT.
I went ahead and applied NS_HIDDEN for the entire class where the
frequently-called methods showed up.
Assignee | ||
Updated•21 years ago
|
Attachment #148258 -
Flags: superreview?(dbaron)
Attachment #148258 -
Flags: review?(dbaron)
Attachment #148258 -
Flags: superreview?(dbaron)
Attachment #148258 -
Flags: superreview+
Attachment #148258 -
Flags: review?(dbaron)
Attachment #148258 -
Flags: review+
Comment 57•21 years ago
|
||
That didn't really seem to affect the tinderbox times....
btek is using a compiler too old to make any difference, and it looks like it
did help Tp on luna.
Comment 59•21 years ago
|
||
Oh, ok. Didn't realize btek's compiler was too old.
Assignee | ||
Comment 60•21 years ago
|
||
_PR_x86_AtomicIncrement/Decrement also have very high call counts. We should
definitely try to reduce that, but this should also help by avoiding the PLT
overhead in the jump to the internal platform-specific atomic op functions.
Assignee | ||
Updated•21 years ago
|
Attachment #148377 -
Flags: review?(wchang0222)
Comment 61•21 years ago
|
||
Comment on attachment 148377 [details] [diff] [review]
apply NS_HIDDEN to internal NSPR atomic functions
bryner, sorry about the delay in reviewing
this patch.
A better solution is to eliminate the
intermediate _PR_x86_AtomicXXX functions
for Linux/x86 and directly implement the
exported PR_AtomicXXX functions in assembly.
Do you agree?
You should be able to implement this solution
by changing three files:
1. mozilla/nsprpub/pr/src/misc/pratom.c: put
the PR_AtomicXXX functions inside an ifdef
like this:
#if defined(LINUX) && defined(__i386__)
/* PR_AtomicXXX implemented in os_Linux_x86.s */
#else
PR_IMPLEMENT(PRInt32)
PR_AtomicIncrement(PRInt32 *val)
{
return _PR_MD_ATOMIC_INCREMENT(val);
}
...
#endif
2. mozilla/nsprpub/pr/include/md/_linux.h:
remove all lines containing _PR_x86_Atomic.
3. mozilla/nsprpub/pr/src/md/unix/os_Linux_x86.s:
globally change _PR_x86_Atomic to PR_Atomic.
Assignee | ||
Comment 62•20 years ago
|
||
Comment on attachment 148377 [details] [diff] [review]
apply NS_HIDDEN to internal NSPR atomic functions
I agree, that would be more efficient.
Attachment #148377 -
Attachment is obsolete: true
Attachment #148377 -
Flags: review?(wtchang)
Updated•18 years ago
|
QA Contact: xpcom
Comment 63•18 years ago
|
||
I'm going to say this is FIXED, since we're actually at the point where we're mostly only exporting frozen symbols. Let's file followups for the atomic operations and whatnot.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•