Closed Bug 224331 Opened 21 years ago Closed 21 years ago

inline nsIContent document and parent getters

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: bryner, Assigned: bryner)

Details

(Keywords: perf)

Attachments

(1 file)

I got about a 1.5% speedup on page load by inlining GetDocument and GetParent on
nsIContent.  To deal with GenericDOMDataNode overloading the low two bits of the
parent pointer, I just always mask those in GetParent().  I don't believe that's
going to slow anything down since a single AND instruction is still cheaper than
a virtual function call.  I also inline an overloaded GetParent() in the
subclasses that don't use those bits for anything, so that they don't pay a penalty.

I also made SetParent() return void since no one seems to care about the return
value.
Attached patch patchSplinter Review
Attachment #134581 - Flags: superreview?(jst)
Attachment #134581 - Flags: review?(bugmail)
When you replaced mParent with GetParent(), there are now several functions that
call GetParent() multiple times.  Should these all be using temporaries?
It shouldn't matter, the function should be inlined.
Comment on attachment 134581 [details] [diff] [review]
patch

sr=jst, though as we discussed (ofline) we might end up undoing the mDocument
part of this of we eliminate mDocument completely from our text nodes (to cut
their size down).
Attachment #134581 - Flags: superreview?(jst) → superreview+
Index: xul/content/src/nsXULElement.h
===================================================================
+    nsIContent*  GetParent() const {

random nit, you have an extra space after nsIContent* ;)
Comment on attachment 134581 [details] [diff] [review]
patch

I *think* you might want to explicitly mark the GetParent-implementations as
inline, otherwise some compilers (gcc) won't always inline it.

>-  NS_IMETHOD SetParent(nsIContent* aParent) = 0;
>+  NS_IMETHOD_(void) SetParent(nsIContent* aParent) = 0;
I wouldn't mind seing a default implementation here, not sure if it should set
all of mParent or if it should leave the two lower bits untouched. Not required
though.

>Index: html/content/src/nsGenericHTMLElement.cpp
>@@ -4147,32 +4139,26 @@ nsGenericHTMLContainerFormElement::SetDo
>                                                PRBool aDeep,
>                                                PRBool aCompileEventHandlers)
> {
>-  nsresult rv = NS_OK;
>-
::SetDocument should still return an errorcode, right? (Though i'm not sure if
it should, but the .h file still says so). Same in
nsGenericHTMLLeafFormElement::SetDocument

> nsXULElement::GetPreviousSibling(nsIDOMNode** aPreviousSibling)
> {
>-    if (nsnull != mParent) {
>-        PRInt32 pos = mParent->IndexOf(this);
>+    if (nsnull != GetParent()) {
Please change this to |if (GetParent())|.

With that, r=sicking
Attachment #134581 - Flags: review?(bugmail) → review+
I believe that according to the C++ standard, member functions defined in a
class are automatically "inline" and an explicit "inline" is redundant.

Anyway, I don't think there's any need to "force" methods to be inline in
general. I would rather trust the compiler to make sensible decisions about when
to inline functions. If gcc is failing to inline trivial methods in optimized
builds, then we have a problem with gcc and should get the problem fixed.
Also there is really no need for this NS_IMETHOD_(...) stuff for methods that
aren't meant to be exported from gklayout or usable from XPCOM.
roc: regarding the NS_IMETHOD_(foo) stuff, I just wanted to keep this patch
somewhat self-contained.  We should go back later and tidy up the interface a bit.
I don't have a copy of the C++ standard, but someone I know does, and here's the
quote:

> from the spec (7.1.2, par. 3): 
>
>  A function defined within a class definition is an inline
>  function. The inline specifier shall not appear on a block scope
>  function declaration.
Checked in.  Per sicking's suggestion, I added default implementations of
SetDocument() and SetParent() on nsIContent (they simply set the members).  This
allowed some code to be removed elsewhere.  It's purely aesthetic, of course,
since the nsIContent version is inlined.

Also, I had to change the |static const int| to an enum to satisfy MSVC.  I'm
not 100% sure that was legal to start with.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
> Also there is really no need for this NS_IMETHOD_(...) stuff for methods that
> aren't meant to be exported from gklayout or usable from XPCOM.

roc: I'm wondering, what does that mean for a) consumers and b) implementors of
nsIContent outside of gklayout?

Looking at the (early) Tinderbox Tp numbers: Windows seems to have gone up by
about 1%, OS X stays the same and Linux goes down by .5%, so right now this
doesn't look like a clear win.
> roc: I'm wondering, what does that mean for a) consumers and b) implementors of
> nsIContent outside of gklayout?

Nothing, as far as I know. You should be able to make and implement plain
virtual calls across shared library boundaries on all platforms+compilers. I
believe we already rely on this elsewhere. The only limitation I can think of is
that you'll need to use the same compiler to build gklayout and transformiix,
because we're not forcing the calling convention to be anything in particular
(all that NS_IMETHOD_(...) does is add "__stdcall" on Windows).
Ok, cool. I'm not asking this only from a transformiix viewpoint btw ;-).
Not sure if creature's Tp increase is something to be relied upon, the first
increase (from 215 to 220) happened before it picked up these changes.
I still favour having components that must use Gecko-private interfaces like
nsIContent and nsIFrame being linked into gklayout.
*Sigh*. We've already been through this: then we must provide a suitable
alternative to nsIContent, nsIDocument and nsINodeInfo for performance-critical
access to our content model to non-gklayout linked components (and no, the
nsIDOM* ones aren't it).
Sure, but what are these mysterious components and why can't they be linked into
gklayout?
On separate notes:

Regarding comment 7: Are you sure about that? Before the checkin of bug 224331
we actually saw those functions show up in instructiontraces (not meaning that
they took long time, but meaning that they got compiled into actual functions).
This was on gcc-builds.

Regarding comment 12: Do we really want to do this [bug] if it doesn't improve Tp?
> we actually saw those functions show up

Which functions? Opt builds or debug builds? At call sites that could be
inlined, or call sites doing indirect calls (virtual or through function pointers)?

I'm sure about what the standard says. I'm pretty sure gcc follows the standard
and treats a defined-in-class method just as if you'd written 'inline'. I'm not
so sure that gcc will actually always inline such methods, but if it doesn't,
adding 'inline' won't help.
The functions in question are the ones that gets marked 'inline' in the patch in
bug 206338 (sorry, posted wrong bugnumber in my previous comment), for example
|txStack::push|. The funtions are not virtual or called through functionpointers
so there are no obvious reason not to inline the function.

I'm not sure if adding the 'inline' keyword actually forced the functions to get
inlined since we didn't look at instructiontraces after the patch landed. But we
did see a 1% speedup by just adding the keyword as well as bypass functions that
are supposed to get inlined in nsVoidArray. I.e. calling |InsertElementAt|
rather then |AppendElement|.
OK, so I just did a little experiment. I compiled txExecutionState.cpp with 3
versions of the txStack class:
1) the trunk code, with "inline" removed from all methods and reverting "push"
to call AppendElement
2) just rewriting AppendElement to InsertElementAt in "push"
3) As 2) but adding "inline" to the methods (i.e., the trunk code)
I didn't run the code, I just generated the assembler output for all three
versions. I'm using gcc 3.2.2 -O3 -DDEBUG which means assertions are on, but I
don't see why that could have any effect.

The output for versions 2 and 3 is absolutely identical, as I expected. In
particular all 6 calls to the push() method are inlined down to an
InsertElementAt call with some glue around it.

The code in version 2 is actually bigger than version 1, and I presume slower,
because it has to convert the result of InsertElementAt to an nsresult using a
conditional (although of course that fixes a bug).

So I'm not sure where 1% could have come from moving from 1 to 3.
If you still have those "instruction traces", whatever they are, maybe we should
take another look...
When you went from 1 to 2 did you also add the "nsresult conversion" code?

I should mention that the actual code in the bug does a bit more then what was
done when the 1% performance measurement was done. The 1% didn't have the
"nsresult conversion" (which probably slows things down) and it didn't remove
the boundscheck in pop() and peek() (which probably speeds things up).

The only thing i can think of that would otherwise have speeded things up was
calling InsertElementAt rather then AppendElement. However that is just inlining
AppendElement by hand which shouldn't be different from the compiler doing it...

I'll try instructiontrace the same three versions and see what i come up with
next week.
> When you went from 1 to 2 did you also add the "nsresult conversion" code?

Yes.
Regarding comment 12 and comment 19, I think if you look at the moving average
on the tinderbox graph you'll see that this does improve Tp on some machines and
has pretty much no effect on some other machines.

It's not clear to me that it increased Tp on creature, and the only scenario I
think think up where things would get slower is if MSVC doesn't behave the same
as gcc with respect to overriding a non-virtual parent class method, as I did
for nsGenericElement and nsXULElement.  That would cause accesses to mParent
from within those classes to be a bit slower.  I'll test that theory in a bit,
and also kick creature which seems to have gone missing in action.  (Even this
idea doesn't seem that likely to me because caladan on the Phoenix page didn't
show any change in Tp on that checkin, and it's a slower machine which should be
more sensitive).
... MSVC definitely behaves as expected when overriding a non-virtual function
from a base class.
> Sure, but what are these mysterious components and why can't they be linked into
> gklayout?

You're not seriously proposing linking stuff like editor or an XML Schema
validator or even transformiix into gklayout?
Bryner: yeah, I think we can safely ignore creature, so this does look like a
win/status quo (a shame that creature acts up like that though :-/).
re comment 28: I could definitely make a case for linking editor into gklayout
-- text form controls don't function at all without it.  It's not really
optional in any sense of the word.
If XML Schema or XSLT need to be really intimate with Gecko internals, and web
authors are going to regard them as part of what is provided by "Gecko", then I
wouldn't object to them being linked into gklayout.

[BTW, bryner, what's the story with merging remaining libraries like widget,
gfx, etc into gklayout?]
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: