Closed Bug 224331 Opened 21 years ago Closed 21 years ago

inline nsIContent document and parent getters

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: bryner, Assigned: bryner)

Details

(Keywords: perf)

Attachments

(1 file)

I got about a 1.5% speedup on page load by inlining GetDocument and GetParent on nsIContent. To deal with GenericDOMDataNode overloading the low two bits of the parent pointer, I just always mask those in GetParent(). I don't believe that's going to slow anything down since a single AND instruction is still cheaper than a virtual function call. I also inline an overloaded GetParent() in the subclasses that don't use those bits for anything, so that they don't pay a penalty. I also made SetParent() return void since no one seems to care about the return value.
Attached patch patch — — Splinter Review
Attachment #134581 - Flags: superreview?(jst)
Attachment #134581 - Flags: review?(bugmail)
When you replaced mParent with GetParent(), there are now several functions that call GetParent() multiple times. Should these all be using temporaries?
It shouldn't matter, the function should be inlined.
Comment on attachment 134581 [details] [diff] [review] patch sr=jst, though as we discussed (ofline) we might end up undoing the mDocument part of this of we eliminate mDocument completely from our text nodes (to cut their size down).
Attachment #134581 - Flags: superreview?(jst) → superreview+
Index: xul/content/src/nsXULElement.h =================================================================== + nsIContent* GetParent() const { random nit, you have an extra space after nsIContent* ;)
Comment on attachment 134581 [details] [diff] [review] patch I *think* you might want to explicitly mark the GetParent-implementations as inline, otherwise some compilers (gcc) won't always inline it. >- NS_IMETHOD SetParent(nsIContent* aParent) = 0; >+ NS_IMETHOD_(void) SetParent(nsIContent* aParent) = 0; I wouldn't mind seing a default implementation here, not sure if it should set all of mParent or if it should leave the two lower bits untouched. Not required though. >Index: html/content/src/nsGenericHTMLElement.cpp >@@ -4147,32 +4139,26 @@ nsGenericHTMLContainerFormElement::SetDo > PRBool aDeep, > PRBool aCompileEventHandlers) > { >- nsresult rv = NS_OK; >- ::SetDocument should still return an errorcode, right? (Though i'm not sure if it should, but the .h file still says so). Same in nsGenericHTMLLeafFormElement::SetDocument > nsXULElement::GetPreviousSibling(nsIDOMNode** aPreviousSibling) > { >- if (nsnull != mParent) { >- PRInt32 pos = mParent->IndexOf(this); >+ if (nsnull != GetParent()) { Please change this to |if (GetParent())|. With that, r=sicking
Attachment #134581 - Flags: review?(bugmail) → review+
I believe that according to the C++ standard, member functions defined in a class are automatically "inline" and an explicit "inline" is redundant. Anyway, I don't think there's any need to "force" methods to be inline in general. I would rather trust the compiler to make sensible decisions about when to inline functions. If gcc is failing to inline trivial methods in optimized builds, then we have a problem with gcc and should get the problem fixed.
Also there is really no need for this NS_IMETHOD_(...) stuff for methods that aren't meant to be exported from gklayout or usable from XPCOM.
roc: regarding the NS_IMETHOD_(foo) stuff, I just wanted to keep this patch somewhat self-contained. We should go back later and tidy up the interface a bit.
I don't have a copy of the C++ standard, but someone I know does, and here's the quote: > from the spec (7.1.2, par. 3): > > A function defined within a class definition is an inline > function. The inline specifier shall not appear on a block scope > function declaration.
Checked in. Per sicking's suggestion, I added default implementations of SetDocument() and SetParent() on nsIContent (they simply set the members). This allowed some code to be removed elsewhere. It's purely aesthetic, of course, since the nsIContent version is inlined. Also, I had to change the |static const int| to an enum to satisfy MSVC. I'm not 100% sure that was legal to start with.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
> Also there is really no need for this NS_IMETHOD_(...) stuff for methods that > aren't meant to be exported from gklayout or usable from XPCOM. roc: I'm wondering, what does that mean for a) consumers and b) implementors of nsIContent outside of gklayout? Looking at the (early) Tinderbox Tp numbers: Windows seems to have gone up by about 1%, OS X stays the same and Linux goes down by .5%, so right now this doesn't look like a clear win.
> roc: I'm wondering, what does that mean for a) consumers and b) implementors of > nsIContent outside of gklayout? Nothing, as far as I know. You should be able to make and implement plain virtual calls across shared library boundaries on all platforms+compilers. I believe we already rely on this elsewhere. The only limitation I can think of is that you'll need to use the same compiler to build gklayout and transformiix, because we're not forcing the calling convention to be anything in particular (all that NS_IMETHOD_(...) does is add "__stdcall" on Windows).
Ok, cool. I'm not asking this only from a transformiix viewpoint btw ;-).
Not sure if creature's Tp increase is something to be relied upon, the first increase (from 215 to 220) happened before it picked up these changes.
I still favour having components that must use Gecko-private interfaces like nsIContent and nsIFrame being linked into gklayout.
*Sigh*. We've already been through this: then we must provide a suitable alternative to nsIContent, nsIDocument and nsINodeInfo for performance-critical access to our content model to non-gklayout linked components (and no, the nsIDOM* ones aren't it).
Sure, but what are these mysterious components and why can't they be linked into gklayout?
On separate notes: Regarding comment 7: Are you sure about that? Before the checkin of bug 224331 we actually saw those functions show up in instructiontraces (not meaning that they took long time, but meaning that they got compiled into actual functions). This was on gcc-builds. Regarding comment 12: Do we really want to do this [bug] if it doesn't improve Tp?
> we actually saw those functions show up Which functions? Opt builds or debug builds? At call sites that could be inlined, or call sites doing indirect calls (virtual or through function pointers)? I'm sure about what the standard says. I'm pretty sure gcc follows the standard and treats a defined-in-class method just as if you'd written 'inline'. I'm not so sure that gcc will actually always inline such methods, but if it doesn't, adding 'inline' won't help.
The functions in question are the ones that gets marked 'inline' in the patch in bug 206338 (sorry, posted wrong bugnumber in my previous comment), for example |txStack::push|. The funtions are not virtual or called through functionpointers so there are no obvious reason not to inline the function. I'm not sure if adding the 'inline' keyword actually forced the functions to get inlined since we didn't look at instructiontraces after the patch landed. But we did see a 1% speedup by just adding the keyword as well as bypass functions that are supposed to get inlined in nsVoidArray. I.e. calling |InsertElementAt| rather then |AppendElement|.
OK, so I just did a little experiment. I compiled txExecutionState.cpp with 3 versions of the txStack class: 1) the trunk code, with "inline" removed from all methods and reverting "push" to call AppendElement 2) just rewriting AppendElement to InsertElementAt in "push" 3) As 2) but adding "inline" to the methods (i.e., the trunk code) I didn't run the code, I just generated the assembler output for all three versions. I'm using gcc 3.2.2 -O3 -DDEBUG which means assertions are on, but I don't see why that could have any effect. The output for versions 2 and 3 is absolutely identical, as I expected. In particular all 6 calls to the push() method are inlined down to an InsertElementAt call with some glue around it. The code in version 2 is actually bigger than version 1, and I presume slower, because it has to convert the result of InsertElementAt to an nsresult using a conditional (although of course that fixes a bug). So I'm not sure where 1% could have come from moving from 1 to 3.
If you still have those "instruction traces", whatever they are, maybe we should take another look...
When you went from 1 to 2 did you also add the "nsresult conversion" code? I should mention that the actual code in the bug does a bit more then what was done when the 1% performance measurement was done. The 1% didn't have the "nsresult conversion" (which probably slows things down) and it didn't remove the boundscheck in pop() and peek() (which probably speeds things up). The only thing i can think of that would otherwise have speeded things up was calling InsertElementAt rather then AppendElement. However that is just inlining AppendElement by hand which shouldn't be different from the compiler doing it... I'll try instructiontrace the same three versions and see what i come up with next week.
> When you went from 1 to 2 did you also add the "nsresult conversion" code? Yes.
Regarding comment 12 and comment 19, I think if you look at the moving average on the tinderbox graph you'll see that this does improve Tp on some machines and has pretty much no effect on some other machines. It's not clear to me that it increased Tp on creature, and the only scenario I think think up where things would get slower is if MSVC doesn't behave the same as gcc with respect to overriding a non-virtual parent class method, as I did for nsGenericElement and nsXULElement. That would cause accesses to mParent from within those classes to be a bit slower. I'll test that theory in a bit, and also kick creature which seems to have gone missing in action. (Even this idea doesn't seem that likely to me because caladan on the Phoenix page didn't show any change in Tp on that checkin, and it's a slower machine which should be more sensitive).
... MSVC definitely behaves as expected when overriding a non-virtual function from a base class.
> Sure, but what are these mysterious components and why can't they be linked into > gklayout? You're not seriously proposing linking stuff like editor or an XML Schema validator or even transformiix into gklayout?
Bryner: yeah, I think we can safely ignore creature, so this does look like a win/status quo (a shame that creature acts up like that though :-/).
re comment 28: I could definitely make a case for linking editor into gklayout -- text form controls don't function at all without it. It's not really optional in any sense of the word.
If XML Schema or XSLT need to be really intimate with Gecko internals, and web authors are going to regard them as part of what is provided by "Gecko", then I wouldn't object to them being linked into gklayout. [BTW, bryner, what's the story with merging remaining libraries like widget, gfx, etc into gklayout?]
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: