Closed Bug 111771 Opened 23 years ago Closed 23 years ago

win32/xptcinvoke.cpp should be streamlined

Categories

(Core :: XPCOM, defect)

x86
Windows 2000
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: jband_mozilla, Assigned: jband_mozilla)

Details

Attachments

(1 file)

jdunn pointed out that it was not optimal (speedwise) to do separate calls out
of XPTC_InvokeByIndex to count the stack words needed and then to copy the
words. We can instead be pessimistic about the space required rather than being
exact - and then just clean up the extra space on the way out.

I've experimented with this and discovered that this can reduce almost half of
the extra call overhead added by XPTC_InvokeByIndex (where call overhead is
measured as the difference betwen calling though XPTC_InvokeByIndex and calling
the target object directly).

With a test I'll attach I see something like this:

(old code...)
Doing 100000000 direct call iterations...
Doing 100000000 invoked call iterations...
 direct took 1.71 seconds
 invoke took 10.48 seconds
 So, invoke overhead was ~ 8.77 seconds (~ 84%)

(new code...)
Doing 100000000 direct call iterations...
Doing 100000000 invoked call iterations...
 direct took 1.69 seconds
 invoke took 6.65 seconds
 So, invoke overhead was ~ 4.96 seconds (~ 75%)

This varies from run to run on my laptop. But, the above is typical. I doubt
this translates into much in the browser. But, every little bit...

The patch adds __declspec(naked) to allow more control over the prolog and
epilog. I also added a short circuit path for cases where the method takes no
params.
This worksforme in debug and release builds. I'll give it a try under Purify and
run with it for a while.
Attached patch proposed changeSplinter Review
Why not just eliminate the use of i

-    for(PRUint32 i = paramCount; i > 0; i--, d++, s++)
+    for(;paramCount > 0; paramCount--, d++, s++)

Under VC++ it generates the same code, not sure about other compilers. Might
generate better code under some compilers.
Comment on attachment 59071 [details] [diff] [review]
proposed change

sr=shaver.  (I don't know the assembly or Win32 calling conventions well 
enough to r= confidently.)
Attachment #59071 - Flags: superreview+
dbradley: sure, why not. That change is in my tree. (Note that we don't compile
this code on other compilers :)

Can you bless this with an r=?
Comment on attachment 59071 [details] [diff] [review]
proposed change

Everything looks fine. I double checked the stack frame setup against the
compiler, just to make sure. There's so many ways to setup the frame on Intel.
Attachment #59071 - Flags: review+
checked in.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: