invoke_copy_to_stack can be slightly optimized

RESOLVED INCOMPLETE

Status

()

Core
XPCOM
P3
enhancement
RESOLVED INCOMPLETE
15 years ago
5 years ago

People

(Reporter: David Bradley, Unassigned)

Tracking

({perf})

Trunk
mozilla1.7alpha
x86
Windows XP
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

15 years ago
For the Win32 implementation, the parameter count is checked before entry to the
function, so there's no need to use a for statement for this loop. I doubt this
will make any substantial impact, but it's an easy change.

I suspect similar changes could be made to other platforms. A quick glance
showed several platforms don't check the paramCount before entering the function.

Another idea is to just inline this function since it is only used in one spot.
Unfortunately the compiler won't inline functions in code that contains __asm.
So the only option would be to completely rewrite the function's code in
assembler and put it within XPTC_InvokeByIndex.
(Reporter)

Comment 1

15 years ago
Created attachment 122965 [details] [diff] [review]
Patch to slightly optimize the invoke_copy_to_stack function for win32

This patch eliminates the initial test of paramCount, as well as removes the
continue statement. The continue statement causes the optimizer to generate
slight larger and slower code, an extra instruction or two, though the
assembler output from the non-continue version is much more intuitive.

Comment 2

15 years ago
The gcc version had far fewer cases but also explicitly uses a temporary
variable
   for(PRUint32 i = paramCount; i >0; i--, d++, s++)
        ...
That might produce better code. Is there some reason the win32 version
has such a large switch statement?

I thought about writing invoke_copy_to_stack in assembler for gcc but
that means embedding knowledge of the internals of nsXPTCVariant. That
didn't seem worth the trouble given gcc's penchant for changing things
from one version to another. Besides, gcc generates efficient code

You bring up a good point about paramCount == 0. Would you like a gcc
version that tests for that in XPTC_InvokeByIndex?
(Reporter)

Comment 3

15 years ago
Not sure, why, other than just to be safe. VC++ boils the switch down to the 3
paths. It's interesting that the gcc version separated out the T_I64 and T_U64
but consolidated the 32 bit and less integers.

I agree, I think inlining the function is too risky.

If you want to provide a patch for the gcc version that would be great.

It's not that big of a deal. I just noticed it when I was working on another bug
and wanted to get the patch recorded. Figured I or someone else could when they
had time look at the other platforms. Right now I only have Win32 and Mac
available to me.
Severity: normal → enhancement
Status: NEW → ASSIGNED
Priority: -- → P3
Target Milestone: --- → mozilla1.5alpha

Comment 4

15 years ago
Okay, I'll give it a whirl, but not right now. What should 
XPTC_InvokeByIndex return on error?

FWIW, gcc 2.95.3 handles T_DOUBLE by pushing it onto the fpu stack and 
popping into the destination. Gcc 3.2.x treats is just like T_I64 and 
T_U64. Just an interesting observation.
(Reporter)

Comment 5

15 years ago
When you say what should it return on error, do you mean an error from the
method invoked, or an internal error in the function? If the method it calls
fails, that returns an nsresult which is just passed through. If the function
itself fails, you could return NS_ERROR_FAILURE or another NS_ERROR code if more
appropriate.

Comment 6

15 years ago
I was under the impression that XPTC_InvokeByIndex was supposed to be 
called with at least one param. That turns out not to be true for Linux. 
There are lots of such calls, so an error probably isn't right. There is 
a small gain by not calling invoke_copy_to_stack with paramCount = 0 so 
I guess it's worth doing. Maybe this week if I have time.

If paramCount = 0 is supposed to be an error then something more serious 
is going on.
(Reporter)

Comment 7

15 years ago
The parameters are determined by paramCount. When the actual function gets
invoked, "this" may be added parameter as well, but that doesn't count for the
invoke_copy_to_stack call, since it is pushed separately. Depending on platform,
"this" is passed either via register or pushed on the stack as an additional
function parameter.
(Reporter)

Updated

15 years ago
Target Milestone: mozilla1.5alpha → mozilla1.5beta
(Reporter)

Comment 8

15 years ago
Moving out, speak up if you believe this needs to be considered for 1.5b
Target Milestone: mozilla1.5beta → mozilla1.6alpha
(Reporter)

Comment 9

14 years ago
Moving out
Target Milestone: mozilla1.6alpha → mozilla1.7alpha
Assignee: dbradley → nobody
Status: ASSIGNED → NEW
QA Contact: scc → xpcom

Updated

5 years ago
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.