For the Win32 implementation, the parameter count is checked before entry to the function, so there's no need to use a for statement for this loop. I doubt this will make any substantial impact, but it's an easy change. I suspect similar changes could be made to other platforms. A quick glance showed several platforms don't check the paramCount before entering the function. Another idea is to just inline this function since it is only used in one spot. Unfortunately the compiler won't inline functions in code that contains __asm. So the only option would be to completely rewrite the function's code in assembler and put it within XPTC_InvokeByIndex.
Created attachment 122965 [details] [diff] [review] Patch to slightly optimize the invoke_copy_to_stack function for win32 This patch eliminates the initial test of paramCount, as well as removes the continue statement. The continue statement causes the optimizer to generate slight larger and slower code, an extra instruction or two, though the assembler output from the non-continue version is much more intuitive.
The gcc version had far fewer cases but also explicitly uses a temporary variable for(PRUint32 i = paramCount; i >0; i--, d++, s++) ... That might produce better code. Is there some reason the win32 version has such a large switch statement? I thought about writing invoke_copy_to_stack in assembler for gcc but that means embedding knowledge of the internals of nsXPTCVariant. That didn't seem worth the trouble given gcc's penchant for changing things from one version to another. Besides, gcc generates efficient code You bring up a good point about paramCount == 0. Would you like a gcc version that tests for that in XPTC_InvokeByIndex?
Not sure, why, other than just to be safe. VC++ boils the switch down to the 3 paths. It's interesting that the gcc version separated out the T_I64 and T_U64 but consolidated the 32 bit and less integers. I agree, I think inlining the function is too risky. If you want to provide a patch for the gcc version that would be great. It's not that big of a deal. I just noticed it when I was working on another bug and wanted to get the patch recorded. Figured I or someone else could when they had time look at the other platforms. Right now I only have Win32 and Mac available to me.
Okay, I'll give it a whirl, but not right now. What should XPTC_InvokeByIndex return on error? FWIW, gcc 2.95.3 handles T_DOUBLE by pushing it onto the fpu stack and popping into the destination. Gcc 3.2.x treats is just like T_I64 and T_U64. Just an interesting observation.
When you say what should it return on error, do you mean an error from the method invoked, or an internal error in the function? If the method it calls fails, that returns an nsresult which is just passed through. If the function itself fails, you could return NS_ERROR_FAILURE or another NS_ERROR code if more appropriate.
I was under the impression that XPTC_InvokeByIndex was supposed to be called with at least one param. That turns out not to be true for Linux. There are lots of such calls, so an error probably isn't right. There is a small gain by not calling invoke_copy_to_stack with paramCount = 0 so I guess it's worth doing. Maybe this week if I have time. If paramCount = 0 is supposed to be an error then something more serious is going on.
The parameters are determined by paramCount. When the actual function gets invoked, "this" may be added parameter as well, but that doesn't count for the invoke_copy_to_stack call, since it is pushed separately. Depending on platform, "this" is passed either via register or pushed on the stack as an additional function parameter.
Moving out, speak up if you believe this needs to be considered for 1.5b