In many cases targets of method invocations are either constant or change infrequently. Avoiding indirect calls may improve performance (even if the caller has to pass MethodEnv).
Created attachment 437678 [details] [diff] [review] Devirtualize calls to final, jit-compiled methods. Early experiment, no promises.
nice. one problem with the patch: initially, even jit methods are not verified, and get verified on first call. So in your patch, I think the patch has a chance of directly calling the verify trampoline. (and re-calling it over and over). why verify native methods on first call? well, there's no good reason now, but we just to jit-compile thunks. Now, the if(isNative()) branch in MethodInfo::verify() installs the thunk pointers, and there are no failure modes, IIRC. Possible fix: The if(isNative()) branch in MethodInfo::verify() can be hoisted to resolve time. (I did this in the now-defunct jit-cache experiment last year). Every method is guaranteed to be resolved before being a) early bound to at a call site, or b) verified. By setting up the thunk addresses at resolve time, an early bound call site to a final native method will see the final value of MethodEnv.implGPR at JIT time. in principle its also possible to inline the thunk, but thats another experiment. there's another very minor gotcha: when debugger is enabled, the address called will be debugEnterExitWrapper32, which then calls the underlying native method whose address is stored in a different field. directly calling this wrapper probably doesn't help much, but also doesn't hurt much either.
never comment before coffee: i read "jit compiled" as "native". the patch should work fine for jit-compiled methods, although we'll need to patch that call address when deopt comes along. and of course it will work for native calls to if we hoist the if(isNative()) branch in MI::verify() like i mentioned.
Priority: -- → P3
Target Milestone: --- → flash10.2
Created attachment 438213 [details] [diff] [review] handling of native, call target patching. Messy proof of concept, x86-only patching, one acceptance test fails.
Attachment #437678 - Attachment is obsolete: true
Created attachment 438607 [details] [diff] [review] page protection shenanigans workaround, all acceptance tests pass (obviously not an acceptable solution though).
Attachment #438213 - Attachment is obsolete: true
Any preliminary performance #s to report?
Preliminary results are flat, I suspect that too few call sites can be devirtualized on larger benchmarks, and microbenchmarks are so small, that hardware support for indirect calls is adequate (e.g. branch target buffer misses are rare). More call sites could be devirtualized if we either had OSR or did preexistence analysis (I was considering doing the latter). Also, the calling sequence is so long (bug 511873) that perhaps the effects of devirutalization won't be seen until it's pared down.
Created attachment 441642 [details] [diff] [review] better handling of page protection. Saves code allocation info with the patch address so that the appropriate chunk can be unprotected for the duration of patching.
Attachment #438607 - Attachment is obsolete: true
bug 563146 will set up native method invocation ponters sooner, so that when early binding to a final native method, we can call it directly with no patching and no indirection.
(In reply to comment #9) The patch already tries to do that, but it makes sense to factor out.
Is this still valid?
Assignee: nobody → kpalacz
Target Milestone: Q3 11 - Serrano → Q1 12 - Brannan
Not actively worked on at the moment, but valid.
Krzys, shall this bug be deferred to Cyril?
yes, it's a performance enhancement, not actively worked on.
Dropping from Andre watch list, as it has target milestone.
Status: NEW → RESOLVED
Last Resolved: 6 days ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.