The attached patch translates JM's getNewObject() allocation code to IonMonkey. SS-1.0's access-binary-trees executes in ~3ms instead of ~6ms, but -- apart from DeltaBlue, which sees an improvement from 495 -> 483ms -- all other tests appear to be unaffected by inline allocation, or appear to very slightly regress.
This is a bit of a mystery.
Created attachment 610367 [details] [diff] [review]
Created attachment 610737 [details] [diff] [review]
'perf stat' shows this patch eliminating ~100,000,000 instructions from Earley-Boyer execution, but the suites are so large that allocation from Ion context appears to be but a small component of the total runtime. Further measurements show small improvements in all benchmarks.
Comment on attachment 610737 [details] [diff] [review]
Review of attachment 610737 [details] [diff] [review]:
@@ +907,5 @@
> + pushArg(protoReg);
> + pushArg(calleeReg);
> + if (!callVM(CreateThisInfo, lir))
> + return false;
Looks good, but if we have a template object, this should be in the out-of-line path instead (you can just bind the inline failure to ool->entry()). We don't want to make LCreateThis as a call either, since we expect to take the fast path.
@@ +381,5 @@
> + storePtr(ImmWord(emptyObjectElements), Address(result, JSObject::offsetOfElements()));
> + }
> + storePtr(ImmWord(templateObject->lastProperty()), Address(result, JSObject::offsetOfShape()));
> + storePtr(ImmWord(templateObject->type()), Address(result, JSObject::offsetOfType()));
Both of these stores should use ImmGCPtr instead.
Created attachment 611614 [details] [diff] [review]
Inline allocation, v2.
Implements above changes. Unmarking as a call and moving to OOL appears to have made v8-deltablue run slightly more slowly (~15ms out of 485), but 'perf stat' shows a strict reduction in cycles and branches.