Closed
Bug 740167
Opened 12 years ago
Closed 12 years ago
IonMonkey: 15x slower on reduced kraken audio-beat-detection benchmark
Categories
(Core :: JavaScript Engine, defect)
Core
JavaScript Engine
Tracking
()
RESOLVED
FIXED
People
(Reporter: h4writer, Assigned: h4writer)
Details
Attachments
(1 file, 1 obsolete file)
In order to find where we still can improve IonMonkey I reduced kraken audio-beat-detection. This gives me a rather small testcase pinpointing possible improvements IonMonkey still lacks. IonMonkey is 15x slower on following benchmark (to methodJIT): BeatDetektor = function() { this.a_freq_range = new Array(128); this.ma_freq_range = new Array(128); }; BeatDetektor.prototype.process = function() { this.last_update = 1; var range = 0; for (x = 0; x < 512000; x += 1) { this.a_freq_range[range] = 0; this.ma_freq_range[range] -= (this.ma_freq_range[range] - this.a_freq_range[range]) * this.last_update * 12.0; } } var bd = new BeatDetektor(); bd.process(); The optimization we lack is hoisting property get/sets. In JaegerMonkey we hoist the getproperty to the start of the loop and save it in a temporary value. The set property is hoisted till after the loop. That way we don't need to do the slow set/get property during the loop. (Except if we call another function or something or bailout or ...)
Assignee | ||
Comment 1•12 years ago
|
||
Ah forgot to mention. The logic of jaegermonkey is in methodjit/LoopState.cpp
Assignee | ||
Updated•12 years ago
|
Summary: IonMonkey: implement hoisting getproperty/setproperty → IonMonkey: 15x slower on reduced kraken audio-beat-detection benchmark
Assignee | ||
Comment 2•12 years ago
|
||
Ok, I was a bit fast with my explanation and it wasn't right. The following shows the problem better: var ma_freq_range = new Array(128); //ma_freq_range[0] = 0 function process() { for (var x = 0; x < 512000; x += 1) { ma_freq_range[0] = ma_freq_range[0]*12; } } process(); If you uncomment the second line it will be as fast as JM. So because ma_freq_range isn't initialized the body of the loop isn't as optimized as it could: loadelement37-vn32loadelement elements29-vn29 constant0-vn7 mul38-vn34 mul loadelement37-vn32 constant0-vn33 storeelement39-vn39storeelement elements29-vn29 constant0-vn7 mul38-vn34 add40-vn41 add phi33-vn17 constant0-vn40 goto98-vn98 goto VS slots33-vn21 slots constant32-vn20 loadslot34-vn22 loadslot slots33-vn21 elements35-vn29 elements loadslot34-vn22 initializedlength36-vn30initializedlength elements35-vn29 loadelementhole37-vn31loadelementhole elements35-vn29 constant0-vn7 initializedlength36-vn30 mul41-vn34 mul loadelementhole37-vn31 box39-vn33 elements45-vn36 elements loadslot34-vn22 storeelementhole37-vn37storeelementhole loadslot34-vn22 elements45-vn36 constant0-vn7 mul41-vn34 add48-vn39 add phi28-vn17 constant0-vn38 goto90-vn90 goto Not sure what the best solution is.
How does Crankshaft do? JM has a pretty specific optimization for this case, I think it updates cached pointers to invalidated slots vectors. I can't think of any easy way to do this for IonMonkey yet.
Assignee | ||
Comment 4•12 years ago
|
||
V8: 46ms JM: 22ms IM: 287ms IM patched: 10ms So I've found the problem. TI returns "void/int/double" for the mul as lhs operand (getelem) and "int32" for rhs. This means lhs is translated into a MIRType_Value and we do the VM-call "Multiply". The performance difference here is the overhead of the call! In the best case MBinaryArithInstruction should be able to distinguish between MIRType_Value containing only elements from {MIRType_Undefined, MIRType_Null, MIRType_Int32, MIRType_Double, MIRType_Boolean} or also containing other. - Maybe introducing MIRType_Number_Value? - Or make it possible to view what the types exactly are in MIRType_Value? (To test speed difference I just adjusted TypeInferenceOracle::getMIRType to check the baseFlags for TYPE_FLAG_UNDEFINED | TYPE_FLAG_INT32 | TYPE_FLAG_DOUBLE to return MIRType_Double) Note: looks like by reducing the testcase it lost the real speed difference between JM and IM in the kraken audio-beat-detection benchmark. My quick fix doesn't introduce a speed difference on the full audio-beat-detection benchmark. So when I find some time I will reduce it again and see if I find something else.
Assignee | ||
Comment 5•12 years ago
|
||
Looks like my suggestion in previous comment is already possible, as MCompare does that. So this patch adjusts the infer function of MBinaryArithInstruction to get TypeOracle::BinaryTypes instead of TypeOracle::Binary. Using BinaryTypes it is possible to determine what types are used in MIRType_Value. V8: 53ms JM+TI: 13ms IM: 13ms
Assignee: general → hv1989
Attachment #611826 -
Flags: review?(dvander)
Comment on attachment 611826 [details] [diff] [review] Specialize binary arith to double if result is double and lhs/rhs only contains double/int/undefined/null Review of attachment 611826 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/ion/MIR.cpp @@ +805,5 @@ > + if ((lhs >= MIRType_String && > + (b.lhsTypes->baseFlags() == 0 || b.lhsTypes->baseFlags() >= types::TYPE_FLAG_STRING)) || > + (rhs >= MIRType_String && > + (b.rhsTypes->baseFlags() == 0 || b.rhsTypes->baseFlags() >= types::TYPE_FLAG_STRING))) { > + if (rval != MIRType_Double) { These tests are kind of complicated - it would be good to use local variables to shorten it a little. It looks like the string check is duplicated, and it looks like (Object + Object => Double) might accidentally get a double specialization.
Assignee | ||
Comment 7•12 years ago
|
||
Adjusted the check to use local variables and it now contains the correct logic.
Attachment #611826 -
Attachment is obsolete: true
Attachment #611826 -
Flags: review?(dvander)
Attachment #612445 -
Flags: review?(dvander)
Comment on attachment 612445 [details] [diff] [review] Specialize binary arith to double if result is double and lhs/rhs only contains double/int/undefined/null Review of attachment 612445 [details] [diff] [review]: ----------------------------------------------------------------- r=me with first two comments addressed. ::: js/src/ion/MIR.cpp @@ +802,5 @@ > + > + // Test if lhs contain only undefined/null/int32/double types > + types::TypeFlags lhsFlags = b.lhsTypes->baseFlags(); > + bool lhsIsNumber = lhsFlags != 0 && lhsFlags < types::TYPE_FLAG_STRING; > + lhsIsNumber = lhsIsNumber && b.lhsTypes->getObjectCount() == 0; This still seems fairly complicated. Instead, I'd add a function to TypeSet to answer the question you're trying to ask, also to avoid inlining TI internals here. @@ +809,5 @@ > + bool rhsIsNumber = true; > + if (b.rhsTypes) { > + types::TypeFlags rhsFlags = b.rhsTypes->baseFlags(); > + rhsIsNumber = rhsFlags != 0 && rhsFlags < types::TYPE_FLAG_STRING; > + rhsIsNumber = rhsIsNumber && b.rhsTypes->getObjectCount() == 0; Same here. @@ +820,5 @@ > } > > + // Don't specialize values (having only undefined/null/int32/double) when result isn't Double > + if (lhs == MIRType_Value || rhs == MIRType_Value) { > + if (rval != MIRType_Double) { At this point, we're guaranteed that lhs and rhs don't contain Object or String, right? @@ +842,1 @@ > specialization_ = MIRType_None; Right. Important not to specialize to Double here since that would conflict with TI.
Attachment #612445 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 9•12 years ago
|
||
(In reply to David Anderson [:dvander] from comment #8) > This still seems fairly complicated. Instead, I'd add a function to TypeSet > to answer the question you're trying to ask, also to avoid inlining TI > internals here. Done. It's called "knownNonStringPrimitive". This matched the other functions (knownNonEmpty/KnownSubset) in TypeSet best. > @@ +820,5 @@ > > } > > > > + // Don't specialize values (having only undefined/null/int32/double) when result isn't Double > > + if (lhs == MIRType_Value || rhs == MIRType_Value) { > > + if (rval != MIRType_Double) { > > At this point, we're guaranteed that lhs and rhs don't contain Object or > String, right? Yes, the first check removes any type containing Object/String. (before the check listed here). The check listed here removes any not specialized input that can coerce to double, but TI report non-double return value. Maybe the comment made you doubt? I've adjusted it to be more clear. http://hg.mozilla.org/projects/ionmonkey/rev/ad6e00aaa6cd
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•