Closed
Bug 738873
Opened 12 years ago
Closed 12 years ago
IonMonkey: Optimize truncates to have fewer bailouts
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: mjrosenb, Unassigned)
References
Details
Attachments
(2 files, 2 obsolete files)
241 bytes,
application/javascript
|
Details | |
23.40 KB,
patch
|
sstangl
:
review+
|
Details | Diff | Splinter Review |
This optimization looks for (a+b)|0 (or any other operation that has an implicit truncateToInt32 in it, paired with an operation that has a non-int32 check, like add, sub, and eventually, div), and omits the overflow / remainder check and bailout.
Attachment #608939 -
Flags: review?(sstangl)
Comment 1•12 years ago
|
||
Could you make this patch on top of bug 736135 ? That patch will land soon and change some things... Secondly the patch in bug 736135 also introduces canOverflow boolean on MDiv (like MMul has already a long time). I would suggest you reuse that boolean for MDiv. And MSub/MAdd could use that same name (instead of elideOverflowCheck) ?
Reporter | ||
Comment 2•12 years ago
|
||
results of patch: ~/bench; js.ion.opt --ion -n --ion-trunc=off ./micro.js 13193 +2012-03-24 01:37 0 !218 (0) mrosenberg@planetaryac:$ ~/bench; js.ion.opt --ion -n --ion-trunc=on ./micro.js 11394 TEST COMPARISON FROM TO DETAILS ============================================================================= ** TOTAL **: 1.004x as fast 381.6ms +/- 0.2% 380.3ms +/- 0.1% significant ============================================================================= 3d: 1.005x as fast 60.8ms +/- 0.3% 60.5ms +/- 0.1% significant cube: 1.004x as fast 28.4ms +/- 0.3% 28.3ms +/- 0.2% significant morph: 1.015x as fast 8.7ms +/- 0.8% 8.5ms +/- 0.2% significant raytrace: - 23.8ms +/- 0.3% 23.7ms +/- 0.2% access: - 47.3ms +/- 0.4% 47.2ms +/- 0.2% binary-trees: - 26.5ms +/- 0.5% 26.5ms +/- 0.4% fannkuch: - 11.7ms +/- 0.4% 11.7ms +/- 0.2% nbody: ?? 5.5ms +/- 0.5% 5.5ms +/- 0.2% not conclusive: might be *1.005x as slow* nsieve: - 3.6ms +/- 0.6% 3.6ms +/- 0.2% bitops: 1.016x as fast 15.7ms +/- 0.2% 15.4ms +/- 0.1% significant 3bit-bits-in-byte: ?? 1.2ms +/- 1.2% 1.2ms +/- 0.4% not conclusive: might be *1.002x as slow* bits-in-byte: 1.019x as fast 5.5ms +/- 0.2% 5.4ms +/- 0.1% significant bitwise-and: - 4.3ms +/- 0.1% 4.2ms +/- 0.1% nsieve-bits: 1.030x as fast 4.7ms +/- 0.6% 4.6ms +/- 0.1% significant controlflow: ?? 3.2ms +/- 0.7% 3.2ms +/- 0.2% not conclusive: might be *1.003x as slow* recursive: ?? 3.2ms +/- 0.7% 3.2ms +/- 0.2% not conclusive: might be *1.003x as slow* crypto: - 35.0ms +/- 0.2% 35.0ms +/- 0.1% aes: *1.006x as slow* 19.5ms +/- 0.3% 19.6ms +/- 0.2% significant md5: - 9.3ms +/- 0.4% 9.2ms +/- 0.2% sha1: 1.010x as fast 6.2ms +/- 0.6% 6.2ms +/- 0.2% significant date: - 87.2ms +/- 0.4% 86.9ms +/- 0.4% format-tofte: - 47.9ms +/- 0.4% 47.9ms +/- 0.6% format-xparb: 1.006x as fast 39.2ms +/- 0.4% 39.0ms +/- 0.4% significant math: *1.005x as slow* 21.5ms +/- 0.3% 21.6ms +/- 0.1% significant cordic: *1.012x as slow* 4.1ms +/- 0.7% 4.2ms +/- 0.3% significant partial-sums: ?? 13.2ms +/- 0.4% 13.2ms +/- 0.1% not conclusive: might be *1.003x as slow* spectral-norm: *1.006x as slow* 4.2ms +/- 0.2% 4.2ms +/- 0.1% significant regexp: 1.014x as fast 14.1ms +/- 0.2% 13.9ms +/- 0.1% significant dna: 1.014x as fast 14.1ms +/- 0.2% 13.9ms +/- 0.1% significant string: 1.004x as fast 96.9ms +/- 0.3% 96.5ms +/- 0.2% significant base64: *1.006x as slow* 6.0ms +/- 0.4% 6.0ms +/- 0.3% significant fasta: ?? 8.8ms +/- 0.3% 8.9ms +/- 0.1% not conclusive: might be *1.003x as slow* tagcloud: 1.006x as fast 35.3ms +/- 0.5% 35.1ms +/- 0.2% significant unpack-code: 1.007x as fast 38.5ms +/- 0.4% 38.3ms +/- 0.2% significant validate-input: ?? 8.2ms +/- 0.5% 8.3ms +/- 0.2% not conclusive: might be *1.005x as slow* ~/bench/v8/v7; js.ion.opt --ion -n --ion-trunc=off ./run-crypto.js Crypto: 9865 Crypto: 9942 Crypto: 10385 Crypto: 10074 Crypto: 9708 Crypto: 9978 Crypto: 10033 Crypto: 10457 Crypto: 10364 ---- Score (version 7): 10087 ~/bench/v8/v7; js.ion.opt --ion -n --ion-trunc=off ./run.js Richards: 2067 DeltaBlue: 1053 Crypto: 10041 RayTrace: 385 EarleyBoyer: 523 RegExp: 466 Splay: 3423 NavierStokes: 658 ---- Score (version 7): 1211 ~/bench/v8/v7; js.ion.opt --ion -n --ion-trunc=on ./run.js Richards: 2090 DeltaBlue: 1080 Crypto: 10419 RayTrace: 386 EarleyBoyer: 522 RegExp: 464 Splay: 3414 NavierStokes: 682 ---- Score (version 7): 1227 .3 +2012-03-23 21:22 0 !179 (0) mrosenberg@planetaryac:$ ~/bench/v8/v7; js.ion.opt --ion -n --ion-trunc=on ./run-crypto.js Crypto: 10522 Crypto: 10161 Crypto: 10649 Crypto: 10570 Crypto: 10683 Crypto: 10842 Crypto: 10699 Crypto: 10644 Crypto: 10450 ---- Score (version 7): 10579 TEST COMPARISON FROM TO DETAILS ============================================================================= ** TOTAL **: - 5121.9ms +/- 0.1% 5113.5ms +/- 0.3% ============================================================================= ai: - 180.1ms +/- 0.2% 180.2ms +/- 0.3% astar: - 180.1ms +/- 0.2% 180.2ms +/- 0.3% audio: - 1562.4ms +/- 0.3% 1559.2ms +/- 0.4% beat-detection: - 509.5ms +/- 0.8% 505.0ms +/- 0.9% dft: ?? 663.1ms +/- 0.3% 663.9ms +/- 0.4% not conclusive: might be *1.001x as slow* fft: - 174.0ms +/- 0.3% 174.2ms +/- 0.4% oscillator: ?? 215.7ms +/- 0.3% 216.1ms +/- 0.3% not conclusive: might be *1.002x as slow* imaging: ?? 792.4ms +/- 0.2% 793.8ms +/- 0.3% not conclusive: might be *1.002x as slow* gaussian-blur: ?? 294.2ms +/- 0.2% 294.8ms +/- 0.3% not conclusive: might be *1.002x as slow* darkroom: ?? 335.7ms +/- 0.3% 336.7ms +/- 0.6% not conclusive: might be *1.003x as slow* desaturate: - 162.4ms +/- 0.7% 162.3ms +/- 0.7% json: ?? 143.6ms +/- 0.3% 144.0ms +/- 0.3% not conclusive: might be *1.003x as slow* parse-financial: ?? 91.2ms +/- 0.4% 91.5ms +/- 0.4% not conclusive: might be *1.003x as slow* stringify-tinderbox: ?? 52.4ms +/- 0.3% 52.5ms +/- 0.4% not conclusive: might be *1.002x as slow* stanford: - 2443.4ms +/- 0.2% 2436.2ms +/- 0.3% crypto-aes: - 1386.6ms +/- 0.3% 1387.0ms +/- 0.3% crypto-ccm: ?? 344.9ms +/- 0.3% 345.6ms +/- 0.4% not conclusive: might be *1.002x as slow* crypto-pbkdf2: 1.009x as fast 567.3ms +/- 0.5% 562.2ms +/- 0.5% significant crypto-sha256-iterative: 1.022x as fast 144.5ms +/- 0.3% 141.5ms +/- 0.3% significant
Reporter | ||
Comment 3•12 years ago
|
||
(In reply to Hannes Verschore from comment #1) > Could you make this patch on top of bug 736135 ? That patch will land soon > and change some things... > > Secondly the patch in bug 736135 also introduces canOverflow boolean on MDiv > (like MMul has already a long time). I would suggest you reuse that boolean > for MDiv. And MSub/MAdd could use that same name (instead of > elideOverflowCheck) ? Sure thing. I'd decided to change the name of the pass mid-writing, and decided to just leave it as a split name for testing. other thing to note: I'm going to strip out the debug fprintf's as well.
Reporter | ||
Comment 4•12 years ago
|
||
Reporter | ||
Comment 5•12 years ago
|
||
Merged this patch with all of the current "range analysis" stuff. I also split the old range analysis into a forwards and backwards pass, while adding the new code specifically into the backwards pass.
Attachment #608939 -
Attachment is obsolete: true
Attachment #608939 -
Flags: review?(sstangl)
Attachment #613813 -
Flags: review?(sstangl)
Comment 6•12 years ago
|
||
Can I raise some concerns. I noticed you moved the range analysis pass before the GVN pass. This actually worsen the effect of the 'NegativeZeroCheck' removal. This is because GVN can also remove the negative zero check and range analysis uses this information! Theoretical the pass would be best at the end, like it was. Another question you've added foldIfNegOne, but I nowhere found any calls to that function in your diff?
Reporter | ||
Comment 7•12 years ago
|
||
(In reply to Hannes Verschore from comment #6) > Can I raise some concerns. I noticed you moved the range analysis pass > before the GVN pass. This actually worsen the effect of the > 'NegativeZeroCheck' removal. This is because GVN can also remove the > negative zero check and range analysis uses this information! Theoretical > the pass would be best at the end, like it was. > I'd asked around if there was any reason that we did negative-zero before gvn, and didn't hear anything definite, so I moved it earlier. If you have any test cases that used to get optimized that no longer do, I'll fix them. Also is there some time we can chat for 15-20 minutes about this rather than through bugzilla/email? > Another question you've added foldIfNegOne, but I nowhere found any calls to > that function in your diff? oops, that was supposed to be called from the binary folding function, guess I never actually added it in. I guess there aren't many cases that people execute x & -1 in our benches :)
Comment 8•12 years ago
|
||
(In reply to Marty Rosenberg [:mjrosenb] from comment #7) > I'd asked around if there was any reason that we did negative-zero before > gvn, and didn't hear anything definite, so I moved it earlier. If you have > any test cases that used to get optimized that no longer do, I'll fix them. > Also is there some time we can chat for 15-20 minutes about this rather than > through bugzilla/email? Small example: (x*6)+(y*z) - Without extra information range analysis can remove one check of both checks if both have "negative zero checks" - GVN removes "negative zero check" on (x*6) - With this information, range analysis can remove the other "negative zero check" too. => GVN followed by RangeAnalysis: removes definitely both checks => RangeAnalysis followed by GVN: removes one or both checks depending on order of MUL's I'll try to be online from 6am PST (now) to 9am PST and from 1pm PST onwards
Comment 9•12 years ago
|
||
Comment on attachment 613813 [details] [diff] [review] /home/mrosenberg/patches/elideOverflow-r3.patch Review of attachment 613813 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/Makefile.in @@ +319,5 @@ > TypePolicy.cpp \ > ValueNumbering.cpp \ > VMFunctions.cpp \ > AliasAnalysis.cpp \ > + OverflowEliminator.cpp \ This is legacy stuff, right? It doesn't look like there is any such file in this cset. ::: js/src/ion/Ion.cpp @@ +681,5 @@ > IonSpewPass("Alias analysis"); > AssertGraphCoherency(graph); > } > > + if (js_IonOptions.rangeAnalysis) { It would be useful to explain the logic for requiring this pass to occur before GVN in a comment; it is not immediately obvious. ::: js/src/ion/Lowering.cpp @@ +541,5 @@ > JS_ASSERT(lhs->type() == MIRType_Int32); > ReorderCommutative(&lhs, &rhs); > LAddI *lir = new LAddI; > + if (!ins->implicitTruncate()) { > + if (!assignSnapshot(lir)) if (!ins->implicitTruncate() && !assignSnapshot(lir)) return false; The else block with the manual DEBUG spew to stderr should be removed. @@ +572,5 @@ > if (ins->specialization() == MIRType_Int32) { > JS_ASSERT(lhs->type() == MIRType_Int32); > LSubI *lir = new LSubI; > + if (!ins->implicitTruncate()) { > + if (!assignSnapshot(lir)) Same comments as with visitAdd(). ::: js/src/ion/MIR.cpp @@ +672,1 @@ > // This is only meaningfull when doing integer division preexisting nit: "meaningful". Period at end. s/when doing/for/. @@ +704,4 @@ > if (canBeNegativeZero_) > canBeNegativeZero_ = NeedNegativeZeroCheck(this); > + if (js::ion::RangeAnalysis::allUsesTruncate(this)) > + implicitTruncate_ = true; setTruncates(). @@ +709,4 @@ > } > > +bool > +MDiv::updateForReplacement(MDefinition *ins_, bool merging) Non-nit: The existence of the boolean parameter is troublesome. Booleans are horribly ugly and make maintenance irritating. Is there a better way to structure this? Could we get by without information about GVN phase leaking? Perhaps have a different function for each phase? updateForReplacement() is intended for instructions that are not equivalent, per comment in MIR.h. @@ +715,5 @@ > + return true; > + JS_ASSERT(ins_->isDiv()); > + MDiv *ins = ins_->toDiv(); > + // Since these properties are based solely on the inputs, GVN equivalence > + // implies that they are also equivalent. updateForReplacement() is commented that it is intended specifically for instructions that are not equivalent (MIR.h:409). @@ +720,5 @@ > + JS_ASSERT(canBeNegativeOverflow() == ins->canBeNegativeOverflow()); > + JS_ASSERT(canBeDivideByZero() == ins->canBeDivideByZero()); > + > + canBeNegativeZero_ = canBeNegativeZero() || ins->canBeNegativeZero(); > + implicitTruncate_ = implicitTruncate() && ins->implicitTruncate(); These should use setters. These comments apply to all the other updateForReplacement() functions. @@ +785,5 @@ > +void > +MAdd::analyzeRangeBackward() > +{ > + if (js::ion::RangeAnalysis::allUsesTruncate(this)) > + implicitTruncate_ = true; setTruncates(); ::: js/src/ion/MIR.h @@ +531,5 @@ > } > MResumePoint *resumePoint() const { > return resumePoint_; > } > + Don't need extra whitespace here. @@ +1838,5 @@ > MDefinition *foldIfEqual() { > return MConstant::New(Int32Value(0)); > } > + MDefinition *foldIfNegOne(size_t operand) { > + // Technically bitNot. This patch introduces a number of functions named foldIfNegOne(), but they appear to be unused. They should be removed. @@ +2052,5 @@ > return new MAdd(left, right); > } > + void analyzeRangeBackward(); > + > + bool implicitTruncate() const { This name seems strange, because it contains a verb but does not perform that action. How about simply "isTruncating()"? @@ +2055,5 @@ > + > + bool implicitTruncate() const { > + return implicitTruncate_; > + } > + void setImplicitTruncate(bool val) { Can get rid of the boolean parameter and just always set it to true. These methods should be put on MBinaryArithInstruction to avoid duplication. ::: js/src/ion/RangeAnalysis.cpp @@ +24,5 @@ > RangeAnalysis::analyze() > { > for (ReversePostorderIterator block(graph.rpoBegin()); block != graph.rpoEnd(); block++) { > for (MDefinitionIterator iter(*block); iter; iter++) { > + (*iter)->analyzeRangeForward(); iter->analyzeRangeForward(); @@ +28,5 @@ > + (*iter)->analyzeRangeForward(); > + } > + } > + > + for (PostorderIterator block(graph.poBegin()); block != graph.poEnd(); block++) { Comment on the necessity of the backwards phase: only used to find and group truncating definitions. ::: js/src/ion/RangeAnalysis.h @@ -14,5 @@ > > class RangeAnalysis > { > MIRGraph &graph; > - This whitespace should be kept. @@ +18,4 @@ > public: > RangeAnalysis(MIRGraph &graph); > bool analyze(); > + static bool allUsesTruncate(MInstruction *m) { This function should be defined in RangeAnalysis.cpp. Could we call it "AllUsesTruncate()"? Capitalized since static. @@ +20,5 @@ > bool analyze(); > + static bool allUsesTruncate(MInstruction *m) { > + for (MUseIterator use = m->usesBegin(); use != m->usesEnd(); use++) { > + if (!use->node()->isDefinition()) > + return false; if (use->node()->isResumePoint()) return false; @@ +33,5 @@ > + if(def->isBitXor()) > + continue; > + if(def->isLsh()) > + continue; > + if(def->isRsh()) nit: need whitespace between "if" and "(" ::: js/src/ion/ValueNumbering.cpp @@ +388,5 @@ > MDefinition *dom = findDominatingDef(defs, ins, index); > if (!dom) > return false; // Insertion failed. > > + if (dom == ins || !dom->updateForReplacement(ins, true)) { The boolean argument should go away; see comments in MIR.cpp. ::: js/src/ion/arm/CodeGenerator-arm.cpp @@ +525,5 @@ > masm.setupAlignedABICall(2); > masm.passABIArg(lhs); > masm.passABIArg(rhs); > masm.callWithABI(JS_FUNC_TO_DATA_PTR(void *, __aeabi_idivmod)); > // idivmod returns the qoutient in r0, and the remainder in r1. preexisting nit: "quotient" @@ +528,5 @@ > masm.callWithABI(JS_FUNC_TO_DATA_PTR(void *, __aeabi_idivmod)); > // idivmod returns the qoutient in r0, and the remainder in r1. > + if (!mir->implicitTruncate()) { > + masm.ma_cmp(r1, Imm32(0)); > + if (!bailoutIf(Assembler::NonZero, ins->snapshot())) Since the above operation is a comparison, it looks more idiomatic here to use Assembler::Equal, even though it aliases to the same thing as NonZero.
Attachment #613813 -
Flags: review?(sstangl) → review+
Reporter | ||
Comment 10•12 years ago
|
||
nits addressed, both sstangl's and h4writer's. lots of fiddly code movement, so asking for a quick follow-up review.
Attachment #613813 -
Attachment is obsolete: true
Attachment #614959 -
Flags: review?(sstangl)
Comment 11•12 years ago
|
||
Comment on attachment 614959 [details] [diff] [review] /home/mrosenberg/patches/elideOverflow-r4.patch Review of attachment 614959 [details] [diff] [review]: ----------------------------------------------------------------- ::: js/src/ion/Ion.cpp @@ +685,5 @@ > + if (js_IonOptions.rangeAnalysis) { > + RangeAnalysis rangeAnalysis(graph); > + if (!rangeAnalysis.analyzeEarly()) > + return false; > + IonSpewPass("Range Analysis"); This should be "Range Analysis Early" or something to differentiate it from the other pass. ::: js/src/ion/MIR.h @@ +409,5 @@ > > // Mark this instruction as having replaced all uses of ins, as during GVN, > // returning false if the replacement should not be performed. For use when > // GVN eliminates instructions which are not equivalent to one another. > + Don't need this added whitespace. @@ +1800,5 @@ > } > + > + MDefinition *foldIfNegOne(size_t operand) { > + return getOperand(1-operand); > + } Still don't need these functions! @@ +2099,4 @@ > double getIdentity() { > return 0; > } > + Still don't need this extra whitespace. @@ +2194,5 @@ > bool canBeDivideByZero() { > return canBeDivideByZero_; > } > + bool updateForReplacement(MDefinition *ins); > + Don't need extra whitespace. ::: js/src/ion/RangeAnalysis.cpp @@ +25,3 @@ > { > for (ReversePostorderIterator block(graph.rpoBegin()); block != graph.rpoEnd(); block++) { > for (MDefinitionIterator iter(*block); iter; iter++) { nit: Braces unnecessary on the innermost loop. Likewise for the other iterators in this file. @@ +25,4 @@ > { > for (ReversePostorderIterator block(graph.rpoBegin()); block != graph.rpoEnd(); block++) { > for (MDefinitionIterator iter(*block); iter; iter++) { > + (*iter)->analyzeRangeForward(); I think this can just be "iter->". Likewise for the other iterators in this file. @@ +43,5 @@ > +{ > + > + for (PostorderIterator block(graph.poBegin()); block != graph.poEnd(); block++) { > + for (MInstructionReverseIterator riter(block->rbegin()); riter != block->rend(); riter++) { > + (*riter)->analyzeTruncBackward(); Is truncating "Truncate" saving anything? "analyzeTruncateBackward" is probably OK. @@ +54,5 @@ > +bool > +RangeAnalysis::AllUsesTruncate(MInstruction *m) > +{ > + for (MUseIterator use = m->usesBegin(); use != m->usesEnd(); use++) { > + if (!use->node()->isDefinition()) if (use->node()->isResumePoint()) ::: js/src/ion/RangeAnalysis.h @@ +18,5 @@ > > public: > RangeAnalysis(MIRGraph &graph); > + bool analyzeEarly(); > + bool analyzeLate(); I like this. It's cute.
Attachment #614959 -
Flags: review?(sstangl) → review+
Comment 12•12 years ago
|
||
Looks like this has caused a large regression on the kraken-gaussian-blur test (371ms => 102447.8ms) going on what AWFY is indicating.
Reporter | ||
Comment 13•12 years ago
|
||
Yeah, this was pointed out last night. Testing on linux reveals no clues, I suspect I need to try this on an OS box.
Comment 14•12 years ago
|
||
I can reproduce this locally by running the entire kraken-1.1 testsuite.
It also reproduces by just running the test individually:
> time ~/dev/ionmonkey/js/src/opt32/js --ion -n -f imaging-gaussian-blur-data.js -f imaging-gaussian-blur.js
> real 0m49.261s
Reporter | ||
Comment 15•12 years ago
|
||
looks like when I updated my configure scripts to run on my desktop, they broke my laptop, such that I was building a 64 bit shell in my 32 bit directory. Now that *that* has been fixed, investigation will begin (and I suspect hilarity will ensue)
Reporter | ||
Comment 16•12 years ago
|
||
landed: http://hg.mozilla.org/projects/ionmonkey/rev/a5788d299d6a
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•