Closed
Bug 693449
Opened 13 years ago
Closed 13 years ago
IonMonkey: More ARM improvements
Categories
(Core :: JavaScript Engine, defect)
Tracking
()
RESOLVED
FIXED
People
(Reporter: mjrosenb, Assigned: mjrosenb)
Details
Attachments
(4 files, 1 obsolete file)
101.53 KB,
patch
|
jbramley
:
review+
|
Details | Diff | Splinter Review |
107.75 KB,
patch
|
jbramley
:
review+
|
Details | Diff | Splinter Review |
107.78 KB,
patch
|
Details | Diff | Splinter Review | |
104.20 KB,
patch
|
Details | Diff | Splinter Review |
Most things that I have tested seem to work. Current things that I know need to be implemented: Function calls Pools (Sadly, we *really* want these for loading doubles) In general, handling immediates that cannot fit into a single instruction ability to patch code after finalization.
Assignee | ||
Comment 1•13 years ago
|
||
*rename toInt() to encode() *rename writeBlob() to writeInst() *move most functions from {Macro,}Assembler-arm.h to {Macro,}Assembler.cpp and create MacroAssembler-arm.cpp, since it did not previously exist.
Assignee | ||
Updated•13 years ago
|
Attachment #566033 -
Flags: review?(Jacob.Bramley)
Assignee | ||
Comment 2•13 years ago
|
||
Comment on attachment 566079 [details] [diff] [review] Implement all of the simple, far reaching changes discussed in 690053 Most of this work was accomplished via emacs macros. Notably, I never bothered to strip out the comments from the header files. I suspect the comments delimiting sections should stay, and others should be stripped (or the comments should not exist, and the declarations should be grouped using some syntatic construct?)
Attachment #566079 -
Flags: review?(Jacob.Bramley)
Comment 3•13 years ago
|
||
Comment on attachment 566033 [details] [diff] [review] Improve on last week's patch of the ARM/IM code Review of attachment 566033 [details] [diff] [review]: ----------------------------------------------------------------- Mostly fine, but one or two questions. ::: .hgignore @@ +11,5 @@ > ^\.sw.$ > .[^/]*\.sw.$ > > +(^|/)#*# > +(^|/).#*# What are these? Emacs swap files? Comment please! ::: js/src/ion/arm/Architecture-arm.h @@ +64,5 @@ > // These offsets are related to bailouts. > //// > > // Size of each bailout table entry. On arm, this is at most a ldr, then a branch > +static const uint32 BAILOUT_TABLE_ENTRY_SIZE = 4; The size no longer matches up with the comment. LDR+B is 8 bytes. What's going on here? ::: js/src/ion/arm/Assembler-arm.cpp @@ +341,5 @@ > > ALUOp > ion::ALUNeg(ALUOp op, Imm32 *imm) > { > + return op_invalid; Did you mean to leave this in here? ::: js/src/ion/arm/Assembler-arm.h @@ +199,5 @@ > + } > + struct Serial; > + Serial encode(); > + // for serializing values > + struct Serial { I don't like the name of this. It is a very specialized struct with a generic name. How about VFPRegIndexSplit? @@ +482,3 @@ > return imm4L | (imm4H << 16); > }; > + Imm8VFPImmData() : imm4L(-1), imm4H(-1), isInvalid(-1) {} Why do you need a default constructor? @@ +733,2 @@ > VFPOffImm(uint32 imm) > + : VFPOff(datastore::Imm8VFPOffData(imm >> 2), imm < 0 ? IsDown : IsUp) {} I'd add ASSERT(!(imm & ~0x3)) to the content. @@ +1209,5 @@ > > // VFP instructions! > + enum vmov_op { > + opv_dbl = 0xC000A10, > + opv_sng = 0xC000A10 Are these supposed to be identical? Also, you've only given 7 nibbles. If you meant 0x0C000A10, it's best to write it that way to avoid confusion. @@ +1225,5 @@ > JS_ASSERT(vd.equiv(vn) && vd.equiv(vm)); > + vfp_size sz = isDouble; > + if (!vd.isDouble()) { > + sz = isSingle; > + } It's personal preference, but I think this would be neater: vfp_size sz = (vd.isDouble()) ? (isDouble) : (isSingle); (Also seen elsewhere.) @@ +1296,5 @@ > > // specifically, a move between two same sized-registers > void as_vmov(VFPRegister vd, VFPRegister vsrc, Condition c = Always) > { > + as_vfp_float(vd, NoVFPRegister, vsrc, opv_add, c); opv_add? @@ +1306,5 @@ > }; > + > + enum VFPXferSize { > + WordTransfer = 0x0E000A10, > + DoubleTransfer = 0x0C400A10 These are just single bits combined with opv_dbl and opv_sng. Ideally, you should express them in terms of opv_(dbl|sng) or just express the single bits and combine them in as_vxfer. @@ +1320,5 @@ > Condition c = Always) > { > + vfp_size sz = isSingle; > + if (vm.isDouble()) { > + // Technically, this can be done with a vmov à la A8-644/A8-646 What are A8-644 and A8-646? @@ +1325,5 @@ > + // however, that requires at least an extra bit saying if the > + // operation should be performed on the lower or upper half of the > + // double. Moving a single to/from 2N/2N+1 isn't equivalent, > + // since there are 32 single registers, and 32 double registers > + // so there is no way to encode the last 16 double registers. For the record, you can also VMOV two adjacent but unaligned S registers. @@ +1367,5 @@ > + vfp_size sz = isDouble; > + if (!vd.isDouble()) { > + sz = isSingle; > + } else { > + length = length << 1; "length *= 2" is clearer, in my opinion. @@ +1381,5 @@ > + vfp_size sz = isDouble; > + if (!vd.isDouble()) { > + // totally do not know how to handle this right now > + sz = isSingle; > + JS_NOT_REACHED("non-double immediate"); See here: http://code.google.com/p/v8/source/browse/branches/bleeding_edge/src/arm/assembler-arm.cc#1908 or my original here (for a month): http://pastebin.mozilla.org/1353337 That is actually my code (with ARM copyright) so you can use it under the tri-license without any licensing mess. I wrote it for NanoJIT but never integrated it. I gave it to some V8 contributors at ARM, which is how it ended up there. @@ +1605,5 @@ > + ret = (ret << 1) | b; > + return ret; > + } > + uint32 encode(uint8 value) { > + //ARM ARM A7-25 Page references to the ARM ARM are (unfortunately) a bad idea. Things might get moved around in future revisions and suchlike. ::: js/src/ion/arm/CodeGenerator-arm.cpp @@ +96,5 @@ > + masm.ma_b(ifTrue->label(), cond); > + } else { > + masm.ma_b(ifFalse->label(), Assembler::InvertCondition(cond)); > + if (!isNextBlock(ifTrue)) > + masm.ma_b(ifTrue->label()); Curly brackets. Also, this would be an ideal place to flush the literal pool (once we get to that). To stop it happening too often, a 'MaybeFlushLiteralPool' function would be handy. @@ +130,5 @@ > > // Test the operand > + // jbramley says that cmp is more efficent than tst. > + // this can be replaced with cmp opd, 0. > + masm.ma_tst(ToRegister(opd), ToRegister(opd)); He does! Why did you revert this back to ma_tst? @@ +310,5 @@ > + > + if (rhs->isConstant()) > + masm.ma_sub(ToRegister(lhs), Imm32(ToInt32(rhs)), ToRegister(dest)); > + else > + masm.ma_sub(ToRegister(lhs), ToOperand(rhs), ToRegister(dest)); Curly brackets. (I'm not going to mention them elsewhere, but there are several other missing ones.) @@ +492,5 @@ > + const LAllocation *lhs = ins->getOperand(0); > + const LAllocation *rhs = ins->getOperand(1); > + const LDefinition *dest = ins->getDef(0); > + > + switch (ins->bitop()) { In all of these, you actually need to explicitly AND the rhs register with 0x1f, for the same reason that you do for the immediates. (That is, unless JSOP_*SH guarantees that it won't give shifts larger than 0x1f.) @@ +512,5 @@ > + masm.ma_asr(Imm32(ToInt32(rhs) & 0x1F), ToRegister(lhs), ToRegister(dest)); > + else > + masm.ma_asr(ToRegister(rhs), ToRegister(lhs), ToRegister(dest)); > + > + // Note: this is an unsigned operation. ASR (arithmetic shift right) is surely a signed operation. Should you have used LSR here, and ASR for the JSOP_RSH case? @@ +848,5 @@ > const LConstantIndex *cindex = ins->getOperand(0)->toConstantIndex(); > const Value &v = graph.getConstant(cindex->index()); > > jsdpun dpun; > dpun.d = v.toDouble(); Type punning through a union is a GCC extension. Using memcpy is always better. @@ +849,5 @@ > const Value &v = graph.getConstant(cindex->index()); > > jsdpun dpun; > dpun.d = v.toDouble(); > + if ((dpun.u64 &0xffffffff) == 0) { Put a space after '&'. @@ +850,5 @@ > > jsdpun dpun; > dpun.d = v.toDouble(); > + if ((dpun.u64 &0xffffffff) == 0) { > + VFPImm dblEnc(dpun.u64 >> 32); It looks like you really want a punning destination with two 32-bit words. That would probably generate better code, too. @@ +922,5 @@ > + uint32 argslot = arg->argslot(); > + int32 stack_offset = StackOffsetOfPassedArg(argslot); > + > + masm.ma_str(val.typeReg(), DTRAddr(StackPointer, DtrOffImm(stack_offset))); > + masm.ma_str(val.payloadReg(), DTRAddr(StackPointer, DtrOffImm(stack_offset+4))); Ideally, these would be passed through some ma_str64 method that knows about STRD and suchlike, like the Store64* methods I added to Macro Assembler. You need to have suitable typeReg and payloadReg assignments for optimal code, though. ::: js/src/ion/arm/CodeGenerator-arm.h @@ +86,5 @@ > > bool emitDoubleToInt32(const FloatRegister &src, const Register &dest, Label *fail); > void emitTruncateDouble(const FloatRegister &src, const Register &dest, Label *fail); > + // Emits a conditional set. > + void emitSet(Assembler::Condition cond, const Register &dest); This isn't defined (yet). However, I'm curious. To what is 'dest' set? Shouldn't this take an additional argument (or two)?
Comment 4•13 years ago
|
||
Comment on attachment 566079 [details] [diff] [review] Implement all of the simple, far reaching changes discussed in 690053 Review of attachment 566079 [details] [diff] [review]: ----------------------------------------------------------------- I skimmed the bits that just looked like they were a mess move and suchlike. I'd leave the comments in the headers, as they are now. ::: js/src/ion/arm/Assembler-arm.cpp @@ +714,5 @@ > + as_alu(InvalidReg, src1, op2, op_tst, SetCond, c); > +} > + > +// Not quite ALU worthy, but useful none the less: > +// These also have the isue of these being formatted s/isue/issue/ @@ +802,5 @@ > +} > + > +// Branch can branch to an immediate *or* to a register. > +// Branches to immediates are pc relative, branches to registers > +// are absolute The B instruction itself is relative-immediate-only. BX can be used to branch to an address in a register. Note that the three as_b implementations only actually use immediate offsets here.
Attachment #566079 -
Flags: review?(Jacob.Bramley) → review+
Comment 5•13 years ago
|
||
(In reply to Jacob Bramley [:jbramley] from comment #4) > I skimmed the bits that just looked like they were a mess move and suchlike. I meant "mass move" of course.
Updated•13 years ago
|
Assignee: general → mrosenberg
Status: NEW → ASSIGNED
Assignee | ||
Comment 6•13 years ago
|
||
This also rebases against the latest patchset in the IonMonkey repo... Now to rebase the next patch
Attachment #566033 -
Attachment is obsolete: true
Attachment #566033 -
Flags: review?(Jacob.Bramley)
Attachment #566768 -
Flags: review?(Jacob.Bramley)
Assignee | ||
Comment 7•13 years ago
|
||
This time around, I'm attempting to fix the nits that need fixing, then explain the rest, since I missed some last time. (In reply to Jacob Bramley [:jbramley] from comment #3) > Comment on attachment 566033 [details] [diff] [review] [diff] [details] [review] > Improve on last week's patch of the ARM/IM code > > > ^\.sw.$ > > .[^/]*\.sw.$ > > > > +(^|/)#*# > > +(^|/).#*# > > What are these? Emacs swap files? Comment please! > Those are from my own .hgignore. I should find some way of adding those that doesn't get pushed upstream ever :( > ::: js/src/ion/arm/Architecture-arm.h > @@ +64,5 @@ > > // These offsets are related to bailouts. > > //// > > > > // Size of each bailout table entry. On arm, this is at most a ldr, then a branch > > +static const uint32 BAILOUT_TABLE_ENTRY_SIZE = 4; > > The size no longer matches up with the comment. LDR+B is 8 bytes. What's > going on here? > comment has been updated, and expanded to remind me to actually fix it back to the way it should be (8, but for a different reason). > ::: js/src/ion/arm/Assembler-arm.cpp > @@ +341,5 @@ > > > > ALUOp > > ion::ALUNeg(ALUOp op, Imm32 *imm) > > { > > + return op_invalid; > > Did you mean to leave this in here? > No, but it did turn up a bug! I should probably make all of the functions that are able to fail without spoiling the compilation fail under certain conditions (env variables maybe?) > ::: js/src/ion/arm/Assembler-arm.h > @@ +199,5 @@ > > + } > > + struct Serial; > > + Serial encode(); > > + // for serializing values > > + struct Serial { > > I don't like the name of this. It is a very specialized struct with a > generic name. How about VFPRegIndexSplit? I didn't like the name horribly either, which is why it is embedded in the VFPRegister class > > @@ +482,3 @@ > > return imm4L | (imm4H << 16); > > }; > > + Imm8VFPImmData() : imm4L(-1), imm4H(-1), isInvalid(-1) {} > > Why do you need a default constructor? > because I declare an array of them, which requires a default constructor. > @@ +733,2 @@ > > VFPOffImm(uint32 imm) > > + : VFPOff(datastore::Imm8VFPOffData(imm >> 2), imm < 0 ? IsDown : IsUp) {} > > I'd add ASSERT(!(imm & ~0x3)) to the content. > I'm not going to bother, since I have another patch in the pipeline that changes this up again. > @@ +1209,5 @@ > > > > // VFP instructions! > > + enum vmov_op { > > + opv_dbl = 0xC000A10, > > + opv_sng = 0xC000A10 > > Are these supposed to be identical? Also, you've only given 7 nibbles. If > you meant 0x0C000A10, it's best to write it that way to avoid confusion. > They aren't referenced, and have been removed. > @@ +1225,5 @@ > > JS_ASSERT(vd.equiv(vn) && vd.equiv(vm)); > > + vfp_size sz = isDouble; > > + if (!vd.isDouble()) { > > + sz = isSingle; > > + } > > It's personal preference, but I think this would be neater: > vfp_size sz = (vd.isDouble()) ? (isDouble) : (isSingle); > (Also seen elsewhere.) It is indeed neater, but that was so I could break on said lines. > > @@ +1296,5 @@ > > > > // specifically, a move between two same sized-registers > > void as_vmov(VFPRegister vd, VFPRegister vsrc, Condition c = Always) > > { > > + as_vfp_float(vd, NoVFPRegister, vsrc, opv_add, c); > > opv_add? good catch. > > @@ +1306,5 @@ > > }; > > + > > + enum VFPXferSize { > > + WordTransfer = 0x0E000A10, > > + DoubleTransfer = 0x0C400A10 > > These are just single bits combined with opv_dbl and opv_sng. Ideally, you > should express them in terms of opv_(dbl|sng) or just express the single > bits and combine them in as_vxfer. > Expressing the single bits is fine, but I am not sure where to put the rest of the immediate. the old opv_dbl didn't really belong. > @@ +1320,5 @@ > > Condition c = Always) > > { > > + vfp_size sz = isSingle; > > + if (vm.isDouble()) { > > + // Technically, this can be done with a vmov à la A8-644/A8-646 > > What are A8-644 and A8-646? pages in the ARM ARM. that part of the comment has been stripped. > > @@ +1325,5 @@ > > + // however, that requires at least an extra bit saying if the > > + // operation should be performed on the lower or upper half of the > > + // double. Moving a single to/from 2N/2N+1 isn't equivalent, > > + // since there are 32 single registers, and 32 double registers > > + // so there is no way to encode the last 16 double registers. > > For the record, you can also VMOV two adjacent but unaligned S registers. good to know! > @@ +130,5 @@ > > > > // Test the operand > > + // jbramley says that cmp is more efficent than tst. > > + // this can be replaced with cmp opd, 0. > > + masm.ma_tst(ToRegister(opd), ToRegister(opd)); > > He does! Why did you revert this back to ma_tst? I think the code that I'd written was ma_cmp(ToRegister(opd), ToRegister(opd)) which was wrong, and rather than thinking about why it was wrong (for testing of course), I just reverted it back to ma_tst with the same arguments. > > @@ +492,5 @@ > > + const LAllocation *lhs = ins->getOperand(0); > > + const LAllocation *rhs = ins->getOperand(1); > > + const LDefinition *dest = ins->getDef(0); > > + > > + switch (ins->bitop()) { > > In all of these, you actually need to explicitly AND the rhs register with > 0x1f, for the same reason that you do for the immediates. (That is, unless > JSOP_*SH guarantees that it won't give shifts larger than 0x1f.) > I've added a comment saying this needs to be done. I need to change a bunch of code to steal an extra register, or mark one as dead. > @@ +512,5 @@ > > + masm.ma_asr(Imm32(ToInt32(rhs) & 0x1F), ToRegister(lhs), ToRegister(dest)); > > + else > > + masm.ma_asr(ToRegister(rhs), ToRegister(lhs), ToRegister(dest)); > > + > > + // Note: this is an unsigned operation. > > ASR (arithmetic shift right) is surely a signed operation. Should you have > used LSR here, and ASR for the JSOP_RSH case? I'm going to assume that I was asleep while writing this. > > @@ +848,5 @@ > > const LConstantIndex *cindex = ins->getOperand(0)->toConstantIndex(); > > const Value &v = graph.getConstant(cindex->index()); > > > > jsdpun dpun; > > dpun.d = v.toDouble(); > > Type punning through a union is a GCC extension. Using memcpy is always > better. > that was copied from x86, If we change it here, it should also be changed there. > @@ +850,5 @@ > > > > jsdpun dpun; > > dpun.d = v.toDouble(); > > + if ((dpun.u64 &0xffffffff) == 0) { > > + VFPImm dblEnc(dpun.u64 >> 32); > > It looks like you really want a punning destination with two 32-bit words. > That would probably generate better code, too. > I think the union already has a 32+32 struct in it. Mind if this gets rolled into a later patch? > @@ +922,5 @@ > > + uint32 argslot = arg->argslot(); > > + int32 stack_offset = StackOffsetOfPassedArg(argslot); > > + > > + masm.ma_str(val.typeReg(), DTRAddr(StackPointer, DtrOffImm(stack_offset))); > > + masm.ma_str(val.payloadReg(), DTRAddr(StackPointer, DtrOffImm(stack_offset+4))); > > Ideally, these would be passed through some ma_str64 method that knows about > STRD and suchlike, like the Store64* methods I added to Macro Assembler. You > need to have suitable typeReg and payloadReg assignments for optimal code, > though. Agreed. I'm not too sure how the register allocator needs to be modified to assign these regs efficently. > > ::: js/src/ion/arm/CodeGenerator-arm.h > @@ +86,5 @@ > > > > bool emitDoubleToInt32(const FloatRegister &src, const Register &dest, Label *fail); > > void emitTruncateDouble(const FloatRegister &src, const Register &dest, Label *fail); > > + // Emits a conditional set. > > + void emitSet(Assembler::Condition cond, const Register &dest); > > This isn't defined (yet). However, I'm curious. To what is 'dest' set? > Shouldn't this take an additional argument (or two)? It is supposed to mimmic an x86 instruction that sets a register depending on the condition codes, so setne %eax will set eax to 1 if the equal bit is off, and to 0 if it is on.
Comment 8•13 years ago
|
||
(In reply to Marty Rosenberg [:Marty] from comment #7) > > > ALUOp > > > ion::ALUNeg(ALUOp op, Imm32 *imm) > > > { > > > + return op_invalid; > > > > Did you mean to leave this in here? > > > No, but it did turn up a bug! > I should probably make all of the functions that are able to fail without > spoiling the compilation fail under certain conditions (env variables maybe?) That is an excellent idea. Debug-only environment variables would work well. It'd also be handy to have conditional failures, like 'fail after N times', 'fail when called by X', 'fail with probability Y' and suchlike. These are things to add later, however. > > Why do you need a default constructor? > > > because I declare an array of them, which requires a default constructor. Oh, I see. I didn't spot that. > > I'd add ASSERT(!(imm & ~0x3)) to the content. > > > I'm not going to bother, since I have another patch in the pipeline that > changes this up again. Right-o. > > @@ +1306,5 @@ > > > }; > > > + > > > + enum VFPXferSize { > > > + WordTransfer = 0x0E000A10, > > > + DoubleTransfer = 0x0C400A10 > > > > These are just single bits combined with opv_dbl and opv_sng. Ideally, you > > should express them in terms of opv_(dbl|sng) or just express the single > > bits and combine them in as_vxfer. > > > Expressing the single bits is fine, but I am not sure where to put the rest > of the immediate. > the old opv_dbl didn't really belong. Well you removed opv_dbl anyway, so it's probably best to least VFPXferSize as it is. > > Type punning through a union is a GCC extension. Using memcpy is always > > better. > > > that was copied from x86, If we change it here, it should also be changed > there. Ok, not a big problem, just something specific that bugs me :-) I'm _really_ conservative about type aliasing because it results in the worst possible class of bugs when it goes wrong. (Weird wrong-result failures rather than crashes, intermittent, goes away with different compiler options and suchlike.) > > It looks like you really want a punning destination with two 32-bit words. > > That would probably generate better code, too. > > > I think the union already has a 32+32 struct in it. Mind if this gets > rolled into a later patch? No problem.
Updated•13 years ago
|
Attachment #566768 -
Flags: review?(Jacob.Bramley) → review+
Assignee | ||
Comment 9•13 years ago
|
||
Assignee | ||
Comment 10•13 years ago
|
||
http://hg.mozilla.org/projects/ionmonkey/rev/2101d4b040ee http://hg.mozilla.org/projects/ionmonkey/rev/6e442768f0ca
Updated•13 years ago
|
Summary: More ARM/IM improvements → IonMonkey: More ARM improvements
Assignee | ||
Updated•13 years ago
|
Status: ASSIGNED → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•