Last Comment Bug 698564 - IonMonkey: implement gc tracing for ARM
: IonMonkey: implement gc tracing for ARM
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: ARM Linux
: -- normal (vote)
: ---
Assigned To: general
:
: Jason Orendorff [:jorendorff]
Mentors:
Depends on: 1125701
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-31 13:22 PDT by Marty Rosenberg [:mjrosenb]
Modified: 2015-01-25 19:30 PST (History)
1 user (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
implement a bit-extractor to safely decode instructions (15.95 KB, patch)
2011-11-09 12:24 PST, Marty Rosenberg [:mjrosenb]
no flags Details | Diff | Splinter Review
Use the instruction decoder to trace (44.64 KB, patch)
2011-11-09 12:26 PST, Marty Rosenberg [:mjrosenb]
Jacob.Bramley: review+
Details | Diff | Splinter Review
Tracing take 2 (38.02 KB, patch)
2011-12-02 05:07 PST, Jan de Mooij [:jandem]
Jacob.Bramley: review+
Details | Diff | Splinter Review

Description Marty Rosenberg [:mjrosenb] 2011-10-31 13:22:39 PDT

    
Comment 1 Marty Rosenberg [:mjrosenb] 2011-11-09 12:24:23 PST
Created attachment 573282 [details] [diff] [review]
implement a bit-extractor to safely decode instructions
Comment 2 Marty Rosenberg [:mjrosenb] 2011-11-09 12:26:06 PST
Created attachment 573284 [details] [diff] [review]
Use the instruction decoder to trace
Comment 3 Jacob Bramley [:jbramley] 2011-11-10 08:38:35 PST
Comment on attachment 573284 [details] [diff] [review]
Use the instruction decoder to trace

Review of attachment 573284 [details] [diff] [review]:
-----------------------------------------------------------------

I only see formatting and trivial issues. My review of the tracing stuff was light as I'm not aware of all that's going on there.

High-level comments:

 - There are quite a few blocks of code now wrapped in #if 0. Is this really necessary? If they can't be deleted, they should perhaps have TODO comments explaining that they're there to support future work and suchlike.

 - Many curly brackets are missing.

::: js/src/ion/arm/Assembler-arm.cpp
@@ +184,5 @@
> +Assembler::Zero(EQ);
> +const Assembler::Condition
> +Assembler::NonZero(NE);
> +const Assembler::Condition
> +Assembler::Always(AL);

These should be on one line each. (Better readability _and_ more compact.)

@@ +189,4 @@
>  
>  Assembler::Condition
>  Assembler::InvertCondition(Condition cond)
> +\{

Looks like a stray backslash.

@@ +1348,1 @@
>          return false;

Add curly brackets whilst you're here.

@@ +1349,1 @@
>      // and let everyone know about it.

This comment no longer makes sense.

@@ +1475,5 @@
> +        // this gets encoded as an absolute jump, so it happens to be correct
> +        // now. This is the place to optimize the absolute load/branch
> +        // to a single relative branch
> +
> +    }

That's just an empty loop!

@@ +1603,5 @@
> +    if (dataRelocations_.length()) {
> +        CompactBufferReader reader(dataRelocations_);
> +        ::TraceDataRelocations(trc, m_buffer.buffer(), reader);
> +    }
> +}

Many missing curly branckets around here.

@@ +1614,5 @@
> +
> +void
> +Assembler::processDeferredData(IonCode *code, uint8 *data)
> +{
> +

Extra white space.

@@ +1618,5 @@
> +
> +    JS_ASSERT(dataSize() == 0);
> +    for (size_t i = 0; i < data_.length(); i++) {
> +        DeferredData *deferred = data_[i];
> +        //Bind(code, deferred->label(), data + deferred->offset());

Why is this commented? What's it for?

::: js/src/ion/arm/Assembler-arm.h
@@ +776,5 @@
> +        return base + ((int32)(8<<data) >> 8);
> +    }
> +    virtual uint32 decode(uint32 x) const {
> +        int32 xx = x;
> +        return ((xx << 8)>>6);

Whilst this is common usage, note that >> is not necessarily an arithmetic shift (and technically a uint->int conversion is implementation-defined for values that would be negative). An assembly implementation is the only well-defined way to achieve this without a loss of efficiency.

A equivalent, valid, pure-C implementation (requiring 4 instructions rather of 2) might look like this:
   return (((x & 0x00ffffff) ^ 0x00800000) - 0x00800000) * 4;

@@ +906,5 @@
>          NotUnordered = VC,
>          Greater_Unordered = Above
>          // there are more.
>      };
> +#endif

Can't you just delete it?

@@ +971,5 @@
>      js::Vector<CodeLabel *, 0, SystemAllocPolicy> codeLabels_;
> +    // The table of jumps has two main functions.  Its primary function is to
> +    // give offsets for branches to code sections that need to be traced.
> +    // Its second function is to give the offsets of the jumps for the code
> +    // copying stage, where we (may) need to replace and absolute adress with

s/and/an/

@@ +973,5 @@
> +    // give offsets for branches to code sections that need to be traced.
> +    // Its second function is to give the offsets of the jumps for the code
> +    // copying stage, where we (may) need to replace and absolute adress with
> +    // an immediate offset (and of course, doing the same during a copy section
> +    // of a copy collection

Unmatched (

::: js/src/ion/arm/CodeGenerator-arm.cpp
@@ +834,5 @@
>  CodeGeneratorARM::emitDoubleToInt32(const FloatRegister &src, const Register &dest, Label *fail)
>  {
>      // convert the floating point value to an integer, if it did not fit,
>      //     then when we convert it *back* to  a float, it will have a
> +    //        different value, which we can test.

The indentation here is more broken than it was before. Is this your editor's auto-indentation getting confused?

::: js/src/ion/arm/MacroAssembler-arm.cpp
@@ +755,5 @@
>      uint32 trg = (uint32)target;
> +    if (!use_pool()) {
> +        as_movw(ScratchRegister, Imm16(trg & 0xffff), c);
> +        as_movt(ScratchRegister, Imm16(trg >> 16), c);
> +        // this is going to get the branch predictor pissed off.

No more than loading from memory and immediately branching to it.

@@ +913,5 @@
>  void
>  MacroAssemblerARM::movePtr(ImmGCPtr imm, const Register dest)
>  {
>      ma_mov(imm, dest);
> +    

Superfluous white space.
Comment 4 Luke Wagner [:luke] 2011-11-17 08:29:08 PST
Comment on attachment 573282 [details] [diff] [review]
implement a bit-extractor to safely decode instructions

Due to the level of abstraction, I kindof have no idea what this code is trying to do.  But, if I get the gist right, I suspect the whole thing could be done without any virtual dispatch.  Perhaps we could chat later?
Comment 5 Luke Wagner [:luke] 2011-11-29 16:03:16 PST
Comment on attachment 573282 [details] [diff] [review]
implement a bit-extractor to safely decode instructions

From IRL: we talked about monomorphizing this by making ADT-like classes for reading/writing individual instructions.
Comment 6 Jan de Mooij [:jandem] 2011-12-02 05:07:37 PST
Created attachment 578552 [details] [diff] [review]
Tracing take 2

Attaching this patch for Marty since he's unable to upload it atm.
Comment 7 Jacob Bramley [:jbramley] 2011-12-05 07:00:21 PST
Comment on attachment 578552 [details] [diff] [review]
Tracing take 2

Review of attachment 578552 [details] [diff] [review]:
-----------------------------------------------------------------

One or two things look broken, and there are a few errors in comments and suchlike, but the changes required should be trivial so this has my r+ with those changes made.

General comment: Lots of missing curly brackets.

::: js/src/ion/arm/Assembler-arm.cpp
@@ +123,5 @@
> +Register
> +js::ion::toRM(Instruction &i)
> +{
> +    return Register::FromCode((i.encode()>>8) & 0xf);
> +}

Admittedly, the locations of RM, RD and so on move around from instruction to instruction, but I would argue that the names of the functions above are misleading:

  toR is used to extract the RM field from a BX or BLX instruction. (I think they're the only branch-to-register instructions that we use.)

  toRM isn't actually used at the moment, but I think it only works for MUL instructions.

  toRD is correct for ALU instructions (but not multiply).

It's all very frustrating, but in general, you need to know at least the class of the instruction in order to know which register is in which field.

Could these be member functions of the Inst* derived classes, perhaps?

@@ +178,5 @@
> +    return (i.encode() & IsDTRMask) == IsDTR;
> +}
> +
> +InstDTR *
> +InstDTR::asTHIS(Instruction &i)

Confusing use of 'THIS', I think, but I can't think of a good alternative.

@@ +331,5 @@
> +
> +InstMovW *
> +InstMovW::asTHIS(Instruction &i)
> +{
> +    return (InstMovW*) (&i);

This (and the MOVT equivalent) should probably have isTHIS checks, like the other asTHIS methods.

@@ +346,5 @@
> +    return (i.encode() & IsWTMask) == IsT;
> +}
> +
> +Imm16::Imm16(Instruction &inst)
> +  : lower(inst.encode() & 0xfff), upper(inst.encode() >> 16), invalid(0xfff) {}

That constructor ought to be with the class declaration, simply because there's another constructor there and that might not be clear when debugging.

@@ +385,5 @@
>          return offset_;
>      }
>  };
> +uint32 *
> +Assembler::getCF32Target(Instruction *jump)

What is a CF32? A comment would help.

@@ +396,5 @@
> +        return imm.getDest(jump)->raw();
> +    } else if (jump->is<InstMovW>() &&
> +               jump->next()->is<InstMovT>() &&
> +               jump->next()->next()->is<InstBranchReg>()) {
> +        // see if we have the complex case,

It looks like a nested call to getPtr32Target could simplify this somewhat.

@@ +399,5 @@
> +               jump->next()->next()->is<InstBranchReg>()) {
> +        // see if we have the complex case,
> +        // movw r_temp, #imm1
> +        // movt r_temp, #imm2
> +        // b r_temp

You can never B to a register. For IM, I think it can be either BX or BLX.

@@ +416,5 @@
> +        top->extractImm(&targ_top);
> +        // make sure they are being loaded intothe same register
> +        top->checkDest(temp);
> +        // make sure we're branching to the same register.
> +        branch->checkDest(temp);

You've ignored the return value. Did you mean to ASSERT this?

@@ +417,5 @@
> +        // make sure they are being loaded intothe same register
> +        top->checkDest(temp);
> +        // make sure we're branching to the same register.
> +        branch->checkDest(temp);
> +        uint32 *dest = (uint32*) (targ_bot.decode() | (targ_top.decode() << 12));

I think you mean << 16.

@@ +418,5 @@
> +        top->checkDest(temp);
> +        // make sure we're branching to the same register.
> +        branch->checkDest(temp);
> +        uint32 *dest = (uint32*) (targ_bot.decode() | (targ_top.decode() << 12));
> +        return dest;

You should also assert that there's a branch instruction after the MOVW/MOVT pair (and that it targets 'temp').

@@ +425,5 @@
> +    return NULL;
> +}
> +
> +uint32 *
> +Assembler::getPtr32Target(Instruction *load, Register *dest)

It seems to me that this works for any 32-bit literal, not just pointers. Does that matter?

@@ +445,5 @@
> +        bottom->extractDest(&temp);
> +        // extract the top part of the immediate
> +        top->extractImm(&targ_top);
> +        // make sure they are being loaded intothe same register
> +        top->checkDest(temp);

You didn't use the return value. (ASSERT?)

@@ +446,5 @@
> +        // extract the top part of the immediate
> +        top->extractImm(&targ_top);
> +        // make sure they are being loaded intothe same register
> +        top->checkDest(temp);
> +        uint32 *value = (uint32*) (targ_bot.decode() | (targ_top.decode() << 12));

<< 16

@@ +527,3 @@
>  }
>  
> +// As far as I can tell, CodeLabels were supposed to be used in swith cables

s/swith/switch/

I think that's what you meant, anyway. I'm not sure what a "switch cable" is.

@@ +789,5 @@
> +Instruction *
> +BOffImm::getDest(Instruction *src)
> +{
> +    // TODO: it is probably worthwhile to verify that src is actually a branch
> +    return &src[((int32)data<<8)>>8];

1: I think you mean to shift right by 6 (since the offset implies two 0 bits at the bottom).

2: I don't like this method of calculating sign extensions as the behaviour is not well defined. It is, however, common practice.

::: js/src/ion/arm/Assembler-arm.h
@@ +1395,5 @@
> +{
> +    uint32 data;
> +  protected:
> +    // This is not for defaulting to always, this is for instructions that
> +    // cannot be made conditional, and have the usualy invalid 4b1111 cond field

s/usualy/usually/

@@ +1422,5 @@
> +    const Instruction & operator=(const Instruction &src) {
> +        data = src.data;
> +        return *this;
> +    }
> +    // Since almost all instructions have condition codes, thie condition

s/thie/this/

@@ +1428,5 @@
> +    void extractCond(Assembler::Condition *c) {
> +        if (data >> 28 != 0xf )
> +            *c = (Assembler::Condition)(data & 0xf0000000);
> +    }
> +    // Cet the next instruction in the instruction stream.

s/Cet/Get/

@@ +1430,5 @@
> +            *c = (Assembler::Condition)(data & 0xf0000000);
> +    }
> +    // Cet the next instruction in the instruction stream.
> +    // This will likely eventually do neat things like ignore
> +    // constant pools and their guards.

Neat indeed, but arguably essential unless we can guarantee that pools won't be dumped in the middle of MOVW/MOVT sequences and suchlike. (We rely on next() in getCF32Target, for example.) If we do guarantee that in all cases where next() might be called, there is no problem here.

@@ +1432,5 @@
> +    // Cet the next instruction in the instruction stream.
> +    // This will likely eventually do neat things like ignore
> +    // constant pools and their guards.
> +    Instruction *next() { return &this[1]; }
> +    // Sometimes, and api wants a uint32 (or a pointer to it) rather than

s/and/an/

@@ +1434,5 @@
> +    // constant pools and their guards.
> +    Instruction *next() { return &this[1]; }
> +    // Sometimes, and api wants a uint32 (or a pointer to it) rather than
> +    // an instruction.  raw() just coerces this into a pointer to a uint32
> +    uint32 *raw() { return (uint32*)this; }

I think that breaks strict aliasing rules. Fixing it might be non-trivial.

::: js/src/methodjit/BaseAssembler.h
@@ +568,5 @@
>          // that make 12 insufficent.  In case 16 is also insufficent, I've bumped
>          // it to 20.
>          ensureSpace(20);
>          int initFlushCount = flushCount();
> +#endif'

s/'//
Comment 8 Marty Rosenberg [:mjrosenb] 2011-12-06 14:57:46 PST
(In reply to Jacob Bramley [:jbramley] from comment #7)
> Comment on attachment 578552 [details] [diff] [review] [diff] [details] [review]
> Tracing take 2
> 
> Review of attachment 578552 [details] [diff] [review] [diff] [details] [review]:
> -----------------------------------------------------------------
> 
> One or two things look broken, and there are a few errors in comments and
> suchlike, but the changes required should be trivial so this has my r+ with
> those changes made.
> 
> General comment: Lots of missing curly brackets.
> 
> ::: js/src/ion/arm/Assembler-arm.cpp
> @@ +123,5 @@
> > +Register
> > +js::ion::toRM(Instruction &i)
> > +{
> > +    return Register::FromCode((i.encode()>>8) & 0xf);
> > +}
> 
> Admittedly, the locations of RM, RD and so on move around from instruction
> to instruction, but I would argue that the names of the functions above are
> misleading:
> 
>   toR is used to extract the RM field from a BX or BLX instruction. (I think
> they're the only branch-to-register instructions that we use.)
> 
>   toRM isn't actually used at the moment, but I think it only works for MUL
> instructions.
> 
>   toRD is correct for ALU instructions (but not multiply).
> 
> It's all very frustrating, but in general, you need to know at least the
> class of the instruction in order to know which register is in which field.
> 
Yeah.  that is why I eventually gave up on the naming suffix, and used "toR" :(

> Could these be member functions of the Inst* derived classes, perhaps?
Yes (after I make derived classes for the ALU ops + mul ops), but I suspect that the code duplication will start to get on my nerves

> 
> @@ +178,5 @@
> > +    return (i.encode() & IsDTRMask) == IsDTR;
> > +}
> > +
> > +InstDTR *
> > +InstDTR::asTHIS(Instruction &i)
> 
> Confusing use of 'THIS', I think, but I can't think of a good alternative.
> 
> @@ +331,5 @@
> > +
> > +InstMovW *
> > +InstMovW::asTHIS(Instruction &i)
> > +{
> > +    return (InstMovW*) (&i);
> 
> This (and the MOVT equivalent) should probably have isTHIS checks, like the
> other asTHIS methods.
> 
> @@ +346,5 @@
> > +    return (i.encode() & IsWTMask) == IsT;
> > +}
> > +
> > +Imm16::Imm16(Instruction &inst)
> > +  : lower(inst.encode() & 0xfff), upper(inst.encode() >> 16), invalid(0xfff) {}
> 
> That constructor ought to be with the class declaration, simply because
> there's another constructor there and that might not be clear when debugging.
>
I  cannot do this because I use inst.encode(), and Instruction in defined after Imm16.
Instead, I moved all of the constructors out of the Imm16 declaration, into the .cpp file.

> It looks like a nested call to getPtr32Target could simplify this somewhat.
Eventually, yes. In the mean time, if getPtr32Target cannot decode the sequence it is given,
it always asserts.  I first need to make it such that it can just return an error, then I'll be able to use it here (never mind the fact that getCF32Target() currently asserts if that case fails).

> You can never B to a register. For IM, I think it can be either BX or BLX.
Yep.  My bad.

> I think you mean << 16.
I actually caught that like an hour after attaching this patch.
> @@ +418,5 @@
> > +        top->checkDest(temp);
> > +        // make sure we're branching to the same register.
> > +        branch->checkDest(temp);
> > +        uint32 *dest = (uint32*) (targ_bot.decode() | (targ_top.decode() << 12));
> > +        return dest;
> 
> You should also assert that there's a branch instruction after the MOVW/MOVT
> pair (and that it targets 'temp').
It looks like that is what we're doing here.
> It seems to me that this works for any 32-bit literal, not just pointers.
> Does that matter?
No, but for now, it is only being used as part of the GC, and the GC only deals with pointers

> > +// As far as I can tell, CodeLabels were supposed to be used in swith cables
> 
> s/swith/switch/
> 
> I think that's what you meant, anyway. I'm not sure what a "switch cable" is.
>
Gibberish! that was quite a typo for "switch tables"
> 1: I think you mean to shift right by 6 (since the offset implies two 0 bits
> at the bottom).
>
Nope, We're indexing into an array of Inst sized objects, so the sign extended offset is implicitly multiplied by 4.  I've added a comment pointing this out.
 
> @@ +1422,5 @@
> > +    const Instruction & operator=(const Instruction &src) {
> > +        data = src.data;
> > +        return *this;
> > +    }
> > +    // Since almost all instructions have condition codes, thie condition
> 
> s/thie/this/
>
s/thie/the/ actually 

> Neat indeed, but arguably essential unless we can guarantee that pools won't
> be dumped in the middle of MOVW/MOVT sequences and suchlike. (We rely on
> next() in getCF32Target, for example.) If we do guarantee that in all cases
> where next() might be called, there is no problem here.
I'll file in a bug for that.  In the mean time, we ASSERT() rather than emitting a guard, so...

> 
> @@ +1432,5 @@
> > +    // Cet the next instruction in the instruction stream.
> > +    // This will likely eventually do neat things like ignore
> > +    // constant pools and their guards.
> > +    Instruction *next() { return &this[1]; }
> > +    // Sometimes, and api wants a uint32 (or a pointer to it) rather than
> 
> s/and/an/
Funny, I don't remember being drunk when I wrote this, but the number of typos seems to disagree with that memory.

> 
> @@ +1434,5 @@
> > +    // constant pools and their guards.
> > +    Instruction *next() { return &this[1]; }
> > +    // Sometimes, and api wants a uint32 (or a pointer to it) rather than
> > +    // an instruction.  raw() just coerces this into a pointer to a uint32
> > +    uint32 *raw() { return (uint32*)this; }
> 
> I think that breaks strict aliasing rules. Fixing it might be non-trivial.
> 
This is true, however, we should be pretty safe, since anyone that gets one of these pointers should never modify it, and anything that they do inspect at that address will need it to be cast back to an Instruction *

Note You need to log in before you can comment on or make changes to this bug.