Last Comment Bug 695219 - IonMonkey: Rewrite most (if not all) of the constant pool logic
: IonMonkey: Rewrite most (if not all) of the constant pool logic
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: JavaScript Engine (show other bugs)
: unspecified
: ARM Linux
: -- normal (vote)
: ---
Assigned To: Marty Rosenberg [:mjrosenb]
:
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-10-17 16:00 PDT by Marty Rosenberg [:mjrosenb]
Modified: 2012-01-30 13:27 PST (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
WIP (64.18 KB, patch)
2011-12-21 02:49 PST, Marty Rosenberg [:mjrosenb]
no flags Details | Diff | Splinter Review
WIP v. 2.0 (80.37 KB, patch)
2011-12-27 00:59 PST, Marty Rosenberg [:mjrosenb]
no flags Details | Diff | Splinter Review
cleaned up and rebased (69.55 KB, patch)
2012-01-06 15:13 PST, Marty Rosenberg [:mjrosenb]
dvander: review+
Details | Diff | Splinter Review

Description Marty Rosenberg [:mjrosenb] 2011-10-17 16:00:54 PDT
AssemblerBufferWithConstantPool is not as featureful as I would like it to be.  It (in my opinion) does not have adequate support for load instructions with varying ranges.  It also seems to lack support for referencing constants with negative offsets, and its handling of deduplication could be improved.  I'm expecting this to be an optimization after I get the rest of IonMonkey working on ARM.
Comment 1 Marty Rosenberg [:mjrosenb] 2011-12-21 02:49:20 PST
Created attachment 583436 [details] [diff] [review]
WIP

Works for most tests.  Definitely missing some functionality w.r.t. backwards references to pools.  Has not been tested on 32 bit values.
Comment 2 Marty Rosenberg [:mjrosenb] 2011-12-27 00:59:12 PST
Created attachment 584399 [details] [diff] [review]
WIP v. 2.0

If all of the failures that I'm seeing are not caused by this patch, I'll ask for review.
Comment 3 Marty Rosenberg [:mjrosenb] 2011-12-28 14:16:24 PST
Comment on attachment 584399 [details] [diff] [review]
WIP v. 2.0

I still need some way of getting the lowering code to use useOrConstant on x* and useRegisterOrConstant on ARM, but that should be decently minor.
Comment 4 David Anderson [:dvander] 2012-01-05 19:38:48 PST
Comment on attachment 584399 [details] [diff] [review]
WIP v. 2.0

Review of attachment 584399 [details] [diff] [review]:
-----------------------------------------------------------------

This patch is a little hard to read, so I'd like a second pass at it with some style fixes. General style rules:
  * Spaces in between operators (i.e. (a + b) not (a+b)
  * Newlines above comments
  * First letter of comments in general should be capitalized (i.e. look like a full sentence)
  * No :TODOs: or :FIXME:s unless there's a cited bug on file
  * No commented or #if 0'd code

::: js/src/assembler/wtf/SegmentedVector.h
@@ +49,5 @@
>  
>          // Only prefix ++ operator supported
>          Iterator& operator++()
>          {
> +            //ASSERT(m_index != SegmentSize);

Why did this assert get commented?

@@ +171,5 @@
>          }
>  
>          void grow(size_t size)
>          {
> +            //ASSERT(size > m_size);

Ditto..

::: js/src/ion/IonLinker.h
@@ +50,5 @@
>  #include "ion/IonMacroAssembler.h"
>  
>  namespace js {
>  namespace ion {
> +static const int codeAlign = 8;

Nit: Newline before and after this new declaration.

@@ +73,5 @@
>  
>          uint8 *result = (uint8 *)comp->execAlloc()->alloc(bytesNeeded, &pool, JSC::METHOD_CODE);
>          if (!result)
>              return fail(cx);
> +        // bump the code up to a nice alignment.

Nit: Newline above this comment.

::: js/src/ion/LIR-Common.h
@@ +224,5 @@
>      LCallGeneric(const LAllocation &func,
>                   uint32 argslot, const LDefinition &token,
>                   const LDefinition &nargsreg,
> +                 const LDefinition &tmpobjreg,
> +                 const LDefinition &wtfreg)

Say what? :)

::: js/src/ion/Lowering.cpp
@@ +199,5 @@
>          MDefinition *right = comp->getOperand(1);
>  
>          if (comp->specialization() == MIRType_Int32) {
>              JSOp op = ReorderComparison(comp->jsop(), &left, &right);
> +            return add(new LCompareIAndBranch(op, useRegister(left), useRegisterOrConstant(right),

Let's find a way to make this change ARM specific.

::: js/src/ion/arm/Assembler-arm.cpp
@@ +365,4 @@
>      ASSERT(m_buffer.sizeOfConstantPool() == 0);
> +#endif
> +    m_buffer.executableCopy(buffer);
> +    //    memcpy(buffer, m_buffer.data(), m_buffer.size());

Remove this comment, as well as the code in the #if 0 if it no longer applies.

@@ +1258,5 @@
>      php.phd.loadType = PoolHintData::poolVDTR;
>      php.phd.destReg = dest.code();
>      php.phd.index = 0;
>      php.phd.ONES = 0xf;
> +    //m_buffer.putIntWithConstantDouble(php.raw, value);

Remove this comment.

::: js/src/ion/arm/Assembler-arm.h
@@ +933,5 @@
>      }
>  
>      Instruction * editSrc (BufferOffset bo) {
> +        return m_buffer.getInst(bo);
> +        //return (Instruction*)(((char*)m_buffer.data()) + bo.getOffset());

Remove instead of comment.

@@ +1003,2 @@
>      {
> +        Pool *tmp = static_cast<Pool*>(malloc(sizeof(Pool)*2));

Any of these mallocs can fail, which is problematic because we tend to do things like:

MacroAssembler masm(cx);
masm.blah(x);
...

So you'll want to make sure that the Assembler has the appearance of working if these allocations fail (and to propagate OOM at link time).

::: js/src/ion/arm/CodeGenerator-arm.cpp
@@ +1127,5 @@
> +        masm.ma_ldr(calleeop.toDTRAddr(), objreg);
> +    } else {
> +        // Make the modifiable object reg a copy of the immutable function reg
> +        masm.as_mov(objreg, calleeop.toOp2());
> +        //masm.ma_mov(calleeop.toOp2(), objreg);

Remove this comment.

::: js/src/ion/arm/MacroAssembler-arm.cpp
@@ +4,5 @@
>   * ***** BEGIN LICENSE BLOCK *****
>   * Version: MPL 1.1/GPL 2.0/LGPL 2.1
>   *
>   * The contents of this file are subject to the Mozilla Public License Version
> + * 1.1 (the "License"); you ma_y not use this file except in compliance with

Accidental change?

::: js/src/ion/arm/Trampoline-arm.cpp
@@ +439,5 @@
>  
>      // Test for an exception
>      masm.as_cmp(r0, Imm8(0));
>      masm.ma_b(&exception, Assembler::Zero);
> +    //masm.as_dtr(IsLoad, 32, PostIndex, pc, DTRAddr(sp, DtrOffImm(4)));

Remove instead of commenting.

@@ +455,5 @@
>      // The return value is how much stack to adjust before returning.
>      masm.as_add(sp, sp, O2Reg(r0));
>      // We're going to be returning by the ion calling convention, which returns
>      // by ??? (for now, I think ldr pc, [sp]!)
> +    //masm.as_dtr(IsLoad, 32, PostIndex, pc, DTRAddr(sp, DtrOffImm(4)));

This whole block is in an #if 0 - just remove it.

::: js/src/ion/shared/Assembler-shared.h
@@ +189,5 @@
>          JS_ASSERT(!used());
>      }
>  };
>  
> +// An abspolute label is like a Label, except it represents an absolute

Accidental change?

::: js/src/ion/shared/IonAssemblerBuffer.h
@@ +46,5 @@
> +namespace js {
> +namespace ion {
> +// I'd like this to actually reside inside of AssemblerBuffer, but that won't be nice
> +// since I'll need to provide a template to use this.
> +class BufferOffset

Newlines above the comment here, and above the namespace declarations.

@@ +57,5 @@
> +    template <class BOffImm>
> +    BOffImm diffB(BufferOffset other) const {
> +        return BOffImm(offset - other.offset-8);
> +    }
> +    template <class BOffImm>

These template functions should have some newline breathing room.

@@ +83,5 @@
> +        JS_ASSERT(next_ == NULL);
> +        next_ = next;
> +    }
> +    uint8 instructions [NodeSize];
> +    unsigned int size() {return nodeSize;}

Style is:
    uint8 instructions [NodeSize];
    unsigned int size() {
        return nodeSize;
    }

It looks like size is an accessor for nodeSize - was it intended to be private & decorated, or is the accessor not needed?

@@ +114,5 @@
> +        }
> +        new (tmp) Node;
> +        return tmp;
> +    }
> +    virtual void ensureSpace(int size) {

We probably don't want this to be virtual. It's called everywhere.

@@ +183,5 @@
> +            }
> +            JS_ASSERT(cur != NULL);
> +        }
> +        // I cannot figure out what invariant this was supposed to be testing
> +        //JS_ASSERT(local_off - bufferSize < tail->size());

Delete commented code.

::: js/src/ion/shared/IonAssemblerBufferWithConstantPools.h
@@ +60,5 @@
> +    uint8 *poolData;
> +    uint32 numEntries;
> +    uint32 buffSize;
> +    LoadOffsets loadOffsets;
> +    // When filling pools where the the size of an immediate is larger

Newline above this comment block.

@@ +124,5 @@
> +        // Inserting an instruction into the stream can
> +        // push any of the pools out of range.
> +        // Similarly, inserting into a pool can push the pool entry out of range
> +        JS_ASSERT(!isBackref);
> +        // We're considered "full" when:

Newline above this comment.

@@ +171,5 @@
> +
> +template <int NodeSize, int InstBaseSize>
> +struct InstNodeTail : public InstNode<NodeSize> {
> +    // dear me, why did I add this variable?
> +    // I have no memory of why it is here.

Hrm :)

@@ +292,5 @@
> +        Chunk *dest = (Chunk*)(((uint32)dest_ + instBufferAlign - 1) & ~(instBufferAlign -1));
> +        int curIndex = 0;
> +        int curInstOffset = 0;
> +        for (int* i = (int*)start; i < (int*)dest; i++) {
> +            *i = 0xe1a00000;

What is this magic number?

@@ +434,5 @@
> +        }
> +        if (p == NULL) {
> +            return INT_MIN;
> +        }
> +#if 0

Remove commented code.
Comment 5 Marty Rosenberg [:mjrosenb] 2012-01-05 23:35:43 PST
(In reply to David Anderson [:dvander] from comment #4)
> Comment on attachment 584399 [details] [diff] [review]
> WIP v. 2.0
> 
> Review of attachment 584399 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This patch is a little hard to read, so I'd like a second pass at it with
> some style fixes. General style rules:
>   * Spaces in between operators (i.e. (a + b) not (a+b)
>   * Newlines above comments
>   * First letter of comments in general should be capitalized (i.e. look
> like a full sentence)
>   * No :TODOs: or :FIXME:s unless there's a cited bug on file
>   * No commented or #if 0'd code
> 
I'll write a comment-liner for myself that will warn me if I did anything silly like that in the code.

> ::: js/src/assembler/wtf/SegmentedVector.h
> @@ +49,5 @@
> >  
> >          // Only prefix ++ operator supported
> >          Iterator& operator++()
> >          {
> > +            //ASSERT(m_index != SegmentSize);
> 
> Why did this assert get commented?
I don't remember, I think they were tripping in a case I thought was valid.
I've uncommented both of them and am testing now.
> ::: js/src/ion/LIR-Common.h
> @@ +224,5 @@
> >      LCallGeneric(const LAllocation &func,
> >                   uint32 argslot, const LDefinition &token,
> >                   const LDefinition &nargsreg,
> > +                 const LDefinition &tmpobjreg,
> > +                 const LDefinition &wtfreg)
> 
> Say what? :)
> 
That was while we were attempting to debug LSRA.  It has been removed.
> ::: js/src/ion/Lowering.cpp
> @@ +199,5 @@
> >          MDefinition *right = comp->getOperand(1);
> >  
> >          if (comp->specialization() == MIRType_Int32) {
> >              JSOp op = ReorderComparison(comp->jsop(), &left, &right);
> > +            return add(new LCompareIAndBranch(op, useRegister(left), useRegisterOrConstant(right),
> 
> Let's find a way to make this change ARM specific.
> 
Already done.  Bug 714949 should have that stuff.  sstangl may ask for some more tweaks to it.

> @@ +1003,2 @@
> >      {
> > +        Pool *tmp = static_cast<Pool*>(malloc(sizeof(Pool)*2));
> 
> Any of these mallocs can fail, which is problematic because we tend to do
> things like:
> 
> MacroAssembler masm(cx);
> masm.blah(x);
> ...
> 
> So you'll want to make sure that the Assembler has the appearance of working
> if these allocations fail (and to propagate OOM at link time).
>
I actually killed a bunch of allocations and wrapped them into the structure.
C++ doesn't support initializing arrays the way I want, but Looks like I can make it work. 
> ::: js/src/ion/arm/MacroAssembler-arm.cpp
> @@ +4,5 @@
> >   * ***** BEGIN LICENSE BLOCK *****
> >   * Version: MPL 1.1/GPL 2.0/LGPL 2.1
> >   *
> >   * The contents of this file are subject to the Mozilla Public License Version
> > + * 1.1 (the "License"); you ma_y not use this file except in compliance with
> 
> Accidental change?
I've done this more times than I care to count.
> @@ +292,5 @@
> > +        Chunk *dest = (Chunk*)(((uint32)dest_ + instBufferAlign - 1) & ~(instBufferAlign -1));
> > +        int curIndex = 0;
> > +        int curInstOffset = 0;
> > +        for (int* i = (int*)start; i < (int*)dest; i++) {
> > +            *i = 0xe1a00000;
> 
> What is this magic number?
> 
*cough*
aside from "no longer there",it is the traditional NOP on ARM.  That code was for aligning the beginning of the instruction stream, and now that we can guarantee the stream will always be aligned, this should never be hit.
Comment 6 Marty Rosenberg [:mjrosenb] 2012-01-06 15:13:41 PST
Created attachment 586582 [details] [diff] [review]
cleaned up and rebased

All of the minor bug fixes have already landed, so they've been removed from this patch.
Comment 7 David Anderson [:dvander] 2012-01-11 19:56:58 PST
Comment on attachment 586582 [details] [diff] [review]
cleaned up and rebased

Review of attachment 586582 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/ion/IonLinker.h
@@ +51,5 @@
>  
>  namespace js {
>  namespace ion {
>  
> +static const int codeAlign = 8;

Nit: CodeAlignment

@@ +76,5 @@
>          if (!result)
>              return fail(cx);
>  
> +        // Bump the code up to a nice alignment.
> +        uint8 *codeStart = (uint8 *) (((uintptr_t)result + sizeof(IonCode*) + (codeAlign - 1)) &

I think this can just be AlignBytes(result, CodeAlignment)

::: js/src/ion/arm/Assembler-arm.cpp
@@ +1789,5 @@
> +    BOffImm dest;
> +    i->extractImm(&dest);
> +    return dest.decode()+8;
> +}
> +void

nit: space separating functions

@@ +1798,5 @@
> +    if (i->is<InstBImm>()) {
> +        new (i) InstBImm(BOffImm(offset-8), cond);
> +    } else {
> +        new (i) InstBLImm(BOffImm(offset-8), cond);
> +    }

nit: no braces needed here

@@ +1801,5 @@
> +        new (i) InstBLImm(BOffImm(offset-8), cond);
> +    }
> +}
> +void
> +Assembler::writePoolHeader(uint8 *start, Pool *p) {

nit: brace on newline

@@ +1806,5 @@
> +    return;
> +}
> +
> +void
> +Assembler::writePoolFooter(uint8 *start, Pool *p) {

ditto

::: js/src/ion/shared/IonAssemblerBuffer.h
@@ +57,5 @@
> +    explicit BufferOffset(int offset_) : offset(offset_) {}
> +    int getOffset() const { return offset; }
> +
> +    template <class BOffImm>
> +    BOffImm diffB(BufferOffset other) const {

What is a BufferOffset, or BOffImm? What does "diffB" mean?

@@ +64,5 @@
> +
> +    template <class BOffImm>
> +    BOffImm diffB(Label *other) const {
> +        JS_ASSERT(other->bound());
> +        return BOffImm(offset - other->offset()-8);

What is this -8 here? I get the feeling this code is ARM specific, should this whole file be under arm/, or just this class?

@@ +75,5 @@
> +    bool assigned() { return offset != INT_MIN; };
> +};
> +
> +template<int NodeSize>
> +struct InstNode {

This needs an explanation, and a better name, like "BufferSlice" or just "Buffer" and ideally should inherit from InlineForwardListNode<Self> to avoid duplicating linked list code.

@@ +94,5 @@
> +    InstNode() : next_(NULL), nodeSize(0) {}
> +    void putBlob(uint32 instSize, uint8* inst) {
> +        if (inst != NULL)
> +            memcpy(&instructions[size()], inst, instSize);
> +        nodeSize+=instSize;

nodeSize += instSize;

@@ +141,5 @@
> +            head = tmp;
> +        return true;
> +    }
> +
> +    void putByte(uint8 value) {

Some renamings to look like other data structures in the engine: appendByte here

@@ +145,5 @@
> +    void putByte(uint8 value) {
> +        putBlob(sizeof(value), (uint8*)&value);
> +    }
> +
> +    void putShort(uint16 value) {

appendUInt16

@@ +149,5 @@
> +    void putShort(uint16 value) {
> +        putBlob(sizeof(value), (uint8*)&value);
> +    }
> +
> +    void putInt(uint32 value) {

appendUInt32

@@ +152,5 @@
> +
> +    void putInt(uint32 value) {
> +        putBlob(sizeof(value), (uint8*)&value);
> +    }
> +    void putBlob(uint32 instSize, uint8 *inst) {

appendRaw(const uint8 *from, uint32 length)

@@ +190,5 @@
> +            JS_ASSERT(cur != NULL);
> +        }
> +        // the offset within this node should not be larger than the node itself.
> +        JS_ASSERT(local_off < cur->size());
> +        return (Inst*)&cur->instructions[local_off];

It looks like the only purpose of the <Inst> typename is for the cast down below. If this file is intended to be arch-generic, I'd advocate just returning uint8 * instead.

::: js/src/ion/shared/IonAssemblerBufferWithConstantPools.h
@@ +83,5 @@
> +          poolData(static_cast<uint8 *>(malloc(8*immSize))), numEntries(0),
> +          buffSize(8), limitingUser(), limitingUsee(INT_MAX)
> +    {
> +    }
> +    Pool() : maxOffset(0xa5a5a5a5), immSize(0xa5a5a5a5), instSize(0xa5a5a5a5), bias(0xa5a5a5a5),

What is this constant? Seems like it deserves to be named.
Comment 8 :Ms2ger (⌚ UTC+1/+2) 2012-01-18 05:03:28 PST
JS_FREE_PATTERN?
Comment 9 Jacob Bramley [:jbramley] 2012-01-18 07:59:09 PST
Comment on attachment 586582 [details] [diff] [review]
cleaned up and rebased

Review of attachment 586582 [details] [diff] [review]:
-----------------------------------------------------------------

I've partially reviewed it. (I skimmed a lot.) However, the principle looks sound. I like the way you've split the pools, and the forward and backward pools are certainly interesting.

::: js/src/ion/arm/Assembler-arm.cpp
@@ +1273,5 @@
> +{
> +    uint32 *load = (uint32*) load_;
> +    PoolHintPun php;
> +    php.raw = *load;
> +    JS_ASSERT(php.phd.ONES == 0xf && php.phd.loadType != PoolHintData::poolBOGUS);

Parentheses would make this easier to read, like ((a == b) && (c == d)).

::: js/src/ion/shared/IonAssemblerBufferWithConstantPools.h
@@ +69,5 @@
> +    // Moreover, If we want to do fancy things like deduplicate pool entries at
> +    // dump time, we may not know the location in a pool (and thus the limiting load)
> +    // until very late.
> +    // Lastly, it may be beneficial to interleave the pools.  I have absolutely no idea
> +    // how that will work, but my suspicions are that it will be difficult.

If you mean interleaving the int32 literals with double literals, I would advise against it, unless you can interleave on a cache-line granularity. Even then, it's probably not worth it.
Comment 10 :Ms2ger (⌚ UTC+1/+2) 2012-01-18 09:00:36 PST
(In reply to Jacob Bramley [:jbramley] from comment #9)
> ::: js/src/ion/arm/Assembler-arm.cpp
> @@ +1273,5 @@
> > +{
> > +    uint32 *load = (uint32*) load_;
> > +    PoolHintPun php;
> > +    php.raw = *load;
> > +    JS_ASSERT(php.phd.ONES == 0xf && php.phd.loadType != PoolHintData::poolBOGUS);
> 
> Parentheses would make this easier to read, like ((a == b) && (c == d)).

And two asserts would probably make debugging easier at some point
Comment 11 Jacob Bramley [:jbramley] 2012-01-19 02:31:07 PST
Comment on attachment 586582 [details] [diff] [review]
cleaned up and rebased

Review of attachment 586582 [details] [diff] [review]:
-----------------------------------------------------------------

Ok, this mechanism is clever, but also very complicated. There is a lot of code here, and so there's an increased risk that it contains bugs. Bugs in literal-pool handling code usually manifest as security holes, such as inadvertently branching into the literal pool and thus allowing remote code execution. I didn't spot any problems. However, as I said, there is a lot of code, and it's really easy to miss off-by-one errors and suchlike.

I think that this code needs a lot of testing before it can go into production. That means specific, stand-alone tests to ensure that pools work at and around their limits. Based on experience with the much simpler schemes used in TM and JM, the standard JS tests will not suffice. Since this is going on a development branch, the tests could come later.

Finally, due to the complexity of this, and the interesting algorithm that is used, it'd be good to write an article or blog post with some diagrams, explaining how the pools are constructed. A big comment in the code (with
ASCII art) would also suffice. I've thought about schemes like this before, but I've never seen one implemented. Nice job!

::: js/src/ion/shared/IonAssemblerBufferWithConstantPools.h
@@ +123,5 @@
> +    }
> +
> +    // poolOffset is the offset of the beginning of the pool were it placed after
> +    // the last instruction currently in the pool (It is ok for the caller to lie,
> +    // and pretend there is an extra instruction in the pool).

I think I'm a bit confused here. What is "the pool" in this case?

@@ +138,5 @@
> +        }
> +        return false;
> +    }
> +
> +    // By the time this function is called, we'd damn well better know that this is going to succeed.

Can we ASSERT that this is the case?

@@ +204,5 @@
> +// happen (in some instances) after the pools have been finalized. Attempting to compute
> +// the correct offsets for branches as the pools are finalized is quite infeasible.
> +// Instead, I write *just* the number of instructions that will be jumped over, then
> +// when we go to copy the instructions into the executable buffer, fix up all of the
> +// offsets to include the pools. Since we have about 32 megabytes worth of offset,

This file isn't ARM-specific, so you might not have 32MB of offset!

@@ +205,5 @@
> +// the correct offsets for branches as the pools are finalized is quite infeasible.
> +// Instead, I write *just* the number of instructions that will be jumped over, then
> +// when we go to copy the instructions into the executable buffer, fix up all of the
> +// offsets to include the pools. Since we have about 32 megabytes worth of offset,
> +// I am not very worried about the pools moving it out of range.

You still need to handle it, for security reasons if nothing else. What if a malicious script finds a way to meet this limit?

@@ +244,5 @@
> +    int numDumps;
> +    struct PoolInfo {
> +        int offset; // the number of instructions before the start of the pool
> +        int size;   // the size of the pool, including padding
> +        int finalPos; // the end uf the buffer, in bytes from the beginning of the buffer

s/uf/of/

@@ +412,5 @@
> +        uint32 poolOffset = nextOffset;
> +        Pool *tmp;
> +        // If we need a guard instruction, reserve space for that.
> +        if (true)
> +            poolOffset += guardSize;

'if (true)' is redundant.
Comment 12 Marty Rosenberg [:mjrosenb] 2012-01-20 18:59:28 PST
(In reply to Jacob Bramley [:jbramley] from comment #11)
> Comment on attachment 586582 [details] [diff] [review]
> cleaned up and rebased
> 
> Review of attachment 586582 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Ok, this mechanism is clever, but also very complicated. There is a lot of
> code here, and so there's an increased risk that it contains bugs. Bugs in
> literal-pool handling code usually manifest as security holes, such as
> inadvertently branching into the literal pool and thus allowing remote code
> execution. I didn't spot any problems. However, as I said, there is a lot of
> code, and it's really easy to miss off-by-one errors and suchlike.
> 
> I think that this code needs a lot of testing before it can go into
> production. That means specific, stand-alone tests to ensure that pools work
> at and around their limits. Based on experience with the much simpler
> schemes used in TM and JM, the standard JS tests will not suffice. Since
> this is going on a development branch, the tests could come later.
> 
> Finally, due to the complexity of this, and the interesting algorithm that
> is used, it'd be good to write an article or blog post with some diagrams,
> explaining how the pools are constructed. A big comment in the code (with
> ASCII art) would also suffice. I've thought about schemes like this before,
> but I've never seen one implemented. Nice job!
> 
> ::: js/src/ion/shared/IonAssemblerBufferWithConstantPools.h
> @@ +123,5 @@
> > +    }
> > +
> > +    // poolOffset is the offset of the beginning of the pool were it placed after
> > +    // the last instruction currently in the pool (It is ok for the caller to lie,
> > +    // and pretend there is an extra instruction in the pool).
> 
> I think I'm a bit confused here. What is "the pool" in this case?
> 
> @@ +138,5 @@
> > +        }
> > +        return false;
> > +    }
> > +
> > +    // By the time this function is called, we'd damn well better know that this is going to succeed.
> 
> Can we ASSERT that this is the case?
> 
> @@ +204,5 @@
> > +// happen (in some instances) after the pools have been finalized. Attempting to compute
> > +// the correct offsets for branches as the pools are finalized is quite infeasible.
> > +// Instead, I write *just* the number of instructions that will be jumped over, then
> > +// when we go to copy the instructions into the executable buffer, fix up all of the
> > +// offsets to include the pools. Since we have about 32 megabytes worth of offset,
> 
> This file isn't ARM-specific, so you might not have 32MB of offset!
> 
True.
> @@ +205,5 @@
> > +// the correct offsets for branches as the pools are finalized is quite infeasible.
> > +// Instead, I write *just* the number of instructions that will be jumped over, then
> > +// when we go to copy the instructions into the executable buffer, fix up all of the
> > +// offsets to include the pools. Since we have about 32 megabytes worth of offset,
> > +// I am not very worried about the pools moving it out of range.
> 
> You still need to handle it, for security reasons if nothing else. What if a
> malicious script finds a way to meet this limit?
At least in debug builds, we'll trip an assertion failure when we go to construct the branch (on arm at least).  I'll file a bug about this, and CC you on it.
> 
> @@ +244,5 @@
> > +    int numDumps;
> > +    struct PoolInfo {
> > +        int offset; // the number of instructions before the start of the pool
> > +        int size;   // the size of the pool, including padding
> > +        int finalPos; // the end uf the buffer, in bytes from the beginning of the buffer
> 
> s/uf/of/
> 
> @@ +412,5 @@
> > +        uint32 poolOffset = nextOffset;
> > +        Pool *tmp;
> > +        // If we need a guard instruction, reserve space for that.
> > +        if (true)
> > +            poolOffset += guardSize;
> 
> 'if (true)' is redundant.
and there to remind me to add a check for a guard instruction.  I figured that case would almost never come up, so it wouldn't hurt to be conservative.
Comment 13 Jacob Bramley [:jbramley] 2012-01-23 01:01:33 PST
(In reply to Marty Rosenberg [:Marty] from comment #12)
> > You still need to handle it, for security reasons if nothing else. What if a
> > malicious script finds a way to meet this limit?
> At least in debug builds, we'll trip an assertion failure when we go to
> construct the branch (on arm at least).  I'll file a bug about this, and CC
> you on it.

Right, but we won't be shipping debug builds, so the assertion will only get hit on our own tests. If the offset is ultimately derived from user input (i.e. JavaScript), then it's very hard to know if the offset range is safe unless we have release-mode checks of some kind, or unless it is possible to prove that the offset range will never be hit. Since the algorithm is not architecture-specific, that could be very difficult.

Note You need to log in before you can comment on or make changes to this bug.