Closed Bug 631106 Opened 9 years ago Closed 9 years ago

JM: simplify allocation of executable memory

Categories

(Core :: JavaScript Engine, defect)

defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: njn, Assigned: njn)

References

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file)

Allocation of executable memory in JM is done with ExecutableAllocator and
ExecutablePool.  Here's how they work.

- ExecutablePool holds multiple contiguous chunks of executable memory
  (pointers to the chunks are kept in m_pools).  The pool as a whole is
  refcounted and all its chunks are freed when the refcount reaches zero.
  The first chunk is allocated when the pool is created;  subsequent chunks
  can be allocated with alloc() if the current chunk isn't big enough to
  satisfy the request.  (Nb: alloc() calls poolAllocate(); the call could be
  inlined).  Not too strange so far.

- But ExecutablePools aren't used directly.  Their use is always mediated
  by ExecutableAllocator, which is layered on top.  And the way
  ExecutableAllocator uses ExecutablePools causes them to never have
  multiple chunks!  (ExecutablePool aids and abets this via its
  available() function, which returns 0 if it has contains more than one
  chunk.)  ExecutableAllocator minimizes fragmentation by keeping track of
  multiple ExecutablePools and allocating from the most appropriate one.  If
  none of the tracked ExecutablePools have enough space (as reported by
  available()) for an allocation, ExecutableAllocator will create a new
  ExecutablePool.

- The usage pattern is that you call allocator.poolForSize() (indirectly,
  eg. via getExecPool/GetExecPool) which gives you a pool that you know
  can satisfy a request of a particular size, then you call
  executableCopy(), which goes through multiple functions but ends up
  calling ExecutableAlloc::alloc() which we know can satisfy the request
  without allocating any more memory (though it can do so as a backup step).
  In other words, the allocation process is split into two steps, and
  conceptually either of which could actually allocate (eg. mmap) some
  memory, but in practice only the first step ever does, the second one just
  gives you a handle to it.

- Oh, but wait, it gets better!  The second step *can* allocate memory,
  resulting in an ExecutablePool with two chunks.  This is because there's
  an off-by-one error in ExecutablePool::alloc(), which means that an
  allocation that fits exactly in the pool's remaining space (and which has
  previously been correctly approved by ExecutableAllocator) gets rejected
  as too big!  So a second chunk *is* allocated.

- I think some of the executableCopy() functions are dead, too.

- Basically, the whole thing -- the allocator itself, plus the layers that
  JM has put in front of it -- is *absurdly* complex.  I suspect it's been
  modified by half a dozen different people at different times, none of
  whom knew exactly how the whole thing worked.  (I plead guilty to being
  one of those people, in bug 616310.) ExecutablePool and
  ExecutableAllocator should be merged into a single class, and allocations
  should occur in a single step.

The only downside is that this will move us further away from JSC's copy of
all this code, but we've already changed it quite a bit so that doesn't
strike me as a problem.
Also, I'm suspicious about the reference counting of pools, it wouldn't surprise me if there were mistakes there.  It could certainly be clearer.  Alternatively, the lifetimes of executable code pieces mostly aren't complex -- eg. PICs are all purged together.  So as long as code pieces with different lifetimes didn't end up in the same pool -- which can lead to fragmentation anyway -- then direct freeing might be simpler and more obviously correct.
Because memory allocation is delicate, esp. when reference counting is involved, I'm breaking this into a few steps.  Bug 632629 is the first step.
Some more notes:
- There are six different executableCopy() functions, but unfortunately I don't think any can be removed/merged.  On x86 they're simple, but on ARM they're more complex due to constant pools.  I don't think they can be merged.

- The executableCopy() calls in finishThisUp() can return NULL.  This currently
isn't checked, it should be.

- ExecutablePool needs to be separate from ExecutableAllocator in order for various places to add/remove references to a pool.  After bug 631045 lands, we'll be able to move to lifetime-based pool management instead of reference counting.

- The handling of OOMs in AssemblerBuffer is weird.  Maybe that can be simplified.
Oh god, I now understand why this is so complex -- because of constant pools on ARM.  For example, the weird two-step allocation process is necessary because on ARM, when copying code from AssemblerBuffer we occasionally have to dump a constant pool, which means that we may need more memory than we originally asked for.  This also explains how an ExecutablePool can end up with more than one chunk.

How annoying.  I'll have to scale this bug's ambition back greatly, and just do a bit of renaming and commenting to make the abominations clearer.
Ah, I worked out a way to clean things up despite the complications of ARM
constant pools.

Changes:

- There were two chains of executableCopy() functions;  this patch renames
  one of those chains executableAllocAndCopy(), which better reflects what
  it does, and also better distinguishes it from the vanilla
  executableCopy() functions.  executableAllocAndCopy() now returns both the
  void* pointer and the ExecutablePool (by reference) from which it came.

- Replaces the two-step allocation process with a sane one-step process:
  ExecutableAllocator::alloc().  This was made possible by delaying the
  allocation until the end of the executableAllocAndCopy() chain, when the
  required size is known, even on ARM with its constant pools.

- Removes some awful variable naming, where ExecutableAllocators were
  sometimes called "pool", and ExecutablePools were sometimes called
  "allocator"(!)

- Changes ExecutablePool::m_pools from a vector to a scalar, just like the
  patch in bug 632629.

- ExecutablePool is now mostly accessed through ExecutableAllocator.
  Reference incrementing/decrementing is the main exception.

- Removes getExecPool() and GetExecPool() by inlining them;  getExecPool()
  was only called once, GetExecPool() was called twice but is very small and
  at one of the callsites its used caused js_ReportOutOfMemory() to be
  called twice on OOM.

  (This leaves BaseCompiler almost empty;  I didn't remove it because I was
  too lazy to change all those |cx| occurrences in sub-classes to |f.cx|.  I
  could do it if people think it's worthwhile.)

- Nb: in LinkerHelper::init(), m_size was recorded before the call to
  executableCopy(), which on ARM can change masm.size() -- this looks like a
  bug.  m_size is now assigned afterwards.
Attachment #512101 - Flags: review?(dvander)
Attachment #512101 - Attachment description: 62168:51cecc00aa12 → patch (against TM 62168:51cecc00aa12)
> - Nb: in LinkerHelper::init(), m_size was recorded before the call to
>   executableCopy(), which on ARM can change masm.size() -- this looks like a
>   bug.  m_size is now assigned afterwards.

Jacob says (via private email) that this is definitely a bug.
(In reply to comment #6)
> 
> Jacob says (via private email) that this is definitely a bug.

And as such, maybe it should be fixed separately in order to make it into 4.0.  It's an easy change, just requires swapping the order of the assignments to m_code and m_size.
> And as such, maybe it should be fixed separately in order to make it into 4.0. 

Bug 634118.
Comment on attachment 512101 [details] [diff] [review]
patch (against TM 62168:51cecc00aa12)

>+        // This alloc is infallible because we poolForSize() just found
>+        // (created, if necessary) a pool that had enough space.

s/we//

>+    JSC::ExecutableAllocator::makeWritable(result, totalSize);
>+    void *ret = masm.executableCopy(result);
>+    if (!ret) {
>         js_ReportOutOfMemory(cx);
>         return Compile_Error;
>     }
>-    JSC::ExecutableAllocator::makeWritable(result, totalSize);
>-    masm.executableCopy(result);
>-    stubcc.masm.executableCopy(result + masm.size());
>+    ret = stubcc.masm.executableCopy(result + masm.size());
>+    JS_ASSERT(ret); // if the first executableCopy() call succeeded, the second should too

No need to use the return values of these executableCopy calls, since they just return the result of memcpy().

r=me with that fixed
Attachment #512101 - Flags: review?(dvander) → review+
It looks like the ARM one checks if the buffer had OOM, but we should check that (or should have checked that) way before because the x86 buffer can OOM too.

Btw,
> Removes some awful variable naming, where ExecutableAllocators were
> sometimes called "pool", and ExecutablePools were sometimes called
> "allocator"(!)

I laughed when I read that. Thanks, that was so confusing.
(In reply to comment #10)
> It looks like the ARM one checks if the buffer had OOM, but we should check
> that (or should have checked that) way before because the x86 buffer can OOM
> too.

I don't understand, can you clarify?

I also found that ~ExecutablePool() needs to check that its allocation isn't NULL before calling systemRelease() on it.
(In reply to comment #9)
> 
> No need to use the return values of these executableCopy calls, since they just
> return the result of memcpy().

(In reply to comment #10)
> It looks like the ARM one checks if the buffer had OOM, but we should check
> that (or should have checked that) way before because the x86 buffer can OOM
> too.

(In reply to comment #11)
> 
> I don't understand, can you clarify?

Oh, I think the sentence in comment 10 refers to the sentence in comment 9.

Both ARMAssembler::executableCopy() and X86Assembler::executableCopy() can return NULL if an OOM happened earlier (as recorded by |m_buffer| and |m_formatter| respectively).  AFAICT an OOM in the buffer/formatter gets propagated along and this is the point where it's checked.

So I think the patch is correct as is.  dvander, do you agree?
It's correct but I'm saying we should make executableCopy infallible because all it does is memcpy(), and hoist the oom check into finishThisUp since I think there is even already one there.
(In reply to comment #13)
> It's correct but I'm saying we should make executableCopy infallible because
> all it does is memcpy(), and hoist the oom check into finishThisUp

On X86 that's true, but not on ARM:

void* ARMAssembler::executableCopy(void * buffer)
{       
    if (m_buffer.oom())
        return NULL;
    
    ASSERT(m_buffer.sizeOfConstantPool() == 0);
    
    memcpy(buffer, m_buffer.data(), m_buffer.size());
    fixUpOffsets(buffer);
    return buffer;
}
After discussion with dvander on IRC, the pushed patch makes executableCopy() infallible and changes it to have no return value.  finishThisUp() checks for OOM earlier on:

http://hg.mozilla.org/tracemonkey/rev/441bc12e94e2
Whiteboard: fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/441bc12e94e2
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.