Closed Bug 462021 Opened 13 years ago Closed 13 years ago

TM: Make JSStackFrame reconstitution infallible

Categories

(Core :: JavaScript Engine, defect)

Other Branch
defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: jorendorff, Assigned: jorendorff)

References

Details

(Keywords: verified1.9.1)

Attachments

(4 files, 5 obsolete files)

To make reentering the interpreter safe, we need to make JSStackFrame infallible even when out of memory.  This means each trace tree needs to know the maximum amount of memory it might need on side exit, and we need to set that memory aside before executing a trace.

(This code is already immune to all other failures, I think.)
Suggest we take this opportunity to replace stackPool with a power-of-two shrinkable and growable jsval array and preallocate space in it on trace entry. If that's too much change to fix this bug, we could preallocate in the arena-pool (should work, but may be harder than the single-growable-stack-array idea).

/be
Blocks: deepbail
Making sure Igor is cc'ed -- we have common interest in stack reforms to help both tracing and inline-threaded (or just faster, via reduced load/store overhead) interpreter performance.

/be
Assignee: general → gal
Taking.

(In reply to comment #1)
This sounds tricky: we pass native functions a vp that points into the stackPool and must not move the stack out from under that pointer. Separate bug at least.
Assignee: gal → jorendorff
Status: The only two things that can actually go wrong here are allocating actual interpreter stack space (in js_SynthesizeFrame) and allocating Call objects (js_GetCallObject).

I'm most of the way toward fixing the former by pre-allocating that stack space.  I just need to arrange for js_CallTree to be able to return the right side exit when pre-allocation there fails. (We should always snapshot there, so there should be a side exit. We don't always snapshot now, but as far as Andreas and I can tell, that's a bug.)

Making js_GetCallObject infallible will be more interesting.
Making js_GetCallObject infallible when leaving a tree will be much easier than I thought.  Its properties are lazily resolved, so we don't have to preallocate for dslots.  Just preallocating a fixed number of JSObjects suffices.  I don't even have to root them, since GC can't happen.
Attached patch Part 1 - easy stuff - v1 (obsolete) — Splinter Review
Attachment #352394 - Flags: review?(gal)
Andreas, I don't know if you've messed with the GC before; feel free to redirect r? to igor or brendan.
Attachment #352396 - Flags: review?(gal)
Attachment #352394 - Flags: review?(gal) → review+
Comment on attachment 352395 [details] [diff] [review]
Part 2 - making js_SynthesizeFrame infallible - v1

This might slow down CallTree, which is very frequently used. We should measure it.
Attachment #352395 - Flags: review?(gal) → review+
Attachment #352396 - Flags: review?(gal) → review+
Comment on attachment 352396 [details] [diff] [review]
Part 3 - Making js_GetCallObject infallible - v1

Did you confirm that we actually trace all your test cases?
Comment on attachment 352395 [details] [diff] [review]
Part 2 - making js_SynthesizeFrame infallible - v1

> SideExit* FASTCALL
> js_CallTree(InterpState* state, Fragment* f)
> {
>+    TreeInfo *ti = (TreeInfo *) f->vmprivate;
>+    void *mark = JS_ARENA_MARK(&state->cx->stackPool);
>+    void *reserve;
>+    JS_ARENA_ALLOCATE(reserve, &state->cx->stackPool, ti->maxInterpStackBytes);
>+    if (!reserve)
>+        return NULL;

This shouldn't cost much if the stack chunking is working for us. We'll usually have a fast path return (test and bump) from within the JS_ARENA_ALLOCATE macro. Good to verify that, though.

>+/* Calculate the amount of stack needed to js_SynthesizeFrame the top n frames. */
>+JS_REQUIRES_STACK size_t
>+InterpreterStackBytes(JSContext *cx, unsigned callDepth)
>+{
>+    size_t nslots = 0;
>+    for (JSStackFrame *fp = cx->fp; callDepth--; fp = fp->down) {
>+        nslots += (fp->script->nslots +
>+                   JS_HOWMANY(sizeof(JSInlineFrame), sizeof(jsval)));
>+        size_t missing = fp->fun->nargs - fp->argc;
>+        if (missing > 0)
>+            nslots += missing;
>+    }
>+    return nslots * sizeof(jsval);

Don't we already have two or three functions similar to this? I'm thinking of js_NativeStackSlots in particular (the others do more than count slots).

/be
OK, I'll run benchmarks tomorrow.

(In reply to comment #11)
> Don't we already have two or three functions similar to this? I'm thinking of
> js_NativeStackSlots in particular (the others do more than count slots).

Yes, we do.  I can probably combine this with that.
*1.018x as slow* (2319ms to 2360ms).

The patch should also add calls to js_ReserveObjects/js_FreeReservedObjects in js_CallTree.  I just forgot them.

One workaround is to do this preallocation in js_ExecuteTree for all nested trees.  This would require a fancier data structure: when a tree is patched, the amount of memory needed (ti->maxInterpStackBytes and ti->maxCallObjects) can go up, not just for that tree but for other trees that call it.

Advice?
Carrying forward Andreas's r+.  Unchanged except I added an #ifdef to suppress a warning.
Attachment #352394 - Attachment is obsolete: true
Attachment #352636 - Flags: review+
Overallocate stack instead of putting more code in js_CallTree.
Attachment #352395 - Attachment is obsolete: true
Attachment #352637 - Flags: review?(gal)
Attached patch Part 3 - call objects - v2 (obsolete) — Splinter Review
This stack of patches is just 0.8% slower, and even that may just be noise.
Attachment #352396 - Attachment is obsolete: true
Attachment #352638 - Flags: review?(gal)
Comment on attachment 352637 [details] [diff] [review]
Part 2 - stack - v2

Meant to ask about:

>+        /* This allocation is infallible due to traceRun.stackMark. */

What is traceRun?

/be
Comment on attachment 352638 [details] [diff] [review]
Part 3 - call objects - v2

The recoveryDoublePool is in the trace monitor, which is in the thread for JS_THREADSAFE, else in the runtime. Shouldn't the reservedObjects members go in the TM too, not in the cx?

I like reserved better than recovery for the grand unified name scheme, FWIW.

/be
(In reply to comment #17)
> (From update of attachment 352637 [details] [diff] [review])
> >+        /* This allocation is infallible due to traceRun.stackMark. */
> 
> What is traceRun?

Oh - an anachronism.  That won't be introduced until bug 462027.  No more late-night time travel for me.

I agree that reservedObjects members belong in the TraceMonitor.
Attached patch Part 3 - call objects - v3 (obsolete) — Splinter Review
Moved reservedObject bits to the JSTraceMonitor as suggested.
Attachment #352638 - Attachment is obsolete: true
Attachment #352844 - Flags: review?(brendan)
Attachment #352638 - Flags: review?(gal)
Attachment #352844 - Flags: review?(brendan)
Comment on attachment 352844 [details] [diff] [review]
Part 3 - call objects - v3

Part 3 has a bug.  Once the reserved objects are cut loose, a later GC will crash trying to finalize them.  The fix is to initialize these objects more completely.
Comment on attachment 352845 [details] [diff] [review]
Part 4 - rename recoveryDoublePool to reservedDoublePool

>diff --git a/js/src/jscntxt.h b/js/src/jscntxt.h
>--- a/js/src/jscntxt.h
>+++ b/js/src/jscntxt.h
>@@ -121,18 +121,18 @@ typedef struct JSTraceMonitor {
>      * JS_ReportOutOfMemory when failing due to runtime limits.
>      */
>     JSBool                  onTrace;
>     CLS(nanojit::Fragmento) fragmento;
>     CLS(TraceRecorder)      recorder;
>     uint32                  globalShape;
>     CLS(SlotList)           globalSlots;
>     CLS(TypeMap)            globalTypeMap;
>-    jsval                   *recoveryDoublePool;
>-    jsval                   *recoveryDoublePoolPtr;
>+    jsval                   *reservedDoublePool;
>+    jsval                   *reservedDoublePoolPtr;
> 
>     /* State to make LeaveTrace infallible. See jstracer.cpp. */
>     JSObject            *reservedObjects;  /* linked list via fslots[0] */
>     JSBool              useReservedObjects;

Nit: keep these indented the same as the other members.

Could put that "linked list via fslots[0]" comment into the before-line comment, making it major style. Otherwise the one sentence in that comment ("State to make LeaveTrace infallible.") seems to trail off in a couplet of sentence fragments.

Looks fine otherwise, thanks. r=me with nit removal.

/be
Attachment #352845 - Flags: review?(brendan) → review+
Fixes that bug.  This also changes allocation so that instead of allocating and zeroing 64 objects every time we enter a tree, we just do it once and never release them (around 2.5KB overhead per thread).
Attachment #352844 - Attachment is obsolete: true
Attachment #353310 - Flags: review?(brendan)
Comment on attachment 353310 [details] [diff] [review]
Part 3 - call objects - v3

>+    /*
>+     * reservedObjects is a linked list (via fslots[0]) of preallocated JSObjects.
>+     * The JIT uses this to ensure that leaving a trace tree can't fail.
>+     */
>+    JSObject                *reservedObjects;  /* linked list via fslots[0] */

Could lose the inline comment now.

>+            if (JS_TRACE_MONITOR(cx).useReservedObjects) {
>+#ifdef JS_GC_ZEAL
>+testReservedObjects:
>+#endif
>+                JSTraceMonitor *tm = &JS_TRACE_MONITOR(cx);
>+
>+                thing = (JSGCThing *) tm->reservedObjects;
>+                flagp = GetGCThingFlags(thing);
>+                JS_ASSERT(thing);
>+                tm->reservedObjects = JSVAL_TO_OBJECT(tm->reservedObjects->fslots[0]);
>+                break;
>+            }

#ifdef JS_TRACER around this, right?

> JSBool
>+js_ReserveObjects(JSContext *cx, size_t nobjects)

This too should be #ifdef JS_TRACER afaict.

>+{
>+    /*
>+     * Ensure at least nobjects objects are in the list.  fslots[1] of each
>+     * object on the reservedObjects list is the length of the list from there.
>+     */
>+    JSObject *&head = JS_TRACE_MONITOR(cx).reservedObjects;
>+    size_t i = head ? JSVAL_TO_INT(head->fslots[1]) : 0;
>+    while (i < nobjects) {
>+        JSObject *obj = (JSObject *) js_NewGCThing(cx, GCX_OBJECT, sizeof(JSObject));
>+        if (!obj)
>+            return JS_FALSE;
>+        memset(obj, 0, sizeof(JSObject));
>+        obj->classword = (jsuword) &js_CallClass;  /* necessary for finalization */

A little odd to have something mostly zeroed of call class, esp. with an int-tagged parent slot. Use &js_ObjectClass instead? May avoid trouble in the future.

>+        obj->fslots[0] = OBJECT_TO_JSVAL(head);
>+        i++;
>+        obj->fslots[1] = INT_TO_JSVAL(i);

No problem setting this to an int-tagged jsval given a class that doesn't care.

/be
Attachment #353310 - Flags: review?(brendan) → review+
OK, I have a patch to push that picks brendan's nits.

Review ping, Andreas.
Attachment #352637 - Flags: review?(gal) → review+
Comment on attachment 352637 [details] [diff] [review]
Part 2 - stack - v2

The patch looks ok but I am worried about performance. Do we have some numbers on the perf impact?
SunSpider shows no change (1305.8ms -> 1304.7ms on my MacBook).

Pushed:
http://hg.mozilla.org/tracemonkey/rev/25cd79874d1e
http://hg.mozilla.org/tracemonkey/rev/3a730996372f
http://hg.mozilla.org/tracemonkey/rev/60916da8fc4c
http://hg.mozilla.org/tracemonkey/rev/d9edda70cc0e
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Depends on: 472461
test included in js1_8_1/trace/trace-test.js 
http://hg.mozilla.org/mozilla-central/rev/8f967a7729e2
Flags: in-testsuite+
Flags: in-litmus-
Whiteboard: [needs 1.9.1 landing]
v 1.9.1, 1.9.2
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.