Closed Bug 480244 Opened 15 years ago Closed 15 years ago

TM: assertion IsInt32(*p) in pretty raytracer

Categories

(Core :: JavaScript Engine, defect)

x86
Linux
defect
Not set
normal

Tracking

()

VERIFIED FIXED

People

(Reporter: graydon, Assigned: graydon)

References

()

Details

(Keywords: testcase, verified1.9.1, Whiteboard: fixed-in-tracemonkey)

Attachments

(3 files)

This only traps in debug builds of course; the raytracer happily completes in opt builds. But something's fishy. dvander suggests it has to do with disagreement in the FrameInfo and the exit typemap.

Also reproducible in shell. See the attachment in bug 463478.

#0  JS_Assert (s=0xb7f47cc2 "isInt32(*p)", file=0xb7f44cb0 "jstracer.cpp", ln=1742)
#1  0xb7eff225 in TraceRecorder::import (this=0xaae303e0, base=0xa9a1bbd8, offset=16, p=0xb283d178, t=@0xaae16607, prefix=0xb7f3be17 "stack", index=2, fp= 0xb283d0f0) at jstracer.cpp:1742
#2  0xb7f00d96 in TraceRecorder::import (this=0xaae303e0, treeInfo=0xb1f65160, sp=0xa9a1bbd8, stackSlots=10, ngslots=40, callDepth=1, typeMap=0xaae16600 "") at jstracer.cpp:1834
#3  0xb7f01388 in TraceRecorder (this=0xaae303e0, cx=0xb2119400, _anchor=0xa9a6fc6c, _fragment=0xaae30340, ti=0xb1f65160, stackSlots=10, ngslots=40, typeMap=0xaae16600 "", innermostNestedGuard=0xa9a4f1f8, outerToBlacklist=0x0) at jstracer.cpp:1261
#4  0xb7f014fd in js_StartRecorder (cx=0xb2119400, anchor=0xa9a6fc6c, f=0xaae30340, ti=0xb1f65160, stackSlots=10, ngslots=40, typeMap=0xaae16600 "", expectedInnerExit=0xa9a4f1f8, outer=0x0) at jstracer.cpp:3064
#5  0xb7f016e2 in js_AttemptToExtendTree (cx=0xb2119400, anchor=0xa9a6fc6c, exitedFrom=0xa9a4f1f8, outer=0x0) at jstracer.cpp:3512
#6  0xb7f083e8 in js_MonitorLoopEdge (cx=0xb2119400, inlineCallCount=@0xbfffec2c)
Assignee: general → graydon
Blocks: 463478
Interesting details. Stop me if any of this sounds familiar:

What's happening is not disagreement between any two typemaps, as far as I can tell: the synthesis frameinfo typemap is specifying the same entries to save as the import typemap is to load.

What's going wrong is the number and location of the saved and loaded *slots*. Saved, in particular. 

We're saving a sequence of frames that's ... funny, let's say. Note that on line 4072 or thereabouts, we have an *explicit* loop that's doing FlushNativeStackFrame *repeatedly* on a pseudo-calldepth of 1 and a manually adjusted callstack input, stepping up the callstack buffer, rather than letting FlushNativeStackFrame use FORALL_SLOTS_IN_PENDING_FRAMES on the full intended stack structure, as the latter appears to be written to try to do. This is the "heavy frame synthesis" bit. Then we do something similar-but-different (skipped in this case) for "shallow frames" (which I think are JSInlineFrames), and finally around line 4180 we "top up" the saved frame by calling FlushNativeStackFrame *again* on the innermost exit's calldepth and typemap.

That's where things go wrong. In our case there's only 1 heavy call, no shallow calls, and 1 top-up.

The heavy call to FlushNativeStackFrame writes the sequence:

 callee (0x8267f7c)=object<0x825b5b0:Function>
 this0  (0x8267f80)=object<0x8241000:global>
 vars0  (0x8268018)=double<-45.0455>
 vars1  (0x826801c)=int<4>
 vars2  (0x8268020)=int<3>

then the top-up call to FlushNativeStackFrame writes the sequence:

 callee0 (0x8268024)=object<0x825b508:Function>
 this0   (0x8268028)=object<0x8241000:global>
 vars0   (0x82680b8)=int<9>
 stack0  (0x82680bc)=int<9>
 stack1  (0x82680c0)=int<9>

We then turn around and attempt to do the following typemap-driven import:

import vp=0x8267f7c name=$callee0 type=object flags=0
import vp=0x8267f80 name=$this0 type=object flags=0
import vp=0x8268018 name=$cast.t type=double flags=0
import vp=0x826801c name=$cast.obj type=int flags=0
import vp=0x8268020 name=$cast.o type=int flags=0
import vp=0x8268024 name=$stack0 type=object flags=0
import vp=0x8268028 name=$stack1 type=object flags=0

and we fault on the import of what we think of as "stack2" at 0x826802c. This happens because in the *import* we're in a single FORALL_SLOTS_IN_PENDING_FRAMES call, and by that macro's logic, this is the sequence of vps to read back. It just has nothing to do with the stack of 2 overlaid frames we (I think incorrectly) wrote out (I mean, it's related, and in fact overlaps a bit, but it has a different structure and lacks the second copy of callee/this/var, and such). 

I *think* the write-out loop is wrong, in LeaveTree; in particular I think the top-up can't get away with calling FlushNativeStackFrame, as it thinks its starting from scratch with a heavy frame. But it also seems like the pending-frame loop macro might be able to do the whole frame at once higher up, in the heavy synthesis stage, rather than this 1-step-at-a-time thing? Anyway, I can't wave a wand and figure out how it ought to go. That will take work and/or consultation with the authors. 

This feels pretty serious if it's wrong! I'll CC a few more curious observers who may shed some light. Blocking?

As an aside: is it correct for the inner guard -- the LOOP_EXIT -- to have a calldepth less than the calldepth of its anchor guard? this seems like it can't happen from my reading of the code, so I'm surprised.
Lovely day, much learned.

First: yet another wild goose chase, but mostly having to do with trying to figure out whose responsibility it is to flush or import which parts of each frame.

My conclusion is that we are handling unexpected-numbers-of-arguments wrong. I have a minimal example now that illustrates:

function outer() { 
  var v = 10.234;
  for (var i = 0; i < 0xff; ++i) {
    inner(v);
  }
}

var g = 0;
var h = 0;

function inner() {
  var v = 10;
  for (var k = 0; k < 0xff; ++k) {
    g++;
    if (g & 0xff == 0xff)
      h++;
  }
  return h;
}

outer();
print("g=" + g + " h=" + h);

This traps exactly the same way as the raytracer. It has to do with the fact that the typemap used during import is missing an entry for the unexpected argument in flight between the outer and the inner. The fact that the argument is not flushed back to the stack is somewhat immaterial, according to gal, since the inner didn't make use of it, so we can just leave it as-it-was before the call (though I imagine it could have via arguments[] hackery, perhaps we guard on that?)

Anyway, the failure happens as the importer drives its import loop using FORALL_SLOTS_IN_PENDING_FRAMES, and this tries to perform an import for all the outer function's active stack entries, *including* the unexpected argument; but it's using a typemap consisting of just "what the inner expected in the outer". So once this loop passes over the stack0 and stack1 entries in the outer (callee and this), it tries to apply the "next" typemap code it has (which describes the first int-typed slot in the callee) to the "next" slot the loop is importing from (which is still the unexpected double arg in the caller) and presto: type mismatch.

Fixing this is still a few steps beyond my ability, but I'm getting closer. Any suggestions?
Test nit, probably doesn't matter:

    if (g & 0xff == 0xff)

This is (g & 1) thanks to the precedence botch inherited from C.

Good point about arguments, we don't trace uses of arguments that would access beyond the expected number of args (the number of formal parameters). See TraceRecorder::record_JSOP_{ARGUMENTS,ARGSUB,ARGCNT} for the abort logic.

/be
The patch also botches trace-tests.js
This approach appears to work: adjust the 3 places that pass over the caller/callee boundary to drop "extra" arguments on the floor, not enumerate them at all. A more complete fix would involve, er, possibly overhauling the entire concept of who owns which half of each frame. Sad. Ugly. 

Given that this bug feels like the icy cold hand of death on my shoulder, I would like more than 1 reviewer to read, apply, test, comment, etc. Brendan perhaps? The frame synthesis and enumeration code looks like partly your production as well.

I also felt unsure about a boundary case: at the *bottom* of the enumerated stack, the base cases where we slurp in the callee/this/args that our tree starts with (calldepth = 0), I did not make any change to skip over extra args. I suppose I can try to do so there as well. It's not clear to me what that context ... means, exactly. Probably I should, but I'm not sure. Might explain the breakage below:

It passes the minimal testcase with identical results jit/no-jit, and the raytracer example runs to completion, though gives different results, jit-vs-non-jit. Trace-test passes. So ... advice? Reviews?
Attachment #365109 - Flags: review?(gal)
Comment on attachment 365109 [details] [diff] [review]
A fix, I think. Sorta.

>@@ -1092,7 +1097,7 @@
>         JSStackFrame* fp2 = fp;
>         fp = fp->down;
>         int missing = fp2->fun->nargs - fp2->argc;
>-        if (missing > 0)
>+        if (missing != 0)
>             slots += missing;
>     }

If you really mean if (missing != 0) slots += missing, then just simplify to slots += missing.

But ... isn't missing > 0 correct here? Don't count them if we don't miss anything, but do count them if we have to add missing slots (see 1st change, we do add missing args still).

>     JS_NOT_REACHED("js_NativeStackSlots");
>@@ -1391,7 +1396,7 @@
> #define RETURN(offset) { JS_ASSERT((offset) == slow_offset); return offset; }
> #else
> #define RETURN(offset) { return offset; }
>-#endif
>+#endif 

Accidental whitespace.

>     size_t offset = 0;
>     JSStackFrame* currentFrame = cx->fp;
>     JSStackFrame* entryFrame;
>@@ -1402,32 +1407,52 @@
>     JSStackFrame** fstack = (JSStackFrame **)alloca(frames * sizeof (JSStackFrame *));
>     JSStackFrame** fspstop = &fstack[frames];
>     JSStackFrame** fsp = fspstop-1;
>+
>+#define CHECK_REGION(a,b,sz) {                              \
>+        if (a <= p && p < b)                                \
>+            RETURN(offset + ((p - a) * sizeof(double)));    \
>+        offset += sz * sizeof(double);                      \
>+    }
>+
>+#define CHECK_INTERVAL(lo,hi) {                             \
>+        jsval* a = (lo);                                    \
>+        jsval* b = (hi);                                    \
>+        size_t sz = (hi - lo);                              \
>+        CHECK_REGION(a,b,sz);                               \
>+    }
>+
>+#define CHECK_EXTENT(point,length) {                        \
>+        jsval* a = (point);                                 \
>+        size_t sz = (length);                               \
>+        jsval* b = a + sz;                                  \
>+        CHECK_REGION(a,b,sz);                               \
>+    }
>+
>+

Beautiful. Seriously. Nit: Drop one extra line here.

>     fp = currentFrame;
>     for (;; fp = fp->down) { *fsp-- = fp; if (fp == entryFrame) break; }
>     for (fsp = fstack; fsp < fspstop; ++fsp) {
>         fp = *fsp;
>         if (fp->callee) {
>             if (fsp == fstack) {
>-                if (size_t(p - &fp->argv[-2]) < size_t(2/*callee,this*/ + fp->fun->nargs))
>-                    RETURN(offset + size_t(p - &fp->argv[-2]) * sizeof(double));
>-                offset += (2/*callee,this*/ + fp->fun->nargs) * sizeof(double);
>-            }
>-            if (size_t(p - &fp->slots[0]) < fp->script->nfixed)
>-                RETURN(offset + size_t(p - &fp->slots[0]) * sizeof(double));
>-            offset += fp->script->nfixed * sizeof(double);
>+                CHECK_EXTENT(&fp->argv[-2], (2/*callee,this*/ + fp->fun->nargs));
>+            }
>+            CHECK_EXTENT(&fp->slots[0], fp->script->nfixed);
>         }
>         jsval* spbase = StackBase(fp);
>-        if (size_t(p - spbase) < size_t(fp->regs->sp - spbase))
>-            RETURN(offset + size_t(p - spbase) * sizeof(double));
>-        offset += size_t(fp->regs->sp - spbase) * sizeof(double);
>         if (fsp < fspstop - 1) {
>             JSStackFrame* fp2 = fsp[1];
>+            jsval* sp = fp->regs->sp;
>             int missing = fp2->fun->nargs - fp2->argc;
>+            if (missing < 0) {
>+                sp += missing;
>+            }
>+            CHECK_INTERVAL(spbase, sp);
>             if (missing > 0) {
>-                if (size_t(p - fp->regs->sp) < size_t(missing))
>-                    RETURN(offset + size_t(p - fp->regs->sp) * sizeof(double));
>-                offset += size_t(missing) * sizeof(double);
>-            }
>+                CHECK_EXTENT(sp, missing);
>+            }
>+        } else {
>+            offset += size_t(fp->regs->sp - spbase) * sizeof(double);
>         }
>     }
> 
>@@ -1438,6 +1463,10 @@
>     JS_ASSERT(size_t(p - currentFrame->slots) < currentFrame->script->nslots);
>     offset += size_t(p - currentFrame->regs->sp) * sizeof(double);
>     RETURN(offset);
>+
>+#undef CHECK_EXTENT
>+#undef CHECK_INTERVAL
>+#undef CHECK_REGION
> #undef RETURN
> }
>
The "fix" above, of course, doesn't really work: we're stuck between a rock and a hard place. If we import the extra args, we have to change ownership of frame parts to make the caller own the args; if we *don't* import the extra args, we wind up having to synthesize frames lacking args that the bytecode expects to see there (and pop off). Can't win without major surgery.

In the meantime -- I'll open a subsequent bug tracking said major surgery -- this bug can close if we apply the attached band-aid that just aborts when we hit the awful situation in question. It hurts the raytracer a bit but it at least maintains correctness. We're currently incorrect -- as in "silently corrupting memory" -- so this should land in the short term. The longer term fix may win back some performance, but it's much lower priority.
Attachment #365308 - Flags: review?(gal)
Attachment #365308 - Flags: review?(gal) → review+
Comment on attachment 365308 [details] [diff] [review]
A band-aid that at least covers the wound

With this patch we are faster with JIT than without, for what its worth.
Follow-on bug 481296 filed to provide a "correct" fix.
Whiteboard: fixed-in-tracemonkey
http://hg.mozilla.org/mozilla-central/rev/2b3e45ddc34f
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
Keywords: testcase
note to self: can't reproduce the assert with Graydon's testcase.
http://hg.mozilla.org/tracemonkey/rev/753d620da90d

/cvsroot/mozilla/js/tests/js1_5/Regress/regress-480244.js,v  <--  regress-480244.js
initial revision: 1.1
Flags: in-testsuite+
Attachment #365109 - Flags: review?(gal) → review-
v 1.9.1, 1.9.2 modulo grain of salt. test passes, but never was reproducible.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: