Closed Bug 449961 Opened 12 years ago Closed 12 years ago

TM: nanojit crash under TraceRecorder::record_JSOP_IFEQ


(Core :: JavaScript Engine, defect, P1)






(Reporter: sayrer, Assigned: brendan)



(Keywords: testcase)


(2 files)

I've been hitting this a fair amount. finally found a case that's easy to reproduce in the shell.
Blocks: landtm
Attached file stack
When I run this, I get a strange bit of pc-to-line-number action:

#11 0x0010821d in TraceRecorder::record_JSOP_IFEQ (this=0x2057b0) at jstracer.cpp:2514
(gdb) p js_PCToLineNumber(cx, cx->fp->script, cx->fp->regs->pc)
$12 = 188

line 188 is:

188                    for (k in value) {
189                        if (typeof k === 'string') {
190                            v = stringify(value[k], whitelist);
191                            if (v) {
192                                a.push(stringify(k) + ':' + v);
193                            }
194                        }
195                    }

And indeed we're having problems because we have a LIR_xf with a constant-false condition:

#8  0x001110ed in nanojit::LirWriter::insGuard (this=0x205890, v=nanojit::LIR_xf, c=0x1001014, x=0x205800) at LIR.h:373
(gdb) p c->imm16()
$14 = 0

which trickles through our LIR pipeline until we hit the ExprFilter, which sees xf-on-const-false and says:

		// need a way to EOT now, since this is trace end.
		return out->insGuard(LIR_x, 0, x);

That passes NULL into the pipeline as a LIns*, and then later on LirBufWriter::ensureReferenceable tries to check if it's a tramp, and that requires looking at the LIns*, and boom:

#0  0x00110cf7 in nanojit::LIns::isop (this=0x0, o=nanojit::LIR_neartramp) at LIR.h:304
#1  0x00137027 in nanojit::LIns::isTramp (this=0x0) at LIR.h:316
#2  0x001316b8 in nanojit::LirBufWriter::ensureReferenceable (this=0x205830, i=0x0, addedDistance=2) at nanojit/LIR.cpp:245

I'm not sure what the root cause of the wrongness is, I confess.
I saw this bug yesterday. Nested trees triggered it in a case. I fixed the cause and it disappeared but I guess there is an underlying problem. This happens when we constant fold conditions and end up with an "always-exit" guard. I will try to reproduce & fix.
Ok, this is an interesting bug. On a trace typeof k === 'string' is a no-op since traces are type-specialized. If you are on trace, k will always be a string at that point. That seems to cause nanojit to die spectacularly. I will try to hack up a simpler test case.
This has to be the for-in code actually. typeof is not supported yet.

import vp=0x808820 name=$this0 type=object flags=0
import vp=0x808890 name=$f.k type=string flags=0
import vp=0x808894 name=$f.q type=int flags=0
import vp=0x808898 name=$f.i type=int flags=0
Abort recording (line 5, pc 23): JSOP_TYPEOF.
Trashing tree info.

for-in is Brendan territory.
It's all my territory, but I share and care ;-).

OS: Mac OS X → All
Priority: -- → P1
Hardware: PC → All
Target Milestone: --- → mozilla1.9.1a2
Assignee: general → brendan
So basically we emit a guard through nanojit that always exits, which nanojit handles in a buggy way. We might add an assert in nanojit, but the underlying issue is that we shouldn't be emitting this guard.
In js_Interpret, the for-in opcode in question, JSOP_FORLOCAL (as do others like it) pushes true so long as it iterates to a new property identifier (stored in the variable), and this true value on the stack is what keeps the subsequent JSOP_IFEQ at the top of the for-in loop from bailing.

We really want a JSOP_IFNE at the bottom, as with all other loops, but that is not relevant here (it's bug 441479).

At the bottom of TraceRecorder::record_JSOP_FORLOCAL:

    // Stack an unboxed true to make JSOP_IFEQ loop.
    stack(0, lir->insImm(1));
    return true;

This is the culprit, but without it or something to push a value on the stack, the subsequent JSOP_IFEQ will be mishandled by the recorder.

But TraceRecorder::record_JSOP_FORLOCAL already guards for-in loop termination conditions. So to fix this bug we need either (1) to avoid the true push and ifop call from JSOP_IFEQ's record method; or (2) to fix nanojit to quash useless guard instructions. I was counting on (2).

(1) could be done by having the recorder advance the interpreter's pc over the JSOP_IFEQ once we know we are iterating.

Forcing the interpreter to skip the IFEQ is hard without more complexity in the recorder/interpreter hookup. Bleah.

Is there no easy nanojit fix in sight?

nanojit is seeing an always-tripped guard, not a never-tripped guard, though.  It seems to do the right thing for never-tripped.

At least, that's my reading from poking around in the crash.  It looks very much like we have an unboxed false on the stack there, from my reading (both the path we took and the inspection of c->imm16() in frame 8 above).  I'm away for the rest of the day, but will poke more later.
I debugged a bit. The object being enumerated is Object.prototype, and it has no enumerable properties. The false value that is on stack must therefore come from the interpreter, and we are entering the recorder at the IFEQ.

Fusing dispatch of the IFEQ (or IFEQX) with the FOR{NAME,LOCAL,ARG,...} before it (as we do for equality and relational ops followed by JSOP_IFNE{,X}) would allow us to avoid the boolean on the stack. I'll give that a try.

A false from the interpreter should result in a guard that looks like this:

LIR_xt (LIR_ld stack-offset)

(Exit if value from the interpreter stack is true, since at recording time we saw false).

Thats not what we get though. We get something thats constant-folded.
(In reply to comment #12)
> I debugged a bit. The object being enumerated is Object.prototype, and it has
> no enumerable properties. The false value that is on stack must therefore come
> from the interpreter, and we are entering the recorder at the IFEQ.

I did not mean "create a new recorder" by "entering the recorder". We seem to be using the same recorder, and its tracker sees the insImm(1) as the generator of the stacked false. Bad times.

Duh. If guarded conditions that terminate the for-in loop are false at recording time we must push a false. Fixed:

Closed: 12 years ago
Resolution: --- → FIXED
Duplicate of this bug: 450035
Keywords: testcase
reproducible with test in bug 450369
/cvsroot/mozilla/js/tests/js1_5/Regress/regress-450369.js,v  <--  regress-450369.js
initial revision: 1.1
Flags: in-testsuite+
Flags: in-litmus-
You need to log in before you can comment on or make changes to this bug.