Closed Bug 528116 Opened 15 years ago Closed 15 years ago

TM: "Assertion failure: ti->nStackTypes == NativeStackSlots(cx, 0), at ../jstracer.cpp" with eval, for...each

Categories

(Core :: JavaScript Engine, defect)

x86
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED

People

(Reporter: gkw, Assigned: igor)

References

Details

(Keywords: assertion, regression, testcase, Whiteboard: fixed-in-tracemonkey)

Attachments

(1 file, 2 obsolete files)

(function() {
    for each(let a in [(void 0), 0, (void 0)]) {
        eval("for (var b = 0; b < 2; ++b);", a)
  }
})()

asserts js debug shell with -j on TM tip at Assertion failure: ti->nStackTypes == NativeStackSlots(cx, 0), at ../jstracer.cpp

Nominating blocking1.9.2? because bug 520164 is marked blocking1.9.2+.

autoBisect shows this is probably related to bug 520164:

The first bad revision is:
changeset:   34357:b10d4f42ce96
user:        Blake Kaplan
date:        Fri Oct 30 17:31:15 2009 +0100
summary:     Bug 520164 - Protect the interpreter from non-expected scope chains and variable objects. r=igor
Flags: blocking1.9.2?
assigning to igor because I believe mrbkap is away.
Assignee: general → igor
Flags: blocking1.9.2? → blocking1.9.2+
To dmandelin:

while investigating this issue I have found that the eval in the test from the comment 0 goes over the following code, http://hg.mozilla.org/tracemonkey/annotate/c73182124eb7/js/src/jstracer.cpp#l7848 :

        // Even if the property is on the global object, we must guard against
        // the creation of properties that shadow the property in the middle
        // of the scope chain ***if we are in a function***.
        if (cx->fp->argv) {

But with the test case the eval is called like eval("for (var b = 0; b < 2; ++b);", new Number(0)). That will create a scope chain consisting of a global object that is followed by a with object while executing an eval script. With that cx->fp->argv will be null when the recording is started during loop execution. So why the above fragment does the check only inside functions?
Attached patch fix (obsolete) — Splinter Review
The fix makes sure that scopeChainProp calls traverseScopeChain both for function and non-function cases. I will ask for a review after more checks.
Attachment #413424 - Flags: review?(dmandelin)
Comment on attachment 413424 [details] [diff] [review]
fix

>+        } else {
>+            scope_ins = INS_CONSTOBJ(scope);
>+        }
>+        LIns *obj_ins;
>+        CHECK_STATUS_A(traverseScopeChain(scope, scope_ins, obj, obj_ins));

It doesn't seem right to me to use INS_CONSTOBJ(scope). Is the scope chain on future trace runs guaranteed to be the same as at recording time? I would have thought it could be different. And I would think that would mean we might pass all the guards, because we're checking the old object, but in reality should be using a different object where we should exit.

Can you explain the reasoning why this works? Or if not, it's probably OK to just abort in this situation, since I would expect it to be rare in perf-critical code.
(In reply to comment #4)
> It doesn't seem right to me to use INS_CONSTOBJ(scope). Is the scope chain on
> future trace runs guaranteed to be the same as at recording time?

Your are right, the patch should have used TraceRecorder::scopeChain() and pass that as obj_ins to traverseScopeChain.
(In reply to comment #4)
> It doesn't seem right to me to use INS_CONSTOBJ(scope).

The existing code in the function case calls traverseScopeChain on the parent of the function at the record time. I assume that this is correct since at the trace time the parent cannot change. That is, there is a guard somewhere that states that the closure object during trace execution must match the recorded closure. Right?
Attached patch fix v2 (obsolete) — Splinter Review
Here is a fixed patch. I will ask for a review after clearing the issue from the comment 6.
Attachment #413424 - Attachment is obsolete: true
Attachment #413424 - Flags: review?(dmandelin)
> Your are right, the patch should have used TraceRecorder::scopeChain() and 
> pass that as obj_ins to traverseScopeChain.

Be sure to note that TraceRecorder::scopeChain returns LIR that loads fp->scopeChain from the JSStackFrame that applied when the trace was entered. So if the target object is "below" that point, we should abort. I think that's about the best we can do, until we get more of a representation of the scope chain on trace. 

> The existing code in the function case calls traverseScopeChain on the parent
> of the function at the record time. I assume that this is correct since at the
> trace time the parent cannot change. That is, there is a guard somewhere that
> states that the closure object during trace execution must match the recorded
> closure. Right?

Yes, there is a guard that the parent of the callee object is the same. In this case we are also generating LIR to load the parent of the callee, so I think the function case would be correct even if we did not have that guard.
(In reply to comment #8)
> Be sure to note that TraceRecorder::scopeChain returns LIR that loads
> fp->scopeChain from the JSStackFrame that applied when the trace was entered.
> So if the target object is "below" that point, we should abort.

I am not sure what "below" means here.

> Yes, there is a guard that the parent of the callee object is the same. In this case we are also generating LIR to load the parent of the callee, so I think the function case would be correct even if we did not have that guard.

The generated code checks that the shapes along the scope chain matches the record time. But without the explicit guard the shapes along the recorded parent would be irrelevant for another closure.
(In reply to comment #8)
> Be sure to note that TraceRecorder::scopeChain returns LIR that loads
> fp->scopeChain from the JSStackFrame that applied when the trace was entered.

The generated code is for a script, not a function. Since the with statement is not traced, the script scope cannot change later during tracing.
Attached patch fix v3Splinter Review
The new patch is v2 with better comments and local names.
Attachment #413556 - Attachment is obsolete: true
Attachment #413702 - Flags: review?(dmandelin)
(In reply to comment #10)
> (In reply to comment #8)
> > Be sure to note that TraceRecorder::scopeChain returns LIR that loads
> > fp->scopeChain from the JSStackFrame that applied when the trace was entered.
> 
> The generated code is for a script, not a function. Since the with statement is
> not traced, the script scope cannot change later during tracing.

OK. 

+            // Skip any Call object when inside a function. Any reference to a
+            // Call name the compiler resolves statically and we do not need
+            // to match shapes of the Call objects.
+            chainHead = cx->fp->calleeObject()->getParent();
+            head_ins = stobj_get_parent(get(&cx->fp->argv[-2]));
+        } else {
+            head_ins = scopeChain();
+        }

Everything looks fine, except I'm concerned about the second branch of this 'if' now. If head_ins is being set to scopeChain(), then I think chainHead should get set to the fp that was active when we started recording. Otherwise chainHead and head_ins will not be referring to matching values.
(In reply to comment #12)
> If head_ins is being set to scopeChain(), then I think chainHead
> should get set to the fp that was active when we started recording. Otherwise
> chainHead and head_ins will not be referring to matching values.

The code in question will be called during recording of name-related opcodes. At that moment cx->fp->scopeChain is the head of the scope chain and chainHead matches it - see the assert that the patch adds at the beginning of ::scopeChainProp.
Besides the bug in scopeChainProp there are at least two more issues.

The first one is directly related to the assert in question. For an eval script NativeStackSlots may return different values depending whether the eval is direct or indirect. The difference is due to the eval code having fp->argv when it is a direct call from a function while for the nn-direct call fp->argv is always null. Yet CheckEntryTypes assumes that NativeStackSlots must always match ti->nStackTypes for the passed TreeInfo instance leading to an assert. I guess this happens due to improper caching of eval scripts.

The second issue is the meaning of eval(string, null). As it is currently implemented, this does not corresponds directly to eval(code) yet behaving rather similar to that. To fix that I need to know the meaning of eval(string, null) and eval(string, undefined). 

To Brendan: ES5 does not specify the semantic of eval(string, scope), but were there any discussions about eval(string, null or undefined) ?
ES5 specifies eval(s) ≡ eval(s, _) ≡ eval(s, _, _2) ≡ ....

We tried to remove the special second argument to eval a couple releases ago.  We even succeeded in a stable release -- but we didn't realize, and we'd actually reverted to supporting a second argument in trunk-like places, so there's a stable release out there that doesn't support eval(s, _), plus a newer (I think) release that does again.

We should re-remove support for the second argument to eval.  Nobody else supports it, which means there are workarounds for the functionality.  It's causing us pain (again, if I recall).  Let's worry about stuff people actually use, rather that stuff people forget about when we remove it.
(In reply to comment #14)
> Besides the bug in scopeChainProp there are at least two more issues.
> 
> The first one is directly related to the assert in question. For an eval script
> NativeStackSlots may return different values depending whether the eval is
> direct or indirect. The difference is due to the eval code having fp->argv when
> it is a direct call from a function while for the nn-direct call fp->argv is
> always null.

I don't understand "nn-direct" -- do you mean an indirect call of the form

  var evil = eval; ... evil(s, o)

or similar? Or do you mean when eval is called from global code, the calling JSOP_EVAL will find cx->fp->argv to be null? But a null argv does unambiguously indicate global code or eval-from-global code, AFAIK. So I don't know what bug you mean here, but please file it separately if it is not this bug.

> Yet CheckEntryTypes assumes that NativeStackSlots must always
> match ti->nStackTypes for the passed TreeInfo instance leading to an assert. I
> guess this happens due to improper caching of eval scripts.

What improper caching?

> The second issue is the meaning of eval(string, null). As it is currently
> implemented, this does not corresponds directly to eval(code) yet behaving
> rather similar to that. To fix that I need to know the meaning of eval(string,
> null) and eval(string, undefined). 

I agree with Waldo: we should break eval(o, s), but did we screw up (again) and wait too long to do so in 3.6?

> To Brendan: ES5 does not specify the semantic of eval(string, scope), but were
> there any discussions about eval(string, null or undefined) ?

No, no discussions. The committee does not want to extend eval itself, mainly for fear of breaking some web use-case. There's awareness and loathing of Mozilla's past extensions. One proposed course adds methods of eval, namely eval.hermetic(s, o) and eval.spawn(o), very much like modes of our shell evalcx function.

/be
(In reply to comment #16)
> I don't understand "nn-direct" -- do you mean an indirect call of the form

To implement eval(obj, scope) obj_eval uses js_ValueToObject. That returns null for null or undefined code. This implies that the code treats eval(obj, null) almost as eval(obj). "Almost" here means that at least in one place obj_eval checks for argc <= 2 to differentiate the direct call from the rest of cases. 

> What improper caching?

obj_eval contains:

    * Source matches, qualify by comparing scopeobj to the
    * COMPILE_N_GO-memoized parent of the first literal
    * function or regexp object if any. If none, then this
    * script has no compiled-in dependencies on the prior
    * eval's scopeobj.

But that is not the case due to different staticLevel for direct eval and other forms of eval.

> I agree with Waldo: we should break eval(o, s), but did we screw up (again) and
> wait too long to do so in 3.6?

3.6 is at the same beta stage now when the previous screw up with indirect eval for 3.5 was discovered. Consistency dictates that we should not change the code now, clarify the semantics of eval(source, not_an_object) and fix the eval accordingly. But that would imply that for 3.7 we would have yet another meaning of eval. IMO this is worse then fixing the eval once and for all.
Comment on attachment 413702 [details] [diff] [review]
fix v3

Sorry, I tried to r+ this the other day, but for some reason it did not go through. I must have collided and not noticed. But clearly the assert means the last problem I was concerned about can't happen.
Attachment #413702 - Flags: review?(dmandelin) → review+
Igor, please file eval bug(s) as needed, pronto. Cc: mrbkap, jwalden+bmo and me since we love eval pain.

/be
http://hg.mozilla.org/tracemonkey/rev/52969582b0bd

no test?
Whiteboard: fixed-in-tracemonkey
Yeah, need a trace-test addition -- super-easy to write.

/be
http://hg.mozilla.org/mozilla-central/rev/52969582b0bd
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → FIXED
the bugs that caused this didn't land on the branch, and the test cases don't assert or crash there. I do have a branch patch for this, since it did translate. speak up if we actually want it.
Flags: blocking1.9.2+ → blocking1.9.2-
A testcase for this bug was automatically identified at js/src/jit-test/tests/basic/bug528116.js.
Flags: in-testsuite+
You need to log in before you can comment on or make changes to this bug.