Closed Bug 757787 Opened 12 years ago Closed 12 years ago

Assertion failure: slot < jp->fun->script()->bindings.count(), at jsopcode.cpp:1744

Categories

(Core :: JavaScript Engine, defect)

Other Branch
x86_64
Linux
defect
Not set
major

Tracking

()

VERIFIED FIXED
Tracking Status
firefox-esr10 --- wontfix

People

(Reporter: decoder, Assigned: djvj)

References

Details

(Keywords: assertion, testcase, Whiteboard: js-triage-needed [jsbugmon:update,ignore])

Attachments

(1 file, 2 obsolete files)

The following testcase asserts on ionmonkey revision d5545e6d927b (run with --ion -n -m --ion-eager):


function reportCompare (expected, actual, description) {
  var testcase = new TestCase("unknown-test-name", description, expected, actual);
}
var lfcode = new Array();
lfcode.push("( function() { actual = (this === x) } ? TestCase = this : false) ();");
lfcode.push("\
test();\
function f(x) {}\
function test() {\
  reportCompare (\"PASSED\", f(true), \"Unqualified\");\
}\
");
lfcode.push("\
var originalEval = eval;\
var eval = originalEval.bind( test(), g, leftContext * RegExp(y), call) ;\
");
while (true) {
	var file = lfcode.shift(); if (file == undefined) { break; }
        loadFile(file);
}
function loadFile(lfVarx) {
	try {
			evaluate(lfVarx);
	} catch (lfVare) {	}
}
Whiteboard: [jsbugmon:update] → [jsbugmon:update] js-triage-needed
Assignee: general → kvijayan
JSBugMon: The testcase found in this bug no longer reproduces (tried revision c05b873dad48).
Whiteboard: [jsbugmon:update] js-triage-needed → js-triage-needed [jsbugmon:update,ignore]
Still repros for me on c05b873dad48.  Debug build on OSX 64-bit with '--ion-eager'.
Attached patch Refactoring and bugfix (obsolete) — Splinter Review
There are several issues here:

1. The ASSERT being triggered in GetArgOrVarAtom is written wrong.

  LOCAL_ASSERT_RV(slot < jp->fun->script()->bindings.count(), NULL);

  should instead be:

  LOCAL_ASSERT_RV(slot < jp->script->bindings.count(), NULL);

  as the script and the function don't have to correspond.  In particular when the code-path that reaches the Decompile function goes through DecompileCode, which explicitly sets a different script before calling Decompile.

2. The above is not the real bug.  The bug is that localNames in JSPrinter is not initialized properly in DecompileCode.  DecompileCode sets the script for the JSPrinter it passes into Decompile, but forgets to initialize the localNames array properly.


Solution:

There were two places in the code that initialized localNames, and this patch would have added another location where this would happen, so initialization of localNames has been refactored into SetPrinterLocalNames, and the two existing uses changed to call that instead of directly initializing localNames.

The bugfix itself is in DecompileCode.  The code is changed to save the existing localNames and initialize them with new ones corresponding to the new script being set, before calling Decompile(...).  After the call to Decompile, the old localNames is restored.

I expect that the LifoAllocScope context in DecompileCode takes care of cleaning up the allocation appropriately when returning.
Attachment #626905 - Flags: review?(jorendorff)
Comment on attachment 626905 [details] [diff] [review]
Refactoring and bugfix

Review of attachment 626905 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing review. Something isn't right about this; the assumption that cx->new_ interacts with LifoAllocScope is surely wrong. I chatted with djvj about it on IRC and he agreed to look into it.

I think RAII-ifying this would clarify what's going on and likely fix some leaks.

::: js/src/jsopcode.cpp
@@ +1067,5 @@
>      }
>  };
>  
> +static bool
> +SetPrinterLocalNames(JSContext *cx, JSScript *script, JSPrinter *jp) {

Style nit: function-body-opening brace on its own line.
Attachment #626905 - Flags: review?(jorendorff)
Attached patch Second try (obsolete) — Splinter Review
So Jason is right about the LifoAllocScope not clearing the allocated memory.

This reveals a leak in the decompiler (specifically in handling JSOP_LAMBDA), as well as the fix I made to DecompileCode.

However, I don't think we can RAII-ify JSPrinter without some more major changes to the code.  A single JSPrinter gets allocated, and re-used through multiple recursive calls of Decompile.  When it's passed down to the next recursive call, the caller saves the JSPrinter's fun, script, and localNames on the C stack, performs the call, and then restores them.

The upshot is that JSPrinter's current design is more like a box that gets passed around than a fully represented value or object.  It doesn't actually have access to all the localName allocations that are used with it through the course of one or more recursive calls, and modifying the decompiler so that it does is nontrivial.  This prevents a straightforward modification of JSPrinter to an RAII model.

I'm modifying the code to fix the memory leak in a more straightforward way (remembering to free the memory before returning), but leaving the larger-scale changes.

I am not convinced this effort would be worth it - especially given the decompiler's uncertain future.
Attachment #626905 - Attachment is obsolete: true
Attachment #626972 - Flags: review?(jorendorff)
This code makes my skin crawl.

I am confused and I need to finish this review later.
Given that this is a bug in the decompiler, is this actually an IonMonkey bug?
No.  I don't think this is an Ion bug.

There's something about --ion-eager that triggers it, which originally I was suspecting might be a bug, but I don't think that's the case.
Status: NEW → ASSIGNED
Summary: IonMonkey: Assertion failure: slot < jp->fun->script()->bindings.count(), at jsopcode.cpp:1744 → Assertion failure: slot < jp->fun->script()->bindings.count(), at jsopcode.cpp:1744
Comment on attachment 626972 [details] [diff] [review]
Second try

Review of attachment 626972 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. r=me.

::: js/src/jsopcode.cpp
@@ +1751,5 @@
>  static JSAtom *
>  GetArgOrVarAtom(JSPrinter *jp, unsigned slot)
>  {
>      LOCAL_ASSERT_RV(jp->fun, NULL);
> +    LOCAL_ASSERT_RV(slot < jp->script->bindings.count(), NULL);

I added this assertion:
    LOCAL_ASSERT_RV(jp->script == jp->fun->script(), NULL);
and the test suite ran fine.

Basically, the opcodes that trigger a call to GetArgOrVarAtom are LOCAL/ARG/ALIASEDVAR opcodes, and those only occur in function code, and never in direct-eval-in-function code. So jp->fun is never null, and jp->fun and jp->script should absolutely always agree here.

(However you can't assert that jp->script->function() == jp->fun, because jp->fun can be a script-visible "cloned" Function object, whereas jp->script->function() is always the bytecode-internal compiler-generated Function.)

Having said all that, your version of the assertion makes more sense than the original. If you want to add my assertion too, or a comment, go for it. At least the next person suffering in this code will see that there's an invariant.

@@ +4700,5 @@
>                      ok = Decompile(&ss2, inner->main(), inner->length - inner->mainOffset)
>                           != NULL;
>                      jp->script = outer;
>                      jp->fun = outerfun;
> +                    Foreground::delete_(jp->localNames);

It bothers me a little that valgrind didn't complain about the leak here. Weird.
Attachment #626972 - Flags: review?(jorendorff) → review+
Unfortunately, that assert you added triggers when running the test case for this bug with --ion-eager.  There may be an Ion bug here after all.
Status: ASSIGNED → NEW
Status: NEW → ASSIGNED
In m-c, that test case tries to call the global object
which results in a call to js_ReportIsNotFunction
which ends up calling js_DecompileValueGenerator
which ends up in the do_fallback case, and calls js_ValueToSource
which, because it's the global object, ends up toSource-ing a few functions
...and that's how we end up in GetArgOrVarAtom.

It seems like it must be an Ion bug.
Attached patch Third trySplinter Review
The issue with --ion-eager was that when js_DecompileValueGenerator was getting called from an Ion context, the Function being pulled from the stack frame and the Script being retreived from the stack frame did not match.  The additional changes from the previous change are:

1. Change js_DecompileValueGenerator to use ScriptFrameIter to get the frame pointer, script, and pc, as opposed to getting it directly from the context.

2. Use the script's function as the function to pass to DecompileExpression, as opposed to the frame pointer's function.  In an Ion context those two do not have to correspond.

Tested on OSX 64-bit, against jstests using both --no-ion and --ion-eager.  No regressions introduced.
Attachment #626972 - Attachment is obsolete: true
Attachment #627330 - Flags: review?(jorendorff)
Comment on attachment 627330 [details] [diff] [review]
Third try

Review of attachment 627330 [details] [diff] [review]:
-----------------------------------------------------------------

It would've been nice to detect the problem and assert sooner, closer to the bug (perhaps in js_GetTopStackFrame or fp->maybeFun(), if it's wrong to call those while Ion code is on the stack).

I asked dvander to take a look at the last two hunks here. I don't understand why the old code is bad or why the new code is better, so maybe i'm not a great reviewer for this.

::: js/src/jsopcode.cpp
@@ +5644,5 @@
>      goto do_fallback;
>  #endif
>  
> +    ScriptFrameIter frameIter(cx);
> +    if (frameIter.done())

I think you might need to add "|| frameIter.isScript()" here. It would matter if the topmost stack frame is a call to a JSNative. Alternatively you might insert a little loop to skip over such stack frames to the topmost script frame.
Attachment #627330 - Flags: review?(jorendorff)
Attachment #627330 - Flags: review?(dvander)
Attachment #627330 - Flags: review+
(In reply to Jason Orendorff [:jorendorff] from comment #13)
> ::: js/src/jsopcode.cpp
> @@ +5644,5 @@
> >      goto do_fallback;
> >  #endif
> >  
> > +    ScriptFrameIter frameIter(cx);
> > +    if (frameIter.done())
> 
> I think you might need to add "|| frameIter.isScript()" here. It would
> matter if the topmost stack frame is a call to a JSNative. Alternatively you
> might insert a little loop to skip over such stack frames to the topmost
> script frame.

ScriptFrameIter are skipping all non isScript frames, so frameIter.isScript() should always be true.  The native won't show up under ScriptFrameIter, if you want to see native function calls, then you should use StackIter and ensure that isScript() is used when needed.
Comment on attachment 627330 [details] [diff] [review]
Third try

Review of attachment 627330 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsopcode.cpp
@@ +5649,3 @@
>          goto do_fallback;
>  
> +    fp = frameIter.fp();

With IonMonkey, we should avoid use of fp() since we get the entry frame which is only useful for the runningInIon flag.

You should remove uses of fp() if this can be an Ion frame.  You can check for this by using
frameIter.isIon(), but the best is to replace fp by frameIter.
(In reply to Nicolas B. Pierron [:pierron] from comment #14)
> ScriptFrameIter are skipping all non isScript frames, so

Oh, of course. I misread that line.

It seems to me, now, that this assertion is a symptom of the underlying bug that js_GetTopStackFrame is being used unsafely in Ion, and the main reason for that is that it doesn't assert.

<pierron> jorendorff: js_GetTopStackFrame can use StackIter and assert if the first frame is an Ion frame.
<pierron> jorendorff: and every place where this assert is raised should be replaced by StackITer / ScriptFrameIter

I agree with that. Can we fix it that way?
I'll make that into a new bug titled properly, and make this one block it.  I'll throw the ASSERT into js_GetTopStackFrame as a patch attached to that bug.

Then I'll close this bug with the posted patch, modulo review comments above.

Then we can attach any other cases we run into over time that triggers the assert, and track those via the meta bug.

Does that sound reasonable?
Sounds good. Does this bug still need to be s-g?
Comment on attachment 627330 [details] [diff] [review]
Third try

Review of attachment 627330 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jsopcode.cpp
@@ +5649,3 @@
>          goto do_fallback;
>  
> +    fp = frameIter.fp();

It looks like we can read fp in the (spindex != JSDVG_SEARCH_STACK) case, around line 5661, so if it's an Ion frame we should goto do_fallback to ensure we don't scan an unrelated frame.
Attachment #627330 - Flags: review?(dvander) → review+
I'd keep the bug as secure.  It's Ion specific but it's definitely an exploitable security issue.
Blocks: 759107
Comitted: https://hg.mozilla.org/projects/ionmonkey/rev/b9febded6011

Waiting for tbpl before closing bug.
The last patch introduce a build error with --enable-more-deterministic.  Fixing that:

https://hg.mozilla.org/projects/ionmonkey/rev/629d1e6251b9
Yeah.  Closing.
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
JSBugMon: This bug has been automatically verified fixed.
Status: RESOLVED → VERIFIED
No longer blocks: 759107
Blocks: 759107
I'm confused by comment 20. are you saying we should fix this on shipping branches (ESR? Beta?) because the patch is not in Ion code?
Sorry I got behind on my bugmail.

The patch is not in Ion code but the issue that it addresses should only show up in Ion code.  It should be fine for inclusion in shipping but it's not necessary from a security perspective.

I don't know what standard procedure would be for things like that.  The refactoring is nice, I guess, but it's more of a hygiene thing than a security thing.
Group: core-security
You need to log in before you can comment on or make changes to this bug.