Closed Bug 995704 Opened 10 years ago Closed 10 years ago

Crash [@ EnterIon] or [@ js::jit::IonCannon] or [@ js::RunScript]

Categories

(Core :: JavaScript Engine: JIT, defect, P1)

x86
Windows 7
defect

Tracking

()

RESOLVED FIXED
mozilla32
Tracking Status
firefox28 --- wontfix
firefox29 --- wontfix
firefox30 --- wontfix
firefox31 + wontfix
firefox32 --- fixed
firefox-esr24 --- unaffected
firefox-esr31 --- wontfix
b2g-v1.2 --- unaffected
b2g-v1.3 --- wontfix
b2g-v1.3T --- wontfix
b2g-v1.4 --- wontfix
b2g-v2.0 --- fixed

People

(Reporter: gkw, Assigned: djvj)

References

Details

(4 keywords, Whiteboard: [adv-main32+])

Crash Data

Attachments

(6 files, 1 obsolete file)

function g(f, inputs) {
    for (j = 0; j < 49; ++j) {
        f(inputs[j], inputs[0])
    }
}
__proto__ = Iterator(true)
m0 = (function(x, y) {
    return (Math.min((((Number.E | 0) * (y | 0) && x()) & Math.round((y | 0) ? Math.trunc((y) = 0) : Math.pow(Math.round))), (Math.tan(((Math.imul()) + ((Math.imul(Math.round(Math.round(x > 0)), Math.cbrt(Number.I))) < (y | 0)))))))
})
m3 = (function(x, y) {
    return m1((((Number.E, m0()) * Math.tan) | Math.round(m1(Math.exp, m0(Math.tan(y))))), m2(x))
})
m1 = (function(x, y) {
    return Math.round(Math.round(Math.min((Math.imul() > (m0(y | 0, Math.round(m0())))), m0())))
})
m4 = (function(x, y) {
    return Math.min((y > 0, (m2(Math.pow(Math.sin)))), (m1(m3(), Math.round(m3(m3(x))))))
});
m2 = (function(x, y) {
    return (Math.round(m1(false, Math.round())) ^ (m0((m1(m0())), m0(Math.log < y)) ^ Math.min(Math.exp(Math.round(undefined > 0)))))
})
try {
    (function() {
        m5 = (function(x, y) {
            m4(Math.round)
        })
        g(m5, [, 0])
    })()
} catch (e) {}

crashes js debug and opt shell on m-c changeset ebdf2740dc3e with --ion-eager --ion-parallel-compile=off --ion-range-analysis=off --ion-regalloc=backtracking at EnterIon with js::jit::IonCannon and js::RunScript on the stack.

My configure flags are:

MAKE=mozmake AR=ar sh c:\\\\Users\\\\mozillaadmin\\\\trees\\\\mozilla-central\\\\js\\\\src\\\\configure --enable-optimize --disable-debug --enable-profiling --enable-gczeal --enable-debug-symbols --disable-tests --enable-more-deterministic --enable-threadsafe <other NSPR options>

Run this testcase multiple times (up to 50/100 times) to get it to reproduce. An upcoming unreduced testcase is a little more reliable (crashes up to once every 5 times).

Setting s-s and sec-critical because a barely reproducible JIT crash on Windows sounds really bad, pending further analysis.

Will try bisecting this.
Note the different runtime flags vs the reduced testcase.
Due to build failures, I cannot shrink the regression window anymore. Jan, is there anything that stands out of the window that might look like a regressor?
Flags: needinfo?(jdemooij)
Based on the regression window, I'm conservatively marking 28 as affected for now. (not that it might be backported to 28 at this point)
Jan: can you take a look at this sec-crit crash?
Priority: -- → P1
I was able to repro this. It's crashing at "push edx", this means it's either the Windows-equivalent of bug 909094 (let's hope not) or another problem with the stack overflow check. Will investigate more.
Flags: needinfo?(jdemooij)
We have a pretty big stack frame with the backtracking allocator. In CodeGeneratorX86Shared::generatePrologue, frameSize() is 19784 bytes, with LSRA it's 1600.

On Windows, we have to touch the stack incrementally (write to the guard page), or we'll crash. See also the #ifdef XP_WIN code in Trampoline-x86.cpp for interpreter -> Baseline OSR.

We could do that here too, but this is the first time we know it affects Ion code too. Dan, maybe backtracking doesn't reuse stack slots somehow? Even on other platforms, 19K is large, compared to 2K with LSRA.

(This is with the 2670unreduced.js attachment and the shell flags in the "opt stack for unreduced testcase" attachment. The attachment is private; if you can't see it we can send it to you.)
Flags: needinfo?(sunfish)
It appears the backtracking allocator does not reuse stack slots. I filed bug 999538, with a hand-written testcase, to track this.

Regardless though, even if the register allocator is taught to reuse stack slots, malicious code could be written in such a way that stack slots cannot be shared, and thus could create arbitrarily large stack frames, so we have to handle this anyway.
Flags: needinfo?(sunfish)
(In reply to Dan Gohman [:sunfish] from comment #10)
> It appears the backtracking allocator does not reuse stack slots. I filed
> bug 999538, with a hand-written testcase, to track this.

Thanks!

> Regardless though, even if the register allocator is taught to reuse stack
> slots, malicious code could be written in such a way that stack slots cannot
> be shared, and thus could create arbitrarily large stack frames, so we have
> to handle this anyway.

True. I think for big frames the prologue should just emit a write to every page, that shouldn't affect most scripts or it should be only a small number of instructions. It'd be great if we could fix bug 999538 for FF31 though, to minimize the perf impact and also to avoid recursion problems on other platforms.
Flags: needinfo?(jdemooij)
Kannan maybe you can take this since it's Windows-only and similar to that Baseline-OSR bug you fixed a while ago?
Flags: needinfo?(jdemooij) → needinfo?(kvijayan)
Assignee: nobody → kvijayan
Flags: needinfo?(kvijayan)
Haven't tested yet, but based on the problem description this is a tentative fix.

For large frameSize() values, we just walk down the stack by 1k chunks, touching it as we go.
Comment on attachment 8411308 [details] [diff] [review]
fix-bug-995704.patch

Hey Gary,

Can you tell me if this patch fixes the bug?  I haven't gotten my windows setup fully online yet.  VS2012 is installed but I haven't set up a moz build environment yet.
Attachment #8411308 - Flags: feedback?(gary)
Comment on attachment 8411308 [details] [diff] [review]
fix-bug-995704.patch

I'm fairly sure this fixes the issue (which was intermittent), and I re-ran a couple of hundred times to be sure, on an opt shell.
Attachment #8411308 - Flags: feedback?(gary) → feedback+
Comment on attachment 8411308 [details] [diff] [review]
fix-bug-995704.patch

Review of attachment 8411308 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/jit/CodeGenerator.cpp
@@ +2696,5 @@
> +            sizeLeft -= 1024;
> +        }
> +        if (sizeLeft > 0) {
> +            masm.reserveStack(sizeLeft);
> +            masm.store32(Imm32(0), Address(StackPointer, 0));

This last store looks unnecessary, since the remainder is less than 1024 at that point. Assuming it is unnecessary, all the added code here can be simplified, since it no longer needs the leading "if (frameSize() < 1024)" special case.
Attachment #8411308 - Flags: feedback+
Group: javascript-core-security
Thanks for the feedback Dan.  Just fixed that up.

On topic:

I'm trying to test this patch but I can't reproduce the issue.

I'm building on Win7 with VS2012, on the revision specified by the first comment, and using the same options (except I removed the --disable-tests because it was causing a build failure).

Ran it about 20 times or so on the shell (with correct command-line arguments) to no avail.  I'm reasonably sure this patch fixes the issue, but still trying to figure out a way to confirm it.
Correction: I'm building on Windows 8, not Win7.
Attachment #8414669 - Flags: review?(sunfish)
Comment on attachment 8414669 [details] [diff] [review]
fix-large-frame-windows-bug.patch

Review of attachment 8414669 [details] [diff] [review]:
-----------------------------------------------------------------

I'm mildly curious about where the 1k size comes from. I assume the smallest page size of any machine that runs Windows is 4k.

::: js/src/jit/CodeGenerator.cpp
@@ +2702,5 @@
> +    while (frameSizeLeft > 1024) {
> +        masm.reserveStack(1024);
> +        masm.store32(Imm32(0), Address(StackPointer, 0));
> +        frameSizeLeft -= 1024;
> +    }

This doesn't handle the remainder after handling the 1024-byte ranges.
Attachment #8414669 - Flags: review?(sunfish) → review-
Learns me to make "small fixes" without re-running tests.  Jit-tests would have found this immediately.  That's embarassing.  Good catch.
Attachment #8414669 - Attachment is obsolete: true
Attachment #8414723 - Flags: review?(sunfish)
Comment on attachment 8414723 [details] [diff] [review]
fix-large-frame-windows-bug.patch

Review of attachment 8414723 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM, provided 1k is the right size.
Attachment #8414723 - Flags: review?(sunfish) → review+
I think 4k is the required size, but given that 1k frames should only occur in pathological cases it seems worth it to give us some headroom.
https://hg.mozilla.org/mozilla-central/rev/289e653a7061
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla32
I can't verify this fix as I can't reproduce the original issue.  Can someone confirm that this is fixed?
Depends on: 1005324
(In reply to Kannan Vijayan [:djvj] from comment #26)
> I can't verify this fix as I can't reproduce the original issue.  Can
> someone confirm that this is fixed?

I'll try this out sometime soon.
Flags: needinfo?(gary)
I'm assuming this is fixed as I couldn't reproduce on a shell from m-c tip, but due to the testcase being so intermittent, I was unable to reproduce this a second time from the same listed rev. Nonetheless, let's file a new bug should something else pop up again.
Flags: needinfo?(gary)
Backing out this bug temporarily because I'm having trouble verifying whether or not it introduced a perf regression, and I want to eliminate this patch from consideration.  This is a simple DoS bug, so not too worried about it.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(In reply to Kannan Vijayan [:djvj] from comment #29)
> Backing out this bug temporarily because I'm having trouble verifying
> whether or not it introduced a perf regression, and I want to eliminate this
> patch from consideration.  This is a simple DoS bug, so not too worried
> about it.

Backout changeset:

https://hg.mozilla.org/integration/mozilla-inbound/rev/a8775731e2fe

(Setting [leave open] to prevent bug from being closed when backout makes it to m-c)
Whiteboard: [leave open]
https://hg.mozilla.org/mozilla-central/rev/6990146cd895

AFAICT, the [leave open] was for comment 31, so I'm re-closing. That said, this landed w/o sec-approval? Can we please get an Aurora/Beta approval request on this ASAP?
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Flags: needinfo?(kvijayan)
Resolution: --- → FIXED
Whiteboard: [leave open]
I don't think this is that important for uplift.  It's a DoS on win8 that's only triggered by pathological conditions.  Those conditions could be easily engineered, but the best an attacker could do with this is force a crash.

I'm game for uplifting if you think it's necessary, but it doesn't seem to be.  Let me know if you still want uplift request.
Flags: needinfo?(kvijayan) → needinfo?(ryanvm)
Based on your reply, it sounds like this isn't really sec-critical then? That's all I'm going on here :)
Flags: needinfo?(ryanvm)
Yeah.  Sec-low is probably appropriate.
Keywords: sec-criticalsec-low
Marking as a "won't fix" for Aurora and Beta.
Kannan: is this Windows crash specific to CodeGenerator::generateArgumentsChecks() or could it also affect other calls to masm.reserveStack()?
Flags: needinfo?(kvijayan)
My initial reaction to that question was that it was, but I went and doublechecked anyway and I found one other place where it might be an issue.

When we ion-compile jsop_calls() along the generic path (CallGeneric), and the call contains a large number of arguments (say 512 or more), then the arguments slotting on stack will use up 4k or more in a single reserveStack().

I'll write up a patch and post it on this bug, as it's basically the same issue.  I think I'll just restrict have ion abort if scripts have calls with more than a threshold number of arguments (say 256).  That'll limit argvector size to 2k bytes.
Flags: needinfo?(kvijayan)
Just talked to jandem about this.  I was mistaken.  The stack space for the CallGeneric codegen is rolled into the frame allocation, so it's covered.

So we're good :)
Group: javascript-core-security
Based on the work involved to reproduce this, the unreliability of the crash and the conclusions drawn in comments 28, 33 and 39, I am marking qe-verify-. If someone feels that we should spend more time on this bug, please advise. Thank you.
QA Whiteboard: qe-verify-
Whiteboard: [adv-main32+]
QA Whiteboard: qe-verify-
Flags: qe-verify-
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.