Closed Bug 1064835 Opened 10 years ago Closed 10 years ago

Assertion failure: stack_[*size_].isJs(), at vm/SPSProfiler.cpp:192 or Crash on Heap

Categories

(Core :: JavaScript Engine, defect)

x86_64
Linux
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla36
Tracking Status
firefox33 --- wontfix
firefox34 --- verified
firefox35 + verified
firefox36 + verified
firefox-esr31 --- wontfix
b2g-v1.4 --- wontfix
b2g-v2.0 --- wontfix
b2g-v2.0M --- wontfix
b2g-v2.1 --- fixed
b2g-v2.2 --- fixed

People

(Reporter: decoder, Assigned: djvj)

Details

(4 keywords, Whiteboard: [adv-main34+])

Attachments

(1 file)

The following testcase asserts on mozilla-central revision 6b8da5940f74 (run with --no-threads --fuzzing-safe):


var lfcode = new Array();
lfcode.push("");
lfcode.push("");
lfcode.push("");
lfcode.push("x = function () { return arr.length; }");
lfcode.push("");
lfcode.push("");
lfcode.push("enableSPSProfiling();");
lfcode.push("");
lfcode.push("\
    x >>=  /x/;\
    (function f() {\
        x.r = x;\
        return f()\
    })();\
");
for (lfx in lfcode) { loadFile(lfcode[lfx]); }
function loadFile(lfVarx) {
    try {
        evaluate(lfVarx, { noScriptRval : true, compileAndGo : true }); 
    } catch (lfVare) {}
}
Crash trace:


Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7fe974d in ?? ()
#0  0x00007ffff7fe974d in ?? ()
#1  0x0000000000000001 in ?? ()
#2  0x00007fffffffd110 in ?? ()
#3  0x00007fffffffd770 in ?? ()
#4  0x0000000001666558 in ?? ()
#5  0x00007ffff7fe9700 in ?? ()
#6  0x00007fffffffce00 in ?? ()
#7  0x0000000001666540 in ?? ()
rbx     0x163d748       103102535496
=> 0x7ffff7fe974d:      movq   $0xaaff7c,(%rbx)


This bug does not reproduce with every run, you might need to run it multiple times. Marked s-s because I don't know if this affects more than just profiling. If it's profiling only, then it's probably sec-moderate.

This is a long standing bug that I was now able to isolate into a single testcase. Fixing it would also help triaging other heap crashes.
Keywords: crash
Needinfo from djvj because it's in SPS.
Flags: needinfo?(kvijayan)
Jan, is there somebody who could look at this?
Flags: needinfo?(jdemooij)
(In reply to Andrew McCreight [:mccr8] from comment #3)
> Jan, is there somebody who could look at this?

djvj is our SPS guru but is already on needinfo. Kannan any idea what's going on?
Flags: needinfo?(jdemooij)
I haven't looked at this yet, but I will this week.  Taking the bug.
Flags: needinfo?(kvijayan)
Assignee: nobody → kvijayan
QA Contact: kvijayan
QA Contact: kvijayan
Kannan, have you had a chance to look at this?  It would be nice to at least know that this is a profiler-only issue.
This is still occurring and might be hiding real bugs even if this is profiler-only. The heap crashes are not always the same and not trivial to distinguish. Please fix this bug :)
Kannan, we need to look at this or at least understand it enough to rate it. Can you please give it some attention at your earliest convenience? Thank you.
Flags: needinfo?(kvijayan)
I will look at this immediately.  I've been working hard on replacing this entire backend and was hoping to get my patches under review.  That process has started now.
Flags: needinfo?(kvijayan)
Found the issue.  Yet another corner-case path.

If we're bailing out from Ion to baseline, and we overflow the C stack on bailout, we pop an SPS frame for the last JS frame (if the bailout was not ArgumentCheck), because stack overflow causes the last frame to be dropped (and we need to pop the SPS frame to match it).

However, sometimes we duplicate this pop because occasionally, the baseline bailout is NOT an ArgumentCheck, but it STILL resumes into the prologue of the baseline script in some corner cases.  In these cases, the baseline-bailout code itself pops the SPS frame, and then subsequently overflows, and then the general bailout code does another pop of the SPS frame, leading to a double-pop.

This _could_ be leveraged to engineer an attack.  If an attacker were able to generate several of these, then they can cause the SPS stack size field to underflow and become negative.  Since the offset is a uint32_t, this becomes a positive offset around 4-billion.  On 64-bit machines, that could well point into valid memory.

Then by subsequently controlling the JS code that gets called AFTER that underflow, some words can have controlled data written to them.  This is because as the JS pushes pseudostack entries, it'll be pushing them at the underflowed offset (several gigabytes above the buffer pointer).  Since an attacker can control the pcOffsets at which call sites occur in JS bytecode, and the PC offsets are stored in pseudostack entries, these words may also be controllable.

This is a very hard attack to execute, but it _is_ an opening.
Fix for bug.  Running through try before sending for review:  https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=f8efc7516fce
Comment on attachment 8502058 [details] [diff] [review]
fix-bug-1064835.patch

Try run seems ok.  Sending for review.
Attachment #8502058 - Flags: review?(jdemooij)
Comment on attachment 8502058 [details] [diff] [review]
fix-bug-1064835.patch

Review of attachment 8502058 [details] [diff] [review]:
-----------------------------------------------------------------

Please add the testcase as well (not sure if we treat profiler bugs as sec-high/critical since content can't enable it...).
Attachment #8502058 - Flags: review?(jdemooij) → review+
I would err on the side of treating profiler bugs with the same level of seriousness as regular bugs.

I have the test on the patch queue.  I'll land it after the patch gets uplifted.
Keywords: sec-high
 https://hg.mozilla.org/integration/mozilla-inbound/rev/53d8fbf54227

Test case queued up.  Marking in-testsuite=? as reminder about test case push after this gets pulled upstream.
Flags: in-testsuite?
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Status: RESOLVED → VERIFIED
JSBugMon: This bug has been automatically verified fixed.
Why isn't this being backported to Aurora and Beta (and ESR31, if it is affected) since it is a sec-high rated security bug?

Also, this went in without sec-approval. Unless it is trunk only, sec-high and sec-critical rated issues need to be approved before landing to help avoid 0daying ourselves. See https://wiki.mozilla.org/Security/Bug_Approval_Process for policy.
Flags: needinfo?(kvijayan)
(In reply to Al Billings [:abillings] from comment #18)
> Why isn't this being backported to Aurora and Beta (and ESR31, if it is
> affected) since it is a sec-high rated security bug?
> 
> Also, this went in without sec-approval. Unless it is trunk only, sec-high
> and sec-critical rated issues need to be approved before landing to help
> avoid 0daying ourselves. See
> https://wiki.mozilla.org/Security/Bug_Approval_Process for policy.

An attack using this is hard to engineer, and it requires that profiling be turned on.  I documented one possible attack vector just for the sake of thoroughness.  I think the sec rating of this bug can be reduced to sec-moderate.

We have been fixing and landing these pseudostack sync bugs for a while (work is underway to replace the infrastructure to eliminate that class of bugs entirely, but it has yet to land).  We haven't been uplifting these as they land, so I assumed the same policy here.

That said, I didn't notice the sec-high on this bug, and wasn't aware of the specific policy for dealing with them.  I'll be more alert about this in the future.

Pinging gkw.. do we downgrade this bug or do you think it needs uplift?
Flags: needinfo?(kvijayan) → needinfo?(gary)
(In reply to Kannan Vijayan [:djvj] from comment #19)
> That said, I didn't notice the sec-high on this bug, and wasn't aware of the
> specific policy for dealing with them.  I'll be more alert about this in the
> future.
> 
> Pinging gkw.. do we downgrade this bug or do you think it needs uplift?

While I'm not entirely sure about the severity of this bug myself, I'd say uplift it because it would help decoder:

(In reply to Christian Holler (:decoder) from comment #1)
> This is a long standing bug that I was now able to isolate into a single
> testcase. Fixing it would also help triaging other heap crashes.

Generally it is really appreciated to uplift issues where it would help the fuzzing folks.
Flags: needinfo?(gary) → needinfo?(kvijayan)
Comment on attachment 8502058 [details] [diff] [review]
fix-bug-1064835.patch

Approval Request Comment
[Feature/regressing bug #]:
SPS Profiler implementation in JS engine.

[User impact if declined]:
Potentially exploitable crash when profiling turned on.

[Describe test coverage new/current, TBPL]:
New fuzztest to be landed in jit-tests.  Hasn't been checked in yet.

[Risks and why]: 
Low risk, minor changes to profiling logic only.

[String/UUID change made/needed]:
N/A
Flags: needinfo?(kvijayan)
Attachment #8502058 - Flags: approval-mozilla-beta?
Attachment #8502058 - Flags: approval-mozilla-aurora?
Comment on attachment 8502058 [details] [diff] [review]
fix-bug-1064835.patch

Approved for Aurora and Beta. Please get this in before Monday as our beta build for Beta branch goes out then and we're not allowing checkins that aren't sec-critical after that.
Attachment #8502058 - Flags: approval-mozilla-beta?
Attachment #8502058 - Flags: approval-mozilla-beta+
Attachment #8502058 - Flags: approval-mozilla-aurora?
Attachment #8502058 - Flags: approval-mozilla-aurora+
JSBugMon: This bug has been automatically verified fixed on Fx34
JSBugMon: This bug has been automatically verified fixed on Fx35
Whiteboard: [adv-main34+]
Lowering severity per comment 19
Keywords: sec-highsec-moderate
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.