Crash in js::jit::LiveInterval::addRangeAtHead

VERIFIED FIXED in Firefox 42

Status

()

defect
--
critical
VERIFIED FIXED
4 years ago
3 years ago

People

(Reporter: bzbarsky, Assigned: bbouvier)

Tracking

(Blocks 1 bug, {crash, sec-high})

unspecified
mozilla43
x86
macOS
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(firefox40 wontfix, firefox41- wontfix, firefox42+ verified, firefox43+ fixed, firefox-esr3842+ verified, b2g-v2.0 unaffected, b2g-v2.0M unaffected, b2g-v2.1 unaffected, b2g-v2.1S unaffected, b2g-v2.2 fixed, b2g-v2.2r fixed, b2g-master fixed, thunderbird_esr38 affected)

Details

(Whiteboard: [adv-main42+][adv-esr38.4+], crash signature)

Attachments

(4 attachments, 2 obsolete attachments)

This is likely bug 1105543 or bug 1106171...
(In reply to Jan de Mooij [:jandem] from comment #1)
> This is likely bug 1105543 or bug 1106171...

Er, to be clear, those bugs are about the same issue.
Flags: needinfo?(nicolas.b.pierron)
Flags: needinfo?(jdemooij)
Depends on: 1106171
The stack seems identical to bug 1106136, that bug is reproducible on maps.yandex.ru.
Crash Signature: [@ js::jit::LiveInterval::addRangeAtHead(js::jit::CodePosition, js::jit::CodePosition)]
Keywords: crash
I was also hitting this on loading http://www.eevblog.com/files/EEVblogJobDescription.pdf, but I can't reproduce a 3rd time.
I am removing the needinfo, since there is no more reports and that Sink is disabled for the moment, leaving this bug open as a blocker.
Flags: needinfo?(nicolas.b.pierron)

Comment 7

4 years ago
Posted file Minimal testcase
This minimal test case will trigger the crash 100% of the time, immediately on load. (FYI, this presents a security risk in the form of a DoS against Firefox users.)
Group: javascript-core-security
Thanks for the testcase! I can repro this in the shell with the test below. Debug builds assert:

Assertion failure: isLowered(), at js/src/jit/MIR.h:762

function foo() {
    var x = 0, y = 0, a = new Float32Array(1);
    function bar() {
        x = y;
        y = a[0];
    }
    for (var i = 0; i < 1000; i++) {
        bar();
    }
}
for (var i=0; i<10000; i++)
    foo();
This is probably the same issue as bug 1124617.

NI from Benjamin as this is probably caused by the float32 optimization.
Flags: needinfo?(benj)
(Assignee)

Comment 11

4 years ago
Posted image 1441877785.png
Here's some analysis, for jandem's test case:
- we have a MObjectState that takes a Float32 operand as an input. Because of MObjectState's type policy, the Float32 is converted into a double with a MToDouble.
- the MObjectState is recovered on bailout, although the MToDouble isn't.
- when lowering the MInterruptCheck, we assign a safepoint to the instruction, which creates a snapshot. This doesn't create a new use for the MObjectState as it's recovered on bailout. Although, it visits each MObjectState's operand and try to create a new use for all of them which aren't recovered on bailout.
- For most of them, it's ok: but the MToDouble introduced by the TypePolicy isn't recovered on bailouts and it's located *after* the MInterruptCheck, so it hasn't been lowered yet.
- KABOUM.
Flags: needinfo?(benj)
(Assignee)

Comment 12

4 years ago
Posted patch Tentative fix (obsolete) — Splinter Review
So this case can happen both with ObjectState and ArrayState, I think, which are the two users of NoFloatAfterPolicy that automatically are recovered on bailouts. So a naive fix consists in propagating the recoveredOnBailout flag to the MToDouble inserted after the float32 input. Unfortunately, this really looks like a bandage rather than a general fix. Moreover, now we have to create not-recoveredOnBailouts copies of the MToDouble afterwards (as it's actually reused in the loop body).

Asking tentatively for review, but feel free to provide a better, more general fix, and ask me for review.
Attachment #8659187 - Flags: review?(nicolas.b.pierron)
(Assignee)

Comment 13

4 years ago
FWIW, for the two test cases (the one in comment 9 and the one in bug 1124617 comment 0, both in the patch), the regression was introduced by

changeset:   https://hg.mozilla.org/mozilla-central/rev/b421cdd4b918
user:        Nicolas B. Pierron
date:        Fri Dec 19 15:28:31 2014 +0100
summary:     Bug 991720 part 4 - Scalar replacement registers the state instead of replacing resume points operands. r=h4writer

However, at the revision just before this one, there was another assertion failing:
Assertion failure: iter->isInterruptCheck() || iter->isInterruptCheckPar(), at /moz/repo/js/src/jit/shared/CodeGenerator-shared.cpp:1260
Comment on attachment 8659187 [details] [diff] [review]
Tentative fix

Review of attachment 8659187 [details] [diff] [review]:
-----------------------------------------------------------------

I think this fix looks good.  This sounds like the most simple approach to address this problem (thanks for the analysis in comment 11)

MInterruptCheck has some requirements of being the first instruction in the loop header, which means that when Scalar Replacement adds the MObjectState for the resume point with MBasicBlock::safeInsertTop [1], it adds the instructions after the interrupt check.
Then the TypePolicy fix the inputs of the MObjectState by inserting the MToDouble, as we cannot accept any float32 operands as the value of any property of objects (by the way we could do it in this particular test case, as the object is in fact the call object)

Flagging the newly created MToDouble as recovered instruction, when the instruction which issued the convertion is already a recovered instruction should fix this issue, as the RecoveredOnBailout flag prevent any folding from GVN, which prevents the MToDouble instruction from remaining live, while being used by the MObjectState which is it-self used by the MInterruptCheck snapshot (as described by benjamin).

I think we should make a follow-up patch to improve AssertResumePointDominatedByOperands [2] in the case of the entry resume point, by checking no fallible instruction can be placed before any of the non-recovered-on-bailout resume point operands, and the same rule as the resume point should apply for the resume point operands which are recovered on bailout.

[1] https://dxr.mozilla.org/mozilla-central/source/js/src/jit/MIRGraph.cpp?from=MIRGraph.cpp#825
[2] https://dxr.mozilla.org/mozilla-central/source/js/src/jit/IonAnalysis.cpp#2225
Attachment #8659187 - Flags: review?(nicolas.b.pierron) → review+
(Assignee)

Comment 15

4 years ago
[Tracking Requested - why for this release]: failures introduced by bug 991720, which landed in gecko 37 as per bugzilla
https://hg.mozilla.org/mozilla-central/rev/fb1f0c400ec0
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
The test case for this (jit-test/tests/ion/bug1107011-2.js) times out on arm simulator debug builds even on a fast laptop.  Is it reasonable to mark the test case as "slow" for now?  Presumably we get less test coverage that way.
Flags: needinfo?(benj)
(Assignee)

Comment 18

4 years ago
(In reply to Lars T Hansen [:lth] from comment #17)
> The test case for this (jit-test/tests/ion/bug1107011-2.js) times out on arm
> simulator debug builds even on a fast laptop.  Is it reasonable to mark the
> test case as "slow" for now?  Presumably we get less test coverage that way.

Yes, sounds fine. It was also slow on my machine (x64), so it can presumably be changed to trigger with --ion-eager... I'll look into that.
Flags: needinfo?(benj)
(Assignee)

Comment 19

4 years ago
Got r+ from lth over irc to reduce the iteration count.
Attachment #8661173 - Flags: review+
(Assignee)

Comment 21

4 years ago
Approval Request Comment
[Feature/regressing bug #]: 991720
[User impact if declined]: crashes, probably correctness issues as well
[Describe test coverage new/current, TreeHerder]: green on treeherder for a few days
[Risks and why]: low risk, if not none (two lines patch)
[String/UUID change made/needed]: n/a

Carrying forward r+ from nbp. I've folded the two patches in this one.
Attachment #8659187 - Attachment is obsolete: true
Attachment #8661176 - Flags: review+
Attachment #8661176 - Flags: approval-mozilla-beta?
Attachment #8661176 - Flags: approval-mozilla-aurora?
Assignee: nobody → benj
Comment on attachment 8661176 [details] [diff] [review]
Tentative fix with reduced iteration count

Next time, please request for the sec approval.
Taking it for 42, might be too late for 41.
Attachment #8661176 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+

Comment 23

4 years ago
This is marked security but has no security rating, I thought we needed a rating before landing any security issue (and sec-approval if it's rated sec-high or sec-critical)?
We do, I mentioned that to Benjamin on IRC... but it is too late as it already landed... :(
(Assignee)

Comment 25

4 years ago
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #23)
> This is marked security but has no security rating, I thought we needed a
> rating before landing any security issue (and sec-approval if it's rated
> sec-high or sec-critical)?

I indeed chose to not ask for sec-approval, for a few reasons:
- next release is next week, so i thought this would make it on the next release. Sylvestre told me on IRC he doubt that. That was a bad call of mine to judge this, and I am sorry about that.
- all instances in comment 0 concern a deref of a pointer located at 0x20, which is a "safe" (non-exploitable) crash, so we shouldn't be afraid of an exploit and a chemspill. The only thing we could be afraid of is a DOS, that is, a website using this to trigger a crash of firefox, but most fuzzbugs seem to fall into that category anyways.
- an example of this is bug 1124617, which has a test case that triggers the same issue, and it is *public*. It's been fixed by this patch as well (test case has also been included here).

So sorry about that, I've duly noted it and I'll think about it next time (and hopefully bug 1158178 will help for that matter).
(Assignee)

Updated

4 years ago
Duplicate of this bug: 1124617
Would still be good to get a security rating assigned to this bug.
Flags: in-testsuite+
Target Milestone: --- → mozilla43
The current problem involve a snapshot, which is filled with a registers which is not yet defined, and thus it might values which are no longer used / nor spilled. This effectively means that we can leak data in this register, which is then interpreted as a Double value after a bailout with this snapshot.
Keywords: sec-high
Given that this was rated sec-high and affects older branches as well (and has had a testcase landed now too, which we typically avoid on sec bugs that affect older branches), please request a retroactive sec-approval for tracking purposes.
https://wiki.mozilla.org/Security/Bug_Approval_Process

Also, looks like this needs an esr38 approval request as well.
Flags: needinfo?(benj)
(Assignee)

Comment 30

4 years ago
Comment on attachment 8661176 [details] [diff] [review]
Tentative fix with reduced iteration count

[Approval Request Comment]
For esr38: See comment 21.

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
See comment 0 (null deref => safe crashes) and comment 28 (fill a JS value with any arbitrary data).

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
- checkin comment: yes, it explains what the problem is.
- tests: no in a first glance; but as they trigger an assertion, a careful analysis of somebody who'd know the code quite well could lead to the same conclusion as in comment 28.

Which older supported branches are affected by this flaw?
This landed in bug 991720, so everything based on gecko 37.

If not all supported branches, which bug introduced the flaw?
bug 991720

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
I assume the initial patch can be backported on all branches without any changes (this code hasn't changed much in a few months). It doesn't seem risky (two lines change), but fuzzers could prove me wrong by exhibiting an even more over the top test case.

How likely is this patch to cause regressions; how much testing does it need?
No clear ways to cause regressions. A few tests are checked in within the patch, but a few days of fuzzing could probably make this safer.
Flags: needinfo?(benj)
Attachment #8661176 - Flags: sec-approval?
Attachment #8661176 - Flags: approval-mozilla-esr38?
Group: core-security-release
Group: javascript-core-security
Comment on attachment 8661176 [details] [diff] [review]
Tentative fix with reduced iteration count

I discussed this with DVeditz (thanks!) and we do not feel this sec issue meets the RC bar. At this point in the 41 cycle (RC week), only bugs which are sec-crit + easy to exploit, severe crash/hangs and major regressions that make the release unusable will be considered. Given that, this is a wontfix for 41.
Attachment #8661176 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
Tracked for 42 and wontfix for 41 (see comment 31)
sec-approval+ for checkin on Oct 6, two weeks into the next cycle.
Whiteboard: [checkin on 10/6]
Attachment #8661176 - Flags: sec-approval? → sec-approval+
Never mind. I see we landed this without sec-approval. Sigh.
Whiteboard: [checkin on 10/6]
Depends on: 1205842
Comment on attachment 8661176 [details] [diff] [review]
Tentative fix with reduced iteration count

This patch has been in other places for about two weeks. It needs to be uplifted to esr38 as well.
Attachment #8661176 - Flags: approval-mozilla-esr38? → approval-mozilla-esr38+
Do both of the patches need uplifted to esr38 or just the one with the flag attached to it?
Flags: needinfo?(benj)
(Assignee)

Comment 39

4 years ago
Comment on attachment 8661173 [details] [diff] [review]
reduce-number-iterations.patch

Only the one with the approval flag (obsoleting the one that doesn't need to be landed).
Attachment #8661173 - Attachment is obsolete: true
Flags: needinfo?(benj)
(Assignee)

Comment 42

4 years ago
Probably a dependent patch which needs uplifting... Looking.
(Assignee)

Comment 43

4 years ago
Posted patch uplift.patchSplinter Review
After some (fun) bisecting between mozilla-beta (which doesn't crash on these test cases) and esr38 (which does), I've identified that a small subset of 703cef22656c (landed in mozilla 39 as part of bug 1137688) was needed for this patch to work on the ESR38.

Flagging nbp for review, but this has landed on all the other branches, so it should be pretty straightforward. Also, can I push this to try, or should we be discrete, as it's sec-high? Should I fold both patches together?
Attachment #8673202 - Flags: review?(nicolas.b.pierron)
Flags: needinfo?(benj)
Attachment #8673202 - Flags: review?(nicolas.b.pierron) → review+
(Assignee)

Comment 44

4 years ago
Comment on attachment 8673202 [details] [diff] [review]
uplift.patch

Re asking for security approval and esr38 approval for this tiny patch. See previous comment for context.

- It is needed to land the other fix (other patch, already r+ed, backed out because it caused assertion failures) to esr38
- It's part of a bigger patch that landed on mozilla39, hence there were no failures on aurora/beta

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
If the other patch is landed without this one, one could more easily build another exploit that would put the content of a register inside a JS value. To be clear: an exploit could be made if and only if the other patch lands on esr38 but this one doesn't.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
The comment does.

Which older supported branches are affected by this flaw?
none, only esr38 with the other patch.

If not all supported branches, which bug introduced the flaw?
same bug, other patch.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Not needed for other branches, this is already a backport for esr38. Very very very very low risk (see below in the esr38 comment).

How likely is this patch to cause regressions; how much testing does it need?
Not likely. Testing has been carefully done locally and mozilla39 has had this for a long time (with tests).

[ESR38 Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: the other patch can't land on ESR => crashes + security issue
Fix Landed on Version: fix is the other patch, already landed on trunk, aurora, beta
Risk to taking this patch (and alternatives if risky): very very very (...) very low, it's making one optimization more conservative, so it will actually  trigger less often.
String or UUID changes made by this patch: n/a
Attachment #8673202 - Flags: sec-approval?
Attachment #8673202 - Flags: review?(nicolas.b.pierron)
Attachment #8673202 - Flags: review+
Attachment #8673202 - Flags: approval-mozilla-esr38?
(Assignee)

Comment 45

4 years ago
Comment on attachment 8673202 [details] [diff] [review]
uplift.patch

Oops, bugzilla clobbered the r+ from nbp
Attachment #8673202 - Flags: review?(nicolas.b.pierron) → review+
Attachment #8673202 - Flags: sec-approval?
Attachment #8673202 - Flags: sec-approval+
Attachment #8673202 - Flags: approval-mozilla-esr38?
Attachment #8673202 - Flags: approval-mozilla-esr38+
Whiteboard: [adv-main42+][adv-esr38.4+]
I was able to reproduce this issue on Windows 10 x 86, using Firefox ESR 38.3.0 (20150916094008) and the 'Minimal testcase' attached.

The crash is no longer reproducible on Firefox 42.0 (20151026170526), Firefox ESR 38.4.0 (20151027170520) builds across platforms [1], using the attached 'Minimal testcase' and the problematic websites from the comments: 
- http://theconversation.com/the-rape-scene-in-brad-pitts-fury-no-one-is-talking-about-33638
- http://www.ruf.rice.edu/~lane/papers/male_female.pdf
- http://www.eevblog.com/files/EEVblogJobDescription.pdf

[1] - Windows 10 x86, Ubuntu 14.04 x86, Mac OS X 10.11
Status: RESOLVED → VERIFIED
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.