Closed Bug 1459568 Opened 3 years ago Closed 3 years ago

Assertion failure: InternalBarrierMethods<T>::thingIsNotGray(v) || CurrentThreadIsTouchingGrayThings(), at js/src/gc/Barrier.h:339

Categories

(Core :: JavaScript Engine, defect, P1)

x86_64
Linux
defect

Tracking

()

VERIFIED FIXED
mozilla62
Tracking Status
firefox-esr52 --- unaffected
firefox-esr60 61+ fixed
firefox60 --- wontfix
firefox61 + verified
firefox62 + verified

People

(Reporter: gkw, Assigned: jonco)

References

Details

(6 keywords, Whiteboard: [jsbugmon:update][adv-main61+][adv-esr60.1+])

Attachments

(2 files)

The following testcase crashes on mozilla-central revision b1628ac71fcc (build with --enable-debug --enable-more-deterministic, run with --fuzzing-safe --no-threads --no-baseline --no-ion):

// jsfunfuzz-generated
gcparam("markStackLimit", 2);
// Adapted from randomly chosen test: js/src/tests/non262/extensions/collect-gray.js
(function() {
    grayRoot().x = 0;
    gc();
    grayRoot().map = new (function(){});
})();

Backtrace:

#0  js::CheckTargetIsNotGray<JSLinearString*> (v=<optimized out>) at js/src/gc/Barrier.h:342
#1  0x0000000000c2f216 in js::CheckTargetIsNotGray<js::Shape*> (v=<optimized out>) at js/src/gc/Barrier.h:339
#2  0x0000000000c6c237 in js::GCPtr<js::Shape*>::set (v=<synthetic pointer>, this=0x7ffffd398af0) at js/src/gc/Barrier.h:489
#3  js::GCPtr<js::Shape*>::operator= (p=<synthetic pointer>, this=0x7ffffd398af0) at js/src/gc/Barrier.h:485
#4  js::Shape::setParent (this=0x7ffffd398ad8, p=0x7ffffd3b03f8) at js/src/vm/Shape.h:799
/snip

For detailed crash information, see attachment.

Setting s-s as a start because gc is involved.
Due to skipped revisions, the first bad revision could be any of:
changeset:   https://hg.mozilla.org/mozilla-central/rev/0ab0d909476f
user:        Richard Pospesel <richard>
date:        Wed Apr 18 13:41:00 2018 +0300
summary:     Bug 859782 - Firefox cannot start without /proc (chroot) r=sphink

changeset:   https://hg.mozilla.org/mozilla-central/rev/363cc3b4dd41
user:        André Bargull
date:        Thu Apr 19 10:46:49 2018 +0200
summary:     Bug 1453922: Add fast path for non-negative int32 values to ToIndex. r=jandem

changeset:   https://hg.mozilla.org/mozilla-central/rev/ff7588bba148
user:        Jan de Mooij
date:        Thu Apr 19 13:02:00 2018 +0200
summary:     Bug 1064316 - Rewrite check_macroassembler_style.py to use os.walk instead of looking at the repo data. r=nbp

changeset:   https://hg.mozilla.org/mozilla-central/rev/2f7d0134b221
user:        Jan de Mooij
date:        Thu Apr 19 13:04:46 2018 +0200
summary:     Bug 1452982 part 14 - Rename 'active thread' to 'main thread'. r=jonco

changeset:   https://hg.mozilla.org/mozilla-central/rev/292f8e5c6336
user:        Jan de Mooij
date:        Thu Apr 19 13:06:12 2018 +0200
summary:     Bug 1452982 part 15 - Rename some constants. r=jonco

changeset:   https://hg.mozilla.org/mozilla-central/rev/cf2687e4e96e
user:        Jan de Mooij
date:        Thu Apr 19 13:14:18 2018 +0200
summary:     Bug 1452602 - Mark some shell functions as fuzzing-safe. r=jonco

changeset:   https://hg.mozilla.org/mozilla-central/rev/22ed5e1657aa
user:        Margareta Eliza Balazs
date:        Thu Apr 19 14:26:20 2018 +0300
summary:     Backed out changeset 0ab0d909476f (bug 859782) for bustage in builds/worker/workspace/build/src/js/src/util/NativeStack.cpp on a CLOSED TREE

Guessing this might be related to bug 1452602, I went ahead and removed --fuzzing-safe and re-did a bisect:

autobisectjs shows this is probably related to the following changeset:

The first bad revision is:
changeset:   https://hg.mozilla.org/mozilla-central/rev/7a805b66dfcc
parent:      405572:f361cfca3755
user:        Jon Coppeard
date:        Tue Feb 27 13:01:49 2018 +0000
summary:     Bug 1440739 - Improve gray marking assertions to cover more types of pointer r=sfink

Jon, is bug 1440739 a likely regressor?
Blocks: 1440739
Flags: needinfo?(jcoppeard)
(In reply to Gary Kwong [:gkw] [:nth10sd] from comment #2)
> Jon, is bug 1440739 a likely regressor?

I expect that's the bug that added the assertion that's failing.
Flags: needinfo?(jcoppeard)
The fuzzers have found a gray marking bug!
Assignee: nobody → jcoppeard
When marking delayed arenas we need to take account of whether we are doing black or gray marking.  Only cells marked with the expected colour need to be considered.  We don't need to mark the cell itself again, just trace its children.
Attachment #8974069 - Flags: review?(sphink)
Comment on attachment 8974069 [details] [diff] [review]
bug1459568-gray-root

Review of attachment 8974069 [details] [diff] [review]:
-----------------------------------------------------------------

Whoa. Thanks for tracking that down.

::: js/src/gc/Marking.cpp
@@ +2586,5 @@
> +
> +    // Whether we need to mark children of gray or black cells in the arena
> +    // depends on which kind of marking we were doing when the arena as pushed
> +    // onto the list.  We never change mark color without draining the mark
> +    // stack though so this is the same as the current color.

It's a somewhat subtle invariant, which makes me think it would be useful to change arena->markOverflow to store the 2-bit color instead of a 1-bit bool, and then assert markColor() == arena->markOverflowColor. Except that it looks like it might complicate the bitpacking in arenas, and that's unappealing, so never mind.
Attachment #8974069 - Flags: review?(sphink) → review+
Priority: -- → P1
(In reply to Steve Fink [:sfink] [:s:] (PTO June 31) from comment #6) 
> It's a somewhat subtle invariant, which makes me think it would be useful to
> change arena->markOverflow to store the 2-bit color instead of a 1-bit bool,
> and then assert markColor() == arena->markOverflowColor. Except that it
> looks like it might complicate the bitpacking in arenas, and that's
> unappealing, so never mind.

There are still two unused bits in the Arena flags there so it should complicate the bitpacking..
Could this explain Bug 1441002 crashes?
Comment on attachment 8974069 [details] [diff] [review]
bug1459568-gray-root

[Security approval request comment]
How easily could an exploit be constructed based on the patch? Very difficult, requires controlling mark stack overflow.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem? No.

Which older supported branches are affected by this flaw? Everything back to time immemorial.

If not all supported branches, which bug introduced the flaw? All branches.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be? Should be simple.

How likely is this patch to cause regressions; how much testing does it need? Unlikely.
Attachment #8974069 - Flags: sec-approval?
(In reply to Ted Campbell [:tcampbell] from comment #8)
> Could this explain Bug 1441002 crashes?

I'm not sure but probably not.  Hopefully it explains our gray marking assertion failures though.
Comment on attachment 8974069 [details] [diff] [review]
bug1459568-gray-root

sec-approval+ to land. We'll want this on beta and ESR branches eventually.
Attachment #8974069 - Flags: sec-approval? → sec-approval+
https://hg.mozilla.org/mozilla-central/rev/f75ee4da0e5d

This grafts cleanly to Beta/ESR60 as-landed. It'll need a rebased patch for ESR52 and approval requests when you get a chance.
Group: javascript-core-security → core-security-release
Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(jcoppeard)
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla62
Duplicate of this bug: 1460065
Status: RESOLVED → VERIFIED
JSBugMon: This bug has been automatically verified fixed.
Comment on attachment 8974069 [details] [diff] [review]
bug1459568-gray-root

Approval Request Comment
[Feature/Bug causing the regression]: Unknown.
[User impact if declined]: Possible crash / security vulnerability.
[Is this code covered by automated tests?]: Yes.
[Has the fix been verified in Nightly?]: Yes.
[Needs manual test from QE? If yes, steps to reproduce]: No.
[List of other uplifts needed for the feature/fix]: None.
[Is the change risky?]: Low risk.
[Why is the change risky/not risky?]: It's a subtle problem but a relatively simple fix and it is now exercised by the tests.
[String changes made/needed]: None.
Flags: needinfo?(jcoppeard)
Attachment #8974069 - Flags: approval-mozilla-beta?
Comment on attachment 8974069 [details] [diff] [review]
bug1459568-gray-root

Ditto previous comment.
Attachment #8974069 - Flags: approval-mozilla-esr60?
Comment on attachment 8974069 [details] [diff] [review]
bug1459568-gray-root

Fixes a sec-high. Approved for 61.0b5.
Attachment #8974069 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Ryan VanderMeulen [:RyanVM] from comment #13)
How important is this to fix for ESR52?  This code has changed a bit since then and none of the infrastructure used in the test is present on that branch so we would have to land without the test.
Flags: needinfo?(ryanvm)
It's rated sec-high, so the default position is that we need to backport it. If you don't think ESR52 is practical to fix this on, you'll want the sec team to buy in on that decision.
Flags: needinfo?(ryanvm) → needinfo?(dveditz)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #21)
It can be done, but it's going to me more work than a simple backport.  Ideally I'd like to backport enough of the test infrastructure so we can test the fix.
One question is whether this is truly sec-high given comment 9. The testcase is reliable and the results are sec-high-like, but hinges on the shell-only functions grayRoot(). We usually assume attackers can trigger gc at opportune times and don't downgrade jsshell tests because of that, but can attackers manipulate gray roots in the necessary way? If our algorithms are predictable, maybe?

How much work is the backport?
Flags: needinfo?(dveditz) → needinfo?(jcoppeard)
(In reply to Daniel Veditz [:dveditz] from comment #23)
JS GC things referenced from browser C++ objects are mostly gray roots, so yes these can be created easily.  The attacker would have to trigger mark stack exhaustion, but that is certainly possible.  So I think we probably do want to fix this.

The backport is a couple of days work I'd guess.
Flags: needinfo?(jcoppeard)
I backported the fix and a whole lot of other code required for the test to the esr52 branch, and found that the bug didn't reproduce there.

After some more digging I've come to the conclusion that this was introduced in bug 1323083, which changed the meaning of TenuredCell::markIfUnmarked(), which was called from GCMarker::markDelayedChildren() with the default color argument of black.

Previously if this was called on a gray cell, it wouldn't do anything because the black bit would be set.  With the new mark bit representation, gray cells do not have the black bit set and this call will change them from gray to black.  This is actually explained in the comments in bug 1323083, and is necessary for incrementalising gray marking.  But it's not what we want in markDelayedChildren().
Comment on attachment 8974069 [details] [diff] [review]
bug1459568-gray-root

sec-high fix for 60.1esr
Attachment #8974069 - Flags: approval-mozilla-esr60? → approval-mozilla-esr60+
Whiteboard: [jsbugmon:update] → [jsbugmon:update][adv-main61+][adv-esr60.1+]
JSBugMon: This bug has been automatically verified fixed on Fx61
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.