Closed Bug 1342261 Opened 3 years ago Closed 3 years ago

Assertion failure: comp == compartment || runtime()->isAtomsCompartment(comp) || (srcKind == JS::TraceKind::Object && InCrossCompartmentMap(static_cast<JSObject*>(src), thing)), at js/src/jsgc.cpp:3725

Categories

(Core :: JavaScript Engine, defect, critical)

x86_64
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox-esr45 --- unaffected
firefox52 --- unaffected
firefox-esr52 --- unaffected
firefox53 --- unaffected
firefox54 + fixed
firefox55 + fixed

People

(Reporter: gkw, Assigned: sfink)

References

(Blocks 2 open bugs)

Details

(4 keywords, Whiteboard: [jsbugmon:update])

Attachments

(3 files, 2 obsolete files)

The following testcase crashes on mozilla-central revision 5069348353f8 (build with --enable-debug --enable-more-deterministic, run with --fuzzing-safe --no-threads --baseline-eager --no-ion):

fullcompartmentchecks(true);
newGlobal({
    sameZoneAs: []
}).z;

Backtrace:

0   js-dbg-64-dm-clang-darwin-5069348353f8	0x000000010339d177 CompartmentCheckTracer::onChild(JS::GCCellPtr const&) + 727 (jsgc.cpp:3723)
1   js-dbg-64-dm-clang-darwin-5069348353f8	0x00000001036763c4 JS::CallbackTracer::onObjectEdge(JSObject**) + 52 (TracingAPI.h:142)
2   js-dbg-64-dm-clang-darwin-5069348353f8	0x000000010385161f JSObject* DoCallback<JSObject*>(JS::CallbackTracer*, JSObject**, char const*) + 63 (TracingAPI.h:258)
3   js-dbg-64-dm-clang-darwin-5069348353f8	0x0000000102fd3c26 void js::jit::TraceCacheIRStub<js::jit::ICStub>(JSTracer*, js::jit::ICStub*, js::jit::CacheIRStubInfo const*) + 214 (CacheIRCompiler.cpp:836)
/snip

For detailed crash information, see attachment.

Compartment / GC bug, let's set this s-s as a start.
autoBisect shows this is probably related to the following changeset:

The first bad revision is:
changeset:   https://hg.mozilla.org/mozilla-central/rev/6b718178f43f
user:        Tom Schuster
date:        Wed Feb 22 21:16:04 2017 +0100
summary:     Bug 1319087 - Implement a CrossCompartmentWrapper IC stub. r=bz,bholley,jandem

Tom, is bug 1319087 a likely regressor?
Flags: needinfo?(evilpies)
Flags: needinfo?(evilpies)
Okay, I am assuming this assert doesn't like the different compartment's global we are embedding.
Flags: needinfo?(evilpies)
Would it maybe be easier to use the JSCompartment* again and call mark() on that, or does that not actually help with keeping the compartment alive?
Keywords: sec-high
Assignee: nobody → evilpies
I am not really sure how to solve this. I tried changing TraceCrossCompartmentEdge so it works for just the object pointer, but that didn't help. We still assert because InCrossCompartmentMap returns false.

Steve, maybe you can take this instead? I don't really have any sense on how to fix this.
Flags: needinfo?(evilpies) → needinfo?(sphink)
I don't *think* this is a real problem, since these pointers are guaranteed to be same-zone. If they were cross-zone, this would be completely broken.

And yet, I don't really want to just loosen the compartment checker assertion, since that would apply to everything, and we'd miss a lot of bugs.

Most of the time, it isn't an issue because something else is keeping the wrapper alive. But this test shows that the stubField can be the only thing keeping the wrapped value alive. I assume that means the wrapper gets collected, and then we have an unregistered cross-compartment pointer.

I just tried adding the wrapper to guardCompartment to preserve it, and it didn't fix the problem. Looking more closely into that now...
Assignee: evilpies → sphink
Flags: needinfo?(sphink)
Ok, just keeping the wrapper alive isn't enough, and I'm not sure whether the wrapper is actually dead here in the first place.

The assertion assumes that cross-compartment edges will always be JSObject* -> JSObject*. But for the purposes of this checking, we have a JSScript* -> JSObject* edge.

Then again, that's a good thing! I can loosen the assertion for only this case now. I'm still inclined to store the wrapper in a stubField, because I'm not totally confident that the same-zone cross-compartment thing is safe without a wrapper map entry, and I could imagine us adding additional cross-compartment handling in the future that relies on the wrapper map even if we don't already have it now. Also, this may make it possible to just remove the same-zone constraint, though maybe that won't buy us anything
I still need to add the test, but I'll do that in a separate patch after a meeting. I wanted to get this up for review.
Attachment #8847682 - Flags: review?(evilpies)
Thanks for picking this up. What happens when we nuke the wrapper, but the compartment is still alive?
(In reply to Tom Schuster [:evilpie] from comment #9)
> Thanks for picking this up. What happens when we nuke the wrapper, but the
> compartment is still alive?

Which compartment?

The wrapped compartment will definitely still be alive, since you have a reference to its global in the other stubField. I didn't change that, and it is still necessary. The nuked wrapper won't really point into it anymore (well, it weakly does since bug 1343261.) But the wrapper itself doesn't get collected just because it has been nuked. It will lurk about in the wrapper map, and in the patch's new stubField, until the target object dies.

The wrapper compartment is the one containing the script. It won't be affected by the wrapper getting nuked.
Attachment #8847682 - Flags: review?(evilpies) → review+
Ugh. This seems to be breaking a jetpack test:

https://queue.taskcluster.net/v1/task/XccjMg-YQA-yng4bOBdaXA/runs/0/artifacts/public/logs/live_backing.log

It requires a "precise" GC to collect two specific objects, and it only seems to get one of them.
Phew. I finally tracked this down.

The problem is what I said above, though there are more compartments and wrappers involved than I expected.

The failing test code in addon-sdk/source/test/test-disposable.js is basically:

  let Foo = Class({ ...object implementing WeakReference... });
  let foo1 = Foo();
  foo1 = null;
  Cu.schedulePreciseGC(function () {
    unload(); # Runs "disposers" of all live Foo's
    assert that foo1 was collected before its disposer ran
  });

It's confusing because the Foo thing does not itself has a finalizer. unload() iterates through a weak set of Foo objects and calls dispose() on them.

foo1's wrapper got embedded in a baseline IC stub. How that happened, I'm really not sure; I don't see how this could ever get warm enough. It was probably getting captured as part of a different closure or something, but even then I don't see how it happened. But I guess it's good for us, since it shows the bug.

The WeakReference makes GC visible, so schedulePreciseGC is used for a language-level precise GC that is expected to clean up everything not otherwise referenced. It delays the GC by posting it to the event loop, so that the GC will run when no other code is executing. In this case, the object stored in foo1 was only referenced by the foo1 local variable, and the test is checking to be sure it got cleaned up.

The first problem is that schedulePreciseGC does not discard JIT code, and so now that we added the extra traced pointer to the wrapper, the jitcode is holding the wrapper live. I used a different gcreason enum value to request jitcode to be flushed to fix this.

Sadly, it still didn't work. The problem turned out to be that a jitcode-discarding GC works by marking everything, including the jitcode to be discarded, and then throwing out the jitcode while sweeping. Which means the wrapper gets marked before the jitcode is discarded. If I do two schedulePreciseGC calls, then everything gets properly cleaned up and the tests pass.

jonco: do you think this is a valid usage that we need to support? Given the existence of schedulePreciseGC, it does seem like we have something of a promise that we'll at least clean up easily-killable stuff. There aren't that many uses in the tree, but a bunch are in Promise test code, and it makes sense to me that you'd want something like this to make sure all Promises get resolved.

If this is valid, it seems like the two main options are (1) make schedulePreciseGC do two GCs internally, or (2) discard jitcode during marking. The former is a hack that may break if we have something else that also needs iterative cleanup, but the latter depends on us having everything registered properly so we won't introduce UAFs or something. And I know there are already things in jitcode cleanup that delay freeing certain things until the next minor GC, because the store buffer may have pointers into them, which does not make me sanguine about this being an easy change.
Flags: needinfo?(jcoppeard)
(In reply to Steve Fink [:sfink] [:s:] from comment #7)

As far as the original issue goes, we should be tracing a wrapper to the target global.  I said this in the comments on the original bug, but it didn't seem to make it into the patch.  Maybe we could do this and guard on the wrapper not being nuked (and remove the direct pointer to the global)?

Alternatively it may be possible just to call JSCompartment::mark() for the compartment.  That would allow the situation where we keep a compartment alive with no objects in it.  I don't think that happens in any other case so there may be problems with this.  This would need to happen if tracer->isMarkingTracer() only.

> The problem turned out to be that a jitcode-discarding GC works by marking everything, including the jitcode to be discarded, and then throwing out the jitcode while sweeping.

This is a little strange.  I would have thought discard-on-mark would be better... but Zone::discardJitCode seems to do quite a lot of stuff.  Also, it's called in a different place for incremental/non-incremental GC.  I guess it makes more sense to do in the mark phase, but for incremental GC we want to split it up by zone and so it happens in the sweep phase.
Flags: needinfo?(jcoppeard)
(In reply to Jon Coppeard (:jonco) from comment #13)
> (In reply to Steve Fink [:sfink] [:s:] from comment #7)
> 
> As far as the original issue goes, we should be tracing a wrapper to the
> target global.  I said this in the comments on the original bug, but it
> didn't seem to make it into the patch.  Maybe we could do this and guard on
> the wrapper not being nuked (and remove the direct pointer to the global)?
>
> Alternatively it may be possible just to call JSCompartment::mark() for the
> compartment.  That would allow the situation where we keep a compartment
> alive with no objects in it.  I don't think that happens in any other case
> so there may be problems with this.  This would need to happen if
> tracer->isMarkingTracer() only.

There are two different problems: the whole target compartment can be collected, leaving the unwrapped object stored in the source compartment dangling. Or just the target object could be collected, leaving the wrapper pointing into stale memory.

The fix for the first is what Tom added in response to your comment, which was to trace the target global. The fix for the second is what I added, which was to trace the target object's wrapper. But you're right, they would both be fixed if we traced a wrapper to the global, which sounds like it was your original intent with your comment.

It still seems a little dangerous, since we would have a cross-compartment edge (the target object stored in the IC) that does not appear in the wrapper map. Instead, it would be covered by a different edge in the wrapper map, for the global. I guess it should work, because the global outlives the object, and the only thing we need the wrapper map entry for is for the zone sweeping group dependency.

On the plus side, doing it that way prevents the wrapper from living long enough to trigger the test-disposable.js failure. I wish I'd realized that sooner, it would have saved me 2 days.

> > The problem turned out to be that a jitcode-discarding GC works by marking everything, including the jitcode to be discarded, and then throwing out the jitcode while sweeping.
> 
> This is a little strange.  I would have thought discard-on-mark would be
> better... but Zone::discardJitCode seems to do quite a lot of stuff.  Also,
> it's called in a different place for incremental/non-incremental GC.  I
> guess it makes more sense to do in the mark phase, but for incremental GC we
> want to split it up by zone and so it happens in the sweep phase.

I think with your simpler fix, I can ignore this for now. It still bothers me a bit that we now have some GC edges that won't get cleaned up immediately during a precise GC, and that we have a cross-compartment edge that is not registered in the wrapper map, but I can't think of any way it will bite us.
This ended up being a very simple fix, which apparently is what jonco intended in the first place.

There is no cross-compartment edge; the JIT code does the unwrapping. I thought this optimization was able to hardcode the unwrapped obj, but it's much more straightforward -- this is do a CCW unwrap in JIT code instead of calling out to C++. And the stubField values are all same-compartment with the code.
Attachment #8849674 - Flags: review?(jcoppeard)
Attachment #8847682 - Attachment is obsolete: true
Comment on attachment 8849674 [details] [diff] [review]
Permit cross-compartment edges in CacheIR stubs,

Review of attachment 8849674 [details] [diff] [review]:
-----------------------------------------------------------------

Great, I'm glad we could resolve this without adding a new kind of cross-compartment edge.
Attachment #8849674 - Flags: review?(jcoppeard) → review+
(In reply to Jon Coppeard (:jonco) from comment #16)
Oh, please add the testcase too if possible.
Attachment #8849674 - Attachment is obsolete: true
Comment on attachment 8850246 [details] [diff] [review]
Permit cross-compartment edges in CacheIR stubs,

Added test, carrying over r+.

[Security approval request comment]
Is sec-approval necessary? This is nightly-only.
Attachment #8850246 - Flags: sec-approval?
Attachment #8850246 - Flags: review+
(In reply to Steve Fink [:sfink] [:s:] from comment #19)

> [Security approval request comment]
> Is sec-approval necessary? This is nightly-only.

The flags here says 54 is affected but nightly is 55. Is 54 affected?
(In reply to Al Billings [:abillings] from comment #20)
> The flags here says 54 is affected but nightly is 55. Is 54 affected?

Sorry, you are correct.

> [Security approval request comment]
> How easily could an exploit be constructed based on the patch?

quite difficult

> Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

On the crash, yes. Realizing it could be a security problem is up to the viewer. I'm guessing it is, but it seems very tough to exploit.

> Which older supported branches are affected by this flaw?

54

> If not all supported branches, which bug introduced the flaw? 

bug 1319087

> Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

It should work unmodified, I'd think.

> How likely is this patch to cause regressions; how much testing does it need?

It's keeping an extra thing alive. The biggest potential would be for leaks, or for preventing finalizers from running if they're expecting stuff to be collected earlier. But generally speaking, JS can never be sure that something will be collected. And all of our tests pass.
sec-approval+ for trunk. Please nominate the patch for branch if you think it applies cleanly.
Attachment #8850246 - Flags: sec-approval? → sec-approval+
https://hg.mozilla.org/mozilla-central/rev/83d372332cb5
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Group: javascript-core-security → core-security-release
Comment on attachment 8850246 [details] [diff] [review]
Permit cross-compartment edges in CacheIR stubs,

Approval Request Comment
[Feature/Bug causing the regression]: bug 1319087
[User impact if declined]: crashes, UAF
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: no
[Why is the change risky/not risky?]: it's removing a dangerously unsafe cross-compartment pointer that would require all kinds of funky special GC handling with a completely standard same-compartment pointer that relies on existing machinery
[String changes made/needed]: none
Attachment #8850246 - Flags: approval-mozilla-aurora?
Comment on attachment 8850246 [details] [diff] [review]
Permit cross-compartment edges in CacheIR stubs,

sec-high fix for aurora54
Attachment #8850246 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.