Closed Bug 1120655 Opened 5 years ago Closed 5 years ago

Make the analysis detect compartment iterator invalidation

Categories

(Core :: JavaScript: GC, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox36 --- wontfix
firefox37 --- wontfix
firefox38 --- fixed
firefox39 --- fixed
firefox-esr31 38+ fixed
firefox-esr38 --- fixed
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed
b2g-v2.0M --- fixed
b2g-v2.1 --- fixed
b2g-v2.1S --- fixed
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: sfink, Assigned: sfink)

References

(Blocks 1 open bug)

Details

(Keywords: csectype-bounds, sec-high, Whiteboard: [asan][adv-main38+][adv-esr31.7+][post-critsmash-triage])

Attachments

(7 files, 5 obsolete files)

10.38 KB, patch
terrence
: review+
sfink
: checkin+
Details | Diff | Splinter Review
7.15 KB, patch
terrence
: review+
sfink
: checkin+
Details | Diff | Splinter Review
14.89 KB, patch
sfink
: review+
abillings
: sec-approval+
Details | Diff | Splinter Review
17.45 KB, patch
Details | Diff | Splinter Review
16.36 KB, patch
Details | Diff | Splinter Review
18.17 KB, patch
Details | Diff | Splinter Review
16.28 KB, patch
Details | Diff | Splinter Review
Bug 1119579 found a case where we are iterating over the list of compartments while potentially modifying it. The hazard analysis can be modified to detect this case, since the main ways the list could be modified are (1) a garbage collection collecting one or more of the compartments and (2) running JS script code that creates new compartments. Running script can always GC, so the same mechanism can be used for both: label CompartmentsIter a GCPointer so the analysis will complain if you can GC while it is alive.
The analysis change is straightforward, though it exposed a legitimate hazard.
Attachment #8547789 - Flags: review?(terrence)
Assignee: nobody → sphink
Status: NEW → ASSIGNED
Comment on attachment 8547789 [details] [diff] [review]
Label various compartment iters as GCPointers

Review of attachment 8547789 [details] [diff] [review]:
-----------------------------------------------------------------

Neat!
Attachment #8547789 - Flags: review?(terrence) → review+
JS_IterateCompartments takes a general callback, and thus is inherently unsafe.

There is a single caller in the tree, which goes over the principals and grabs out their script source strings. Unfortunately, it appears that this can not only GC, but it *might* be possible for it to run script got since it goes through nsIURI.

One way to fix this is to not worry about the exact behavior of the iterator, and just make sure it never accesses anything out of bounds. If you call script or GC during iteration, you may or may not get a callback for all live compartments. But you should never see any dead ones, nor should you access anything off the end of the list.

I'm really not sure whether this is better or worse than suppressing GC during iteration. My concern with suppressing GC is that I think the iteration may happen for about:memory, which is commonly loaded when memory is low.
Attachment #8547795 - Flags: review?(terrence)
Comment on attachment 8547795 [details] [diff] [review]
Fix out of bounds error in JS_iterateCompartments

Review of attachment 8547795 [details] [diff] [review]:
-----------------------------------------------------------------

Not showing all compartments in about:memory would also be pretty bad. This seems less bad at least.

::: js/src/gc/Zone.h
@@ +386,5 @@
>  
>      bool done() const {
>          MOZ_ASSERT(it);
> +        return it < zone->compartments.begin() ||
> +               it >= zone->compartments.begin();

I'm guessing you meant it >= *end* and not it >= begin here.
Attachment #8547795 - Flags: review?(terrence) → review+
The next hazard reported is:

Function '_ZN2js11WeakMapBase16traceAllMappingsEPNS_13WeakMapTracerE|void js::WeakMapBase::traceAllMappings(js::WeakMapTracer*)' has unrooted 'c' of type 'class js::CompartmentsIterT<js::ZonesIter>' live across GC call js::WeakMapBase.traceMappings at js/src/jsweakmap.cpp:135

The problem here is js::TraceWeakMaps. It is called by https://dxr.mozilla.org/mozilla-central/source/xpcom/base/CycleCollectedJSRuntime.cpp#245 and another place in the same file, and it ends up calling the js::WeakMapTracer.callback function pointer.

One lesson: callback-based APIs make things tough for the static analysis.

So I believe this is a false positive. But determining that is not straightforward. This is called by nsCycleCollector::FixGrayBits, which in fact *does* GC, but only after calling mJSRuntime->FixWeakMappingGrayBits(). So it's not like I can just say "the cycle collector won't GC while it's figuring out the gray bits."

I guess the trick would be to prove that the WeakMapTracer.callback won't GC. The easiest way to do that would be to make the API use a virtual function.
(In reply to Terrence Cole [:terrence] from comment #4)
> > +        return it < zone->compartments.begin() ||
> > +               it >= zone->compartments.begin();
> 
> I'm guessing you meant it >= *end* and not it >= begin here.

Nah, the indentation lined up better with them both using begin(), so I think I'll keep it that way.

:-) Thanks for the catch
If you can't get to fixing the analysis soon, sfink, maybe you could spin the bounds check off into a separate bug?
Group: javascript-core-security
Sorry, this dropped off my radar. Seems like I'll need 3 bugs -- one for the reviewed fix here that can land anytime (but needs sec-approval), one for the traceMappings thing that I don't have a fix for yet, and one for the analysis fix that can't land until the other two are in.
Depends on: 1137326
Attachment #8547795 - Attachment is obsolete: true
Comment on attachment 8569986 [details] [diff] [review]
Fix out of bounds error in JS_iterateCompartments,

Doh! bzexport insisted on reusing this bug even though I changed the number in the patch. Note to self: change the mq patch name too.

I'll just obsolete this one, since I'm moving it to bug 1137326 anyway.
Attachment #8569986 - Attachment is obsolete: true
Depends on: 1137336
Is this ready to land?
(In reply to Andrew McCreight [:mccr8] from comment #11)
> Is this ready to land?

Yes, as of yesterday with https://hg.mozilla.org/mozilla-central/rev/7d8224f00aec (from bug 1137336). That was the last of the fixes to land.

I marked lots of things as affected, due to bug 1137326, but that doesn't mean we need to land this everywhere. The only thing we really need to avoid is having this patch land on a branch that one of the actual fixes (in dependent bugs) has not landed on. Because that would make the hazard analysis fail on treeherder and point directly to a problem. Having this patch landed on a given branch just means that we could detect future landings that add hazards of a sort it can detect, which is pretty unlikely in the first place for any branch other than mozilla-central.
Blocks: 1143786
Is this going to make the current 39 (trunk) cycle (and then maybe branches)? I know it bounced two weeks ago on checkin but I'm not sure where it is in the queue to be addressed. Since it is a sec-high rated issue, I want to follow up on it.
(In reply to Al Billings [:abillings] from comment #14)
> Is this going to make the current 39 (trunk) cycle (and then maybe
> branches)? I know it bounced two weeks ago on checkin but I'm not sure where
> it is in the queue to be addressed. Since it is a sec-high rated issue, I
> want to follow up on it.

Thanks for checking. When I expanded this check to include zone invalidation, I uncovered a few more problems, and decided that it would be better to make this pattern safe rather than guard against it with the analysis (since I couldn't come up with a clean way of fixing some of the revealed hazards other than by suppressing GC entirely, which could lead to OOM crashes.)

So I will not be landing the reviewed patch after all. Instead, I'd like to do something different here. I will be uploading that patch shortly.
Flags: needinfo?(sphink)
First, a purely cosmetic patch because it confused me when tracing through this stuff. There's a 'lastGC' boolean floating around. I wasn't sure if it meant "previous GC", "last slice in an incremental GC", or what. It turns out that it means it's the final GC when destroying the whole JSRuntime. So I renamed it to 'destroyingRuntime'.
Attachment #8579436 - Flags: review?(terrence)
The real patch. This tracks whether we are currently iterating over zones or compartments, and prevents them from being swept if so. This cleans up a whole pile of hazards (I think there were 11, though most were false positives.)
Attachment #8579438 - Flags: review?(terrence)
Comment on attachment 8579436 [details] [diff] [review]
Rename lastGC -> destroyingRuntime

Review of attachment 8579436 [details] [diff] [review]:
-----------------------------------------------------------------

That reads much better.
Attachment #8579436 - Flags: review?(terrence) → review+
Comment on attachment 8579438 [details] [diff] [review]
Suppress zone/compartment collection while iterating

Review of attachment 8579438 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/src/gc/GCRuntime.h
@@ +1006,5 @@
>      int64_t lastGCTime;
>  
>      JSGCMode mode;
>  
> +    mozilla::Atomic<size_t, mozilla::ReleaseAcquire> activeIters;

Could we call it |activeZoneIters| so that it is clearer in sweepZones what sort of iterations cause us to bail?
Attachment #8579438 - Flags: review?(terrence) → review+
Attachment #8547789 - Attachment is obsolete: true
[Security approval request comment]
How easily could an exploit be constructed based on the patch?

The patch is about fixing a general class of problems, so it doesn't point to any specific exploit itself. But it does give a good direction to look for exploits. On the other hand, iterator invalidation is already pretty well known as a ripe area to find boundary overruns in, and this particular set of iterators is harder than most to exploit.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?

Not really.

Which older supported branches are affected by this flaw?

The specific problems are varied, but they date way back. esr31 definitely has *something* related to this in it.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?

Sadly, I expect there will be some minor syntactical differences that will make this a little different for the different branches. They'll be doing exactly the same thing, though, and should be easy to create.

How likely is this patch to cause regressions; how much testing does it need?

At least in theory, the only thing that this patch can introduce is OOM during iteration, but the amount of memory we're talking about here is tiny. And the relevant iterations are rare. So I'd say fairly low risk.
Attachment #8580354 - Flags: sec-approval?
Comment on attachment 8580354 [details] [diff] [review]
Rollup patch for approvals and backports

Carrying over r=terrence
Attachment #8580354 - Flags: review+
Comment on attachment 8580354 [details] [diff] [review]
Rollup patch for approvals and backports

sec-approval+.

How hard would this be to get on at least Aurora (yes, I saw the comment in the approval request)?
Attachment #8580354 - Flags: sec-approval? → sec-approval+
Attachment #8579436 - Flags: checkin?
Attachment #8579438 - Flags: checkin?
(In reply to Al Billings [:abillings] from comment #22)
> Comment on attachment 8580354 [details] [diff] [review]
> Rollup patch for approvals and backports
> 
> sec-approval+.
> 
> How hard would this be to get on at least Aurora (yes, I saw the comment in
> the approval request)?

I'll need to wait for the tree to open so I can get it into inbound first, but I'll upload a version for aurora now (as expected, it wasn't an automatic backport, but it was easy).
Attachment #8580468 - Flags: approval-mozilla-aurora+
Attached patch Backport to betaSplinter Review
Attachment #8579436 - Flags: checkin? → checkin+
Attachment #8579438 - Flags: checkin? → checkin+
Comment on attachment 8580354 [details] [diff] [review]
Rollup patch for approvals and backports

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: potentially exploitable iterator invalidation bugs. Though in reviewing a list of the zone-specific hazards in https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/sfink@mozilla.com-fa0c0df17ee3/try-linux64-br-haz/hazards.txt.gz , they all look hard to trigger. The compartment iterator ones were worse, but I've already backported spot fixes to the worst of those.

Fix Landed on Version: 39

Risk to taking this patch (and alternatives if risky): very slight chance of increased OOMs

String or UUID changes made by this patch: none

See https://wiki.mozilla.org/Release_Management/ESR_Landing_Process for more info.

Approval Request Comment
[Feature/regressing bug #]: This has been a problem for a very long time. Taking a sample problem, the code in the JS Debugger function addAllGlobalsAsDebuggees can be made to iterate out of bounds at least as far back as Fx28, since that code was changed in bug 933882. But the previous code had pretty much the same issue.

[Describe test coverage new/current, TreeHerder]: try push, and it is now on inbound (though it's too early to see if it will bounce.)

[Risks and why]: very slight chance to increase OOMs, though these iterations are rare outside of a GC (and inside of a GC, they can't cause problems.) A logic bug would be more likely, so that the GC does something wrong if it happens during an iteration, but the approach taken seems safe.

[String/UUID change made/needed]: none

Note that the aurora version of this patch is already approved for aurora. I'm not going to land until I see this happy on inbound, though. The backporting sheriff fairy is free to land the appropriate versions.
Attachment #8580354 - Flags: approval-mozilla-esr31?
Attachment #8580354 - Flags: approval-mozilla-beta?
Well, it required a trivial bustage fix for introducing a single-arg constructor, anyway:
https://hg.mozilla.org/integration/mozilla-inbound/rev/25c0b4f618fb
abillings and I agree that as we're at the end of the 37 cycle and don't need to take this fix in 37 we're going to defer to 38.
Comment on attachment 8580354 [details] [diff] [review]
Rollup patch for approvals and backports

ESR approval will be granted once we ship ESR 31.6.0.
Attachment #8580354 - Flags: approval-mozilla-beta? → approval-mozilla-beta-
cc-ing bkerensa. He couldn't see this bug and needs to since he's doing release management for ESR.
Lawrence wanted me to check per comment 33 I was unable to see these bugs and will need my privleges adjusted but do we want to respin on these?
Flags: needinfo?(abillings)
This is tracking for 38 (see tracking flag). See comment 30 and comment 31 as well.
Flags: needinfo?(abillings)
Group: javascript-core-security
FWIW, the beta patch doesn't apply to b2g37 at all. This'll need more rebasing for the remaining backports.
Flags: needinfo?(sphink)
Attachment #8580354 - Flags: approval-mozilla-esr31? → approval-mozilla-esr31+
Here's a rebased backport to b2g37. Let me know what other backports I should do.
Flags: needinfo?(sphink)
Attached patch Backport to esr31 (obsolete) — Splinter Review
This backport was not so clean. Hopefully it will work for similar branches.
Attached patch Backport to esr31 (obsolete) — Splinter Review
Hopefully fixed the Windows build breakage:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=93a3615ffa2d
Attachment #8590414 - Attachment is obsolete: true
Moar fixes. (Pull in RyanVM's fix.)
Attachment #8590545 - Attachment is obsolete: true
Is there a need for manually testing this fix? If yes, could you please provide some detailed steps on how to do it?
Flags: needinfo?(sphink)
(In reply to Florin Mezei, QA (:FlorinMezei) from comment #47)
> Is there a need for manually testing this fix? If yes, could you please
> provide some detailed steps on how to do it?

No.
Flags: needinfo?(sphink)
Whiteboard: [asan][adv-main38+][adv-esr31.7+]
Group: core-security → core-security-release
Whiteboard: [asan][adv-main38+][adv-esr31.7+] → [asan][adv-main38+][adv-esr31.7+][post-critsmash-triage]
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.