Closed Bug 1073345 Opened 5 years ago Closed 5 years ago

GMP: crash [@mozilla::gmp::GMPPlaneImpl::Copy]

Categories

(Core :: Audio/Video, defect, critical)

x86_64
macOS
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla35
Tracking Status
firefox32 --- unaffected
firefox33 + fixed
firefox34 + fixed
firefox35 + fixed
firefox-esr31 --- unaffected

People

(Reporter: posidron, Assigned: jesup)

References

(Blocks 2 open bugs)

Details

(Keywords: crash, sec-critical, testcase)

Attachments

(3 files)

This happened while opening https://gist.github.com/anonymous/a709930b5d604512cce8 in a build with '--enable-ipc-fuzzer' enabled.

Run Faulty this way to reproduce the issue:

export MOZ_IPC_MESSAGE_LOG=0
export FAULTY_ENABLE_LOGGING=1
export FAULTY_PROBABILITY=3500
export FAULTY_PICKLE=1
export FAULTY_PARENT=0
export FAULTY_CHILDREN=1
$HOME/dev/repos/mozilla/mozilla-ipc/obj-ff64-asan-opt/dist/NightlyDebug.app/Contents/MacOS/firefox -P fuzzing -no-remote $1 2>&1 | tee ~/Desktop/faulty-session.txt
per comment in another bug, we need to know how to get the code for --enable-ipc-fuzzer
Flags: needinfo?(cdiehl)
You need to apply v9 of the patch from here https://bugzilla.mozilla.org/show_bug.cgi?id=777067
Flags: needinfo?(cdiehl)
I caught this one with pain (had to add a 10 second sleep after initing the plugin so I could catch it in gdb).

It appears faulty caused a NeedsShmem() to return a shared memory buffer smaller than was requested, so when we copied to it we took an ASAN hit (in the child).  This isn't surprising (and not particularily worrying, since it can only crash/corrupt the child processes, and then only if something is badly wrong or corrupted in master).

I've added a MOZ_ASSERT(); I can make it MOZ_CRASH() in this case if we want.
Re-rating after IRC conversation with Randell.
(In reply to Randell Jesup [:jesup] from comment #3)
> This isn't surprising (and not particularily worrying, since it
> can only crash/corrupt the child processes

The attached log seems to suggest a parent crash; am I misreading it?  The stack trace in the log ends in "firefox" (not "plugin-container"), and the pid on the "SEGV on unknown address" line seems to be the parent pid, judging by other lines mentioning it.
That's correct. Messages from the child process got fuzzed which lead to a crash in the parent process. MOZ_CRASH would not be not ideal for upcoming fuzzing runs. Here are my preferences for the 'fuzzing' profile for Firefox: https://gist.github.com/posidron/a74118edcfed1910e141 it makes launching Firefox easier while reproducing issues.
Ok, back to sec-crit, though I took an identical hit (at least location, not stack) in the child process.
Ok.  last modification was 
[Faulty] pickle field {int} of value: 11 changed to: 12
which was almost certainly an Shmem id

So child wanted to transfer 11 and parent received 12 (and who knows who owns 12)

And after pulling data out of 11, it goes in the pool.  And later we try to use it. and maybe it even cycles back and forth a few times (in addition to the "real" copy).  And maybe it even gets deleted later (or rather one reference to it) but still keeps being used.

The IPC code handles which side can write to it; when we send it to the Child we call RevokeRights() on it.  So my guess is that we permuted the Id, it got (re?) adopted by the parent and cached, then used again but by this time the "real" Shmem 12 had been transferred to the child, and so we'd RevokeRights on it, and so boom.

This would not be a sec issue beyond basic DoS.  If we recorded in the IPC code who owned the Shmem, then we could check incoming ones were owned by the other side, and raise a deserialization error
FYI, I caught one of these locally, where it tried to memcpy into an Shmem which was permuted
Ok, since I added a hack to avoid duplicate Shmems in the cache I haven't seen a parent crash.  I need to make more runs (it dies in layers constantly).

However, this all makes sense:
* faulty munges Shmem ID to one the parent already has in the Shmem cache
* we use it (maybe crash if the replacement one was smaller than it needed to be, I'll check those - because of how we copy data out, if it's too small we'll just copy the data that's there)
* We Dealloc it to the Shmem cache (we now have two identical entries)
* We alloc, use and send to the child one entry
* We alloc, try to use and go boom on the second entry.

My only concern is that if we're copying from a "wrong" Shmem, and we allocate a larger correct size, we'll then copy copy the smaller amount into it.  The danger is that the allocation might have "garbage" in it that would then passed out as valid data.  Note that this is a poor leak, since it won't look like encoded video data (NALs) for RecvEncoded(), and for RecvDecoded() it will be passed out as a video frame.  The only way that could be a problem would be by the application dumping the data to a canvas and then shipping it out of the machine.  Pretty involved; REALLY hard to pull off for any useful purpose.

The "right" solution is to modify IPC Shmem code or IPC generation code to include some type of "who owns" bool (it already keeps a map of valid IDs, but it doesn't know which side owns them).  Then higher-level code doesn't need to worry about this.

I could (for 33) land the hack to disallow duplicate buffers.  It doesn't block the unlikely-leakage scenario, unless (for 33) I upgrade the hack to MOZ_CRASH(), which allows a DoS attack (likely doable anyways), but blocks any leakage.

I'd also re-downgrade this one now that it's understood.  The worst result today is DoS and maybe possible memory contents leakage (and that assumes attacker in control of the child and JS code in parent).
Flags: needinfo?(bent.mozilla)
Alternate version that could be used to wall off any possible info leaks
Attachment #8498363 - Attachment is obsolete: true
Attachment #8498363 - Attachment description: Hack to block duplicates from the GMP SharedMemory cache → Debug-only check for Shmem duplicates in the cache
Attachment #8498363 - Attachment is obsolete: false
Attachment #8498390 - Attachment description: Hack to block duplicates from the GMP SharedMemory cache → Block duplicates from the GMP SharedMemory cache by crashing
Attachment #8498390 - Flags: review?(cpearce)
Comment on attachment 8498390 [details] [diff] [review]
Block duplicates from the GMP SharedMemory cache by crashing

Review of attachment 8498390 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/gmp/GMPSharedMemManager.cpp
@@ +55,5 @@
>  
>    size_t size = aMem.Size<uint8_t>();
>    size_t total = 0;
> +
> +  // XXX Bug NNNNNNN Until we put better guards on ipc::shmem, verify we

Bug 1073345? ;)

@@ +62,5 @@
> +    if (NS_WARN_IF(aMem == GetGmpFreelist(aClass)[i])) {
> +      // Safest to crash in this case; should never happen in normal
> +      // operation.
> +      MOZ_CRASH("Deallocating Shmem we already have in our cache!");
> +      //return true;

Probably don't need the "//return true;" since MOZ_CRASH is always fatal.
Attachment #8498390 - Flags: review?(cpearce) → review+
Comment on attachment 8498390 [details] [diff] [review]
Block duplicates from the GMP SharedMemory cache by crashing

[Security approval request comment]
How easily could an exploit be constructed based on the patch?  
tough - they'd know there was something there, but trying to get it to be useful would be tough (and you need to break the plugin first as well)

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
A bit, though it isn't clear what getting it wrong allows.  We need this on beta asap anyways before Openh264 goes to release.  We could degrade the comments, but if it's going to aurora/beta ASAP it won't matter.

Which older supported branches are affected by this flaw?  Aurora and Beta

If not all supported branches, which bug introduced the flaw? GMP support

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?  Should apply trivially

How likely is this patch to cause regressions; how much testing does it need? Very unlikely to regress; has been tested locally.
Attachment #8498390 - Flags: sec-approval?
Comment on attachment 8498390 [details] [diff] [review]
Block duplicates from the GMP SharedMemory cache by crashing

sec-approval+. We'll want Aurora and Beta patches made and nominated ASAP. As soon as this goes green on Trunk, we'll want this in the branches.

Does this need to go into ESR31?
Attachment #8498390 - Flags: sec-approval? → sec-approval+
Attachment #8498390 - Flags: approval-mozilla-beta?
Attachment #8498390 - Flags: approval-mozilla-aurora?
Attachment #8498390 - Flags: approval-mozilla-beta?
Attachment #8498390 - Flags: approval-mozilla-beta+
Attachment #8498390 - Flags: approval-mozilla-aurora?
Attachment #8498390 - Flags: approval-mozilla-aurora+
Assignee: nobody → rjesup
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Blocks: 1078105
Hi Christoph, ideally we'd like to run your fuzzer to verify that this bug has been fixed. However, based on the late point in the release cycle and our queue of bugs to verify, it does not appear that we'll be able to get to it before release. 

Can you confirm in Fx33 that this has been fixed? Thank you.
Flags: qe-verify-
Flags: needinfo?(cdiehl)
Yes, this seems fixed. Let Faulty run for an hour now and did not see this crash anymore.
Flags: needinfo?(cdiehl)
Status: RESOLVED → VERIFIED
Flags: needinfo?(bent.mozilla)
Group: core-security
You need to log in before you can comment on or make changes to this bug.