Closed Bug 1041232 Opened 10 years ago Closed 10 years ago

Disabling a GMP plugin while it's in-use in a call leads to a UAF crash.

Categories

(Core :: WebRTC: Audio/Video, defect)

defect
Not set
normal

Tracking

()

VERIFIED FIXED
mozilla34
Tracking Status
firefox32 --- unaffected
firefox33 --- verified
firefox34 --- verified
firefox-esr24 --- unaffected
firefox-esr31 --- unaffected

People

(Reporter: jesup, Assigned: jesup)

References

(Blocks 2 open bugs)

Details

(Keywords: crash, csectype-uaf, sec-critical)

Attachments

(3 files, 4 obsolete files)

mHost gets freed out from under us in WebrtcGmpVideoCodec.cpp.  Likely needs a reference-counted ptr.
Blocks: 1038961
In general using raw pointers as member variables is scary and leads easily to security bugs.
If one needs to use raw pointers, adding comment about lifetime management would be really nice.
> In general using raw pointers as member variables is scary and leads easily to security bugs.

Yeah...  Not my choice; inherited from original author.

So: let's redesign this (as we discussed in #media).

Proposal:
Users keep WeakReferences to the API interface objects (i.e. currently the Proxy).  This means if the plugin crashes, we can safely delete it (as while it's actually in-use we hold a strong ref).

We add a Close() method to the Proxy (user is done with it).  Users call it when done with the codec or in their destructor, triggering cleanup of that codec use (and maybe the entire process if the codec lists become empty).

We add an entry to each CallbackProxy (maybe as a base class all the callback classes inherit from, or just collapse all the callback classes into on) for CodecTerminated(code).  (Not sure we need 'code')  If the plugin crashes or is disabled, before the objects go away the consumer is given a callback to know it's dead.

Comments?
Flags: needinfo?(cpearce)
The *CallbackProxy (or original *Callback) objects not being refcounted always bugged me. I'd prefer the callback objects to be refcounted, but to be neutered when the plugin crashes, i.e. internally have a plugin valid flag or somesuch. That way the *Callback pointer is still valid if there are runnables targeting it later on in the event queue at the point where the CodecTerminated call comes in.
Flags: needinfo?(cpearce)
You could handle the neutering by making the Callback pointer in the other direction a weakref as well, instead of a separate object with a neutering mechanism.

The downside of WeakRefs is that all our implementations seem to be non-threadsafe.  :-(

Alternative is to require implementation of the Terminate() and Close() callbacks, and when called we release the refs to the Proxy (from Terminate() in the consumer) or to the callback (in Close() in the GMP code).  And this means that you can't assume Terminate() stops calls; only object destruction (all refs gone) guarantees that.  So you likely need to (on GMPThread for example):

  mDead = true;
  mCallback->Terminate();  // From GMPThread; note that GMPVideo*coderParent objects aren't threadsafe
  mCallback = nullptr;

and in the other funcs:
Encode(...) // already asserts on GMPThread
{
  if (mDead) {
    return GMPDead;
  }
  ...
}

Similar in the other direction for Close().

If we're careful, we can use WeakRefs by proxying deletions of references to them to GMPThread (like we do with Encode() calls).
Whiteboard: [openh264-uplift]
Attached patch some api proposals (obsolete) — Splinter Review
Comment on attachment 8460000 [details] [diff] [review]
some api proposals

Review of attachment 8460000 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/gmp/GMPCallbackBase.h
@@ +5,5 @@
> +
> +#ifndef GMPCallbackBase_h_
> +#define GMPCallbackBase_h_
> +
> +class GMPCallbackBase {

I would just call this GMPCallbackProxy, and you only need Terminated(), as other non-decoding/encoding related interfaces' proxies will need to extend this too.

::: content/media/gmp/GMPVideoDecoderProxy.h
@@ +24,1 @@
>    virtual nsresult Decode(GMPVideoEncodedFrame* aInputFrame,

we should either expose the refcount on the GMPVideoDecoderParent here, or make the implementation of this a distinct object that points back to the actual GMPVideoDecoderParent. It's probably easier to expose the GMPVideoDecoderParent's refcount here, as then you can don't need to implement GMPVideoDecoderProxy on the intermediary and the GMPVideoDecoderParent to ensure we can still pass nsTArrays directly to the GMPVideoDecoderParent.

::: content/media/gmp/gmp-api/gmp-video-decode.h
@@ +46,3 @@
>  
>  // ALL METHODS MUST BE CALLED ON THE MAIN THREAD
> +class GMPVideoDecoderCallback : public GMPCallbackBase

This is unnecessary. The GMP doesn't need to see Terminated() function added here, it's only relevant in the parent process, on the proxy.
This works - I can disable the plugin in mid-call safely.  There still is likely an issue with the mCallback values - it isn't a WeakPtr (yet); I need to decide if that will work.  Also, the 'VirtualRelease' is hackish but avoids problems with making the returned object from the service a Parent instead of a Proxy.  Alternative might be to make it an XPCOM/ISupports class....
Attachment #8460000 - Attachment is obsolete: true
Attachment #8460802 - Attachment is obsolete: true
Attachment #8460953 - Attachment is obsolete: true
Comment on attachment 8461020 [details] [diff] [review]
WIP patch for GMP API/refcounting fixes

https://tbpl.mozilla.org/?tree=Try&rev=99da5d52e356
Green, though I'll repeat with Ted's GMP plugin testing patch.

Please have barf bag ready when you hit "VirtualRelease()"
Attachment #8461020 - Flags: review?(cpearce)
Comment on attachment 8461020 [details] [diff] [review]
WIP patch for GMP API/refcounting fixes

Review of attachment 8461020 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/gmp/GMPParent.cpp
@@ +260,5 @@
>    if (!pvdp) {
>      return NS_ERROR_FAILURE;
>    }
>    GMPVideoDecoderParent *vdp = static_cast<GMPVideoDecoderParent*>(pvdp);
> +  NS_ADDREF(vdp);

I think we should do the addref inside GMPVideoDecoderParent's constructor; it's an internal detail.

::: content/media/gmp/GMPVideoDecoderProxy.h
@@ +24,5 @@
>  // GMPVideoDecoderParent exposes this to users the GMP.
>  // This enables Gecko to pass nsTArrays to the child GMP and avoid
>  // an extra copy when doing so.
> +
> +// The consumer must keep a reference to this object.  This is not

I would say the consumer needs to call Close(), and then never use its pointer to the proxy after calling it, rather than explictly saying keep a refcount on the object, since the consumer never explictly addrefs it.

@@ +45,5 @@
> +  // Call to tell plugin we're going to be releasing our references, and
> +  // that in any case we're done with this encoder/decoder.
> +  virtual void Close() = 0;
> +  // Release the reference held for us on the parent object
> +  virtual void VirtualRelease() = 0;

Just considering decoding in this instance, but the same applies to encoding, and decryption for that matter...

So with this change, we have three things related to shutdown on this interface:

Close(), VirtualRelease() and DecodingComplete().

I think it's important to have separate DecodingComplete() and "consumer is done with the proxy" functions, since the GMPService/Parent should be able to kill a GMP when we're shutting down, without the consumer's ref to the proxy becoming stale.

But it still seems to me that we only need the proxy to expose one of these to the consumer, one function covers both cases, right? I mean, looking at your WebRTC GMP code, you call Close() and VirtualRelease() right after the other, and you never call DecodingComplete()...

So I propose:
* We keep Close() on the proxy interface, and drop VirtualRelease() and DecodingComplete().
* GMPVideoDecoderParent's ctor adds a refcount, and sets a flag mIsOpen=true;
* GMPVideoDecoderParent::Close() (the proxy override function) sets mIsOpen=false, calls DecodingComplete(), and then calls NS_Release(this).
* GMPVideoDecoderParent::DecodingComplete() only calls the Terminated() callback if mIsOpen==true. This means Terminated() only gets called if Gecko has to kill the plugin, an abnormal shutdown.
* Consumers call Close() when they've finished their decoding, or if they receive a Terminated() callback (to drop the GMPVideoDecoderParent's self ref).
* We also need to call Terminated() and set mIsOpen=false in ActorDestroy().

::: media/webrtc/signaling/src/media-conduit/WebrtcGmpVideoCodec.cpp
@@ +66,5 @@
> +  if (mGMPThread && mGMP) {
> +    mGMPThread->Dispatch(WrapRunnableNM(&Encoder_Close_g, mGMP),
> +                         NS_DISPATCH_NORMAL);
> +    mGMP = nullptr;
> +  }  

Trailing space, and below.
Attachment #8461020 - Flags: review?(cpearce)
Attachment #8461020 - Attachment is obsolete: true
Attachment #8461395 - Flags: review?(cpearce)
Updates per review.  Enable/disable mid-call works still.  however, second calls hangs calling Encode() repeatably.
Comment on attachment 8461795 [details] [diff] [review]
interdiffs for deferred process shutdown to avoid Shmem lockup (to be resolved)

r? for review on the updates
THere's an odd problem where if we shutdown the Process, but don't destroy the GMPParent, and then start up a call again (start a new process for the Parent), it hangs when the Child does a NeedsShmem request - it's sent, but never received.  Workaround is to not close/shutdown the plugin container except when the Parent is destroyed.  State in the MessageChannel I think
Attachment #8461795 - Flags: review?(cpearce)
Comment on attachment 8461395 [details] [diff] [review]
WIP patch for GMP API/refcounting fixes

Review of attachment 8461395 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/media/gmp/GMPVideoDecoderParent.h
@@ +28,5 @@
>  
>    GMPVideoDecoderParent(GMPParent *aPlugin);
>  
>    GMPVideoHostImpl& Host();
> +  virtual nsresult Shutdown();

This doesn't need to be virtual, right?

@@ +33,3 @@
>  
>    // GMPVideoDecoder
> +  virtual void Close();

MOZ_OVERRIDE

::: content/media/gmp/GMPVideoEncoderParent.h
@@ +28,5 @@
>  
>    GMPVideoEncoderParent(GMPParent *aPlugin);
>  
>    GMPVideoHostImpl& Host();
> +  virtual void Shutdown();

This doesn't need to be virtual, right?

@@ +33,3 @@
>  
>    // GMPVideoEncoderProxy
> +  virtual void Close();

MOZ_OVERRIDE

::: content/media/gmp/gmp-api/gmp-video-decode.h
@@ +33,5 @@
>  
>  #ifndef GMP_VIDEO_DECODE_h_
>  #define GMP_VIDEO_DECODE_h_
>  
> +#include <stdint.h>

Do you need to change the GMP-API again here? ;)

::: media/webrtc/signaling/src/media-conduit/WebrtcGmpVideoCodec.cpp
@@ +69,5 @@
> +    mozilla::SyncRunnable::DispatchToThread(mGMPThread,
> +                                            WrapRunnableNM(&Encoder_Close_g, mGMP));
> +    mGMP = nullptr;
> +  }  
> +}

Nit: trailing space, here and below.

::: media/webrtc/signaling/src/media-conduit/WebrtcGmpVideoCodec.h
@@ +47,5 @@
>    {
>      return mGMP ? mGMP->ParentID() : 0;
>    }
>  
> +  virtual void Terminated();

MOZ_OVERRIDE (and on other overriding functions here too?)
Attachment #8461395 - Flags: review?(cpearce) → review+
(In reply to Randell Jesup [:jesup] from comment #16)
> Comment on attachment 8461795 [details] [diff] [review]
> interdiffs for deferred process shutdown to avoid Shmem lockup (to be
> resolved)
> 
> r? for review on the updates
> THere's an odd problem where if we shutdown the Process, but don't destroy
> the GMPParent, and then start up a call again (start a new process for the
> Parent), it hangs when the Child does a NeedsShmem request - it's sent, but
> never received.  Workaround is to not close/shutdown the plugin container
> except when the Parent is destroyed.  State in the MessageChannel I think

I am trying to understand what's going on here... Would this effectively leak plugin-container.exe subprocesses for the entire lifetime of Firefox.exe?
Flags: needinfo?(rjesup)
Comment on attachment 8461795 [details] [diff] [review]
interdiffs for deferred process shutdown to avoid Shmem lockup (to be resolved)

Review of attachment 8461795 [details] [diff] [review]:
-----------------------------------------------------------------

The shutdown stuff here is really confusing, needs cleaning up after we fix the underlying issue.

::: content/media/gmp/GMPParent.cpp
@@ +36,5 @@
>  }
>  
>  GMPParent::~GMPParent()
>  {
> +  // Can't Close or Destroy the process here, since destruction is MainThread only

MOZ_ASSERT(NS_IsMainThread());

(I'd say assert mState==mStateNotLoaded too, but I don't think it's threadsafe to do that on anything other than the GMP thread.)

@@ +74,5 @@
>  
> +  if (!mProcess) {
> +    mProcess = new GMPProcessParent(path.get());
> +    if (!mProcess->Launch(30 * 1000)) {
> +     mProcess->Delete();

nit: indentation is off by 1 here.

@@ +143,5 @@
>      return;
>    }
>  
> +  // XXX Get rid of aPendingClose and do this on all Shutdowns once
> +  // Bug XXXXX is fixed

Please file this and update comment. :)

::: content/media/gmp/GMPParent.h
@@ +131,5 @@
>    nsCString mDescription; // description of plugin for display to users
>    nsCString mVersion;
>    nsTArray<nsAutoPtr<GMPCapability>> mCapabilities;
>    GMPProcessParent* mProcess;
> +  bool mPendingClose;

I think you should call this mDeleteProcessOnUnload, so it's clearer what it's used for.
Attachment #8461795 - Flags: review?(cpearce) → review+
(In reply to Chris Pearce (:cpearce) from comment #18)
> I am trying to understand what's going on here... Would this effectively
> leak plugin-container.exe subprocesses for the entire lifetime of
> Firefox.exe?

Yes.  There appears to be some issue with MessageChannel state - Close(), and later re-Open(). If the client calls NeedsShmem (protocol 'ipc' with a return value), it never seems to be received by the Parent, and you end up stuck.  If you destroy the GMPParent and recreate it, things work (Disable plugin, then re-enable).

I consider this temporary until we resolve the underlying issue.  This keeps one OpenH264 process (after first use) until disable, or shutdown.
Blocks: 1043671
Summary: Disabling a GMP plugin while it's in-use in a call leaks to a UAF crash. → Disabling a GMP plugin while it's in-use in a call leads to a UAF crash.
Comment on attachment 8461395 [details] [diff] [review]
WIP patch for GMP API/refcounting fixes

Note: patch to land is what landed on inbound/central.

Approval Request Comment
[Feature/regressing bug #]: original GMP code

[User impact if declined]: UAF sec bug

[Describe test coverage new/current, TBPL]: just basic GMP usage; test case not easy to test yet in automation (plugin disable in mid-call).  Crashreporter tests will hit this (kill plugin in-call).  Extensively tested by hand.

[Risks and why]: low - This majorly cleans up the ownership model, which was totally busted.  Primary risk would be a fubar in the refcounting, but as most things are refcounted or dependent on a refcounted object, it should be pretty safe.  Only a failure to null out a callback ptr associated with a reference could cause a problem.

[String/UUID change made/needed]: none
Attachment #8461395 - Flags: approval-mozilla-aurora?
Attachment #8461395 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Group: core-security
Nils, have you been able to verify that this has been fixed for Fx33 release? Thanks.
Flags: needinfo?(drno)
Verified with 34.0a2 (2014-10-09) on Mac OSX that switching the OpenH264 Plugin to disabled while in a video call with H264, the video stream stops as expected and FF does NOT crash.
Flags: needinfo?(drno)
Verified on 33.0 (20141007073543) on Mac OSX that in a H264 video call switching the OpenH264 plugin to disabled stops the video rendering and does NOT crash the browser.
Marking as verified fixed based on comment 30 and comment 31.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: