Closed Bug 1288618 Opened 8 years ago Closed 8 years ago

Do hardware accelerated video decoding in the GPU process

Categories

(Core :: Audio/Video: Playback, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla52
Tracking Status
firefox50 --- affected
firefox52 --- fixed

People

(Reporter: mattwoodrow, Assigned: mattwoodrow)

References

(Depends on 3 open bugs)

Details

Attachments

(19 files, 2 obsolete files)

58 bytes, text/x-review-board-request
dvander
: review+
Details
58 bytes, text/x-review-board-request
nical
: review+
Details
58 bytes, text/x-review-board-request
cpearce
: review+
Details
58 bytes, text/x-review-board-request
cpearce
: review+
jya
: review+
Details
58 bytes, text/x-review-board-request
dvander
: review+
Details
58 bytes, text/x-review-board-request
bholley
: review+
Details
58 bytes, text/x-review-board-request
dvander
: review+
Details
58 bytes, text/x-review-board-request
nical
: review+
Details
58 bytes, text/x-review-board-request
nical
: review+
Details
58 bytes, text/x-review-board-request
nical
: review+
Details
58 bytes, text/x-review-board-request
cpearce
: review+
Details
58 bytes, text/x-review-board-request
dvander
: review+
Details
58 bytes, text/x-review-board-request
dvander
: review+
Details
58 bytes, text/x-review-board-request
dvander
: review+
cpearce
: review+
Details
58 bytes, text/x-review-board-request
cpearce
: review+
dvander
: review+
Details
1.19 KB, patch
Details | Diff | Splinter Review
725 bytes, patch
Details | Diff | Splinter Review
738 bytes, patch
Details | Diff | Splinter Review
1.15 KB, patch
n.nethercote
: review+
Details | Diff | Splinter Review
We've had a lot of crashes with hardware video decoding, so we want to move this into the GPU process (when available) so that these crashes don't end the browser/tab process.
Depends on: e10s-gpu
Attached patch Rough outline of the infrastructure (obsolete) — — Splinter Review
Blocks: 1297311
Attachment #8773621 - Attachment is obsolete: true
David's review should be fine here unless you wanted me to look at it specifically.
(In reply to Bill McCloskey (:billm) from comment #18)
> David's review should be fine here unless you wanted me to look at it
> specifically.

That's fine, I just thought you might want to see big changes like this.
Comment on attachment 8788039 [details]
Bug 1288618 - Part 2: Allow allocating D3D9/11 Images when we don't have a recycling allocator available.

https://reviewboard.mozilla.org/r/76590/#review74788
Attachment #8788039 - Flags: review?(nical.bugzilla) → review+
Matt, could you write up an overview of how vidoe in the gpu process works?
(In reply to Nicolas Silva [:nical] from comment #21)
> Matt, could you write up an overview of how vidoe in the gpu process works?

Sure!

Basically it takes the MediaDataDecoder/MediaDataDecoderCallback interfaces, and wraps an IPDL protocol (PVideoDecoder) around them, so that we can use them in a different process. RemoteVideoDecoder is a MediaDataDecoder implementation that forwards all calls across PVideoDecoder to another MediaDataDecoder instance (always WMFMediaDataDecoder for now).

We also have the PVideoDecoderManager top-level protocol which manages IPDL threads for this to run on, and handles allocation/destruction of PVideoDecoders.

The output of a MediaDataDecoder is a layers::Image, which we would normally use to create a TextureClient and then forward to the compositor using ImageBridge. Since PVideoDecoderManager is a different top-level protocol to PImageBridge we can't use TextureClient, as IPDL actors can't move between protocols.

We instead keep the layers::Image stored in the GPU process, and pass back a SurfaceDescriptor that represents it. We need to manually deallocate these as we don't get automatic lifetimes without an IPDL actor. To do this, we make sure we always create a TextureData object for every SurfaceDescriptor that is returned, and use the Deallocate/Forget methods on that to deallocate. The existing infrastructure around TextureData/TextureClient makes sure that we don't issue the deallocate until the Compositor has created TextureHosts (if necessary) and prevents races from occuring (everything runs on different threads).
Also, deallocation is a function on the manager protocol so that if a video decoder is shut down before all decoded frames have been presented, then we don't lose those frames.

When the manager itself is shutdown (child process exiting), then we drop all remaining allocated frames from the compositor side. This makes sure we avoid leaking any, without making shutdown complicated by requiring that ImageBridge completes shutdown entirely and deallocates all SurfaceDescriptors before we begin shutting down the VideoDecoderManager.
Depends on: 1300675
Depends on: 1300676
Depends on: 1300677
Depends on: 1300678
Depends on: 1300679
Depends on: 1300681
Depends on: 1300682
Depends on: 1300684
Comment on attachment 8788044 [details]
Bug 1288618 - Part 7: Initialize AbstractThread in the GPU process.

https://reviewboard.mozilla.org/r/76600/#review75074

::: xpcom/threads/AbstractThread.h:69
(Diff revision 1)
> +  virtual void TailDispatchTasksFor(AbstractThread* aThread);
> +
> +  virtual bool HasTailTasksFor(AbstractThread* aThread);

Comment these, especially the motivation for using them rather than accessing the tail dispatcher directly.

::: xpcom/threads/AbstractThread.cpp:98
(Diff revision 1)
> +  virtual void TailDispatchTasksFor(AbstractThread* aThread) override
> +  {
> +    if (!mTailDispatcher.isSome()) {
> +      return;
> +    }
> +    return TailDispatcher().DispatchTasksFor(aThread);
> +  }
> +
> +  virtual bool HasTailTasksFor(AbstractThread* aThread) override
> +  {
> +    if (!mTailDispatcher.isSome()) {
> +      return false;
> +    }
> +    return TailDispatcher().HasTasksFor(aThread);
> +  }
> +
>    virtual nsIThread* AsXPCOMThread() override { return mTarget; }
>  
>  private:
>    RefPtr<nsIThread> mTarget;
>    Maybe<AutoTaskDispatcher> mTailDispatcher;
>  };
>  
> +void
> +AbstractThread::TailDispatchTasksFor(AbstractThread* aThread)
> +{
> +  TailDispatcher().DispatchTasksFor(aThread);
> +}
> +
> +bool
> +AbstractThread::HasTailTasksFor(AbstractThread* aThread)
> +{
> +  return TailDispatcher().HasTasksFor(aThread);
> +}
> +

Seems like this could be implemented a bit more elegantly by having a virtual bool HasTailTasks() method, and then implementing these two new methods as non-virtual on the superclass.

I'd prefer that I think, unless you find some good reason not to.
Attachment #8788044 - Flags: review?(bobbyholley) → review+
Comment on attachment 8788043 [details]
Bug 1288618 - Part 6: Avoid unnecessarily allocating a TailDispatcher for XPCOMThreadWrapper.

https://reviewboard.mozilla.org/r/76598/#review74828

So, in bug 1300118 bkelly has a patch to make TaskQueue operate on an nsIEventTarget instead of a SharedThreadPool. That approach seems much nicer compared to internal overridable dispatch methods. Could you make the chromium loop implement nsIEventTarget or use a simple wrapper class to make it do so instead?

::: xpcom/threads/TaskQueue.h:106
(Diff revision 1)
> +    RefPtr<Runnable> run = aRunner;
> +    return mPool->Dispatch(run.forget(), NS_DISPATCH_NORMAL);

You should be able to just Move(aRunner).
Attachment #8788043 - Flags: review?(bobbyholley) → review-
Attachment #8788053 - Attachment is obsolete: true
Attachment #8788053 - Flags: review?(dvander)
Attachment #8788053 - Flags: review?(cpearce)
Comment on attachment 8788043 [details]
Bug 1288618 - Part 6: Avoid unnecessarily allocating a TailDispatcher for XPCOMThreadWrapper.

https://reviewboard.mozilla.org/r/76598/#review75318

I'm kind of disoriented by review board - my intention was to r+ this patch (the HasTailTaskFor patch) with comments, and to r- the patch with the internal dispatch method. But either I did that wrong or MozReview got really confused with reordered patches or something...

::: xpcom/threads/AbstractThread.h:69
(Diff revision 2)
> +  // Returns true if we have tail tasks scheduled, or if this isn't known.
> +  // Returns false if we definitely don't have any tail tasks.
> +  virtual bool HasTailTasks() { return true; }

Given these semantics, I'd call this MightHaveTailTasks.
Attachment #8788043 - Flags: review?(bobbyholley) → review+
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #41)
> Comment on attachment 8788043 [details]
> Bug 1288618 - Part 6: Avoid unnecessarily allocating a TailDispatcher for
> XPCOMThreadWrapper.
> 
> https://reviewboard.mozilla.org/r/76598/#review75318
> 
> I'm kind of disoriented by review board - my intention was to r+ this patch
> (the HasTailTaskFor patch) with comments, and to r- the patch with the
> internal dispatch method. But either I did that wrong or MozReview got
> really confused with reordered patches or something...

You did that :) Reviewboard got confused when I pulled the other patch and renumbered everything after it.
Comment on attachment 8788040 [details]
Bug 1288618 - Part 3: Avoid accessing MediaPrefs from the GPU process within WMF code.

https://reviewboard.mozilla.org/r/76592/#review74946

::: dom/media/platforms/wmf/DXVA2Manager.cpp:492
(Diff revision 2)
>    MOZ_ASSERT(NS_IsMainThread());
>    HRESULT hr;
>  
>    // DXVA processing takes up a lot of GPU resources, so limit the number of
>    // videos we use DXVA with at any one time.
> -  const uint32_t dxvaLimit = MediaPrefs::PDMWMFMaxDXVAVideos();
> +  uint32_t dxvaLimit = 4;

Ideally whatever makes prefs work between the content/chrome process can be made to work in the gpu process too, so MediaPrefs Just Work in the gpu process...

::: dom/media/platforms/wmf/WMFVideoMFTManager.cpp:419
(Diff revision 2)
>    UINT32 aware = 0;
>    if (attr) {
>      attr->GetUINT32(MF_SA_D3D_AWARE, &aware);
>      attr->SetUINT32(CODECAPI_AVDecNumWorkerThreads,
>        WMFDecoderModule::GetNumDecoderThreads());
> -    if (MediaPrefs::PDMWMFLowLatencyEnabled()) {
> +    if ((XRE_GetProcessType() == GeckoProcessType_GPU) ||

Should we be turning on low latency mode? It crashed a lot IIRC.

So maybe for the time being just make that:

 if ((XRE_GetProcessType() != GeckoProcessType_GPU) && MediaPrefs::PDMWMFLowLatencyEnabled()) ....

?

Then once you've got the OOP decoding stabilized, try turning it on?
Attachment #8788040 - Flags: review?(cpearce) → review+
Comment on attachment 8788040 [details]
Bug 1288618 - Part 3: Avoid accessing MediaPrefs from the GPU process within WMF code.

https://reviewboard.mozilla.org/r/76592/#review74946

> Ideally whatever makes prefs work between the content/chrome process can be made to work in the gpu process too, so MediaPrefs Just Work in the gpu process...

That would be nice! For gfx, we've got gfxVars to sync values we need across to the GPU process. I've filed bug 1300678 for coming up with a solution for this.

> Should we be turning on low latency mode? It crashed a lot IIRC.
> 
> So maybe for the time being just make that:
> 
>  if ((XRE_GetProcessType() != GeckoProcessType_GPU) && MediaPrefs::PDMWMFLowLatencyEnabled()) ....
> 
> ?
> 
> Then once you've got the OOP decoding stabilized, try turning it on?

Oh good point, the default value is false. I'll fix this.
Comment on attachment 8788041 [details]
Bug 1288618 - Part 4: Fix some namespace collisions in media code.

https://reviewboard.mozilla.org/r/76594/#review75674

I feel jya should review this, as the MediaSource code is really his baby.

And I'm surprised you can't just have some typedefs here to make the existing symbol/enum names work.

::: dom/media/mediasource/MediaSourceDemuxer.cpp:418
(Diff revision 2)
>    Result result;
>    RefPtr<MediaRawData> sample =
>      mManager->GetSample(mType,
>                          media::TimeUnit(),
>                          result);
> -  MOZ_ASSERT(result != Result::ERROR && sample);
> +  MOZ_ASSERT(result != Result::ERROR_FAILURE && sample);

Why does the typedef for Result on line 382 not make this work?

::: dom/media/mediasource/TrackBuffersManager.h:159
(Diff revision 2)
>                                         const media::TimeUnit& aFuzz,
>                                         bool& aFound);
>  
>    enum class GetSampleResult
>    {
> -    NO_ERROR,
> +    NO_SAMPLE_ERROR,

The name "NO_SAMPLE_ERROR" is ambiguous; it makes me think it means "we couldn't get a sample", whereas the previous NO_ERROR meant "we didn't encounter an error" (i.e. the Apple convention of naming success codes as not-a-failure).

So if you do need to rename this, please call it something that's unambiguously a success code. Like SUCCESS, OK, or GOT_SAMPLE, or somesuch.
Attachment #8788041 - Flags: review?(cpearce) → review?(jyavenard)
Comment on attachment 8788048 [details]
Bug 1288618 - Part 11: Add a pref to enable GPU process video decoding.

https://reviewboard.mozilla.org/r/76608/#review75676
Attachment #8788048 - Flags: review?(cpearce) → review+
Comment on attachment 8788041 [details]
Bug 1288618 - Part 4: Fix some namespace collisions in media code.

https://reviewboard.mozilla.org/r/76594/#review75686

What's the conflict with?

This is an enum class, it must be fully qualified. The only way it can conflicts is if we have a crappy #define somewhere , or a #include that shouldn't be included.
In any case, this leads to bad definition somewhere.

I'd prefer the real issue fixed, rather than workaround by messing unrelated code.
Comment on attachment 8788051 [details]
Bug 1288618 - Part 14: Add PVideoDecoder protocol for individual decoders.

https://reviewboard.mozilla.org/r/76614/#review75688

::: dom/media/ipc/VideoDecoderChild.h:53
(Diff revision 2)
> +
> +  MessageLoop* mMessageLoop;
> +
> +  MediaDataDecoderCallback* mCallback;
> +
> +  RefPtr<MediaDataDecoder::InitPromise::Private> mInitPromise;

Please use a MozPromiseHolder instead.

Private shouldn't be used (hence the name)

Then to test if a promise is pending or not, you can use Exists()

::: dom/media/ipc/VideoDecoderChild.cpp:32
(Diff revision 2)
> +}
> +
> +VideoDecoderChild::~VideoDecoderChild()
> +{
> +  AssertOnManagerThread();
> +  if (mInitPromise) {

Using MozPromiseHolder; use RejectIfExists method

::: dom/media/ipc/VideoDecoderChild.cpp:90
(Diff revision 2)
> +VideoDecoderChild::RecvInitComplete()
> +{
> +  AssertOnManagerThread();
> +  mInitPromise->Resolve(TrackInfo::kVideoTrack, __func__);
> +  mInitPromise = nullptr;
> +  mInitialized = true;

this members seems redundant to me, testing if mInitPromise exists is sufficient.
No other functions can be called if an error occurred.

::: dom/media/ipc/VideoDecoderParent.h:50
(Diff revision 2)
> +  void DrainComplete() override;
> +  bool OnReaderTaskQueue() override;
> +
> +private:
> +  ~VideoDecoderParent();
> +  

trailing spaces
Comment on attachment 8788052 [details]
Bug 1288618 - Part 15: Add media code facing RemoteVideoDecoder.

https://reviewboard.mozilla.org/r/76616/#review75694

::: dom/media/ipc/RemoteVideoDecoder.h:38
(Diff revision 2)
> +  nsresult Drain() override;
> +  nsresult Shutdown() override;
> +  nsresult ConfigurationChanged(const TrackInfo& aConfig) override { MOZ_ASSERT(false); return NS_OK; }
> +
> +  virtual const char* GetDescriptionName() const override { return "RemoteVideoDecoder"; }
> +  

white spaces

::: dom/media/ipc/RemoteVideoDecoder.h:59
(Diff revision 2)
> +public:
> +  explicit RemoteDecoderModule(PlatformDecoderModule* aWrapped)
> +    : mWrapped(aWrapped)
> +  {}
> +
> +  virtual nsresult Startup() override;

redundant virtual.
virtual OR override, not both will fail static analysis
Comment on attachment 8788049 [details]
Bug 1288618 - Part 12: Initialize WMF in the GPU process.

https://reviewboard.mozilla.org/r/76610/#review75698

::: gfx/ipc/GPUProcessImpl.cpp:29
(Diff revision 2)
>  }
>  
>  bool
>  GPUProcessImpl::Init()
>  {
> +#ifdef XP_WIN

i hope this won't stay..

We've moved so that the initialisation of all PDM is now done controlled by the PDM Factory...

could simply have a PDM factory singleton in the GPU process too
Comment on attachment 8788050 [details]
Bug 1288618 - Part 13: Add VideoDecoderManager protocol.

https://reviewboard.mozilla.org/r/76612/#review75700

::: dom/media/ipc/VideoDecoderManagerChild.cpp:23
(Diff revision 2)
> +
> +static Thread* sVideoDecoderChildThread = nullptr;
> +
> +MessageLoop* VideoDecoderManagerChild::GetMessageLoop()
> +{
> +  if (!sVideoDecoderChildThread) {

Is this guaranteed to be thread safe?

It seems to me that GetMessageLoop can be called from two different threads.

Shouldn't we have a static monitor to protect sVideoDecoderChildThread?

::: dom/media/ipc/VideoDecoderManagerChild.cpp:46
(Diff revision 2)
> +
> +#ifdef XP_WIN
> +  if (!MediaPrefs::PDMUseGPUDecoder()) {
> +    return;
> +  }
> +  

white space

::: dom/media/ipc/VideoDecoderManagerChild.cpp:84
(Diff revision 2)
> +      ReentrantMonitor barrier("VideoDecoderManagerChild shutdown");
> +      ReentrantMonitorAutoEnter autoMon(barrier);
> +
> +      // Should be safe to capture stack variables by references since we're blocking
> +      // the calling thread until the lamba has completed.
> +      bool done = false;

pedantically, done should be an Atomic

::: dom/media/ipc/VideoDecoderManagerParent.h:27
(Diff revision 2)
> +
> +  static void StartupThreads();
> +  static void ShutdownThreads();
> +
> +protected:
> +  virtual bool RecvDeallocateSurfaceDescriptorGPUVideo(const SurfaceDescriptorGPUVideo& aSD) override;

while I see it's everywhere over the ipdl code, it's no reason to continue...

virtual OR override.

::: gfx/ipc/GPUProcessManager.h:51
(Diff revision 2)
>  class GPUChild;
>  class VsyncBridgeChild;
>  class VsyncIOThreadHolder;
>  class PVRManagerChild;
>  
> +

unecessary new line
(In reply to Jean-Yves Avenard [:jya] from comment #50)
> Comment on attachment 8788049 [details]
> Bug 1288618 - Part 12: Initialize WMF in the GPU process.
> 
> https://reviewboard.mozilla.org/r/76610/#review75698
> 
> ::: gfx/ipc/GPUProcessImpl.cpp:29
> (Diff revision 2)
> >  }
> >  
> >  bool
> >  GPUProcessImpl::Init()
> >  {
> > +#ifdef XP_WIN
> 
> i hope this won't stay..
> 
> We've moved so that the initialisation of all PDM is now done controlled by
> the PDM Factory...
> 
> could simply have a PDM factory singleton in the GPU process too

It would be nice to have a decent solution to this, do you want to fix this as a followup?

The problem is that WMF *needs* to be initialized on the main thread first, otherwise it just fails to post tasks internally (silently!) and hangs. Figuring this out took a lot of trial and error and debugging of closed source assembly code. They post a task to their internal task queue, and it never gets run.

We dodge this bug in the content process basically by accident, as part of trying to run a video we check canPlayType  (on the main thread!) and this instantiates the WMF PDM and initializes WMF.

We probably should have a more generic (and explicit!) way of doing the global initialization of WMF (as opposed to the per-instance init) and make sure we run that from the main thread of all relevant processes.
(In reply to Jean-Yves Avenard [:jya] from comment #51)
> Comment on attachment 8788050 [details]
> Bug 1288618 - Part 13: Add VideoDecoderManager protocol.
> 
> https://reviewboard.mozilla.org/r/76612/#review75700
> 
> ::: dom/media/ipc/VideoDecoderManagerChild.cpp:23
> (Diff revision 2)
> > +
> > +static Thread* sVideoDecoderChildThread = nullptr;
> > +
> > +MessageLoop* VideoDecoderManagerChild::GetMessageLoop()
> > +{
> > +  if (!sVideoDecoderChildThread) {
> 
> Is this guaranteed to be thread safe?
> 
> It seems to me that GetMessageLoop can be called from two different threads.
> 
> Shouldn't we have a static monitor to protect sVideoDecoderChildThread?

It's not entirely obvious, but we're guaranteed that the first call to this function happens on the main thread and before we have anything on other threads that might call it.
Comment on attachment 8788051 [details]
Bug 1288618 - Part 14: Add PVideoDecoder protocol for individual decoders.

https://reviewboard.mozilla.org/r/76614/#review75678

r+ with issues addressed.

In general, you need to be more careful about ensuring that errors are reported. Otherwise looks good.

::: dom/media/ipc/PVideoDecoder.ipdl:45
(Diff revision 2)
> +};
> +
> +// This protocol provides a way to use MediaDataDecoder/MediaDataDecoderCallback
> +// across processes. The parent side currently is only implemented to work with
> +// Window Media Foundation, but can be extended easily to support other backends.
> +// The child side runs in the child process, and the parent side runs in the

s/The child side runs in the child process/The child side runs in the content process/

?

::: dom/media/ipc/PVideoDecoder.ipdl:46
(Diff revision 2)
> +
> +// This protocol provides a way to use MediaDataDecoder/MediaDataDecoderCallback
> +// across processes. The parent side currently is only implemented to work with
> +// Window Media Foundation, but can be extended easily to support other backends.
> +// The child side runs in the child process, and the parent side runs in the
> +// GPU process. We run a separate IDPL thread for both sides.

s/We run a separate IDPL thread for both sides./We run a separate IPDL thread for both sides./

that is, s/IDPL/IPDL/, right?

::: dom/media/ipc/VideoDecoderChild.cpp:110
(Diff revision 2)
> +void
> +VideoDecoderChild::Init(RefPtr<MediaDataDecoder::InitPromise::Private> aPromise)
> +{
> +  AssertOnManagerThread();
> +  mInitPromise = aPromise;
> +  SendInit(mVideoInfo, mLayersBackend);

If the IPC send fails, reject the init promise.

::: dom/media/ipc/VideoDecoderChild.cpp:123
(Diff revision 2)
> +
> +  // TODO: It would be nice to add an allocator method to
> +  // MediaDataDecoder so that the demuxer could write directly
> +  // into shmem rather than requiring a copy here.
> +  Shmem buffer;
> +  AllocShmem(aSample->Size(), Shmem::SharedMemory::TYPE_BASIC, &buffer);

You should check the return value here, and fail if the allocation of the shmem fails. Otherwise you'll do a null pointer deref on line 125.

::: dom/media/ipc/VideoDecoderChild.cpp:134
(Diff revision 2)
> +                                        aSample->mTimecode,
> +                                        aSample->mDuration,
> +                                        aSample->mFrames,
> +                                        aSample->mKeyframe),
> +                          buffer);
> +  SendInput(sample);

If the IPC fails, you should return failure, i.e. something like:

return SendInput(sample) ? NS_OK : NS_ERROR_FAILURE;

Ditto for Flush, Drain, Shutdown, and in general if anything does IPC, you should be checking for failure and reporting failures.

::: dom/media/ipc/VideoDecoderChild.cpp:141
(Diff revision 2)
> +}
> +
> +nsresult
> +VideoDecoderChild::Flush()
> +{
> +  AssertOnManagerThread();

Once MediaDataDecoder::Flush() returns, the MFR is supposed to be able to send more input safely... I suppose that's probably still true here because the WMF decoder's Flush() implementation on the other side of the IPC is synchronous too...

::: dom/media/ipc/VideoDecoderManagerChild.h:32
(Diff revision 2)
>    // Main thread only
>    static void Initialize();
>    static void Shutdown();
>  
>  protected:
> + virtual PVideoDecoderChild* AllocPVideoDecoderChild() override;

Indentation off by one.

::: dom/media/ipc/VideoDecoderManagerChild.h:43
(Diff revision 2)
>    {}
>    ~VideoDecoderManagerChild() {}
>  
>    void Open(Endpoint<PVideoDecoderManagerChild>&& aEndpoint);
>  
>    uint32_t mManagedDecoders;

You could declare this as:

  uint32_t mManagedDecoders = 0;

And then you'd not need the constructor.

::: dom/media/ipc/VideoDecoderManagerChild.cpp:47
(Diff revision 2)
>  
>  #ifdef XP_WIN
>    if (!MediaPrefs::PDMUseGPUDecoder()) {
>      return;
>    }
>    

Trailing whitespace.

::: dom/media/ipc/VideoDecoderParent.h:39
(Diff revision 2)
> +  bool RecvInput(const MediaRawDataIPDL& aData) override;
> +  bool RecvFlush() override;
> +  bool RecvDrain() override;
> +  bool RecvShutdown() override;
> +
> +  void ActorDestroy(ActorDestroyReason aWhy) override;

ActorDestroy gets called with AbnormalShutdown when the child process crashes. In this case, it will be when the content process crashes. How does the VideoDecoderChild get a notification when the GPU process crashes? Will one of the SendXXX messages fail to send? If so, you definitely need to be checking the return values of the SendXXX functions in the child.

::: dom/media/ipc/VideoDecoderParent.h:50
(Diff revision 2)
> +  void DrainComplete() override;
> +  bool OnReaderTaskQueue() override;
> +
> +private:
> +  ~VideoDecoderParent();
> +  

Trailing whitespace.

::: dom/media/ipc/VideoDecoderParent.cpp:107
(Diff revision 2)
> +  data->mTime = aData.base().time();
> +  data->mTimecode = aData.base().timecode();
> +  data->mDuration = aData.base().duration();
> +  data->mKeyframe = aData.base().keyframe();
> +
> +  DeallocShmem(aData.buffer());

The GMP code does this complicated thing where it keeps a pool of shmems around. I think that was because we'd sometimes hit the limit on the number of file descriptors on Linux. So you may find you need to do that in future if we make other platforms decode OOP.

::: dom/media/ipc/VideoDecoderParent.cpp:109
(Diff revision 2)
> +  data->mDuration = aData.base().duration();
> +  data->mKeyframe = aData.base().keyframe();
> +
> +  DeallocShmem(aData.buffer());
> +
> +  mDecoder->Input(data);

You should send an error back to the child if Input() returns failure, as otherwise there's no other feedback.
Attachment #8788051 - Flags: review?(cpearce) → review+
Comment on attachment 8788052 [details]
Bug 1288618 - Part 15: Add media code facing RemoteVideoDecoder.

https://reviewboard.mozilla.org/r/76616/#review75684
Comment on attachment 8788052 [details]
Bug 1288618 - Part 15: Add media code facing RemoteVideoDecoder.

https://reviewboard.mozilla.org/r/76616/#review75750
Attachment #8788052 - Flags: review?(cpearce) → review+
Comment on attachment 8788038 [details]
Bug 1288618 - Part 1: Use gfxVars for CanUseHardwareVideoDecoding so we can access it in the GPU process.

https://reviewboard.mozilla.org/r/76588/#review75756
Attachment #8788038 - Flags: review?(dvander) → review+
Comment on attachment 8788042 [details]
Bug 1288618 - Part 5: Add IPC serialization helpers for some media structs/enums.

https://reviewboard.mozilla.org/r/76596/#review75758

It'd be nice to put a comment in VideoInfo mentioning when fields have to be added to the IPDL serializer. That way it's 2% less likely they get out of sync, since this has a lot of fields and only a few are serialized.
Attachment #8788042 - Flags: review?(dvander) → review+
Comment on attachment 8788044 [details]
Bug 1288618 - Part 7: Initialize AbstractThread in the GPU process.

https://reviewboard.mozilla.org/r/76600/#review75762

r=me assuming this doesn't need a paired Shutdown call or something, which would go in GPUParent::ActorDestroy.
Attachment #8788044 - Flags: review?(dvander) → review+
Comment on attachment 8788049 [details]
Bug 1288618 - Part 12: Initialize WMF in the GPU process.

https://reviewboard.mozilla.org/r/76610/#review75790

Unless there is some ordering issue that prevents this from being initialized later, it'd be nicer if this was initialized in GPUParent, where we already have #ifdef WIN stuff.
Attachment #8788049 - Flags: review?(dvander) → review+
Comment on attachment 8788050 [details]
Bug 1288618 - Part 13: Add VideoDecoderManager protocol.

https://reviewboard.mozilla.org/r/76612/#review75792

::: dom/ipc/ContentParent.cpp:1066
(Diff revision 3)
>  
> +bool
> +ContentParent::RecvInitVideoDecoderManager(bool* aSuccess, Endpoint<PVideoDecoderManagerChild>* aEndpoint)
> +{
> +  GPUProcessManager* gpm = GPUProcessManager::Get();
> +  if (!gpm) {

GPUProcessManager is always non-null, so no null-check needed here.

::: dom/ipc/ContentParent.cpp:1075
(Diff revision 3)
> +
> +  DebugOnly<bool> opened =
> +    gpm->CreateContentVideoDecoderManager(OtherPid(), aEndpoint);
> +  MOZ_ASSERT(opened);
> +  *aSuccess = true;
> +  return true;

Sending (or receiving) an invalid Endpoint over IPDL is a MOZ_RELEASE_ASSERT, so the aSuccess return value here doesn't buy us much.

I'm not sure what the right thing to do is. Inverting the message flow here is not very easy. Maybe we could do an IPDL union. Or we could just take out the MOZ_RELEASE_ASSERT and require consumers to check whether Endpoints are valid.

Either way, I'm fine with leaving this as a follow-up bug, if you do please file & leave a comment in the code.

::: dom/media/ipc/PVideoDecoderManager.ipdl:15
(Diff revision 3)
> +namespace dom {
> +
> +async protocol PVideoDecoderManager
> +{
> +parent:
> +

nit: extra newline

::: dom/media/ipc/VideoDecoderManagerChild.h:31
(Diff revision 3)
> +
> +  // Main thread only
> +  static void Initialize();
> +  static void Shutdown();
> +
> +protected:

nit: unused?

::: dom/media/ipc/VideoDecoderManagerChild.h:41
(Diff revision 3)
> +  {}
> +  ~VideoDecoderManagerChild() {}
> +
> +  void Open(Endpoint<PVideoDecoderManagerChild>&& aEndpoint);
> +
> +  uint32_t mManagedDecoders;

This field looks unused?

::: dom/media/ipc/VideoDecoderManagerChild.cpp:73
(Diff revision 3)
> +
> +/* static */ void
> +VideoDecoderManagerChild::Shutdown()
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +

nit: extra newline

::: dom/media/ipc/VideoDecoderManagerChild.cpp:95
(Diff revision 3)
> +        barrier.NotifyAll();
> +      }));
> +
> +      while (!done) {
> +        barrier.Wait();
> +      }

Since this is a brand new thread, can we use an XPCOM thread instead? Then we could use sync dispatch instead.

Alternately, you could hoist the new SynchronousTask/etc out of ImageBridgeChild.cpp into a new header.

Also, in the interest of future-proofing, you'll probably want to separate this into ShutdownSingleton/Shutdown functions so it's easy to restart the bridge.

::: dom/media/ipc/VideoDecoderManagerChild.cpp:118
(Diff revision 3)
> +{
> +  if (!aEndpoint.Bind(this)) {
> +    // We can't recover from this.
> +    MOZ_CRASH("Failed to bind VideoDecoderChild to endpoint");
> +  }
> +}

If Bind() succeeds, we need to AddRef the IPDL object. Then implement DeallocPVideoDecoderManagerChild should Release().

Otherwise, an ActorDestroy could be queued to the event loop after we've dropped the singleton ref.

::: dom/media/ipc/VideoDecoderManagerParent.h:14
(Diff revision 3)
> +#include "mozilla/dom/PVideoDecoderManagerParent.h"
> +
> +namespace mozilla {
> +namespace dom {
> +
> +class VideoDecoderManagerParent final : public PVideoDecoderManagerParent

It looks like VideoDecoderManagerParent is per-content-process, rather than a singleton. That's actually quite nice and I'm wondering whether we can just move all the static ImageMap stuff into members of VideoDecoderManagerParent.

That would also solve some of the security concern of different content processes being able to access other textures.

::: gfx/ipc/GPUProcessManager.cpp:526
(Diff revision 3)
> +bool
> +GPUProcessManager::CreateContentVideoDecoderManager(base::ProcessId aOtherProcess,
> +                                                    ipc::Endpoint<dom::PVideoDecoderManagerChild>* aOutEndpoint)
> +{
> +  if (!mGPUChild) {
> +    return true;

Return false?
Comment on attachment 8788050 [details]
Bug 1288618 - Part 13: Add VideoDecoderManager protocol.

https://reviewboard.mozilla.org/r/76612/#review75798
Attachment #8788050 - Flags: review?(dvander)
(In reply to David Anderson [:dvander] from comment #64)
> Sending (or receiving) an invalid Endpoint over IPDL is a
> MOZ_RELEASE_ASSERT, so the aSuccess return value here doesn't buy us much.
> 
> I'm not sure what the right thing to do is. Inverting the message flow here
> is not very easy. Maybe we could do an IPDL union. Or we could just take out
> the MOZ_RELEASE_ASSERT and require consumers to check whether Endpoints are
> valid.
> 
> Either way, I'm fine with leaving this as a follow-up bug, if you do please
> file & leave a comment in the code.

This is a shame. I'll file the followup, since it would be nice to have this gracefully fail if you try to enable this when the GPU process isn't running.


> > +
> > +  uint32_t mManagedDecoders;
> 
> This field looks unused?

It's used in the next patch, I probably should have pulled this line out.

> 
> Since this is a brand new thread, can we use an XPCOM thread instead? Then
> we could use sync dispatch instead.
> 
> Alternately, you could hoist the new SynchronousTask/etc out of
> ImageBridgeChild.cpp into a new header.
> 
> Also, in the interest of future-proofing, you'll probably want to separate
> this into ShutdownSingleton/Shutdown functions so it's easy to restart the
> bridge.

I'll have another look into the thread situation and see if we can do better.


> 
> It looks like VideoDecoderManagerParent is per-content-process, rather than
> a singleton. That's actually quite nice and I'm wondering whether we can
> just move all the static ImageMap stuff into members of
> VideoDecoderManagerParent.
> 
> That would also solve some of the security concern of different content
> processes being able to access other textures.

I like this idea. I'll do it as a followup since we'll need a way to get from layers code to a VideoDecoderManagerParent* in a way that the child process can't interfere with.
Depends on: 1302008
Depends on: 1302009
Comment on attachment 8788051 [details]
Bug 1288618 - Part 14: Add PVideoDecoder protocol for individual decoders.

https://reviewboard.mozilla.org/r/76614/#review75688

> this members seems redundant to me, testing if mInitPromise exists is sufficient.
> No other functions can be called if an error occurred.

This isn't true unfortunately, we seem to Flush() decoders while the Init() is in progress.
Comment on attachment 8788051 [details]
Bug 1288618 - Part 14: Add PVideoDecoder protocol for individual decoders.

https://reviewboard.mozilla.org/r/76614/#review75688

> This isn't true unfortunately, we seem to Flush() decoders while the Init() is in progress.

i don't see how this could happen.
But if it does, then it's a bug and the MediaFormatReader should be fixed accordingly.
feel free to open a bug on this, I can take it on ASAP.
I am not found of how the idea that this patchset re-implements texture sharing another way. The remote video decoders could shared textures with a simple ImageBridge-like IPDL protocol using the same CompositableForwarder infrastructure and code.
Inevitably this new texture sharing code will become more complex and will grow into needing the features already baked into ImageBridge and friends. If the already-existing texture sharing code isn't the right tool for the remote gpu video stuff, it means that it is already not the right tool for our current video needs and we should fix it.
(In reply to Nicolas Silva [:nical] from comment #69)
> I am not found of how the idea that this patchset re-implements texture
> sharing another way. The remote video decoders could shared textures with a
> simple ImageBridge-like IPDL protocol using the same CompositableForwarder
> infrastructure and code.
> Inevitably this new texture sharing code will become more complex and will
> grow into needing the features already baked into ImageBridge and friends.
> If the already-existing texture sharing code isn't the right tool for the
> remote gpu video stuff, it means that it is already not the right tool for
> our current video needs and we should fix it.

Can you please explain more about this idea?

Do you want an equivalent to ImageBridge running from the decoder thread to the compositor thread, entirely within the GPU process?

The video decoders know nothing about presentation, so we'd still need the real ImageBridge connection (from content process to GPU process) to send PTextures that signify which frames are to be displayed. We'd still need some sort of identifier sent from video decoder -> content process and then that same identifier would be sent within the ImageBridge PTexture back to the GPU process.

The main difference would be that we'd map from ID -> TextureHost (that is owned by the 'VideoBridge') rather than ID -> layers::Image (owned by the hashtable in VideoDecoderManagerParent).

We really don't want to send anything other than an arbitrary ID back to the content process, since we want to be able to avoid having shared textures entirely. This is also a problem if we did setup a new 'VideoBridge' within the content process, since we'd want the shared object to be ID3D11Texture2D* not HANDLE, and all of our PTexture code currently works with the former.
Comment on attachment 8788050 [details]
Bug 1288618 - Part 13: Add VideoDecoderManager protocol.

https://reviewboard.mozilla.org/r/76612/#review76902

::: dom/media/ipc/PVideoDecoderManager.ipdl:15
(Diff revision 4)
> +namespace dom {
> +
> +async protocol PVideoDecoderManager
> +{
> +parent:
> +

nit: extra newline

::: dom/media/ipc/VideoDecoderManagerChild.cpp:52
(Diff revision 4)
> +    return;
> +  }
> +
> +  // TODO: The above message should return an empty endpoint if there wasn't a GPU
> +  // process. Unfortunately IPDL will assert in this case, so it can't actually
> +  // happen. Bug 1302009 is filed for fixing this.

After bug 1302009 lands, we can just check endpoint.IsValid(). I can't decide if this message would be better on PCompositorBridge, but it's fine here for now.
Attachment #8788050 - Flags: review?(dvander) → review+
(In reply to Matt Woodrow (:mattwoodrow) from comment #70)

> 
> Can you please explain more about this idea?

Could be that my understanding of the big picture is still too blurry, but my worry is that sharing textures has proved to be tricky and bug prone (not only cross-process, cross-thread too). We currently have a TextureClient/Host abstraction that got a bit hairier than I'd like but is very solid when it comes to properly defining ownership and resolving the destruction sequence between the threads involved.
TextureData is intended as an implementation detail of this system, so I'm not found of using it by hand. If/when either the way TextureClient or the GPU process changes it means TextureData has to cater for the two ways it is used.
Using TextureClient/Host to handle shared textures also means we don't have to add new interfaces to Compositor like having it talk directly to the Image abstraction. When the a compositable is presented a TextureHost, it does not need to know whether it comes from the content process or a decoder thread, and the compositor doesn't need to know what Image is (The parts of the Image abstraction that matter to the compositor being redundant with TextureClient/Host).

> 
> Do you want an equivalent to ImageBridge running from the decoder thread to
> the compositor thread, entirely within the GPU process?

Something like that, or simpler, but at least a texture sharing mechanism that is based on TextureClient/Host.

> 
> The video decoders know nothing about presentation, so we'd still need the
> real ImageBridge connection (from content process to GPU process) to send
> PTextures that signify which frames are to be displayed. We'd still need
> some sort of identifier sent from video decoder -> content process and then
> that same identifier would be sent within the ImageBridge PTexture back to
> the GPU process.

Yes, that's fine. It's very similar to what ImageBridge does in coordination with CompositorBrdige in fact, with the exception that ImageBridge Ids are per video stream and in this case it is per texture.

> 
> The main difference would be that we'd map from ID -> TextureHost (that is
> owned by the 'VideoBridge') rather than ID -> layers::Image (owned by the
> hashtable in VideoDecoderManagerParent).
> 
> We really don't want to send anything other than an arbitrary ID back to the
> content process, since we want to be able to avoid having shared textures
> entirely. This is also a problem if we did setup a new 'VideoBridge' within
> the content process, since we'd want the shared object to be
> ID3D11Texture2D* not HANDLE, and all of our PTexture code currently works
> with the former.

I am not 100% sure I see what you are referring to, but the idea that the content process has a "fake" RemoteTextureData type of thing that only wraps an ID which is matched on the compositor side to replace with the real TextureHost (with a texture in it) coming from the decoder thread make sense. My issue is with doing texture sharing by hand between the decoder and the compositor, and exposing things like Image to the compositor.


> We really don't want to send anything other than an arbitrary ID back to the
> content process [...]

I am not proposing to use TextureClient/Host to share textures between the decoder thread and the content process (I mean, unless you do need it). It seems to me that what you are doing here is very similar to what ImageBridge does: it sends textures to the compositor and another channel connects things through Ids. The requirements in terms of texture sharing are the same as far as I can tell, and I don't see the need to add Image to the compositor API. TextureClient started out simple and grew complicated. whatever new APIs you add to the compositor will naturally become more complicated so I'd rather have a good reason to add a new system when there's already one in place that does more or less the same thing.

To be more precise I am more attached to the idea of using TextureClient/Host to share textures with the compositor (and not adding more texture abstractions like Image to the Compositor API) than to using a specific IPDL protocol and/or messaging scheme with or without transactions. You could even decouple the TextureClient/Host mechanism from IPDL and make it able to work with another message passing mechanism if you don't want to use IPDL with the decoder thread, or even no message passing at all for all I care.
Comment on attachment 8788051 [details]
Bug 1288618 - Part 14: Add PVideoDecoder protocol for individual decoders.

https://reviewboard.mozilla.org/r/76614/#review77038

::: dom/media/ipc/VideoDecoderParent.cpp:37
(Diff revision 4)
> +  MOZ_COUNT_CTOR(VideoDecoderParent);
> +  // We hold a reference to ourselves to keep us alive until IPDL
> +  // explictly destroys us. There may still be refs held by
> +  // tasks, but no new ones should be added after we're
> +  // destroyed.
> +  mSelfRef = this;

Can you call this mIPDLRef or mIPDLSelfRef? (With that name I'm liking this pattern better than the manual AddRef/Release, now.)

::: dom/media/ipc/VideoDecoderParent.cpp:157
(Diff revision 4)
> +{
> +  MOZ_ASSERT(mDecodeTaskQueue->IsCurrentThreadIn());
> +  RefPtr<VideoDecoderParent> self = this;
> +  RefPtr<MediaData> data = aData;
> +  mManagerTaskQueue->Dispatch(NS_NewRunnableFunction([self, data]() {
> +    if (!self->mDestroyed) {

nit: This should early-return instead.
Attachment #8788051 - Flags: review?(dvander) → review+
Comment on attachment 8788052 [details]
Bug 1288618 - Part 15: Add media code facing RemoteVideoDecoder.

https://reviewboard.mozilla.org/r/76616/#review77050

::: dom/media/ipc/RemoteVideoDecoder.cpp:34
(Diff revision 4)
> +}
> +
> +RemoteVideoDecoder::~RemoteVideoDecoder()
> +{
> +  VideoDecoderChild* actor = mActor;
> +  VideoDecoderManagerChild::GetManagerThread()->Dispatch(NS_NewRunnableFunction([actor]() {

All the async callbacks in this function should (1) hold a reference to VideoDecoderChild while it's in the event queue and (2) check that ActorDestroy hasn't run on the VideoDecoderChild by the time the event dispatches.

It looks like this'll require refcounting VideoDecoderChild and keeping an mCanSend bit or something in that class.

It's okay to push the mCanSend check down into the actor, as long as we do it somewhere.

::: dom/media/ipc/RemoteVideoDecoder.cpp:138
(Diff revision 4)
> +  VideoDecoderManagerChild::GetManagerThread()->Dispatch(NS_NewRunnableFunction([object, callback, info, backend]() {
> +    VideoDecoderChild* child = static_cast<VideoDecoderChild*>(VideoDecoderManagerChild::GetSingleton()->SendPVideoDecoderConstructor());
> +    object->mActor = child;
> +    object->mActor->mCallback = callback;
> +    object->mActor->mVideoInfo = info;
> +    object->mActor->mLayersBackend = backend;

Does it make sense to put this behind an Init function or something? That function could also acquire the IPDL ref.
Attachment #8788052 - Flags: review?(dvander)
(In reply to Nicolas Silva [:nical] from comment #87)
> 
> To be more precise I am more attached to the idea of using
> TextureClient/Host to share textures with the compositor (and not adding
> more texture abstractions like Image to the Compositor API) than to using a
> specific IPDL protocol and/or messaging scheme with or without transactions.
> You could even decouple the TextureClient/Host mechanism from IPDL and make
> it able to work with another message passing mechanism if you don't want to
> use IPDL with the decoder thread, or even no message passing at all for all
> I care.

Ok, so what if I store the layers::Image in the hashtable (as I currently do), call GetTextureClient() on it and store that, and then manually call Serialize()/CreateTextureHost() to get the TextureHost without needing IPDL.

That way the TextureHost for the SurfaceDescriptorGPUVideo can still use the ID to map into the hashtable, but it'll grab the TextureHost instead of layers::Image, and then we can share textures with the compositor using that instead.
Ugh, separating TextureHost from IPDL looks really complicated and almost certainly bug prone.

I think I'll need a new IPDL protocol similar to ImageBridge (VideoBridge?) that only manages PTextures and nothing else.

Then the TextureHost created across ImageBridge can be a wrapper TextureHost that just forwards all operations to the wrapped TextureHost from the VideoBridge.

That's a fair bit of work though. Would you be ok with landing this as is (it's all pref'd off), and then I'll fix it as a follow-up before it gets turned on? I'd like to try land what I've got before it bitrots too much.
Comment on attachment 8788052 [details]
Bug 1288618 - Part 15: Add media code facing RemoteVideoDecoder.

https://reviewboard.mozilla.org/r/76616/#review77208

::: dom/media/ipc/RemoteVideoDecoder.cpp:40
(Diff revision 6)
> +  // task queue for the VideoDecoderChild thread to keep
> +  // it alive until we send the delete message.
> +  RefPtr<VideoDecoderChild> actor = mActor;
> +  VideoDecoderManagerChild::GetManagerThread()->Dispatch(NS_NewRunnableFunction([actor]() {
> +    MOZ_ASSERT(actor);
> +    PVideoDecoderChild::Send__delete__(actor);

This should check the actor state before sending.

::: dom/media/ipc/RemoteVideoDecoder.cpp:138
(Diff revision 6)
> +
> +  VideoInfo info = aParams.VideoConfig();
> +
> +  layers::LayersBackend backend = aParams.mLayersBackend;
> +  VideoDecoderManagerChild::GetManagerThread()->Dispatch(NS_NewRunnableFunction([object, callback, info, backend]() {
> +    object->mActor = static_cast<VideoDecoderChild*>(VideoDecoderManagerChild::GetSingleton()->SendPVideoDecoderConstructor());

This is dangerous, RefPtr isn't threadsafe and mActor is being assigned and read on different threads.

Ideally, mActor would be assigned outside of the dispatch and initialized asynchronously.

Alternately, it looks like mActor is not used on the media thread except for the destructor, where we're guaranteed no one else is assigning it. So you're probably safe, but if you can't assert it - better safe than sorry.
Attachment #8788052 - Flags: review?(dvander)
Comment on attachment 8788052 [details]
Bug 1288618 - Part 15: Add media code facing RemoteVideoDecoder.

https://reviewboard.mozilla.org/r/76616/#review77214
Attachment #8788052 - Flags: review+
(In reply to Matt Woodrow (:mattwoodrow) from comment #91)
> That's a fair bit of work though. Would you be ok with landing this as is
> (it's all pref'd off), and then I'll fix it as a follow-up before it gets
> turned on?

Ok, sounds good.
Comment on attachment 8788045 [details]
Bug 1288618 - Part 8: Add Compositor API to create a TextureSource from a layers::Image.

https://reviewboard.mozilla.org/r/76602/#review77268

Please add a comment saying that this is a temporary thing that will be removed sometime soon and new code should not depend on it.
Attachment #8788045 - Flags: review?(nical.bugzilla) → review+
Comment on attachment 8788046 [details]
Bug 1288618 - Part 9: Add a new SurfaceDescriptor type for video decoding in the GPU process.

https://reviewboard.mozilla.org/r/76604/#review77270
Attachment #8788046 - Flags: review?(nical.bugzilla) → review+
Comment on attachment 8788047 [details]
Bug 1288618 - Part 10: Add a new layers::Image type for GPUVideo.

https://reviewboard.mozilla.org/r/76606/#review77272
Attachment #8788047 - Flags: review?(nical.bugzilla) → review+
Depends on: 1302918
Comment on attachment 8788041 [details]
Bug 1288618 - Part 4: Fix some namespace collisions in media code.

https://reviewboard.mozilla.org/r/76594/#review77484

this is super ugly IMHO, this undef should be done where the actual file containing that definition is included.
Attachment #8788041 - Flags: review+
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e00026b645d2
Part 1: Use gfxVars for CanUseHardwareVideoDecoding so we can access it in the GPU process. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/1d9978f82efc
Part 2: Allow allocating D3D9/11 Images when we don't have a recycling allocator available. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/36f0861336b6
Part 3: Avoid accessing MediaPrefs from the GPU process within WMF code. r=cpearce
https://hg.mozilla.org/integration/mozilla-inbound/rev/5b279f726a5e
Part 4: Fix some namespace collisions in media code. r=jya
https://hg.mozilla.org/integration/mozilla-inbound/rev/ddaa7f107193
Part 5: Add IPC serialization helpers for some media structs/enums. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/fdee8d1538b0
Part 6: Avoid unnecessarily allocating a TailDispatcher for XPCOMThreadWrapper. r=bholley
https://hg.mozilla.org/integration/mozilla-inbound/rev/30e2b2eea845
Part 7: Initialize AbstractThread in the GPU process. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/da04776fc1bd
Part 8: Add Compositor API to create a TextureSource from a layers::Image. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/7ac458ce1e64
Part 9: Add a new SurfaceDescriptor type for video decoding in the GPU process. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/409e327b69c7
Part 10: Add a new layers::Image type for GPUVideo. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/86e71b1c4edf
Part 11: Add a pref to enable GPU process video decoding. r=cpearce
https://hg.mozilla.org/integration/mozilla-inbound/rev/d5c1565d5f7f
Part 12: Initialize WMF in the GPU process. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/f981edef8f81
Part 13: Add VideoDecoderManager protocol. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/f2da4a56dc4f
Part 14: Add PVideoDecoder protocol for individual decoders. r=cpearce,dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/00cc54c30222
Part 15: Add media code interface RemoteVideoDecoder. r=cpearce,dvander
sorry had to back this out for valgrind failures like https://treeherder.mozilla.org/logviewer.html#?job_id=35895195&repo=mozilla-inbound
Flags: needinfo?(matt.woodrow)
Backout by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7821e3ae6aec
Backed out changeset 00cc54c30222 
https://hg.mozilla.org/integration/mozilla-inbound/rev/dce59210b50d
Backed out changeset f2da4a56dc4f 
https://hg.mozilla.org/integration/mozilla-inbound/rev/1d54e301b552
Backed out changeset f981edef8f81 
https://hg.mozilla.org/integration/mozilla-inbound/rev/6d8a48fbea14
Backed out changeset d5c1565d5f7f 
https://hg.mozilla.org/integration/mozilla-inbound/rev/a665d0287143
Backed out changeset 86e71b1c4edf 
https://hg.mozilla.org/integration/mozilla-inbound/rev/23a99e45a5d8
Backed out changeset 409e327b69c7 
https://hg.mozilla.org/integration/mozilla-inbound/rev/d3daabe53a71
Backed out changeset 7ac458ce1e64 
https://hg.mozilla.org/integration/mozilla-inbound/rev/84ee142476f5
Backed out changeset da04776fc1bd 
https://hg.mozilla.org/integration/mozilla-inbound/rev/c8d1c359469e
Backed out changeset 30e2b2eea845 
https://hg.mozilla.org/integration/mozilla-inbound/rev/deb400c9f162
Backed out changeset fdee8d1538b0 
https://hg.mozilla.org/integration/mozilla-inbound/rev/975f2a57e36d
Backed out changeset ddaa7f107193 
https://hg.mozilla.org/integration/mozilla-inbound/rev/d656afad6531
Backed out changeset 5b279f726a5e 
https://hg.mozilla.org/integration/mozilla-inbound/rev/fd00432644ea
Backed out changeset 36f0861336b6 
https://hg.mozilla.org/integration/mozilla-inbound/rev/81cdd5e85057
Backed out changeset 1d9978f82efc 
https://hg.mozilla.org/integration/mozilla-inbound/rev/8b0b6ff32644
Backed out changeset e00026b645d2 for valgrind failures
Juian, is there any chance you could help me with this failure?

It's almost certainly the first patch [1] as that's the only one that touches code near where the uninitialized memory is created.

It's not clear how this is unitialized during the read though as skia clearly initializes the memory right before calling cpuid.
Flags: needinfo?(matt.woodrow) → needinfo?(jseward)
(In reply to Matt Woodrow (:mattwoodrow) from comment #121)
> It's not clear how this is unitialized during the read though as skia
> clearly initializes the memory right before calling cpuid.

I have a theory, as yet unproven.  I suspect this is an obscure
Valgrind instrumentation inaccuracy.

CPUID (the instruction) reads EAX, and, depending on the EAX value, it
also reads ECX.  Then it writes results into EAX, EBX, ECX and EDX.
What may have happened is that CPUID has been called with an EAX value
that doesn't require the instruction to read ECX, and with ECX being
undefined.  The problem is that Valgrind regards CPUID as always
reading both EAX and ECX, and so it paints all of the 4 output values
as being undefined, which leads both the the complaint.  It also
explains the apparently bogus source of the undefinedness:

mozilla::gfx::gfxVars::VarImpl<bool,
&mozilla::gfx::gfxVars::GetCanUseHardwareVideoDecodingDefault>::GetValue(mozilla::gfx::GfxVarValue*)

since presumably that's where ECX most recently got made undefined
(for unrelated reasons) and so Memcheck presents that as the origin
point.

What to do about this .. I don't know yet.

I have to say, in 12+ years of shipping Valgrind on x86 targets, I have
never heard of a failure like this before.  So maybe my analysis isn't
correct.  I'll keep looking.
Flags: needinfo?(jseward)
I built with gcc-4.8 at -O2, and with the patch shown in comment 122 applied,
but I cannot reproduce the failure, even when running "./mach valgrind-test"
as that is what I think ran in automation.  Would it help to try the entire
patch set?
Assuming the theory in comment 123 is correct, this should fix the
problem by forcing ecx to zero in the case where we issue CPUID with
eax=1.  Unfortunately this changes the semantics a bit, because the
__get_cpuid() call that it removes checks that eax=1 is a supported 
value for this CPU (which it always should be; eax=1 is a pretty basic
query).

I'd be happier about this if I could actually reproduce the problem, though.
(In reply to Julian Seward [:jseward] from comment #125)
> Created attachment 8791983 [details] [diff] [review]
> Forces %ecx to zero before calling cpuid(%eax=1)
> 
> Assuming the theory in comment 123 is correct, this should fix the
> problem by forcing ecx to zero in the case where we issue CPUID with
> eax=1.  Unfortunately this changes the semantics a bit, because the
> __get_cpuid() call that it removes checks that eax=1 is a supported 
> value for this CPU (which it always should be; eax=1 is a pretty basic
> query).
> 
> I'd be happier about this if I could actually reproduce the problem, though.

This fixes the Valgrind test on treeherder at least.

Unfortunately skia is 3d party code, so maintaining a local modification is annoying.

Can we instead fix this using a suppression file entry?
Flags: needinfo?(jseward)
Comment on attachment 8788041 [details]
Bug 1288618 - Part 4: Fix some namespace collisions in media code.

https://reviewboard.mozilla.org/r/76594/#review77968
Attachment #8788041 - Flags: review?(cpearce) → review+
(In reply to Matt Woodrow (:mattwoodrow) from comment #127)
> Can we instead fix this using a suppression file entry?

Try the attached patch.  Unfortunately I can't test it because
I can't reproduce the problem.
Flags: needinfo?(jseward)
Attached patch valgrind suppression file — — Splinter Review
Looks like we need both suppression entries to get it green. Who should review this?
Attachment #8792736 - Flags: feedback?(jseward)
Comment on attachment 8792736 [details] [diff] [review]
valgrind suppression file

Looks like you've reviewed changes to this file before Nick, so flagging you for review. Feel free to drop it or defer it if you don't feel comfortable doing so. Thanks!
Attachment #8792736 - Flags: review?(n.nethercote)
Comment on attachment 8792736 [details] [diff] [review]
valgrind suppression file

Review of attachment 8792736 [details] [diff] [review]:
-----------------------------------------------------------------

::: build/valgrind/x86_64-redhat-linux-gnu.sup
@@ +206,5 @@
> +   fun:__get_cpuid
> +   fun:cpuid
> +   fun:_ZN6SkOptsL4initEv
> +   fun:sk_once_no_arg_adaptor
> +}

You could use a frame-level wildcard ("...") to combine the two suppressions into a single one if you wanted:

> {
>    Bug 1288618 comments 119 through 127
>    Memcheck:Cond
>    ...
>    fun:_ZN6SkOptsL4initEv
>    fun:sk_once_no_arg_adaptor
> }

Or not, doesn't matter much either way.

(See http://valgrind.org/docs/manual/manual-core.html#manual-core.suppress for suppression format details.)
Attachment #8792736 - Flags: review?(n.nethercote) → review+
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/da740b4fd484
Part 1: Use gfxVars for CanUseHardwareVideoDecoding so we can access it in the GPU process. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/75a61d0e71b7
Part 2: Allow allocating D3D9/11 Images when we don't have a recycling allocator available. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/2583ae4e2e3b
Part 3: Avoid accessing MediaPrefs from the GPU process within WMF code. r=cpearce
https://hg.mozilla.org/integration/mozilla-inbound/rev/53b113acee42
Part 4: Fix some namespace collisions in media code. r=jya
https://hg.mozilla.org/integration/mozilla-inbound/rev/4e7fe69d5f45
Part 5: Add IPC serialization helpers for some media structs/enums. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/8f67443dccb8
Part 6: Avoid unnecessarily allocating a TailDispatcher for XPCOMThreadWrapper. r=bholley
https://hg.mozilla.org/integration/mozilla-inbound/rev/63a9c689e1d5
Part 7: Initialize AbstractThread in the GPU process. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/95f23366de82
Part 8: Add Compositor API to create a TextureSource from a layers::Image. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/8b0adeab93df
Part 9: Add a new SurfaceDescriptor type for video decoding in the GPU process. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/f8bbdabdb6da
Part 10: Add a new layers::Image type for GPUVideo. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/24df0e89b273
Part 11: Add a pref to enable GPU process video decoding. r=cpearce
https://hg.mozilla.org/integration/mozilla-inbound/rev/e98f835c6ee5
Part 12: Initialize WMF in the GPU process. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/25396a1af922
Part 13: Add VideoDecoderManager protocol. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/e179c8e8265d
Part 14: Add PVideoDecoder protocol for individual decoders. r=cpearce,dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/2a47f8ea1d89
Part 15: Add media code interface RemoteVideoDecoder. r=cpearce,dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/06187d250f7a
Add a new valgrind suppression entry to fix a false positive with CPUID. r=njn
sorry had to back this out again for a cross-platform bustage like https://treeherder.mozilla.org/logviewer.html#?job_id=36194634&repo=mozilla-inbound
Flags: needinfo?(matt.woodrow)
Backout by cbook@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8b6401bcb115
Backed out 16 changesets for bustage on a CLOSED TREE
Pushed by mwoodrow@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b2b7cbb3efe4
Part 1: Use gfxVars for CanUseHardwareVideoDecoding so we can access it in the GPU process. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/2d24d26d01d6
Part 2: Allow allocating D3D9/11 Images when we don't have a recycling allocator available. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/d632e1e10c41
Part 3: Avoid accessing MediaPrefs from the GPU process within WMF code. r=cpearce
https://hg.mozilla.org/integration/mozilla-inbound/rev/21eda8769a71
Part 4: Fix some namespace collisions in media code. r=jya
https://hg.mozilla.org/integration/mozilla-inbound/rev/252f4945c107
Part 5: Add IPC serialization helpers for some media structs/enums. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/637bc458366b
Part 6: Avoid unnecessarily allocating a TailDispatcher for XPCOMThreadWrapper. r=bholley
https://hg.mozilla.org/integration/mozilla-inbound/rev/26730dd9f8c9
Part 7: Initialize AbstractThread in the GPU process. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/51a6e5bbafd7
Part 8: Add Compositor API to create a TextureSource from a layers::Image. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/b96953533f9d
Part 9: Add a new SurfaceDescriptor type for video decoding in the GPU process. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/bc592d380641
Part 10: Add a new layers::Image type for GPUVideo. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/d3e692026805
Part 11: Add a pref to enable GPU process video decoding. r=cpearce
https://hg.mozilla.org/integration/mozilla-inbound/rev/03cffb70d1af
Part 12: Initialize WMF in the GPU process. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/77f9319fb6f6
Part 13: Add VideoDecoderManager protocol. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/70e4b38cd629
Part 14: Add PVideoDecoder protocol for individual decoders. r=cpearce,dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/c6c28fb67917
Part 15: Add media code interface RemoteVideoDecoder. r=cpearce,dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/29f80f1769fc
Add a new valgrind suppression entry to fix a false positive with CPUID. r=njn
Depends on: 1304722
No longer blocks: 1297311
Flags: needinfo?(matt.woodrow)
Depends on: 1305213
Blocks: 1305320
Depends on: 1314187
Depends on: 1314189
Depends on: 1314191
Depends on: 1314192
Depends on: 1314193
Depends on: 1314194
Depends on: 1315141
Depends on: 1315144
This has introduced a regression on bug 1214710, which means the crashes from bug 1278574 are back.
I'm glad to help with troubleshooting and testing if required.

Regression range: https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=6cf09a35f5295baef0706f63a80c308abdfd4594&tochange=29f80f1769fc66ca5cc390183f3b49eec160ad4f
Flags: needinfo?(matt.woodrow)
More specifically, patch 13 is causing the regression (https://hg.mozilla.org/integration/mozilla-inbound/rev/25396a1af922).
This is on Android right? Wasn't GPU process decode only enabled on Windows?

As far as I was explained, the androids surface aren't compatible with the GPU process ipdl.

Eugen, can you please open another bug?
Depends on: 1317779
No longer depends on: 1317779
Flags: needinfo?(matt.woodrow)
Depends on: 1318235
Depends on: 1319190
Attachment #8792736 - Flags: feedback?(jseward)
See Also: → 1351993
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: