Don't eagerly initialize D3D11/D2D for WebRender content processes

RESOLVED FIXED in Firefox 66

Status

()

P2
normal
RESOLVED FIXED
3 months ago
25 days ago

People

(Reporter: bobbyholley, Assigned: sotaro)

Tracking

(Depends on: 1 bug, Blocks: 2 bugs)

unspecified
mozilla66
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox66 fixed)

Details

Attachments

(2 attachments, 5 obsolete attachments)

This causes a ~4MB GPU allocation per content process for the swap chains. We generally don't need it except for canvas, which is being remoted in bug 1464032.

So we could wait for bug 1464032 and then just never initialize it in WR content processes, but it's easy enough to make on-demand.
So I wrote a patch for this ([1] or equivalently [2]), but I'm hitting a weird Canvas2D bug. It works fine when we draw to the canvas every frame (i.e. [3]), but if we don't (i.e. [4]) the image is rendered and then disappears with the next frame.

You can reproduce this by applying the patch from either [1] or [2], and then loading the testcases in [3] and [4]. If you remove the !gfxVars::UseWebRender() check from gfxPlatform.cpp (i.e. call EnsureDevicesInitialized unconditionally), things work as expected.

I've stepped through nsDisplayCanvas::CreateWebRenderCommands in both cases, and it all looks similar and sane. The first time through we run CreateCompositable() and then update mCanvasClient, and subsequent runs bail out early because of the IsDirty() check. On the WebRender side, the GL texture id doesn't change between the first frame and subsequent frames. So the problem is presumably somewhere in the buffer management in the layers code.

I'm at the limits of my knowledge here, and so it would probably be more efficient for someone familiar with the setup to debug this. Sotaro, can you have a look?

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=e1b8327cf5871d01c2d09e1051f4ab0b70f028bf
[2] https://github.com/bholley/gecko/commit/fd7e0afdbfd7816f5b54421fbc169dca9910883b
[3] https://jrmuizel.github.io/webrender/perf/canvas-perf.html
[4] https://bholley.net/testcases/canvasfill.html
Flags: needinfo?(sotaro.ikeda.g)
(Assignee)

Comment 2

3 months ago
Yup, I take a look:)
Flags: needinfo?(sotaro.ikeda.g)
(Assignee)

Comment 3

3 months ago
WrBridge()->GetSyncObject()->Synchronize() was not called in WebRenderLayerManager::EndEmptyTransaction() and WebRenderLayerManager::EndEmptyTransaction(). SyncObjectD3D11Client was not initialized correctly because when SyncObjectD3D11Client was created, ContentDevice was not initialized yet.

SyncObjectD3D11Client is used to synchronize D3D11Texture on main  thread of content process.
(Assignee)

Comment 4

3 months ago
(Assignee)

Comment 5

3 months ago
attachment 9031388 [details] [diff] [review] addressed the canvas2d rendering problem. But it causes the performance regression on WebGL and software decoded video, since ContentDevice might not be initialized when they are used.
(Assignee)

Comment 6

3 months ago
One simple way to address Comment 5 is just defer d2d initialization and do not defer D3DDevice initialization.
Comment hidden (obsolete)
Comment hidden (obsolete)
(Assignee)

Comment 9

3 months ago
A bit of clean up.
Attachment #9031392 - Attachment is obsolete: true
(Assignee)

Comment 10

3 months ago
Comment on attachment 9031397 [details] [diff] [review]
patch - Defer only d2d initialization

:bholley, can you feedback to the patch? It defer d2d device initialization, but it still initialize content device(D3DDevice). It could prevent the problem of Comment 5.
Attachment #9031397 - Flags: feedback?(bobbyholley)
Comment on attachment 9031397 [details] [diff] [review]
patch - Defer only d2d initialization

Unfortunately, D3D initialization is where the large GPU memory allocations are coming from (swap chains), so that's actually the more-important part to defer.
Attachment #9031397 - Flags: feedback?(bobbyholley) → feedback-
(Reporter)

Updated

3 months ago
Summary: Don't eagerly initialize D2D for WebRender content processes → Don't eagerly initialize D3D11/D2D for WebRender content processes
(In reply to Sotaro Ikeda [:sotaro] from comment #5)
> attachment 9031388 [details] [diff] [review] addressed the canvas2d
> rendering problem. But it causes the performance regression on WebGL and
> software decoded video, since ContentDevice might not be initialized when
> they are used.

This is because WebGL and software decoded video rely on the ContentDevice to transfer surfaces from the content process to the GPU process? Could we do the same lazy initialization for those cases as well?
Reassigning to Sotaro for the time being.
Assignee: bobbyholley → sotaro.ikeda.g
Blocks: 1386669
Priority: -- → P2
Also, note that there is ongoing work to stop using Dx for both Canvas (bug 1464032) and WebGL (bug 1477756) in the content process. The intention is that we stop calling into those APIs entirely so that we can use a stronger sandbox.

So in the interim, we should use the same solution for WebGL as we do for Canvas (lazily initialize on first use). But we'll need a story long-term for software video decoding that doesn't rely on initializing D3D11 in the content process.
(Assignee)

Comment 16

3 months ago
(In reply to Bobby Holley (:bholley) from comment #13)
> This is because WebGL and software decoded video rely on the ContentDevice
> to transfer surfaces from the content process to the GPU process?

Yes, WebGL with ANGLE uses content device.
  https://dxr.mozilla.org/mozilla-central/source/gfx/gl/SharedSurfaceANGLE.cpp#215

Software decoded video uses D3D11YCbCrImage for performance. It also uses content device.
  https://dxr.mozilla.org/mozilla-central/source/dom/media/MediaData.cpp#276

> Could we do the same lazy initialization for those cases as well?

It seems possible. Handling WebGL could be similar to canvas 2D. We might need to think about OffscreenCanvas. It is in behind the pref.

Handling software decoded video could be tricky. It happens off main thread.
(Assignee)

Comment 17

3 months ago
(In reply to Sotaro Ikeda [:sotaro] from comment #16)
> 
> Handling software decoded video could be tricky. It happens off main thread.

One simple solution is using WebRenderBridgeChild::GetForMedia() to trigger d3d initialization. But it is called on main thread.
  https://dxr.mozilla.org/mozilla-central/source/gfx/layers/wr/WebRenderBridgeChild.cpp#626
  https://dxr.mozilla.org/mozilla-central/source/dom/media/MediaDecoder.cpp#916

It is also called on audio only playback. But if we want to choose another place to trigger initialization. It becomes more tricky.
(Assignee)

Comment 18

3 months ago
We also need to think about WebRTC.
(In reply to Sotaro Ikeda [:sotaro] from comment #16)
> (In reply to Bobby Holley (:bholley) from comment #13)
> > This is because WebGL and software decoded video rely on the ContentDevice
> > to transfer surfaces from the content process to the GPU process?
> 
> Yes, WebGL with ANGLE uses content device.
>  
> https://dxr.mozilla.org/mozilla-central/source/gfx/gl/SharedSurfaceANGLE.
> cpp#215
> 
> Software decoded video uses D3D11YCbCrImage for performance. It also uses
> content device.
>   https://dxr.mozilla.org/mozilla-central/source/dom/media/MediaData.cpp#276
> 
> > Could we do the same lazy initialization for those cases as well?
> 
> It seems possible. Handling WebGL could be similar to canvas 2D.

Right. Both would be lazily initialized in the interim until the remoting work is completed.

> We might
> need to think about OffscreenCanvas. It is in behind the pref.

Looks like it was landed behind a pref in bug 709490 back in 2015 and never prefed on. Not sure if anyone cares about it, but I doubt we should allow it to block this work.

> Handling software decoded video could be tricky. It happens off main thread.
>
> ...
> 
> One simple solution is using WebRenderBridgeChild::GetForMedia() to trigger
> d3d initialization. But it is called on main thread.
>  
> https://dxr.mozilla.org/mozilla-central/source/gfx/layers/wr/
> WebRenderBridgeChild.cpp#626
>  
> https://dxr.mozilla.org/mozilla-central/source/dom/media/MediaDecoder.cpp#916
> 
> It is also called on audio only playback. But if we want to choose another
> place to trigger initialization. It becomes more tricky.

Well, long-term we're going to need to have a software video path that doesn't require us to have D3D devices initialized in the content process. I am open to an interim hack similar to Canvas2D and WebGL, but we should at least understand what the long-term plan is. Alex?

(In reply to Sotaro Ikeda [:sotaro] from comment #18)
> We also need to think about WebRTC.

Same here.
Flags: needinfo?(agaynor)
(In reply to Bobby Holley (:bholley) from comment #19)

> 
> Well, long-term we're going to need to have a software video path that
> doesn't require us to have D3D devices initialized in the content process. I
> am open to an interim hack similar to Canvas2D and WebGL, but we should at
> least understand what the long-term plan is. Alex?

It looks like if we're decoding with WMF, and the GPU process is enabled, then we should be doing the decoding in the GPU process, which solves this.

There's still some cases that will decode locally, but it shouldn't be overly hard to require them to decode in the GPU process (or the newly added decoder process (currently AV1 only)).
(Assignee)

Comment 21

3 months ago
(In reply to Sotaro Ikeda [:sotaro] from comment #18)
> We also need to think about WebRTC.

Current gecko seems not use D3DTexture for WebRTC. It used BufferTextureData in SharedPlanarYCbCrImage. From performance point of view, its performance is not good compared to D3DTexture usage. But if we keep it, we do not need to handle this bug's problem.

The following is a related call stack.

> BufferTextureData::CreateForYCbCr()
> layers::TextureClient::CreateForYCbCr()
> layers::YCbCrTextureClientAllocationHelper::Allocate()
> layers::TextureClientRecycleAllocator::CreateOrRecycle()
> layers::SharedPlanarYCbCrImage::Allocate()
> layers::SharedPlanarYCbCrImage::CopyData()
> MediaEngineRemoteVideoSource::DeliverFrame()
> camera::CamerasChild::RecvDeliverFrame()
> camera::PCamerasChild::OnMessageReceived()
(Assignee)

Updated

3 months ago
Attachment #9031397 - Attachment is obsolete: true
(Assignee)

Comment 22

3 months ago
I found additional use case of using D3D11Device. async plugin's DXGISurface uses it.
 https://dxr.mozilla.org/mozilla-central/source/dom/plugins/ipc/PluginInstanceParent.cpp#660

It seems that we could address the problem by similar way to canvas.
Comment hidden (obsolete)
(Assignee)

Updated

3 months ago
Attachment #9031849 - Flags: review?(matt.woodrow)
Not 100% sure I understood the question, but the long-term plan is that all of audio, WebGL, and canvas 2D will be remoted. In fact, they're all hopeful for H1 of 2019 (Canvas 2D, bug 1464032, and audio, bug 1432303 are more or less sure things for then). Was that the info you needed?
Flags: needinfo?(agaynor)
(In reply to Alex Gaynor [:Alex_Gaynor] from comment #25)
> Not 100% sure I understood the question, but the long-term plan is that all
> of audio, WebGL, and canvas 2D will be remoted. In fact, they're all hopeful
> for H1 of 2019 (Canvas 2D, bug 1464032, and audio, bug 1432303 are more or
> less sure things for then). Was that the info you needed?

Yes, combined with comment 20 and comment 21.

The last remaining question here is what the sandboxing plan is for the plugin case in comment 22.
Flags: needinfo?(agaynor)
PluginInstanceParent is flash only, and we have no plans to implement win32k lockdown for flash, given it's impending doom. So there's no plan to make d3d changes there.
Flags: needinfo?(agaynor)
Ooof, I forgot that PluginInstanceParent ran in the content process. Yes, that'll almost certainly need to be removed, but I'm not sure we have a specific plan around it yet.
(In reply to Alex Gaynor [:Alex_Gaynor] from comment #28)
> Ooof, I forgot that PluginInstanceParent ran in the content process. Yes,
> that'll almost certainly need to be removed, but I'm not sure we have a
> specific plan around it yet.

That said, according to [1] Flash will be off by default in 2019, and it's plausible that the manual steps required to enable flash could also disable win32k lockdown. That may not be the ideal option, but it's certainly a decent backup.

It's also worth figuring out whether/how flash actually uses this API, and whether we could just turn it off.

[1] https://developer.mozilla.org/en-US/docs/Plugins/Roadmap
It looks like that's part of async drawing for plugins, and was added relatively recently (2015), so I suspect we can't just remove it. But yes, potentially trying win32k lockdown to flash being disabled is an option, as is doing an additional jump of remoting this.
Attachment #9031849 - Flags: review?(matt.woodrow) → review+

Comment 32

3 months ago
Pushed by sikeda@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9a5ef8fd69c9
Lazily initialize DirectX devices for WebRender r=mattwoodrow

Comment 33

3 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/9a5ef8fd69c9
Status: NEW → RESOLVED
Last Resolved: 3 months ago
status-firefox66: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla66
Depends on: 1530977
Depends on: 1531207
See Also: → bug 1521370
You need to log in before you can comment on or make changes to this bug.