Closed Bug 1542881 (win32k-decoder-uploads) Opened 6 years ago Closed 4 years ago

Accelerated surface backed image classes trigger win32k.sys API calls

Categories

(Core :: Graphics: Layers, task, P3)

task

Tracking

()

RESOLVED FIXED
Tracking Status
firefox68 --- affected

People

(Reporter: jimm, Unassigned)

References

Details

These image types often write data directly to or read out from surfaces managed by the underlying GPU or accelerated rendering libraries.

We'll need some guidance from the graphics team on how to address these classes. Probably involves some sort of remoting work to push these device calls over to the gpu process.

Some examples:
https://searchfox.org/mozilla-central/source/gfx/layers/D3D11YCbCrImage.h
https://searchfox.org/mozilla-central/source/gfx/layers/D3D9SurfaceImage.h
https://searchfox.org/mozilla-central/source/gfx/layers/GPUVideoImage.h
https://searchfox.org/mozilla-central/source/gfx/layers/IMFYCbCrImage.h

Types:
https://searchfox.org/mozilla-central/source/gfx/layers/ImageTypes.h

Typical stack when in use (bug 1541029)
win32u!NtGdiDdDDIUnlock
win32u!NtGdiDdDDIUnlock
d3d11!CallAndLogImpl<long
d3d11!NDXGI::CDevice::UnlockCB
vm3dum64_10
vm3dum64_10
vm3dum64_10
d3d11!NDXGI::CDevice::Flush
d3d11!NDXGI::CDevice::ResolveSharedResourceImpl
d3d11!NDXGI::CDevice::DXGIReleaseSync
xul!mozilla::layers::AutoLockD3D11Texture::~AutoLockD3D11Texture
xul!mozilla::layers::D3D11YCbCrImage::SetData
xul!mozilla::VideoData::CreateAndCopyData
xul!mozilla::FFmpegVideoDecoder<46465650>::CreateImage
xul!mozilla::FFmpegVideoDecoder<46465650>::DoDecode
xul!mozilla::FFmpegDataDecoder<46465650>::DoDecode

Type: defect → task
Priority: -- → P3

These are for upload of decoded media/video frames.

What we'll need to do is get the decoded frame data into shmem (ideally by directing our decoders to decode directly into shmems we provide), share that shmem with the GPU process long enough to upload it into a GPU resource.

Related, we should probably move all the decoders into the RDD process with AV1.

Alias: win32k-decoder-uploads

Today:

  • AV1 is sandboxed in the RDD
  • Other SW decoders are in the content process
  • Content and GPU processes have GPU access for uploads

There were a couple of options:
Upload from RDD directly:

  • Move all SW decoding to RDD
  • Allow GPU access in the RDD and upload directly from there
  • This would make AV1 containment less secure than today, particularly if we can't allow just-texture-uploads selectively through the sandbox
  • This would keep the efficient upload paths we have today, while allowing the Lockdown of the content process

Decode in GPU process:

  • Move all (non-RDD/av1?) SW decoders into the GPU process
    ** This doesn't seem less safe than what we ship today
    ** But we probably want to move SW decoders into RDD anyways
  • This would keep the efficient upload paths we have today, while allowing the Lockdown of the content process

Upload from shmem:

  • Move SW decoders to RDD
  • Move decoded data via shmem shared directly between the RDD and GPU process
  • Upload in the GPU process directly from shmem data
  • This would allow lockdown of both RDD and Content process
  • This would be worse than today, and isn't viable for decoding large resolutions due to the extra copy

Decode into shmem:

  • Same as "Upload from shmem", but also:
  • Adapt SW decoders to decode directly into shmem.
    ** This will need some amount of glue code and/or minor API changes for the SW decoders
    ** This can be done concurrently to other work, and per-decoder
  • Shmem lock-while-upload should be short-lived, so we shouldn't run into decoder-frame-recycling issues
    ** We already have good cross-process lifetime primitives in PTexture (TextureClient/TextureHost)
    ** We also have shmem versions of these

So! I think we can manage "Decode into shmem", but it'll need a couple pieces of work. It will however get us to a both secure and efficient decode+upload path.

I talked to tdaede about the software decoders we ship, and it sounds like av1, vp8, and vp9 should all be able to decode into a surface we provide. (though we'll have to match alignment and stride requirements, of course)

See Also: → 1539043

I filed a few dependent bugs against bug 1539043 (the media side performance bug for this work) for the specific action items to implement this.

Depends on: 1595994

I think this was fixed by bug 1595994.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.