Briefly corrupted 4K video with DXVA on GPUs that don't support 4k

RESOLVED FIXED in Firefox 43

Status

()

defect
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: acomminos, Assigned: mattwoodrow)

Tracking

unspecified
mozilla43
Unspecified
Windows 8
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox41 wontfix, firefox42 wontfix, firefox43 fixed)

Details

Attachments

(4 attachments, 1 obsolete attachment)

Steps to reproduce:
- Visit http://mozvr.github.io/loops/20150729_city_2/index.html
- Watch corrupted video
- Observe things looking correct after a few seconds of playback

Relevant info: DXVAChecker reports that the device supports FHD DXVA, but not 4K.

The issue does not occur with hardware video acceleration disabled.

Application Basics
------------------

Name: Firefox
Version: 42.0a1
Build ID: 20150811030206
Update Channel: nightly
User Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:43.0) Gecko/20100101 Firefox/43.0
Multiprocess Windows: 1/1 (default: true)
Safe Mode: false

Crash Reports for the Last 3 Days
---------------------------------

All Crash Reports

Extensions
----------

Graphics
--------

Adapter Description: Intel(R) HD Graphics 3000
Adapter Drivers: igdumd64 igd10umd64 igd10umd64 igdumd32 igd10umd32 igd10umd32
Adapter RAM: Unknown
Asynchronous Pan/Zoom: wheel input enabled
Device ID: 0x0126
Direct2D Enabled: true
DirectWrite Enabled: true (6.3.9600.17795)
Driver Date: 1-29-2014
Driver Version: 9.17.10.3347
GPU #2 Active: false
GPU Accelerated Windows: 1/1 Direct3D 11 (OMTC)
Subsys ID: 21da17aa
Supports Hardware H264 Decoding: true
Vendor ID: 0x8086
WebGL Renderer: Google Inc. -- ANGLE (Intel(R) HD Graphics 3000 Direct3D11 vs_4_1 ps_4_1)
windowLayerManagerRemote: true
AzureCanvasBackend: direct2d 1.1
AzureContentBackend: direct2d 1.1
AzureFallbackCanvasBackend: cairo
AzureSkiaAccelerated: 0

Important Modified Preferences
------------------------------

browser.cache.disk.capacity: 358400
browser.cache.disk.filesystem_reported: 1
browser.cache.disk.smart_size.first_run: false
browser.cache.frecency_experiment: 4
browser.download.importedFromSqlite: true
browser.places.smartBookmarksVersion: 7
browser.sessionstore.upgradeBackup.latestBuildID: 20150811030206
browser.startup.homepage_override.buildID: 20150811030206
browser.startup.homepage_override.mstone: 43.0a1
dom.apps.reset-permissions: true
dom.mozApps.used: true
extensions.lastAppVersion: 43.0a1
gfx.direct3d.last_used_feature_level_idx: 0
gfx.driver-init.appVersion: 43.0a1
gfx.driver-init.deviceID: 0x0126
gfx.driver-init.driverVersion: 9.17.10.3347
gfx.driver-init.feature-d2d: true
gfx.driver-init.feature-d3d11: true
gfx.driver-init.status: 2
media.gmp-eme-adobe.lastUpdate: 1439321866
media.gmp-eme-adobe.version: 12
media.gmp-gmpopenh264.lastUpdate: 1439321866
media.gmp-gmpopenh264.version: 1.4
media.gmp-manager.buildID: 20150811030206
media.gmp-manager.lastCheck: 1439321999
media.hardware-video-decoding.failed: false
network.cookie.prefsMigrated: true
network.predictor.cleaned-up: true
places.history.expiration.transient_current_max_pages: 104858
plugin.disable_full_page_plugin_for_types: application/pdf
plugin.importedState: true
privacy.sanitize.migrateFx3Prefs: true
security.sandbox.content.tempDirSuffix: {9b4bf288-f3d8-48e3-8f66-b96491e51ab0}

Important Locked Preferences
----------------------------

JavaScript
----------

Incremental GC: true

Accessibility
-------------

Activated: false
Prevent Accessibility: 0

Library Versions
----------------

NSPR
Expected minimum version: 4.10.9 Beta
Version in use: 4.10.9 Beta

NSS
Expected minimum version: 3.19.3 Basic ECC
Version in use: 3.19.3 Basic ECC

NSSSMIME
Expected minimum version: 3.19.3 Basic ECC
Version in use: 3.19.3 Basic ECC

NSSSSL
Expected minimum version: 3.19.3 Basic ECC
Version in use: 3.19.3 Basic ECC

NSSUTIL
Expected minimum version: 3.19.3
Version in use: 3.19.3

Experimental Features
---------------------
This happens when the video card doesn't support hardware decoding of the resolution, and the H264 MFT we are using falls back to software internally.

I've been reverse engineering the MFT (provided by msmpeg2vdec.dll) using apitrace to figure out how it decides the capabilities of the GPU and what it does in response to that.

It appears that IDirectXVideoDecoderService::CreateVideoDecoder [1] fails to create a decoder when given resolutions that the GPU/driver doesn't support.

It then falls back to decoding in software, and using IDirect3DSurface9::LockRect [2] + memcpy to upload the software decoding results into the d3d9 textures that we are expecting as output.

I currently have no idea why this results in corruption, because disabling DXVA, asking the MFT to output memory buffers and doing the upload ourselves works fine.

We should be able to work around this by calling CreateVideoDecoder ourselves and only asking the MFT for DXVA if we know it'll actually work (and won't silently be using software decoding).

This should fix the corruption for this card (and also allow us to remove the AMD specific block on DXVA+4k), and improve performance since we can upload directly into a shared texture instead of the non-shareable ones that the MFT allocates.

It also looks like we could implement the entire MFT ourselves and drop the mspeg2vdec dependency, but that's a fair amount of work (need to handle software decoders as well as DXVA), so I'm not in a rush to do so unless there's a real need.

[1] https://msdn.microsoft.com/en-us/library/windows/desktop/ms696175%28v=vs.85%29.aspx
[2] https://msdn.microsoft.com/en-us/library/windows/desktop/bb205896%28v=vs.85%29.aspx
Assignee: nobody → matt.woodrow
Added apitrace dump of this happening.

The main lines of interest are:

121 IDirectXVideoDecoderService::CreateVideoDecoder returning E_FAIL

159-168 IDirectXVideoDecoderService::CreateSurface calls with DXVA2_VideoSoftwareRenderTarget

334-336 IDirect3DSurface9::LockRect, memcpy,  IDirect3DSurface9::UnlockRect
Attachment #8647618 - Flags: review?(jyavenard)
Attachment #8647618 - Flags: review?(cpearce)
Summary: Briefly corrupted 4K video with DXVA enabled → Briefly corrupted 4K video with DXVA on GPUs that don't support 4k
Comment on attachment 8647618 [details] [diff] [review]
Fallback to software if hardware isn't supported

Review of attachment 8647618 [details] [diff] [review]:
-----------------------------------------------------------------

Nice job reverse engineering the MFT!
Attachment #8647618 - Flags: review?(cpearce) → review+
Comment on attachment 8647618 [details] [diff] [review]
Fallback to software if hardware isn't supported

Review of attachment 8647618 [details] [diff] [review]:
-----------------------------------------------------------------

LGTM

I think there are issues that could occur when there's a resolution change.
However, this is likely minor because it physically can't happen with the new MSE. The decoder should never see a change of resolution anymore as whenever a resolution change occurs, the current video decoders gets shutdown and re-created.

Maybe you could simplify the entire thing by first removing support for the shareddecodermanager (which is unused) ; and then only worry about the first data coming in.

I also have the feeling that this will break properly handling of the VP8/VP9 hardware accelerated as the entire DXVA configuration here assumes we're using h264 ; which isn't always the case.

::: dom/media/platforms/wmf/WMFVideoMFTManager.cpp
@@ +314,5 @@
> +    return false;
> +  }
> +
> +  if (mDXVA2Manager->SupportsConfig(aType)) {
> +    if (!mUseHwAccel) {

I'm not sure I understand this.
why not always test for dxva support, ignoring the current value of mUseHwAccel

as the aim is to enable it if possible anyway.

@@ +326,5 @@
> +    }
> +  } else if (mUseHwAccel) {
> +    // DXVA enabled, and not supported for this resolution
> +    HRESULT hr = mDecoder->SendMFTMessage(MFT_MESSAGE_SET_D3D_MANAGER, 0);
> +    NS_ASSERTION(SUCCEEDED(hr), "Attempting to fall back to software failed?");

shouldn't this be a fatal error ?
So we've been unable to use dxva nor software. what can the decoder do ?

@@ +352,5 @@
> +
> +    HRESULT hr = mDecoder->GetOutputMediaType(mediaType);
> +    NS_ENSURE_TRUE(SUCCEEDED(hr), hr);
> +
> +    mDecoder->Input(mLastInput);

if the resolution change failed; what happened to all the frames currently buffered and not output yet. Wouldn't they have been discarded ?

Shouldn't the decoder be drained first to ensure we don't lose a big chunk of frames (the WMF decoder typically buffers over 25 frames in win7 (seem to be the entire GOP) and 10 frames in win8 now that the low latency settings is on)
Attachment #8647618 - Flags: review?(jyavenard) → review+
(In reply to Jean-Yves Avenard [:jya] from comment #6)
 
> I think there are issues that could occur when there's a resolution change.
> However, this is likely minor because it physically can't happen with the
> new MSE. The decoder should never see a change of resolution anymore as
> whenever a resolution change occurs, the current video decoders gets
> shutdown and re-created.
> 
> Maybe you could simplify the entire thing by first removing support for the
> shareddecodermanager (which is unused) ; and then only worry about the first
> data coming in.

The main problem is that we don't have an IMFMediaType* until we've submitted the first sample.

We could try generate one, but it's hard since you need to know framerate among other things.

> 
> I also have the feeling that this will break properly handling of the
> VP8/VP9 hardware accelerated as the entire DXVA configuration here assumes
> we're using h264 ; which isn't always the case.

That's true. It'd be nice to figure out which decoder devices the MFT's use for a given video format, but we don't have any way (that I can think of) of knowing that without testing it and apitracing.

Does VP8/9 acceleration actually work anywhere?

I'll just restrict this test to h264 for now.

> 
> ::: dom/media/platforms/wmf/WMFVideoMFTManager.cpp
> @@ +314,5 @@
> > +    return false;
> > +  }
> > +
> > +  if (mDXVA2Manager->SupportsConfig(aType)) {
> > +    if (!mUseHwAccel) {
> 
> I'm not sure I understand this.
> why not always test for dxva support, ignoring the current value of
> mUseHwAccel
> 
> as the aim is to enable it if possible anyway.

If the MFT is already DXVA-enabled, and DXVA supports the current config, then there's nothing to do. We only need to try turning it on if it wasn't on already (4k -> lower res transition).

That said, if we never actually transition resolutions any more, then this path won't be hit.

> 
> @@ +326,5 @@
> > +    }
> > +  } else if (mUseHwAccel) {
> > +    // DXVA enabled, and not supported for this resolution
> > +    HRESULT hr = mDecoder->SendMFTMessage(MFT_MESSAGE_SET_D3D_MANAGER, 0);
> > +    NS_ASSERTION(SUCCEEDED(hr), "Attempting to fall back to software failed?");
> 
> shouldn't this be a fatal error ?
> So we've been unable to use dxva nor software. what can the decoder do ?

Yeah I guess, we've pretty stuck if this happens.

> 
> @@ +352,5 @@
> > +
> > +    HRESULT hr = mDecoder->GetOutputMediaType(mediaType);
> > +    NS_ENSURE_TRUE(SUCCEEDED(hr), hr);
> > +
> > +    mDecoder->Input(mLastInput);
> 
> if the resolution change failed; what happened to all the frames currently
> buffered and not output yet. Wouldn't they have been discarded ?
> 
> Shouldn't the decoder be drained first to ensure we don't lose a big chunk
> of frames (the WMF decoder typically buffers over 25 frames in win7 (seem to
> be the entire GOP) and 10 frames in win8 now that the low latency settings
> is on)

I'm really not sure, there's no documentation explaining how this is supposed to behave.

We only ever get this happening with a single sample (the initial one), so I can't easily test what would happen.
VP8/VP9 HW acceleration is supposed to work on broadwell based intel. Need to enable a pref.

The HP laptop I was given at Whistler supports it ; I'll need to try it.
(In reply to Matt Woodrow (:mattwoodrow) from comment #7)
> (In reply to Jean-Yves Avenard [:jya] from comment #6)
>  

> 
> The main problem is that we don't have an IMFMediaType* until we've
> submitted the first sample.
> 
> We could try generate one, but it's hard since you need to know framerate
> among other things.

The frame rate is found in the demuxer. Not the h264 data.
Before feeding the data to the decoder, we parsed the NALs with the H264Converter. That does the same job as what WMF does internally to detect change of resolution or aspect ratio.

That means that the VideoConfig the decoder received is now always going to match what the WMF decoder is ultimately going to find.

We don't need to rely on the WMF parsing the first sample, signal the change of format and read the new size from the decoder.

All this could be simplified and we generate everything ourselves.
https://hg.mozilla.org/mozilla-central/rev/05c65951f21d
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Blocks: 1196409
Comment on attachment 8647618 [details] [diff] [review]
Fallback to software if hardware isn't supported

Approval Request Comment
[Feature/regressing bug #]: MSE
[User impact if declined]: Broken 4k video playback on cards (excluding AMD) that don't support that in hardware (fairly rare).
[Describe test coverage new/current, TreeHerder]: Tested manually.
[Risks and why]: Low risk, just detects decoder support and reverts to software decoding.
[String/UUID change made/needed]: None
Attachment #8647618 - Flags: approval-mozilla-beta?
Attachment #8647618 - Flags: approval-mozilla-aurora?
Comment on attachment 8647618 [details] [diff] [review]
Fallback to software if hardware isn't supported

This fix has stabilized on Nightly for a week, let's uplift to Aurora and Beta.
Attachment #8647618 - Flags: approval-mozilla-beta?
Attachment #8647618 - Flags: approval-mozilla-beta+
Attachment #8647618 - Flags: approval-mozilla-aurora?
Attachment #8647618 - Flags: approval-mozilla-aurora+
Needs rebasing for Aurora/Beta uplift.
Flags: needinfo?(matt.woodrow)
Posted patch Aurora patchSplinter Review
Flags: needinfo?(matt.woodrow)
Needs rebasing for Beta too.
Flags: needinfo?(matt.woodrow)
Posted patch Beta patch (obsolete) — Splinter Review
Flags: needinfo?(matt.woodrow)
Flags: needinfo?(matt.woodrow)
Posted patch Beta patch v2Splinter Review
https://treeherder.mozilla.org/#/jobs?repo=try&revision=0466b0351229
Attachment #8655609 - Attachment is obsolete: true
Flags: needinfo?(matt.woodrow)
Depends on: 1202296
Attachment #8647618 - Flags: approval-mozilla-beta+
Attachment #8647618 - Flags: approval-mozilla-aurora+
Hindsight maybe, but I think that when we have to back out a graphics bug from beta, but arent seeing crashes reported on nightly, we can't assume things are fine on nightly.
You need to log in before you can comment on or make changes to this bug.