Closed
Bug 1409176
Opened 8 years ago
Closed 8 years ago
Crash in mozilla::layers::SyncObjectD3D11Client::Init
Categories
(Core :: Graphics, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla58
| Tracking | Status | |
|---|---|---|
| firefox-esr52 | --- | unaffected |
| firefox56 | --- | unaffected |
| firefox57 | --- | wontfix |
| firefox58 | --- | fixed |
People
(Reporter: philipp, Assigned: jerry)
References
(Blocks 2 open bugs)
Details
(Keywords: crash, regression, Whiteboard: [gfx-noted])
Crash Data
Attachments
(2 files, 3 obsolete files)
|
7.49 KB,
patch
|
jerry
:
review+
|
Details | Diff | Splinter Review |
|
3.80 KB,
patch
|
jerry
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-2f803393-bf2b-449a-af89-bbd1b0171016.
=============================================================
Crashing Thread (95), Name: MediaPDecoder #6
Frame Module Signature Source
0 xul.dll mozilla::layers::SyncObjectD3D11Client::Init() gfx/layers/d3d11/TextureD3D11.cpp:1739
1 xul.dll mozilla::layers::SyncObjectD3D11Client::Synchronize() gfx/layers/d3d11/TextureD3D11.cpp:1777
2 xul.dll mozilla::D3D11DXVA2Manager::CopyToImage(IMFSample*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::Image**) dom/media/platforms/wmf/DXVA2Manager.cpp:990
3 xul.dll mozilla::WMFVideoMFTManager::CreateD3DVideoFrame(IMFSample*, __int64, mozilla::VideoData**) dom/media/platforms/wmf/WMFVideoMFTManager.cpp:1074
4 xul.dll mozilla::WMFVideoMFTManager::Output(__int64, RefPtr<mozilla::MediaData>&) dom/media/platforms/wmf/WMFVideoMFTManager.cpp:1204
5 xul.dll mozilla::WMFMediaDataDecoder::ProcessOutput(nsTArray<RefPtr<mozilla::MediaData> >&) dom/media/platforms/wmf/WMFMediaDataDecoder.cpp:154
6 xul.dll mozilla::WMFMediaDataDecoder::ProcessDecode(mozilla::MediaRawData*) dom/media/platforms/wmf/WMFMediaDataDecoder.cpp:142
7 xul.dll mozilla::detail::ProxyRunnable<mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >, mozilla::MediaResult, 1>, RefPtr<mozilla::MozPromise<nsTArray<RefPtr<mozilla::MediaData> >, mozilla::MediaResult, 1> > ( mozilla::TheoraDecoder::*)(mozilla::MediaRawData*), mozilla::TheoraDecoder, mozilla::MediaRawData*>::Run() xpcom/threads/MozPromise.h:1395
8 xul.dll mozilla::TaskQueue::Runner::Run() xpcom/threads/TaskQueue.cpp:246
9 xul.dll nsThreadPool::Run() xpcom/threads/nsThreadPool.cpp:226
10 xul.dll nsThread::ProcessNextEvent(bool, bool*) xpcom/threads/nsThread.cpp:1039
11 xul.dll NS_ProcessNextEvent(nsIThread*, bool) xpcom/threads/nsThreadUtils.cpp:521
12 xul.dll mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) ipc/glue/MessagePump.cpp:338
13 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:319
14 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc:299
15 xul.dll nsThread::ThreadFunc(void*) xpcom/threads/nsThread.cpp:427
16 nss3.dll PR_NativeRunThread nsprpub/pr/src/threads/combined/pruthr.c:397
17 nss3.dll pr_root nsprpub/pr/src/md/windows/w95thred.c:95
18 ucrtbase.dll thread_start<unsigned int (__cdecl*)(void* __ptr64)>
19 kernel32.dll BaseThreadInitThunk
20 ntdll.dll RtlUserThreadStart
this low-level volume crash signature is regressing in firefox 57 on windows - most of them with MOZ_CRASH(GFX: Cannot get D3D11 KeyedMutex). perhaps related to bug 1357299?
Blake - can you find someone to take a look at this? We have 5 reports in the last week on Windows 10.
Flags: needinfo?(ajones) → needinfo?(bwu)
Comment 3•8 years ago
|
||
Peter,
Thanks for your asking on slack. We do need your help on this bug. :-)
Please feel free to ping me back if you find the root cause is in decoding pipeline.
Flags: needinfo?(bwu) → needinfo?(howareyou322)
Updated•8 years ago
|
Comment 4•8 years ago
|
||
Found the following error on the crash report and the corresponding error mapping from MSDN.
It looks like the device reset happened on user's environment and it was unable to query the KeyedMutex.
Jerry, it looks like the video didn't notice the device reset and caused this crash. I think it is worth to expose gfx's D3D11 wrapper API for video which involves the device reset checking logic. How do you think?
https://crash-stats.mozilla.com/report/index/5ef9ecc3-a188-4157-bcef-71b820171024#tab-metadata
[402][GFX1-]: Failed to OpenSharedResource for SyncObjectD3D11: 0x80070057 (t=9131.96) |
[403][GFX1-]: Failed to OpenSharedResource for SyncObjectD3D11: 0x80070057 (t=9131.96) |
[404][GFX1]: Failed to get KeyedMutex (2): 0x80004002 (t=9132.25) |
[405][GFX1-]: (gfxWindowsPlatform) Detected device reset: 7 (t=9132.33)
E_INVALIDARG One or more arguments are not valid 0x80070057
E_NOINTERFACE No such interface supported 0x80004002
Flags: needinfo?(howareyou322) → needinfo?(hshih)
| Assignee | ||
Comment 5•8 years ago
|
||
I would like to forward the error to decoder instead of using the gfx_critical_error(). The decoder has the mechanism to recreate itself for some error. We could set the corresponding error code in decoder to create a new decoder. Then, the decoder could get the latest d3d device which is set by our device-reset flow.
Assignee: nobody → hshih
Status: NEW → ASSIGNED
Flags: needinfo?(hshih)
| Assignee | ||
Comment 6•8 years ago
|
||
Currently, the device-reset flow doesn't notify the decoder for device change
immediately. The decoder might use an invalid sync-object for synchronization.
Then, we will hit some assertion.
This patch try to make the synchronization flow become fallible, then pass
the error code to the media framework.
MozReview-Commit-ID: ELDFV75Z3Ff
| Assignee | ||
Comment 7•8 years ago
|
||
Currently, the device-reset flow doesn't notify the decoder for device change
immediately. The decoder might use an invalid sync-object for synchronization.
Then, we will hit some assertions.
This patch try to make the synchronization flow become fallible, then we could
pass the error to the media framework for decoder recreation.
MozReview-Commit-ID: BFY32MmOdt0
Attachment #8924498 -
Flags: review?(dvander)
| Assignee | ||
Updated•8 years ago
|
Attachment #8924139 -
Attachment is obsolete: true
| Assignee | ||
Comment 8•8 years ago
|
||
The DXVA decoder might hit some error because of the hardware device status.
This patch try to pass the error code to the decoder promise object, then
try to recreate a new decoder for DXGI_ERROR_DEVICE_RESET error.
MozReview-Commit-ID: IAj8gzKGF3j
Attachment #8924499 -
Flags: review?(jyavenard)
Comment 9•8 years ago
|
||
Comment on attachment 8924498 [details] [diff] [review]
P1: make SyncObjectD3D11Client become fallible.
Review of attachment 8924498 [details] [diff] [review]:
-----------------------------------------------------------------
::: gfx/layers/SyncObject.h
@@ +60,5 @@
> };
>
> virtual SyncType GetSyncType() = 0;
>
> + // Return false for synchronization failed.
for failed synchronization
Comment 10•8 years ago
|
||
Comment on attachment 8924499 [details] [diff] [review]
P2: handle decoder error in WMFMediaDataDecoder::ProcessDecode().
Review of attachment 8924499 [details] [diff] [review]:
-----------------------------------------------------------------
LGTM.
However I'm concerned with the potential unlimited re-creation of decoder should the problem always occur.
If you think it doesn't matter than sure...
Asking Matt for his opinion on the matter.
::: dom/media/platforms/wmf/DXVA2Manager.cpp
@@ +928,5 @@
> RefPtr<TextureClient> client = image->GetTextureClient(ImageBridgeChild::GetSingleton().get());
> NS_ENSURE_TRUE(client, E_FAIL);
>
> RefPtr<IDXGIKeyedMutex> mutex;
> + HRESULT hr = S_OK;
not technically required as it will be set in AutoTextureLock, but I guess it's clearer that way
@@ +987,5 @@
> // It appears some race-condition may allow us to arrive here even when mSyncObject
> // is null. It's better to avoid that crash.
> client->SyncWithObject(mSyncObject);
> + if (!mSyncObject->Synchronize(true)) {
> + return DXGI_ERROR_DEVICE_RESET;
is a 10s timeout value always related to a device reset?
10s sounds like a damn long time, could it be possible to lower the value?
::: dom/media/platforms/wmf/WMFMediaDataDecoder.cpp
@@ +119,2 @@
> return DecodePromise::CreateAndReject(
> + MediaResult((aError == DXGI_ERROR_DEVICE_RESET) ? NS_ERROR_DOM_MEDIA_NEED_NEW_DECODER
Does the decoder need to be re-created?
when a decode error is returned, it will automatically retry up to 3 times again, before it aborts and no longer attempt to decode.
When returning NEED_NEW_DECODER however, there's no limit to how many times it will happen.
So this could potentially cause a regression in that the video won't play, yet no error is ever returned to the user, it will continue to try again and again, re-creating a new decoder each time.
Attachment #8924499 -
Flags: review?(matt.woodrow)
Attachment #8924499 -
Flags: review?(jyavenard)
Attachment #8924499 -
Flags: review+
Comment on attachment 8924498 [details] [diff] [review]
P1: make SyncObjectD3D11Client become fallible.
Review of attachment 8924498 [details] [diff] [review]:
-----------------------------------------------------------------
::: gfx/layers/d3d11/TextureD3D11.cpp
@@ +1802,5 @@
> }
>
> RefPtr<ID3D11DeviceContext> ctx;
> dev->GetImmediateContext(getter_AddRefs(ctx));
> + if (ctx) {
I don't think GetImmediateContext can give you a nullptr.
Attachment #8924498 -
Flags: review?(dvander) → review+
Comment 12•8 years ago
|
||
Comment on attachment 8924499 [details] [diff] [review]
P2: handle decoder error in WMFMediaDataDecoder::ProcessDecode().
Review of attachment 8924499 [details] [diff] [review]:
-----------------------------------------------------------------
I agree with jya, we already have retrying for normal errors, we shouldn't use NS_ERROR_DOM_MEDIA_NEED_NEW_DECODER unless we have something else that will prevent it from retrying forever.
The GPU process decoder uses this because we have code in other places to disable the GPU process if it fails too much, so we don't need the media code to add limits for us.
r=me with that changed.
Attachment #8924499 -
Flags: review?(matt.woodrow) → review+
| Assignee | ||
Comment 13•8 years ago
|
||
remove the GetImmediateContext ctx checking.
Attachment #8924815 -
Flags: review+
| Assignee | ||
Updated•8 years ago
|
Attachment #8924498 -
Attachment is obsolete: true
| Assignee | ||
Comment 14•8 years ago
|
||
Just return NS_ERROR_DOM_MEDIA_DECODE_ERR for dxva decoder error.
The decoder recreation is the future work.
Attachment #8924816 -
Flags: review+
| Assignee | ||
Updated•8 years ago
|
Attachment #8924499 -
Attachment is obsolete: true
| Assignee | ||
Comment 15•8 years ago
|
||
Comment on attachment 8924499 [details] [diff] [review]
P2: handle decoder error in WMFMediaDataDecoder::ProcessDecode().
Review of attachment 8924499 [details] [diff] [review]:
-----------------------------------------------------------------
::: dom/media/platforms/wmf/DXVA2Manager.cpp
@@ +987,5 @@
> // It appears some race-condition may allow us to arrive here even when mSyncObject
> // is null. It's better to avoid that crash.
> client->SyncWithObject(mSyncObject);
> + if (!mSyncObject->Synchronize(true)) {
> + return DXGI_ERROR_DEVICE_RESET;
It's usually failed before the waiting call. And this code is shared between here and the regular web content rendering. So, I would prefer to use the original timeout value.
| Assignee | ||
Comment 16•8 years ago
|
||
Comment 18•8 years ago
|
||
Pushed by hshih@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/00b5d1cc54d3
make SyncObjectD3D11Client become fallible. r=dvander
https://hg.mozilla.org/integration/mozilla-inbound/rev/382939cc1ff5
handle decoder error in WMFMediaDataDecoder::ProcessDecode(). r=jya,mattwoodrow
Comment 19•8 years ago
|
||
| bugherder | ||
https://hg.mozilla.org/mozilla-central/rev/00b5d1cc54d3
https://hg.mozilla.org/mozilla-central/rev/382939cc1ff5
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla58
Updated•8 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•