Closed Bug 1373937 Opened 3 years ago Closed 3 years ago

Crash in CContext::ID3D11DeviceContext1_UpdateSubresource_<T>

Categories

(Core :: Graphics: Layers, defect)

55 Branch
All
Windows
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 + fixed
firefox56 + fixed

People

(Reporter: philipp, Assigned: vliu)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file, 1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-3f4e7b0e-60f1-4bec-a11c-112970170617.
=============================================================
Crashing Thread (0)
Frame 	Module 	Signature 	Source
0 	d3d11.dll 	CContext::ID3D11DeviceContext1_UpdateSubresource_<1>(ID3D11DeviceContext1*, ID3D11Resource*, unsigned int, D3D11_BOX const*, void const*, unsigned int, unsigned int) 	
1 	d3d11.dll 	CContext::ID3D11DeviceContext1_UpdateSubresource_Amortized<0>(ID3D11DeviceContext1*, ID3D11Resource*, unsigned int, D3D11_BOX const*, void const*, unsigned int, unsigned int) 	
2 	xul.dll 	mozilla::layers::D3D11TextureData::Create(mozilla::gfx::IntSizeTyped<mozilla::gfx::UnknownUnits>, mozilla::gfx::SurfaceFormat, mozilla::gfx::SourceSurface*, mozilla::layers::TextureAllocationFlags, ID3D11Device*) 	gfx/layers/d3d11/TextureD3D11.cpp:463
3 	xul.dll 	mozilla::layers::D3D11TextureData::Create(mozilla::gfx::SourceSurface*, mozilla::layers::TextureAllocationFlags, ID3D11Device*) 	gfx/layers/d3d11/TextureD3D11.cpp:509
4 	xul.dll 	mozilla::layers::TextureClient::CreateFromSurface(mozilla::layers::KnowsCompositor*, mozilla::gfx::SourceSurface*, mozilla::layers::BackendSelector, mozilla::layers::TextureFlags, mozilla::layers::TextureAllocationFlags) 	gfx/layers/client/TextureClient.cpp:1162
5 	xul.dll 	mozilla::layers::SourceSurfaceImage::GetTextureClient(mozilla::layers::KnowsCompositor*) 	gfx/layers/ImageContainer.cpp:798
6 	xul.dll 	mozilla::layers::ImageClientSingle::UpdateImage(mozilla::layers::ImageContainer*, unsigned int) 	gfx/layers/client/ImageClient.cpp:210
7 	xul.dll 	mozilla::layers::ClientImageLayer::RenderLayer() 	gfx/layers/client/ClientImageLayer.cpp:141
8 	xul.dll 	mozilla::layers::ClientLayer::RenderLayerWithReadback(mozilla::layers::ReadbackProcessor*) 	gfx/layers/client/ClientLayerManager.h:379
9 	xul.dll 	mozilla::layers::ClientContainerLayer::RenderLayer() 	gfx/layers/client/ClientContainerLayer.h:57
10 	xul.dll 	mozilla::layers::ClientLayer::RenderLayerWithReadback(mozilla::layers::ReadbackProcessor*) 	gfx/layers/client/ClientLayerManager.h:379
11 	xul.dll 	mozilla::layers::ClientContainerLayer::RenderLayer() 	gfx/layers/client/ClientContainerLayer.h:57
12 	xul.dll 	mozilla::layers::ClientLayerManager::EndTransactionInternal(void (*)(mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) 	gfx/layers/client/ClientLayerManager.cpp:375
13 	xul.dll 	mozilla::layers::ClientLayerManager::EndTransaction(void (*)(mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) 	gfx/layers/client/ClientLayerManager.cpp:433
14 	xul.dll 	nsDisplayList::PaintRoot(nsDisplayListBuilder*, nsRenderingContext*, unsigned int) 	layout/painting/nsDisplayList.cpp:2288
15 	xul.dll 	nsLayoutUtils::PaintFrame(nsRenderingContext*, nsIFrame*, nsRegion const&, unsigned int, nsDisplayListBuilderMode, nsLayoutUtils::PaintFrameFlags) 	layout/base/nsLayoutUtils.cpp:3697
16 	xul.dll 	mozilla::PresShell::Paint(nsView*, nsRegion const&, unsigned int) 	layout/base/PresShell.cpp:6447
17 	xul.dll 	nsViewManager::ProcessPendingUpdatesPaint(nsIWidget*) 	view/nsViewManager.cpp:481
18 	xul.dll 	nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) 	view/nsViewManager.cpp:413
19 	xul.dll 	nsViewManager::ProcessPendingUpdates() 	view/nsViewManager.cpp:1095

this crash signature from intel devices seems to be quite high in stability early data from the 55.0b cycle.

Adapter device id facet
1 	0x0402 	7 	17.95 %
2 	0x1916 	7 	17.95 %
3 	0x0152 	5 	12.82 %
4 	0x0166 	3 	7.69 %
5 	0x0412 	3 	7.69 %
6 	0x0102 	2 	5.13 %
7 	0x0a16 	2 	5.13 %

Adapter driver version facet
1 	10.18.10.3304 	4 	10.26 %
2 	20.19.15.4380 	4 	10.26 %
3 	10.18.10.3355 	3 	7.69 %
4 	10.18.10.3412 	2 	5.13 %
5 	10.18.10.3540 	2 	5.13 %
6 	10.18.10.4276 	2 	5.13 %
7 	10.18.10.4425 	2 	5.13 %
8 	10.18.14.4206 	2 	5.13 %
9 	21.20.16.4590 	2 	5.13 %
10 	9.17.10.4229 	2 	5.13 %
vliu, please help to check this bug.
Assignee: nobody → vliu
All these crashes seem to be happening during or after a device reset. David, you have more experience with our device reset code, are all these happening 'during' a device reset?
Flags: needinfo?(dvander)
(In reply to Peter Chang[:pchang] from comment #1)
> vliu, please help to check this bug.

Ok, I will look into it.
A first look into it from mini-dump, Gecko got failed return with error code 0x887a0005 (DXGI_ERROR_DEVICE_REMOVED) in [1]. texture11 got nullptr so Gecko crash in [2]. 


[1]: https://searchfox.org/mozilla-central/rev/2bcd258281da848311769281daf735601685de2d/gfx/layers/d3d11/TextureD3D11.cpp#453
[2]: https://searchfox.org/mozilla-central/rev/2bcd258281da848311769281daf735601685de2d/gfx/layers/d3d11/TextureD3D11.cpp#463

From back trace, it seems that device remove happens while Gecko run in an video. 

7 	xul.dll 	mozilla::layers::ClientImageLayer::RenderLayer() 	gfx/layers/client/ClientImageLayer.cpp:141
Cool, sounds like we need to move the HRESULT check up then.
Flags: needinfo?(dvander)
[Tracking Requested - why for this release]: Top crasher on Beta.
Sure, we can track this. Vincent, if you land a fix here please request uplift to beta, since this looks like a high volume crash in beta 54.
Flags: needinfo?(vliu)
(In reply to David Anderson [:dvander] from comment #5)
> Cool, sounds like we need to move the HRESULT check up then.

Hi David,
Currently I don't have more information to figure out the root cause. But it seems that move the HRESULT is better check up then. Could you help me to review the patch to move it? Thanks.
Attachment #8881899 - Flags: review?(dvander)
Comment on attachment 8881899 [details] [diff] [review]
0001-Bug-1373937-Move-gfxCriticalError-to-get-error-infor.patch

Review of attachment 8881899 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d11/TextureD3D11.cpp
@@ +455,5 @@
> +  if (FAILED(hr) || !texture11) {
> +    gfxCriticalError(CriticalLog::DefaultOptions(Factory::ReasonableSurfaceSize(aSize)))
> +      << "[D3D11] 2 CreateTexture2D failure Size: " << aSize
> +      << "texture11: " << texture11 << " Code: " << gfx::hexa(hr);
> +    return nullptr;

We still hit crash with gfxCriticalError in the release build, how about change to gfxCriticalNote? Image layer should handle the empty texture properly and then we should recover the device reset flow in the content process.
(In reply to Peter Chang[:pchang] from comment #9)
> Comment on attachment 8881899 [details] [diff] [review]
> 0001-Bug-1373937-Move-gfxCriticalError-to-get-error-infor.patch
> 
> Review of attachment 8881899 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: gfx/layers/d3d11/TextureD3D11.cpp
> @@ +455,5 @@
> > +  if (FAILED(hr) || !texture11) {
> > +    gfxCriticalError(CriticalLog::DefaultOptions(Factory::ReasonableSurfaceSize(aSize)))
> > +      << "[D3D11] 2 CreateTexture2D failure Size: " << aSize
> > +      << "texture11: " << texture11 << " Code: " << gfx::hexa(hr);
> > +    return nullptr;
> 
> We still hit crash with gfxCriticalError in the release build, how about
> change to gfxCriticalNote? Image layer should handle the empty texture
> properly and then we should recover the device reset flow in the content
> process.

It seems a great idea to add gfxCriticalNote to gather information and not crashing to enter recovery flow.

Hi David,
Could you please have a review for this? Thanks.
Attachment #8881899 - Attachment is obsolete: true
Attachment #8881899 - Flags: review?(dvander)
Flags: needinfo?(vliu)
Attachment #8882334 - Flags: review?(dvander)
Attachment #8882334 - Flags: review?(dvander) → review+
Pushed by vliu@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8323cbf507dd
Add gfxCriticalNote() to get error information. r=dvander
https://hg.mozilla.org/mozilla-central/rev/8323cbf507dd
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Do you plan on uplifting this change to 55?  We don't seem to get any of these crashes on nightly unfortunately.
Comment on attachment 8882334 [details] [diff] [review]
0001-Bug-1373937-Add-gfxCriticalNote-to-get-error-informa.patch

Approval Request Comment
[Feature/Bug causing the regression]: No
[User impact if declined]: User got crash without this patch.
[Is this code covered by automated tests?]: No
[Has the fix been verified in Nightly?]: yes
[Needs manual test from QE? If yes, steps to reproduce]:  No
[List of other uplifts needed for the feature/fix]: No
[Is the change risky?]: No
[Why is the change risky/not risky?]:
[String changes made/needed]:
Attachment #8882334 - Flags: approval-mozilla-beta?
Blocks: 1377545
Comment on attachment 8882334 [details] [diff] [review]
0001-Bug-1373937-Add-gfxCriticalNote-to-get-error-informa.patch

hopefully fix a crash in beta55.

Not quite sure what makes you say this is verified in nightly?  The crash rate there seems too low to be confident.
Attachment #8882334 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Vincent Liu[:vliu] from comment #15)
> [Is this code covered by automated tests?]: No
> [Has the fix been verified in Nightly?]: yes
> [Needs manual test from QE? If yes, steps to reproduce]:  No

Setting qe-verify- based on Vincent's assessment on manual testing needs.
Flags: qe-verify-
You need to log in before you can comment on or make changes to this bug.