Closed Bug 1204922 Opened 4 years ago Closed 4 years ago

Increase in CompositorD3D11::HandleError crashes in 41b9

Categories

(Core :: Graphics: Layers, defect, critical)

41 Branch
x86
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
firefox41 - wontfix
firefox42 + wontfix
firefox43 --- affected
firefox44 --- fixed
b2g-v2.5 --- fixed

People

(Reporter: dmajor, Assigned: bas.schouten)

References

Details

(Keywords: crash, topcrash-win, Whiteboard: [gfx-noted])

Crash Data

Attachments

(3 files, 3 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-ec5ec537-9752-4922-b8b3-d6d5e2150913.
=============================================================

This signature roughly doubled in volume in 41beta9. It's about 70% Win7, and 80% Intel graphics.

0 	xul.dll 	mozilla::layers::CompositorD3D11::HandleError(long, mozilla::layers::CompositorD3D11::Severity) 	gfx/layers/d3d11/CompositorD3D11.cpp
1 	xul.dll 	mozilla::layers::CompositorD3D11::Failed(long, mozilla::layers::CompositorD3D11::Severity) 	gfx/layers/d3d11/CompositorD3D11.cpp
2 	xul.dll 	mozilla::layers::CompositorD3D11::UpdateRenderTarget() 	gfx/layers/d3d11/CompositorD3D11.cpp
3 	xul.dll 	mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*) 	gfx/layers/d3d11/CompositorD3D11.cpp
4 	xul.dll 	mozilla::layers::LayerManagerComposite::Render() 	gfx/layers/composite/LayerManagerComposite.cpp
5 	xul.dll 	mozilla::layers::LayerManagerComposite::EndTransaction(void (*)(mozilla::layers::PaintedLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) 	gfx/layers/composite/LayerManagerComposite.cpp
There seems to be a mixture of crashes, based on the error messages, so that could be accounting for the increase?  Let's start with the ones that fail in VerifyBufferSize - they do so because mSwapChain->ResizeBuffers fails with the out of memory error code (0x8007000e)  Are we leaking something somewhere?

If this increased in Beta 9 - these are D3D, so it shouldn't be bug 1160295 (which would just have put more people into blocklist), but perhaps we had people that were forcing DirectWrite with a preference and bug 1193641 took that away and put them in a different path?

David, was there a corresponding spike in nightly around this time?
Whiteboard: [gfx-noted]
> David, was there a corresponding spike in nightly around this time?

No, but I really can't draw any conclusions from nightly when the volume there is so low. We can easily be thrown off by a single machine that's having a bad day (or lack thereof).
[Tracking Requested - why for this release]:

#11 in Firefox 41 with 0.88% of the volume.
#11 in Firefox 42 with 1.13% of the volume.
#36 in Firefox 43 with 0.38% of the volume.
#103 in Firefox 44 with 0.11% if the volume.
Tracking as it is a top crash.
Crash Signature: [@ mozilla::layers::CompositorD3D11::HandleError(long, mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::Failed(long, mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::UpdateRenderTarget()] → [@ mozilla::layers::CompositorD3D11::HandleError(long, mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::Failed(long, mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::UpdateRenderTarget()] [@ m…
Can we make this actionable?
Assignee: nobody → bas
Flags: needinfo?(bas)
I looked at top crashes on 41.0.1 and 41.0.2 and this signature is at #13. Given that, I do not think it is so severe as to need a dot release. Marking this wontfix for 41.
This bug was filed from the Socorro interface and is 
report bp-67097138-6360-4b1e-909c-2bb992151020.
=============================================================

Application Basics
Name 	Firefox
Version 	42.0b7
Build ID 	20151015151621

Graphics
Adapter Description	Intel(R) HD Graphics 3000
Adapter Drivers	igdumd32 igd10umd32 igd10umd32
Adapter RAM	Unknown
Asynchronous Pan/Zoom	none
Device ID	0x0126
Direct2D Enabled	true
DirectWrite Enabled	true (6.2.9200.17461)
Driver Date	2-22-2013
Driver Version	9.17.10.3040
GPU #2 Active	false
GPU Accelerated Windows	6/6 Direct3D 11 (OMTC)
Subsys ID	04931028
Supports Hardware H264 Decoding	false
Vendor ID	0x8086
WebGL Renderer	Google Inc. -- ANGLE (Intel(R) HD Graphics 3000 Direct3D11 vs_4_1 ps_4_1)
windowLayerManagerRemote	true
AzureCanvasBackend	direct2d 1.1
AzureContentBackend	direct2d 1.1
AzureFallbackCanvasBackend	cairo
AzureSkiaAccelerated	0
(#0) Error	Too many dropped/corrupted frames, disabling DXVA

Crashing Thread
Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::layers::CompositorD3D11::HandleError(long, mozilla::layers::CompositorD3D11::Severity) 	gfx/layers/d3d11/CompositorD3D11.cpp
1 	xul.dll 	mozilla::layers::CompositorD3D11::Failed(long, mozilla::layers::CompositorD3D11::Severity) 	gfx/layers/d3d11/CompositorD3D11.cpp
2 	xul.dll 	mozilla::layers::CompositorD3D11::UpdateRenderTarget() 	gfx/layers/d3d11/CompositorD3D11.cpp
3 	xul.dll 	mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*) 	gfx/layers/d3d11/CompositorD3D11.cpp
4 	xul.dll 	mozilla::layers::LayerManagerComposite::Render() 	gfx/layers/composite/LayerManagerComposite.cpp
5 	xul.dll 	mozilla::layers::LayerManagerComposite::EndTransaction(mozilla::TimeStamp const&, mozilla::layers::LayerManager::EndTransactionFlags) 	gfx/layers/composite/LayerManagerComposite.cpp
6 	kernelbase.dll 	BaseGetNamedObjectDirectory
This is a good one as well: https://crash-stats.mozilla.com/report/index/3a33d25c-9d80-4541-8a28-f1d042151019.  Lot of bad sizes (-458500 x 3, -597 x -8260822, 1813391347 x 184) perhaps suggests a corruption somewhere along the way?  The first error is "invalid argument" 0x80070057.
1. Assert/annotate if we call the compositor with the platform not having a valid device, while the compositor's device is still valid, and add a few more annotations for possible failures.
2. Have UpdateRenderTarget return a true/false status if it succeeded in what it tried to do.
3. (very unsure about this one) Inside of HandleError, we treat invalid call+device reset code as a device reset case; we already do that elsewhere in UpdateRenderTarget, we just make it consistent and handle it in HandleError as well.
Attachment #8677615 - Flags: review?(dvander)
Attachment #8677615 - Flags: review?(bas)
Comment on attachment 8677615 [details] [diff] [review]
More noise about mismatched devices, look harder for device resets.

Review of attachment 8677615 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d11/CompositorD3D11.cpp
@@ +480,5 @@
>    CD3D11_TEXTURE2D_DESC desc(DXGI_FORMAT_B8G8R8A8_UNORM, aRect.width, aRect.height, 1, 1,
>                               D3D11_BIND_SHADER_RESOURCE | D3D11_BIND_RENDER_TARGET);
>  
> +  if (mDevice && !gfxWindowsPlatform::GetPlatform()->GetD3D11Device()) {
> +    gfxCriticalError() << "Out of sync D3D11 devices in CreateRenderTarget";

This should be:

>   if (mDevice && mDevice != gfxWindowsPlatform::GetPlatform()->GetD3D11Device()) {

A TDR could trigger this since we recreate the platform device on a different thread, so it's probably better to use the don't-crash flag. Same in the other instances of this logic. Actually I'd just put it in an AssertCoherentDevice helper or something.

@@ +1513,3 @@
>  
> +  // Crash if we are making invalid calls outside of device removal
> +  if (hr == DXGI_ERROR_INVALID_CALL && !deviceRemoved) {

I know this is old code but MOZ_CRASHES in D3D11 make me really nervous. Arguably we shouldn't have them at all. In bug 1210863 we saw that just running mochitests, we can inexplicably fail to create a swap chain halfway into the process with E_INVALIDARG. So at least some D3D11 calls have to fail gracefully.

Hopefully this helps reduce this particular volume though.
Attachment #8677615 - Flags: review?(dvander) → review+
Looks like we won't be able to fix that in 42. Fortunately, we might be able to have a fix for 43.
Wontfix it.
(In reply to Milan Sreckovic [:milan] from comment #9)
> Handy query from :ashughes to only find reports that are very unlikely to be
> standard OOMs:
> https://crash-stats.mozilla.com/search/
> ?signature=~mozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AHandleError&availab
> le_virtual_memory=%3E%3D1024000000&_facets=signature&_columns=date&_columns=s
> ignature&_columns=product&_columns=version&_columns=build_id&_columns=platfor
> m#facet-signature

The question is whether, when filtering out the OOMs, there's any one 'cause' of this crash that's common enough to justify investing resources into fixing it. What would be really interesting is if there's a 'situation', perhaps like the one described in common #8, that is by itself common enough that it effects a lot of people and should be fixed.
Flags: needinfo?(bas)
Comment on attachment 8677615 [details] [diff] [review]
More noise about mismatched devices, look harder for device resets.

Review of attachment 8677615 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/d3d11/CompositorD3D11.cpp
@@ +1513,3 @@
>  
> +  // Crash if we are making invalid calls outside of device removal
> +  if (hr == DXGI_ERROR_INVALID_CALL && !deviceRemoved) {

I'm off the opinion that there's a value in crashing in a lot of cases as it forces the issues under our noses rather than them going unnoticed.

Having said that I agree, when there's a device removal going on not crashing seems good.
Attachment #8677615 - Flags: review?(bas) → review+
> +  if (hr == DXGI_ERROR_INVALID_CALL && !deviceRemoved) {

This may be a good candidate for the new gfxCrash?  It will MOZ_CRASH in Aurora and Nightly, and send telemetry in Beta and Release.  We can then keep an eye on the telemetry and see if we're getting any of these in the versions where we don't crash.  It's in 44, but I'll see if I can get it uplifted to 43, which means it'd be in beta by next week...
if bug 1213007 gets uplifted to 43, we can have this change uplifted to match.
Let's leave this open and see if it makes a difference; I'm having trouble looking up these crashes, so I'm not sure if there were any in nightly anyway.
When searching, the crash from comment 7 is a better one to start with.  Not a high volume on nightly, so let's wait a few more days to see if this patch made a difference, or at least gave us more info when the crashes do happen.
removing the b2g 2.5 flag since this commit has been reverted due to an incorrect merge, sorry for the confusion
The HandlError crashes are going to migrate to CrashStatsLogForwarder::CrashAction until bug 1223169 gets fixed.  E.g., https://crash-stats.mozilla.com/report/index/f2e7a15d-f5e2-41d9-a865-c589d2151109
Some notes on the crash from comment 26, as there is a bit more info.
We start with incorrect parameter, and follow by "explaining" that D3D11 swap resize buffers failed because of the >17k width.  This inside of VerifyBufferSize, inside of UpdateRenderTarget.  However, the scenario that crashes comes after that, gets past VerifyBufferSize, which should mean that ResizeBuffers worked.  So, in this case, we just have mSwapChain->GetBuffer(0, ...) returning "error invalid call", and the "is device reset" returning S_OK.

Question - do we need to watch for this scenario, because there is an explanation as to why it would happen, or should we keep just crashing?
Flags: needinfo?(dvander)
Flags: needinfo?(bas)
(In reply to Milan Sreckovic [:milan] from comment #27)
> Question - do we need to watch for this scenario, because there is an
> explanation as to why it would happen, or should we keep just crashing?

I don't know enough about D3D11 to answer here (for example, if ResizeBuffers fails, what state is the swap chain in?) - but I would err on the side of not crashing, given that bug 1210863 comment #5 had no clear explanation either.
Flags: needinfo?(dvander)
(In reply to Milan Sreckovic [:milan] from comment #27)
> Some notes on the crash from comment 26, as there is a bit more info.
> We start with incorrect parameter, and follow by "explaining" that D3D11
> swap resize buffers failed because of the >17k width.  This inside of
> VerifyBufferSize, inside of UpdateRenderTarget.  However, the scenario that
> crashes comes after that, gets past VerifyBufferSize, which should mean that
> ResizeBuffers worked.  So, in this case, we just have
> mSwapChain->GetBuffer(0, ...) returning "error invalid call", and the "is
> device reset" returning S_OK.
> 
> Question - do we need to watch for this scenario, because there is an
> explanation as to why it would happen, or should we keep just crashing?

If resize buffers was successful, and the device was not reset, I don't see any obvious reason why this would happen, if we know for a fact that it does, I should investigate further.
Flags: needinfo?(bas)
(In reply to Bas Schouten (:bas.schouten) from comment #29)
> ...
> If resize buffers was successful, and the device was not reset, I don't see
> any obvious reason why this would happen, if we know for a fact that it
> does, I should investigate further.

Looking at https://crash-stats.mozilla.com/report/index/f2e7a15d-f5e2-41d9-a865-c589d2151109, that seems to be what is happening (perhaps bug 1210863 comment #5 that David mentioned is what explains it.)  The reasoning for this crash demonstrating this situation is based on crash annotations working properly:


|[0][GFX1]: [CompositorD3D11] error code: 0x80070057, 0x0
|[1][GFX1-]: D3D11 swap resize buffers failed 0x80070057 on Size(17920,1440)
|[2][GFX1]: [CompositorD3D11] error code: 0x80070057, 0x0
|[3][GFX1-]: Failed VerifyBufferSize in UpdateRenderTarget Size(17920,1440)
|[4][GFX1]: [CompositorD3D11] error code: 0x887a0001, 0x0
|[5][GFX1 1]: Invalid D3D11 api call

[5] is https://dxr.mozilla.org/mozilla-central/source/gfx/layers/d3d11/CompositorD3D11.cpp#1516, pointing out deviceRemoved was false, supported by [4].  This is all inside of CompositorD3D11::HandleError.  Assuming mDevice is not null (*), GetDeviceRemovedReason() returned S_OK.

The stack is telling us the call crashing is Failed(hr) in CompositorD3D11::UpdateRenderTarget here https://dxr.mozilla.org/mozilla-central/source/gfx/layers/d3d11/CompositorD3D11.cpp#1282.  This means it was mSwapChain->GetBuffer call that returned hr DXGI_ERROR_INVALID_CALL.

Going up in CompositorD3D11::UpdateRenderTarget, since we got as far as GetBuffer call, it means VerifyBufferSize() succeeded, and size is positive.

VerifyBufferSize() succeeding means that mSwapChain->GetDesc succeded, and either size matches what we're holding, or resize succeeded.

(*) Is mDevice null?  No.  We are calling UpdateRenderTarget() from BeingFrame(), and before we call UpdateRenderTarget(), we access mDevice-> (https://dxr.mozilla.org/mozilla-central/source/gfx/layers/d3d11/CompositorD3D11.cpp#1034), which would have crashed.

Needinfo Bas based on "...I should investigate further..."
Flags: needinfo?(bas)
(In reply to Milan Sreckovic [:milan] from comment #30)
> (In reply to Bas Schouten (:bas.schouten) from comment #29)
> > ...
> > If resize buffers was successful, and the device was not reset, I don't see
> > any obvious reason why this would happen, if we know for a fact that it
> > does, I should investigate further.
> 
> Looking at
> https://crash-stats.mozilla.com/report/index/f2e7a15d-f5e2-41d9-a865-
> c589d2151109, that seems to be what is happening (perhaps bug 1210863
> comment #5 that David mentioned is what explains it.)  The reasoning for
> this crash demonstrating this situation is based on crash annotations
> working properly:
> 
> 
> |[0][GFX1]: [CompositorD3D11] error code: 0x80070057, 0x0
> |[1][GFX1-]: D3D11 swap resize buffers failed 0x80070057 on Size(17920,1440)
> |[2][GFX1]: [CompositorD3D11] error code: 0x80070057, 0x0
> |[3][GFX1-]: Failed VerifyBufferSize in UpdateRenderTarget Size(17920,1440)
> |[4][GFX1]: [CompositorD3D11] error code: 0x887a0001, 0x0
> |[5][GFX1 1]: Invalid D3D11 api call
> 
> [5] is
> https://dxr.mozilla.org/mozilla-central/source/gfx/layers/d3d11/
> CompositorD3D11.cpp#1516, pointing out deviceRemoved was false, supported by
> [4].  This is all inside of CompositorD3D11::HandleError.  Assuming mDevice
> is not null (*), GetDeviceRemovedReason() returned S_OK.
> 
> The stack is telling us the call crashing is Failed(hr) in
> CompositorD3D11::UpdateRenderTarget here
> https://dxr.mozilla.org/mozilla-central/source/gfx/layers/d3d11/
> CompositorD3D11.cpp#1282.  This means it was mSwapChain->GetBuffer call that
> returned hr DXGI_ERROR_INVALID_CALL.
> 
> Going up in CompositorD3D11::UpdateRenderTarget, since we got as far as
> GetBuffer call, it means VerifyBufferSize() succeeded, and size is positive.
> 
> VerifyBufferSize() succeeding means that mSwapChain->GetDesc succeded, and
> either size matches what we're holding, or resize succeeded.
> 
> (*) Is mDevice null?  No.  We are calling UpdateRenderTarget() from
> BeingFrame(), and before we call UpdateRenderTarget(), we access mDevice->
> (https://dxr.mozilla.org/mozilla-central/source/gfx/layers/d3d11/
> CompositorD3D11.cpp#1034), which would have crashed.
> 
> Needinfo Bas based on "...I should investigate further..."

So I checked this in a windows 10 stand-alone program. What happens when you create an insanely large window with an invalid size (i.e. a dimension bigger than 16384) and try to ResizeBuffers to that is this:

1) ResizeBuffers fails with E_INVALIDARG
2) GetDesc returns the dimension values of the failed ResizeBuffers call

In other words if one frame successfully aborts because of ResizeBuffers failing, the next frame will happily succeed in VerifyBufferSize and proceed with an unusable swap chain.
Flags: needinfo?(bas)
I think this case won't be too common, but regardless, let's deal with it, who knows what other issues where ResizeBuffers fails but GetDesc is already updated this may cover.
Attachment #8686810 - Flags: review?(milan)
Comment on attachment 8686810 [details] [diff] [review]
Do not attempt to paint again when ResizeBuffers fails until it succeeds

Review of attachment 8686810 [details] [diff] [review]:
-----------------------------------------------------------------

Good patch for the uplift.  We can clean things up (mostly what was there before), but I like this one for simplicity.
Attachment #8686810 - Flags: review?(milan) → review+
I'd like to get more information in the cases where we crash.  Not sure if it'll be useful, but it can't hurt.  Almost makes Failure() unnecessary though.
Attachment #8687198 - Flags: review?(bas)
This bug was filed from the Socorro interface and is 
report bp-e332885b-2da4-42f2-9b52-1eedc2151116.
=============================================================

Crashing Thread
Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::layers::CompositorD3D11::HandleError(long, mozilla::layers::CompositorD3D11::Severity) 	gfx/layers/d3d11/CompositorD3D11.cpp
1 	xul.dll 	mozilla::layers::CompositorD3D11::Failed(long, mozilla::layers::CompositorD3D11::Severity) 	gfx/layers/d3d11/CompositorD3D11.cpp
2 	xul.dll 	mozilla::layers::CompositorD3D11::UpdateRenderTarget() 	gfx/layers/d3d11/CompositorD3D11.cpp
3 	xul.dll 	mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits, float> const*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits, float> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits, float>*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits, float>*) 	gfx/layers/d3d11/CompositorD3D11.cpp
Comment on attachment 8686810 [details] [diff] [review]
Do not attempt to paint again when ResizeBuffers fails until it succeeds

Approval Request Comment
A lot of crashes that we hope are handled with this patch, let's see what it does to Aurora/44 population (the first patch is at 44 already.)
Attachment #8686810 - Flags: approval-mozilla-aurora?
Rebased now that Part 2 has landed.
Attachment #8687198 - Attachment is obsolete: true
Attachment #8687198 - Flags: review?(bas)
Attachment #8688505 - Flags: review?(bas)
Comment on attachment 8686810 [details] [diff] [review]
Do not attempt to paint again when ResizeBuffers fails until it succeeds

Approved with the hope that this patch fixes the crashes. Aurora44+
Attachment #8686810 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8688505 [details] [diff] [review]
Part 3. Handle error after we annotate with more info. r=bas

Review of attachment 8688505 [details] [diff] [review]:
-----------------------------------------------------------------

I could suggest making Failed() take an (optional) const std::string& argument and passing the failure string into that, and then going the gfxCriticalWarning inside ::Failed, would that work for you?
Flags: needinfo?(milan)
(In reply to Bas Schouten (:bas.schouten) from comment #41)
> ...
> I could suggest making Failed() take an (optional) const std::string&
> argument and passing the failure string into that, and then going the
> gfxCriticalWarning inside ::Failed, would that work for you?

Gets somewhat messy.  We sometimes want to abort the debug (wherever we're using gfxCriticalError) regardless of what type of error it is; sometimes we only want to abort (debug and ndebug) on bad errors (with HandleError.)  Also, we'd have to create the string (with the values that we want shown, these are not constant strings), and now we have to do that even if there is no error at all - it's only inside of Failed() that we decide whether we have an error or not and thus whether we need that error string or not.
Flags: needinfo?(milan)
Attachment #8688505 - Flags: review?(bas) → review+
NI Wes for part 2 uplift.
Flags: needinfo?(wkocher)
Flags: needinfo?(wkocher)
Check-in of part 3, and then we'll stop attaching patches to this bug :) - leave-open until we see if the crash rate drops on aurora.
Keywords: checkin-needed
Attachment #8689633 - Flags: checkin? → checkin+
Comment on attachment 8686810 [details] [diff] [review]
Do not attempt to paint again when ResizeBuffers fails until it succeeds

Approval Request Comment
What's our appetite for picking this up for 43?  High enough crasher?  It seems to have made a difference on nightly (45) and aurora (44), I'm not seeing crashes, though I don't know if they started up showing elsewhere.
Attachment #8686810 - Flags: approval-mozilla-beta?
Crash Signature: , mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::UpdateRenderTarget()] [@ mozilla::layers::CompositorD3D11::HandleError | mozilla::layers::CompositorD3D11::Failed | mozilla::layers::CompositorD3D11::UpdateRenderTarget] → , mozilla::layers::CompositorD3D11::Severity) | mozilla::layers::CompositorD3D11::UpdateRenderTarget()] [@ mozilla::layers::CompositorD3D11::HandleError | mozilla::layers::CompositorD3D11::Failed | mozilla::layers::CompositorD3D11::UpdateRenderTarget] […
Milan, all 3 patches?   It does seem to have brought the crashes down on nightly/aurora and there are significant amount of reports still on beta. Maybe a good idea.
The second patch is the one that actually did the work, we believe.  However, we haven't tested it without the first one, so I'd say we take the first and second.
Third can stay on Aurora (the third one will help us diagnose any remaining or new problems, so we should be OK without it for now at least.)
First two patches didn't take care of all the crashes: https://crash-stats.mozilla.com/report/index/1d06d20e-68bb-4d91-8b52-232b72151125.  Note that the signature changes after the first patch is applied.
Crash Signature: mozilla::layers::CompositorD3D11::UpdateRenderTarget] [@ mozilla::layers::CompositorD3D11::HandleError | mozilla::layers::CompositorD3D11::VerifyBufferSize] → mozilla::layers::CompositorD3D11::UpdateRenderTarget] [@ mozilla::layers::CompositorD3D11::HandleError | mozilla::layers::CompositorD3D11::VerifyBufferSize] [@ CrashStatsLogForwarder::CrashAction ]
Comment on attachment 8679664 [details] [diff] [review]
More noise about mismatched devices, look harder for device resets. r=bas, dvander

Crash fix, worked well on nightly and aurora. Please uplift to beta.
Attachment #8679664 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment on attachment 8686810 [details] [diff] [review]
Do not attempt to paint again when ResizeBuffers fails until it succeeds

D3D11 crash fix, worked well on nightly and aurora. Please uplift to beta.
Attachment #8686810 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Keywords: checkin-needed
this has problems uplifting to beta like:

grafting 315880:74501e7db97f "Bug 1204922 - More information about crashes. r=bas a=ritu"
merging gfx/2d/Logging.h
merging gfx/layers/d3d11/CompositorD3D11.cpp
merging gfx/layers/d3d11/CompositorD3D11.h
warning: conflicts while merging gfx/layers/d3d11/CompositorD3D11.cpp! (edit, then use 'hg resolve --mark')
abort: unresolved conflicts, can't continue
Flags: needinfo?(bas)
Still a top crash in beta 9.  We are planning a dot release for next week. This seems a bit risky for a ridealong but if the crash looks much worse on release, that is still an option.
also needinfo'ing milan to get this on his radar
Flags: needinfo?(milan)
Did we change trains?  When I sync to what is mozilla-beta today, I get the correct version of the file already.
The trains did move out today. m-c is now 46 (integration branches still technically say 45, but anything on there will be for 46), aurora is now 45, beta is now 44, and release is now 43.
Based on comment 60 it doesn't look like anything needs checking in, clearing c-n keyword. Feel free to readd it if I'm mistaken.
Keywords: checkin-needed
Yes, this should be OK up to 44.
Flags: needinfo?(milan)
Flags: needinfo?(bas)
I believe we can consider this 'fixed' at this point unless I'm mistaking?
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
We can close it.  Other things will have taken over.
(In reply to Milan Sreckovic [:milan] from comment #65)
> We can close it.  Other things will have taken over.

Which would be bug 1245680.
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.