Closed Bug 1133623 Opened 10 years ago Closed 8 years ago

crash in mozilla::layers::CompositorD3D11::BeginFrame

Categories

(Core :: Graphics: Layers, defect)

x86
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla50
Tracking Status
firefox50 --- fixed

People

(Reporter: away, Assigned: jerry)

References

Details

(Keywords: crash, Whiteboard: [gfx-noted][tbird topcrash])

Crash Data

Attachments

(1 file, 2 obsolete files)

This bug was filed from the Socorro interface and is report bp-4e47d175-93f9-478d-8775-0f5892150215. ============================================================= This comes and goes from the top crash list on nightly 38. 0 xul.dll mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*) gfx/layers/d3d11/CompositorD3D11.cpp 1 xul.dll mozilla::layers::LayerManagerComposite::Render() gfx/layers/composite/LayerManagerComposite.cpp 2 xul.dll mozilla::layers::LayerManagerComposite::EndTransaction(void (*)(mozilla::layers::PaintedLayer*, gfxContext*, nsIntRegion const&, mozilla::layers::DrawRegionClip, nsIntRegion const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) gfx/layers/composite/LayerManagerComposite.cpp 3 xul.dll mozilla::layers::LayerManagerComposite::EndEmptyTransaction(mozilla::layers::LayerManager::EndTransactionFlags) gfx/layers/composite/LayerManagerComposite.cpp 4 xul.dll mozilla::layers::CompositorParent::CompositeToTarget(mozilla::gfx::DrawTarget*, nsIntRect const*) gfx/layers/ipc/CompositorParent.cpp 5 xul.dll mozilla::layers::CompositorParent::CompositeCallback(mozilla::TimeStamp) gfx/layers/ipc/CompositorParent.cpp 6 xul.dll RunnableMethod<mozilla::layers::CompositorParent, void ( mozilla::layers::CompositorParent::*)(mozilla::TimeStamp), Tuple1<mozilla::TimeStamp> >::Run() ipc/chromium/src/base/task.h 7 xul.dll MessageLoop::DoWork() ipc/chromium/src/base/message_loop.cc 8 xul.dll base::MessagePumpForUI::DoRunLoop() ipc/chromium/src/base/message_pump_win.cc 9 xul.dll base::MessagePumpWin::RunWithDispatcher(base::MessagePump::Delegate*, base::MessagePumpWin::Dispatcher*) ipc/chromium/src/base/message_pump_win.cc 10 xul.dll base::MessagePumpWin::Run(base::MessagePump::Delegate*) ipc/chromium/src/base/message_pump_win.h 11 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc 12 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc 13 xul.dll base::Thread::ThreadMain() ipc/chromium/src/base/thread.cc 14 xul.dll `anonymous namespace'::ThreadFunc(void*) ipc/chromium/src/base/platform_thread_win.cc 15 kernel32.dll BaseThreadInitThunk 16 ntdll.dll __RtlUserThreadStart 17 ntdll.dll _RtlUserThreadStart
What is the action item if we see these mutexes time out? Can we do anything about it?
Blocks: 1119854
Flags: needinfo?(bas)
I get these crashes, every time I view full screen flash video of more than a couple of minutes. They are followed a minute of two latter with a windows message the graphics card has stopped responding and that is followed by a full on blue screen crash and system restart. In the lat 24 hours has come the added twist of Windows identifying plugin container as having stopped responding before the blue screen. The one thing I don't think the crash data supplies is I am running with dual monitors.
Whiteboard: gfx-noted
I can reproduce this by forcing Firefox onto discrete GPU (on a W540 dual GPU setup), then disabling the discrete GPU in the device manager.
Whiteboard: gfx-noted → [gfx-noted][tbird crash]
This signature is 0.6% of crashes in 38.0b6 and makes it into the top 20 with that.
This is I guess another of the few TDR signatures.
Assignee: nobody → bas
#2 crash for TB38.0b4
Whiteboard: [gfx-noted][tbird crash] → [gfx-noted][tbird topcrash]
Crash Signature: , mozilla::gfx:...] → , mozilla::gfx:...] [@ mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<T> const*, mozilla::gfx::RectTyped<T> const&, mozilla::gfx::RectTyped<T>*, mozilla::gfx::RectTyped<T>*) ] [@ mozilla::layers::CompositorD3D11…
I had another of these, this time in Thunderbird Daily following update the restart. https://crash-stats.mozilla.com/report/index/8bde75d3-afc5-4722-90c7-c4ef12150614 Refers The program then went on to make a number of windows "stopped functioning" crashes but worked fine in Safe mode. Found my router had crashed and on resetting it all errors disappeared.
¡Hola! Another data point that I hope it's useful, if not my apologies for the bug spam =) On https://bugzilla.mozilla.org/show_bug.cgi?id=1127270#c48 I was instructed to update the driver for Intel HD3000 So I installed win64_152824.exe I disobeyed the installer and left Nightly running during the update. This resulted on the following crash: Report ID Date Submitted bp-4388b86f-bd1b-4444-a05c-be5682150625 25/06/2015 04:04 p.m. That is seemingly this bug...
Flags: needinfo?(bas)
This is no longer a top issue for Thunderbird but remains a significant issue for Firefox. Thunderbird 38.3.0 has 1 reports Thunderbird 38.2.0 has 0 reports Thunderbird 38.1.0 has 11 reports Firefox 41 has 1012 reports Firefox 40 has 876 reports Firefox 39 has 30 reports Based on volume this would rank #30 for Firefox and doesn't rank at all for Thunderbird. I'm not sure if Firefox crashes are being investigated here or if that's a different bug report.
changing a signature, since it doesn't seem to be properly mapped in crash-stats atm.
Crash Signature: , mozilla::gfx::RectTyped<T>*) ] [@ mozilla::layers::CompositorD3D11::BeginFrame ] → , mozilla::gfx::RectTyped<T>*) ] [@ mozilla::layers::CompositorD3D11::BeginFrame]
(In reply to Anthony Hughes, QA Mentor (:ashughes) from comment #9) > This is no longer a top issue for Thunderbird but remains a significant > issue for Firefox. > > Thunderbird 38.3.0 has 1 reports > Thunderbird 38.2.0 has 0 reports Ah, but we disabled HWA in 38.2.0 https://www.mozilla.org/en-US/thunderbird/38.2.0/releasenotes/
Jerry, you've spent some time dealing with device resets now, can you dig a bit into this one? I wonder if we get a device reset while we're waiting to AcquireSync (https://hg.mozilla.org/integration/mozilla-inbound/annotate/b0096c5c7277/gfx/layers/d3d11/CompositorD3D11.cpp#l1198) so we time out. Here's a recent crash: https://crash-stats.mozilla.com/report/index/3b10f2c6-a6d3-4068-8ec9-301de2160529
Flags: needinfo?(hshih)
Assignee: bas → nobody
Assignee: nobody → hshih
Flags: needinfo?(hshih)
Status: NEW → ASSIGNED
Hi Bas, If there is an IDXGIKeyedMutex from device context A and then the A is device-removed, does that mutex still work? From https://msdn.microsoft.com/en-us/library/windows/desktop/ff471339%28v=vs.85%29.aspx The return value are: E_FAIL WAIT_ABANDONED WAIT_TIMEOUT I'm not sure the AcquireSync() call is still workable when the device is removed.
Flags: needinfo?(bas)
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #13) > Hi Bas, > If there is an IDXGIKeyedMutex from device context A and then the A is > device-removed, does that mutex still work? > > From > https://msdn.microsoft.com/en-us/library/windows/desktop/ff471339%28v=vs. > 85%29.aspx > > The return value are: > E_FAIL > WAIT_ABANDONED > WAIT_TIMEOUT > > I'm not sure the AcquireSync() call is still workable when the device is > removed. We've discussed this a couple of times before, I've always been in favor of not crashing but checking whether the device has been reset in this situation.
Flags: needinfo?(bas)
If we handle it that way, let's make sure we do so in other places trying to get the sync texture (e.g., bug 1275798 comment 11)
Attachment #8760161 - Attachment is obsolete: true
Attachment #8760161 - Flags: review?(bas)
Comment on attachment 8760166 [details] [diff] [review] check device-removed status when we have timeout. v2 Review of attachment 8760166 [details] [diff] [review]: ----------------------------------------------------------------- This patch will empty the renderBound when we have driver-removed during AcquireSync(). Then that frame is skipped. That might prevent a lot of timeout MOZ_ASSERT() in our textureHost code. Should we update all textureHost timeout call or update the BeginFrame() at this moment? ::: gfx/layers/d3d11/CompositorD3D11.cpp @@ +1195,5 @@ > MOZ_ASSERT(mutex); > HRESULT hr = mutex->AcquireSync(0, 10000); > if (hr == WAIT_TIMEOUT) { > + hr = mDevice->GetDeviceRemovedReason(); > + if (hr == S_OK) { If the device status is normal, we use crash for the timeout. @@ +1203,5 @@ > + > + // Since the timeout is related to the driver-removed, clear the > + // render-bounding size to skip this frame. > + gfxCriticalNote << "GFX: D3D11 timeout with device-removed:" << gfx::hexa(hr); > + *aRenderBoundsOut = IntRect(); If this is related to driver-removed, empty the renderBound to skip this frame.
Attachment #8760166 - Flags: review?(milan)
Attachment #8760166 - Flags: review?(bas) → review+
Comment on attachment 8760166 [details] [diff] [review] check device-removed status when we have timeout. v2 Review of attachment 8760166 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/d3d11/CompositorD3D11.cpp @@ +1197,5 @@ > if (hr == WAIT_TIMEOUT) { > + hr = mDevice->GetDeviceRemovedReason(); > + if (hr == S_OK) { > + // There is no driver-removed event. Crash with this timeout. > + MOZ_CRASH("GFX: D3D11 timeout"); I would change this message slightly: MOZ_CRASH("GFX: D3D11 normal status timeout"); for example. That way, we can quickly search for the old type of crashes (we time out) vs. new type of crashes (we time out without a device reset) and can more easily find out of the original problem was fixed.
Attachment #8760166 - Flags: review?(milan) → review+
Attachment #8760166 - Attachment is obsolete: true
Pushed by cbook@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/afd3d8815462 check device-removed status when we have timeout. r=milan, r=bas
Keywords: checkin-needed
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla50
This is still reproducible on Fx50, based on the last 2 months of crash data. SIGNATURE | mozilla::layers::CompositorD3D11::BeginFrame ---------------------------------------------------------- CRASH STATS | http://tinyurl.com/hc7otrn ---------------------------------------------------------- OVERVIEW | 33 crashes on nightly 52 | 127 crashes on nightly 51 | 23 crashes on aurora 51 | 2 crashes on nightly 50 | 12 crashes on aurora 50 | 4 crash on beta 50 ---------------------------------------------------------- LAST CRASH | 2016-09-26 (on 50.0b1, 52.0a1)
Bug 1306168 is tracking ongoing crashes with this signature.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: