Closed Bug 1362366 Opened 8 years ago Closed 7 years ago

Crash in mozilla::layers::LockD3DTexture<T>

Categories

(Core :: Graphics: Layers, defect)

All
Windows
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox-esr52 --- wontfix
firefox53 --- wontfix
firefox54 - wontfix
firefox55 + fixed

People

(Reporter: philipp, Assigned: vliu)

References

Details

(Keywords: crash, regression)

Crash Data

Attachments

(1 file, 1 obsolete file)

[Tracking Requested - why for this release]: this signature is the #6 browser crash on nightly currently and is regressing in volume in 55.0a1. This bug was filed from the Socorro interface and is report bp-ec46cfdb-e252-4bd0-af22-0dc8b0170505. ============================================================= Crashing Thread (3) Frame Module Signature Source 0 xul.dll mozilla::layers::LockD3DTexture<ID3D11Texture2D> gfx/layers/d3d11/TextureD3D11.cpp:216 1 xul.dll mozilla::layers::DXGITextureHostD3D11::LockInternal() gfx/layers/d3d11/TextureD3D11.cpp:819 2 xul.dll mozilla::layers::ImageHost::Lock() gfx/layers/composite/ImageHost.cpp:403 3 xul.dll mozilla::layers::AutoLockCompositableHost::AutoLockCompositableHost(mozilla::layers::CompositableHost*) obj-firefox/dist/include/CompositableHost.h:259 4 xul.dll mozilla::layers::ImageHost::Composite(mozilla::layers::Compositor*, mozilla::layers::LayerComposite*, mozilla::layers::EffectChain&, float, mozilla::gfx::Matrix4x4Typed<mozilla::gfx::UnknownUnits, mozilla::gfx::UnknownUnits> const&, mozilla::gfx::SamplingFilter, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const*, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ImageHost.cpp:217 5 xul.dll <lambda_900b9b3618a2532715a35ddaee6fe7a4>::operator() gfx/layers/composite/ImageLayerComposite.cpp:105 6 xul.dll mozilla::layers::RenderWithAllMasks<<lambda_900b9b3618a2532715a35ddaee6fe7a4> >(mozilla::layers::Layer*, mozilla::layers::Compositor*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, <lambda_900b9b3618a2532715a35ddaee6fe7a4>) obj-firefox/dist/include/mozilla/layers/LayerManagerComposite.h:753 7 xul.dll mozilla::layers::ImageLayerComposite::RenderLayer(mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ImageLayerComposite.cpp:102 8 xul.dll mozilla::layers::RenderLayers<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::RenderTargetPixel> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:471 9 xul.dll mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:636 10 xul.dll mozilla::layers::RenderLayers<mozilla::layers::RefLayerComposite>(mozilla::layers::RefLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::RenderTargetPixel> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:471 11 xul.dll mozilla::layers::ContainerRender<mozilla::layers::RefLayerComposite>(mozilla::layers::RefLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:636 12 xul.dll mozilla::layers::RenderLayers<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::RenderTargetPixel> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:471 13 xul.dll mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:636 14 xul.dll mozilla::layers::LayerManagerComposite::Render(mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&) gfx/layers/composite/LayerManagerComposite.cpp:928 15 xul.dll mozilla::layers::LayerManagerComposite::UpdateAndRender() gfx/layers/composite/LayerManagerComposite.cpp:515 16 xul.dll mozilla::layers::LayerManagerComposite::EndTransaction(mozilla::TimeStamp const&, mozilla::layers::LayerManager::EndTransactionFlags) gfx/layers/composite/LayerManagerComposite.cpp:436 17 xul.dll mozilla::layers::CompositorBridgeParent::CompositeToTarget(mozilla::gfx::DrawTarget*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const*) gfx/layers/ipc/CompositorBridgeParent.cpp:1008 18 xul.dll mozilla::layers::CompositorVsyncScheduler::Composite(mozilla::TimeStamp) gfx/layers/ipc/CompositorVsyncScheduler.cpp:258 19 xul.dll mozilla::detail::RunnableMethodImpl<SoftwareDisplay* const, void ( SoftwareDisplay::*)(mozilla::TimeStamp), 1, 1, mozilla::TimeStamp>::Run() obj-firefox/dist/include/nsThreadUtils.h:909 20 xul.dll MessageLoop::RunTask(already_AddRefed<mozilla::Runnable>) ipc/chromium/src/base/message_loop.cc:361 21 xul.dll MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask&&) ipc/chromium/src/base/message_loop.cc:369 22 xul.dll MessageLoop::DoWork() ipc/chromium/src/base/message_loop.cc:444 23 xul.dll base::MessagePumpForUI::DoRunLoop() ipc/chromium/src/base/message_pump_win.cc:212 24 xul.dll base::MessagePumpWin::Run(base::MessagePump::Delegate*) ipc/chromium/src/base/message_pump_win.h:80 25 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:231 26 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc:211 27 xul.dll base::Thread::ThreadMain() ipc/chromium/src/base/thread.cc:179 28 xul.dll `anonymous namespace'::ThreadFunc ipc/chromium/src/base/platform_thread_win.cc:28 29 kernel32.dll BaseThreadInitThunk 30 ntdll.dll RtlUserThreadStart filing a new bug for the signature to get some traction on it. the signature has been around for a while but on nightly it is jumping up in frequency since build 20170323030203. this would be the changelog to the day before: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=201231223cd4354a450c3e5d80959f35b8e4cf0c&tochange=7513b3f42058e9bcf9950d4acf4647d4ad2240f0 out of that, perhaps bug 1343814 is related? Correlations for Firefox Nightly (22.05% in signature vs 00.17% overall) GFX_ERROR "(nsWindow) Detected device reset: " = true [100.0% vs 00.22% if startup_crash = 0] (22.05% in signature vs 00.17% overall) GFX_ERROR "(nsWindow) Finished device reset." = true [100.0% vs 00.22% if startup_crash = 0] (100.0% in signature vs 04.79% overall) address = 0x0 (100.0% in signature vs 12.06% overall) reason = EXCEPTION_ACCESS_VIOLATION_READ (100.0% in signature vs 31.48% overall) moz_crash_reason = null (100.0% in signature vs 36.63% overall) abort_message = null (53.54% in signature vs 01.26% overall) Module "msvproc.dll" = true [69.88% vs 02.55% if platform_pretty_version = Windows 10] (16.54% in signature vs 00.59% overall) GFX_ERROR "(gfxWindowsPlatform) Finished device reset." = true [75.00% vs 00.66% if startup_crash = 0] (16.54% in signature vs 00.61% overall) GFX_ERROR "(gfxWindowsPlatform) Detected device reset: " = true [75.00% vs 00.69% if startup_crash = 0] (89.76% in signature vs 00.70% overall) GFX_ERROR "GFX: D3D11 skip BeginFrame with device-removed." = true [100.0% vs 38.51% if process_type = gpu]
Crash report showed that EXCEPTION_ACCESS_VIOLATION_READ happens in [1]. In mini dump, aTexture in [1] was nullptr so we crashed in accessing QueryInterface. [1]: https://searchfox.org/mozilla-central/rev/8b70b0a5038ef9472fe7c53e04a70c978bb06aed/gfx/layers/d3d11/TextureD3D11.cpp#216 But more looked into it, device-removed happens many times before hitting the crash. Maybe that is why we don't have D3D11Texture in mTextureSource to lock it. [G0][GFX1-]: GFX: D3D11 skip BeginFrame with device-removed.
225 crashes in the last week on nightly - that is a lot. 54 also seems to be affected. Vincent, is this something you are going to work on further? Or should we try to find someone else?
Flags: needinfo?(vliu)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #2) > 225 crashes in the last week on nightly - that is a lot. 54 also seems to be > affected. > Vincent, is this something you are going to work on further? Or should we > try to find someone else? Currently I don't have any further thought to figure out the possible reason why this crash happens. Maybe David can have any idea with that.
Flags: needinfo?(vliu) → needinfo?(dvander)
No idea, unfortunately. Looking at crash-stats it's almost exclusively on Windows 8+10, which means GPU process isolation is kicking in. Is this crash correlated with device resets?
Flags: needinfo?(dvander)
I found there was code we are missing checking which may need to have a patch to fix. When We have a calling in [1], it will call into [2] to construct DataTextureSourceD3D11. In this case, it seems that aProvider->GetD3D11Device() doesn't have any check to make sure whether it is valid or not. Is that missing relative to this issue? [1]: https://searchfox.org/mozilla-central/rev/224cc663d54085994a4871ef464b7662e0721e83/gfx/layers/d3d11/TextureD3D11.cpp#828 [2]: https://searchfox.org/mozilla-central/rev/224cc663d54085994a4871ef464b7662e0721e83/gfx/layers/d3d11/TextureD3D11.cpp#196
Flags: needinfo?(dvander)
Probably not, though a correlation with device resets could make a better argument. Or you could MOZ_RELEASE_ASSERT that mProvider->IsValid is true.
Flags: needinfo?(dvander)
See Also: → 1363675
Assignee: nobody → vliu
Hi David, From back trace in mini-dump, it might have chance to call [1] to NULL mTexture in mTextureSource[2]. If it did, Gecko might hit access violation in [3] when Lock D3DTexture. The above case might happens when Content device was changed[4]. [1]: http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/gfx/layers/composite/ImageHost.cpp#213 [2]: http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/gfx/layers/d3d11/TextureD3D11.cpp#1144 [3]: http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/gfx/layers/d3d11/TextureD3D11.cpp#212 [4]: http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/gfx/layers/d3d11/TextureD3D11.cpp#1174 To avoid this, the attached patch intends to NULL mTextureSource when device change was detected. After that, we may create a new mTextureSource when Lock D3DTexture is called. Could you please have a review? Really thanks.
Attachment #8867681 - Flags: review?(dvander)
Comment on attachment 8867681 [details] [diff] [review] 0001-Bug-1362366-Set-mTextureSource-to-nullptr-when-ID3D1.patch Review of attachment 8867681 [details] [diff] [review]: ----------------------------------------------------------------- ::: gfx/layers/d3d11/TextureD3D11.cpp @@ +766,5 @@ > mTextureSource = nullptr; > return; > } > > mProvider = aProvider; If mDevice is changed, I think you need to change mTextureSource in line 770, why do you want to cleanup everything in line 762? @@ +1166,5 @@ > } > > void > DataTextureSourceD3D11::SetTextureSourceProvider(TextureSourceProvider* aProvider) > { I think it's better to add assertion if we detect mDevice changed.
There's a couple instances of this on ESR52 as well, but low volume enough that I think we can wontfix. Feel free to set the status back to affected if you want to consider this for backport there as well.
Comment on attachment 8867681 [details] [diff] [review] 0001-Bug-1362366-Set-mTextureSource-to-nullptr-when-ID3D1.patch Okay, this looks good. I agree with Peter re: adding an assert in DataTextureSourceD3D11, but actually you could just remove the whole function and override. It doesn't do anything after your patch, and it should have no other callers. Whichever you do, I would also add this before you set mTextureSource = nullptr: > if (mTextureSource) { > mTextureSource->Reset(); > } Just in case some layer or cache is retaining it.
Attachment #8867681 - Flags: review?(dvander) → review+
Track 54- as the volume of crashes in Beta54 is low.
(In reply to David Anderson [:dvander] from comment #10) > Comment on attachment 8867681 [details] [diff] [review] > 0001-Bug-1362366-Set-mTextureSource-to-nullptr-when-ID3D1.patch > > Okay, this looks good. I agree with Peter re: adding an assert in > DataTextureSourceD3D11, but actually you could just remove the whole > function and override. It doesn't do anything after your patch, and it > should have no other callers. > May I clarify that the function I could just remove the whole function and override was [1] pointed to? [1]: http://searchfox.org/mozilla-central/source/gfx/layers/d3d11/TextureD3D11.cpp#1185-1199 I have two questions. 1. DXGIYCbCrTextureHostD3D11 also use this override. Since it called SetNextSibling() in [2], mNextSibling would be valid. In this case, should we still keep this function for DXGIYCbCrTextureHostD3D11? [2]: http://searchfox.org/mozilla-central/source/gfx/layers/d3d11/TextureD3D11.cpp#968-969 2. if the override function [1] was removed, [3] would call into its base case (class TextSource). The behavior would be like other TextureSouce (MacIOSurfaceTextureSourceBasic is a case). Based on this, does it have any concern for DXGIYCbCrTextureHostD3D11 and DXGITextureHostD3D11. [3]: http://searchfox.org/mozilla-central/rev/79f625ac56d936ef5d17ebe13a123b25efee4241/gfx/layers/d3d11/TextureD3D11.cpp#1197 Please correct me if I got anything wrong. Really thanks.
Flags: needinfo?(dvander)
IMO, DXGIYCbCrTextureHostD3D11 and DXGITextureHostD3D11 are basically the same thing. Probably in both cases we should actually revoke the device on the sources, but just doing nothing is probably fine too (compositing should re-acquire the source every frame, and therefore get no TextureSource until the device reset is handled). Does that answer your question?
Flags: needinfo?(dvander)
That is: the code for handling a different TextureSourceProvider should look the same for both.
(In reply to David Anderson [:dvander] from comment #13) > IMO, DXGIYCbCrTextureHostD3D11 and DXGITextureHostD3D11 are basically the > same thing. Probably in both cases we should actually revoke the device on > the sources, but just doing nothing is probably fine too (compositing should > re-acquire the source every frame, and therefore get no TextureSource until > the device reset is handled). > > Does that answer your question? Thanks for your reply. Based on above, I drafted a patch to fix this issue. Could you please review again my patch to make sure the patch get the thing right. Really thanks
Attachment #8867681 - Attachment is obsolete: true
Attachment #8870741 - Flags: review?(dvander)
Attachment #8870741 - Flags: review?(dvander) → review+
Pushed by vliu@mozilla.com: https://hg.mozilla.org/integration/mozilla-inbound/rev/e22128ea91bf Set DataTextureSourceD3D11 to nullptr once devce change was detected. r=dvander
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Crash volume seems pretty low on 54. With the RC build happening on Monday, can we let this ride the 55 train at this point or did you want to request Beta approval on it? If yes, please do so ASAP :).
Flags: needinfo?(vliu)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #19) > Crash volume seems pretty low on 54. With the RC build happening on Monday, > can we let this ride the 55 train at this point or did you want to request > Beta approval on it? If yes, please do so ASAP :). We just let this ride the 55 train and I won't request Beta approval since the Crash volume pretty low. Please have comment if anyone need this Beta approval.
Flags: needinfo?(vliu)
Blocks: 1709600
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: