Closed
Bug 1362366
Opened 8 years ago
Closed 7 years ago
Crash in mozilla::layers::LockD3DTexture<T>
Categories
(Core :: Graphics: Layers, defect)
Tracking
()
RESOLVED
FIXED
mozilla55
People
(Reporter: philipp, Assigned: vliu)
References
Details
(Keywords: crash, regression)
Crash Data
Attachments
(1 file, 1 obsolete file)
3.42 KB,
patch
|
dvander
:
review+
|
Details | Diff | Splinter Review |
[Tracking Requested - why for this release]:
this signature is the #6 browser crash on nightly currently and is regressing in volume in 55.0a1.
This bug was filed from the Socorro interface and is
report bp-ec46cfdb-e252-4bd0-af22-0dc8b0170505.
=============================================================
Crashing Thread (3)
Frame Module Signature Source
0 xul.dll mozilla::layers::LockD3DTexture<ID3D11Texture2D> gfx/layers/d3d11/TextureD3D11.cpp:216
1 xul.dll mozilla::layers::DXGITextureHostD3D11::LockInternal() gfx/layers/d3d11/TextureD3D11.cpp:819
2 xul.dll mozilla::layers::ImageHost::Lock() gfx/layers/composite/ImageHost.cpp:403
3 xul.dll mozilla::layers::AutoLockCompositableHost::AutoLockCompositableHost(mozilla::layers::CompositableHost*) obj-firefox/dist/include/CompositableHost.h:259
4 xul.dll mozilla::layers::ImageHost::Composite(mozilla::layers::Compositor*, mozilla::layers::LayerComposite*, mozilla::layers::EffectChain&, float, mozilla::gfx::Matrix4x4Typed<mozilla::gfx::UnknownUnits, mozilla::gfx::UnknownUnits> const&, mozilla::gfx::SamplingFilter, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const*, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ImageHost.cpp:217
5 xul.dll <lambda_900b9b3618a2532715a35ddaee6fe7a4>::operator() gfx/layers/composite/ImageLayerComposite.cpp:105
6 xul.dll mozilla::layers::RenderWithAllMasks<<lambda_900b9b3618a2532715a35ddaee6fe7a4> >(mozilla::layers::Layer*, mozilla::layers::Compositor*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, <lambda_900b9b3618a2532715a35ddaee6fe7a4>) obj-firefox/dist/include/mozilla/layers/LayerManagerComposite.h:753
7 xul.dll mozilla::layers::ImageLayerComposite::RenderLayer(mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ImageLayerComposite.cpp:102
8 xul.dll mozilla::layers::RenderLayers<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::RenderTargetPixel> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:471
9 xul.dll mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:636
10 xul.dll mozilla::layers::RenderLayers<mozilla::layers::RefLayerComposite>(mozilla::layers::RefLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::RenderTargetPixel> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:471
11 xul.dll mozilla::layers::ContainerRender<mozilla::layers::RefLayerComposite>(mozilla::layers::RefLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:636
12 xul.dll mozilla::layers::RenderLayers<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::RenderTargetPixel> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:471
13 xul.dll mozilla::layers::ContainerRender<mozilla::layers::ContainerLayerComposite>(mozilla::layers::ContainerLayerComposite*, mozilla::layers::LayerManagerComposite*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::Maybe<mozilla::gfx::PolygonTyped<mozilla::gfx::UnknownUnits> > const&) gfx/layers/composite/ContainerLayerComposite.cpp:636
14 xul.dll mozilla::layers::LayerManagerComposite::Render(mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&) gfx/layers/composite/LayerManagerComposite.cpp:928
15 xul.dll mozilla::layers::LayerManagerComposite::UpdateAndRender() gfx/layers/composite/LayerManagerComposite.cpp:515
16 xul.dll mozilla::layers::LayerManagerComposite::EndTransaction(mozilla::TimeStamp const&, mozilla::layers::LayerManager::EndTransactionFlags) gfx/layers/composite/LayerManagerComposite.cpp:436
17 xul.dll mozilla::layers::CompositorBridgeParent::CompositeToTarget(mozilla::gfx::DrawTarget*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const*) gfx/layers/ipc/CompositorBridgeParent.cpp:1008
18 xul.dll mozilla::layers::CompositorVsyncScheduler::Composite(mozilla::TimeStamp) gfx/layers/ipc/CompositorVsyncScheduler.cpp:258
19 xul.dll mozilla::detail::RunnableMethodImpl<SoftwareDisplay* const, void ( SoftwareDisplay::*)(mozilla::TimeStamp), 1, 1, mozilla::TimeStamp>::Run() obj-firefox/dist/include/nsThreadUtils.h:909
20 xul.dll MessageLoop::RunTask(already_AddRefed<mozilla::Runnable>) ipc/chromium/src/base/message_loop.cc:361
21 xul.dll MessageLoop::DeferOrRunPendingTask(MessageLoop::PendingTask&&) ipc/chromium/src/base/message_loop.cc:369
22 xul.dll MessageLoop::DoWork() ipc/chromium/src/base/message_loop.cc:444
23 xul.dll base::MessagePumpForUI::DoRunLoop() ipc/chromium/src/base/message_pump_win.cc:212
24 xul.dll base::MessagePumpWin::Run(base::MessagePump::Delegate*) ipc/chromium/src/base/message_pump_win.h:80
25 xul.dll MessageLoop::RunHandler() ipc/chromium/src/base/message_loop.cc:231
26 xul.dll MessageLoop::Run() ipc/chromium/src/base/message_loop.cc:211
27 xul.dll base::Thread::ThreadMain() ipc/chromium/src/base/thread.cc:179
28 xul.dll `anonymous namespace'::ThreadFunc ipc/chromium/src/base/platform_thread_win.cc:28
29 kernel32.dll BaseThreadInitThunk
30 ntdll.dll RtlUserThreadStart
filing a new bug for the signature to get some traction on it. the signature has been around for a while but on nightly it is jumping up in frequency since build 20170323030203. this would be the changelog to the day before: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=201231223cd4354a450c3e5d80959f35b8e4cf0c&tochange=7513b3f42058e9bcf9950d4acf4647d4ad2240f0
out of that, perhaps bug 1343814 is related?
Correlations for Firefox Nightly
(22.05% in signature vs 00.17% overall) GFX_ERROR "(nsWindow) Detected device reset: " = true [100.0% vs 00.22% if startup_crash = 0]
(22.05% in signature vs 00.17% overall) GFX_ERROR "(nsWindow) Finished device reset." = true [100.0% vs 00.22% if startup_crash = 0]
(100.0% in signature vs 04.79% overall) address = 0x0
(100.0% in signature vs 12.06% overall) reason = EXCEPTION_ACCESS_VIOLATION_READ
(100.0% in signature vs 31.48% overall) moz_crash_reason = null
(100.0% in signature vs 36.63% overall) abort_message = null
(53.54% in signature vs 01.26% overall) Module "msvproc.dll" = true [69.88% vs 02.55% if platform_pretty_version = Windows 10]
(16.54% in signature vs 00.59% overall) GFX_ERROR "(gfxWindowsPlatform) Finished device reset." = true [75.00% vs 00.66% if startup_crash = 0]
(16.54% in signature vs 00.61% overall) GFX_ERROR "(gfxWindowsPlatform) Detected device reset: " = true [75.00% vs 00.69% if startup_crash = 0]
(89.76% in signature vs 00.70% overall) GFX_ERROR "GFX: D3D11 skip BeginFrame with device-removed." = true [100.0% vs 38.51% if process_type = gpu]
Assignee | ||
Comment 1•8 years ago
|
||
Crash report showed that EXCEPTION_ACCESS_VIOLATION_READ happens in [1]. In mini dump, aTexture in [1] was nullptr so we crashed in accessing QueryInterface.
[1]: https://searchfox.org/mozilla-central/rev/8b70b0a5038ef9472fe7c53e04a70c978bb06aed/gfx/layers/d3d11/TextureD3D11.cpp#216
But more looked into it, device-removed happens many times before hitting the crash. Maybe that is why we don't have D3D11Texture in mTextureSource to lock it.
[G0][GFX1-]: GFX: D3D11 skip BeginFrame with device-removed.
Comment 2•8 years ago
|
||
225 crashes in the last week on nightly - that is a lot. 54 also seems to be affected.
Vincent, is this something you are going to work on further? Or should we try to find someone else?
Assignee | ||
Comment 3•8 years ago
|
||
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #2)
> 225 crashes in the last week on nightly - that is a lot. 54 also seems to be
> affected.
> Vincent, is this something you are going to work on further? Or should we
> try to find someone else?
Currently I don't have any further thought to figure out the possible reason why this crash happens. Maybe David can have any idea with that.
Flags: needinfo?(vliu) → needinfo?(dvander)
No idea, unfortunately. Looking at crash-stats it's almost exclusively on Windows 8+10, which means GPU process isolation is kicking in. Is this crash correlated with device resets?
Flags: needinfo?(dvander)
Assignee | ||
Comment 5•8 years ago
|
||
I found there was code we are missing checking which may need to have a patch to fix. When We have a calling in [1], it will call into [2] to construct DataTextureSourceD3D11. In this case, it seems that aProvider->GetD3D11Device() doesn't have any check to make sure whether it is valid or not. Is that missing relative to this issue?
[1]: https://searchfox.org/mozilla-central/rev/224cc663d54085994a4871ef464b7662e0721e83/gfx/layers/d3d11/TextureD3D11.cpp#828
[2]: https://searchfox.org/mozilla-central/rev/224cc663d54085994a4871ef464b7662e0721e83/gfx/layers/d3d11/TextureD3D11.cpp#196
Flags: needinfo?(dvander)
Probably not, though a correlation with device resets could make a better argument. Or you could MOZ_RELEASE_ASSERT that mProvider->IsValid is true.
Flags: needinfo?(dvander)
Updated•8 years ago
|
Assignee: nobody → vliu
Assignee | ||
Comment 7•7 years ago
|
||
Hi David,
From back trace in mini-dump, it might have chance to call [1] to NULL mTexture in mTextureSource[2]. If it did, Gecko might hit access violation in [3] when Lock D3DTexture. The above case might happens when Content device was changed[4].
[1]: http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/gfx/layers/composite/ImageHost.cpp#213
[2]: http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/gfx/layers/d3d11/TextureD3D11.cpp#1144
[3]: http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/gfx/layers/d3d11/TextureD3D11.cpp#212
[4]: http://searchfox.org/mozilla-central/rev/484d2b7f51b7aed035147bbb4a565061659d9278/gfx/layers/d3d11/TextureD3D11.cpp#1174
To avoid this, the attached patch intends to NULL mTextureSource when device change was detected. After that, we may create a new mTextureSource when Lock D3DTexture is called. Could you please have a review? Really thanks.
Attachment #8867681 -
Flags: review?(dvander)
Comment 8•7 years ago
|
||
Comment on attachment 8867681 [details] [diff] [review]
0001-Bug-1362366-Set-mTextureSource-to-nullptr-when-ID3D1.patch
Review of attachment 8867681 [details] [diff] [review]:
-----------------------------------------------------------------
::: gfx/layers/d3d11/TextureD3D11.cpp
@@ +766,5 @@
> mTextureSource = nullptr;
> return;
> }
>
> mProvider = aProvider;
If mDevice is changed, I think you need to change mTextureSource in line 770, why do you want to cleanup everything in line 762?
@@ +1166,5 @@
> }
>
> void
> DataTextureSourceD3D11::SetTextureSourceProvider(TextureSourceProvider* aProvider)
> {
I think it's better to add assertion if we detect mDevice changed.
Comment 9•7 years ago
|
||
There's a couple instances of this on ESR52 as well, but low volume enough that I think we can wontfix. Feel free to set the status back to affected if you want to consider this for backport there as well.
Comment on attachment 8867681 [details] [diff] [review]
0001-Bug-1362366-Set-mTextureSource-to-nullptr-when-ID3D1.patch
Okay, this looks good. I agree with Peter re: adding an assert in DataTextureSourceD3D11, but actually you could just remove the whole function and override. It doesn't do anything after your patch, and it should have no other callers.
Whichever you do, I would also add this before you set mTextureSource = nullptr:
> if (mTextureSource) {
> mTextureSource->Reset();
> }
Just in case some layer or cache is retaining it.
Attachment #8867681 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 12•7 years ago
|
||
(In reply to David Anderson [:dvander] from comment #10)
> Comment on attachment 8867681 [details] [diff] [review]
> 0001-Bug-1362366-Set-mTextureSource-to-nullptr-when-ID3D1.patch
>
> Okay, this looks good. I agree with Peter re: adding an assert in
> DataTextureSourceD3D11, but actually you could just remove the whole
> function and override. It doesn't do anything after your patch, and it
> should have no other callers.
>
May I clarify that the function I could just remove the whole function and override was [1] pointed to?
[1]: http://searchfox.org/mozilla-central/source/gfx/layers/d3d11/TextureD3D11.cpp#1185-1199
I have two questions.
1. DXGIYCbCrTextureHostD3D11 also use this override. Since it called SetNextSibling() in [2], mNextSibling would be valid.
In this case, should we still keep this function for DXGIYCbCrTextureHostD3D11?
[2]: http://searchfox.org/mozilla-central/source/gfx/layers/d3d11/TextureD3D11.cpp#968-969
2. if the override function [1] was removed, [3] would call into its base case (class TextSource).
The behavior would be like other TextureSouce (MacIOSurfaceTextureSourceBasic is a case). Based on this, does it have any concern for
DXGIYCbCrTextureHostD3D11 and DXGITextureHostD3D11.
[3]: http://searchfox.org/mozilla-central/rev/79f625ac56d936ef5d17ebe13a123b25efee4241/gfx/layers/d3d11/TextureD3D11.cpp#1197
Please correct me if I got anything wrong. Really thanks.
Flags: needinfo?(dvander)
IMO, DXGIYCbCrTextureHostD3D11 and DXGITextureHostD3D11 are basically the same thing. Probably in both cases we should actually revoke the device on the sources, but just doing nothing is probably fine too (compositing should re-acquire the source every frame, and therefore get no TextureSource until the device reset is handled).
Does that answer your question?
Flags: needinfo?(dvander)
That is: the code for handling a different TextureSourceProvider should look the same for both.
Assignee | ||
Comment 15•7 years ago
|
||
(In reply to David Anderson [:dvander] from comment #13)
> IMO, DXGIYCbCrTextureHostD3D11 and DXGITextureHostD3D11 are basically the
> same thing. Probably in both cases we should actually revoke the device on
> the sources, but just doing nothing is probably fine too (compositing should
> re-acquire the source every frame, and therefore get no TextureSource until
> the device reset is handled).
>
> Does that answer your question?
Thanks for your reply. Based on above, I drafted a patch to fix this issue. Could you please review again my patch to make sure the patch get the thing right. Really thanks
Attachment #8867681 -
Attachment is obsolete: true
Attachment #8870741 -
Flags: review?(dvander)
Attachment #8870741 -
Flags: review?(dvander) → review+
Assignee | ||
Comment 16•7 years ago
|
||
Comment 17•7 years ago
|
||
Pushed by vliu@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e22128ea91bf
Set DataTextureSourceD3D11 to nullptr once devce change was detected. r=dvander
Comment 18•7 years ago
|
||
bugherder |
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
Comment 19•7 years ago
|
||
Crash volume seems pretty low on 54. With the RC build happening on Monday, can we let this ride the 55 train at this point or did you want to request Beta approval on it? If yes, please do so ASAP :).
Flags: needinfo?(vliu)
Assignee | ||
Comment 20•7 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #19)
> Crash volume seems pretty low on 54. With the RC build happening on Monday,
> can we let this ride the 55 train at this point or did you want to request
> Beta approval on it? If yes, please do so ASAP :).
We just let this ride the 55 train and I won't request Beta approval since the Crash volume pretty low. Please have comment if anyone need this Beta approval.
Flags: needinfo?(vliu)
Updated•7 years ago
|
You need to log in
before you can comment on or make changes to this bug.
Description
•