Closed Bug 1202700 Opened 8 years ago Closed 8 years ago

crash in igd10umd32.dll@0x18f35 coming from mozilla::layers::DataTextureSourceD3D11::Update


(Core :: Graphics, defect)

Windows NT
Not set



Tracking Status
firefox41 --- unaffected
firefox42 --- unaffected
firefox43 --- fixed


(Reporter: kairo, Assigned: mattwoodrow)



(Keywords: crash)

Crash Data


(1 file)

[Tracking Requested - why for this release]:

This bug was filed from the Socorro interface and is 
report bp-2f88ebca-3c70-42b9-bc25-2d6502150908.

This may be related to bug 1098597, which shares the signature but seems to have somewhat different stacks - but even those go through mozilla::layers::DataTextureSourceD3D11::Update.

Stack Trace:
0 	igd10umd32.dll 	igd10umd32.dll@0x18f35 	
Ø 1 	igd10umd32.dll 	igd10umd32.dll@0x7a07 	
Ø 2 	igd10umd32.dll 	igd10umd32.dll@0x7024 	
Ø 3 	igd10umd32.dll 	igd10umd32.dll@0x335f 	
4 	xul.dll 	mozilla::layers::DataTextureSourceD3D11::Update(mozilla::gfx::DataSourceSurface*, nsIntRegion*, mozilla::gfx::IntPointTyped<mozilla::gfx::UnknownUnits>*) 	gfx/layers/d3d11/TextureD3D11.cpp
5 	xul.dll 	mozilla::layers::BufferTextureHost::Upload(nsIntRegion*) 	gfx/layers/composite/TextureHost.cpp
6 	xul.dll 	mozilla::layers::BufferTextureHost::MaybeUpload(nsIntRegion*) 	gfx/layers/composite/TextureHost.cpp
7 	xul.dll 	mozilla::layers::BufferTextureHost::UpdatedInternal(nsIntRegion const*) 	gfx/layers/composite/TextureHost.cpp
8 	xul.dll 	mozilla::layers::TextureHost::Updated(nsIntRegion const*) 	gfx/layers/composite/TextureHost.cpp
9 	xul.dll 	mozilla::layers::ContentHostSingleBuffered::UpdateThebes(mozilla::layers::ThebesBufferData const&, nsIntRegion const&, nsIntRegion const&, nsIntRegion*) 	gfx/layers/composite/ContentHost.cpp
10 	xul.dll 	mozilla::layers::CompositableParentManager::ReceiveCompositableUpdate(mozilla::layers::CompositableOperation const&, std::vector<mozilla::layers::EditReply, std::allocator<mozilla::layers::EditReply> >&) 	gfx/layers/ipc/CompositableTransactionParent.cpp
11 	xul.dll 	mozilla::layers::LayerTransactionParent::RecvUpdate(nsTArray<mozilla::layers::Edit>&&, unsigned __int64 const&, mozilla::layers::TargetConfig const&, nsTArray<mozilla::layers::PluginWindowData>&&, bool const&, bool const&, unsigned int const&, bool const&, mozilla::TimeStamp const&, nsTArray<mozilla::layers::EditReply>*) 	gfx/layers/ipc/LayerTransactionParent.cpp

Those are spiking in Firefox 41.0b7 over last weekend, this is now 1.7% of all b7 crashes (rank #5), but the signature exists in other versions, mostly looking like bug 1098597. This 40.0.3 crash looks like more like the stack in here as well though: bp-3ae588ee-c09d-4669-b5b8-21fcd2150908
on beta 41.0b7 this seems to be contained to adapters from the intel gma 4500 series:
1 0x2a42 	668 	92.39 %
2 0x2e12 	47 	6.50 %
3 0x2e22 	8 	1.11 %
Given this signature spans multiple stacks and the bug here is only tracking a spike in 41.0b7 for the specific stack, I'm not sure how to break this down further; at least not without doing a manual report-by-report correlation.

That said, here is the 41.0b7 breakdown for the signature by driver versions:
1 	313 	42.59 %
2 	181 	24.63 %
3 	104 	14.15 %
4 	73 	9.93 %
5 	62 	8.44 %
6 	2 	0.27 %

The latest driver is from November 6, 2013 so maybe we can look at blocklisting as a workaround.
The spike is on Beta 7 only?
(In reply to Milan Sreckovic [:milan] from comment #3)
> The spike is on Beta 7 only?

That's what KaiRo said in the Release Coordination meeting today. I'll let him elaborate.
We blocklist D2D on all driver versions for these cards, and all functionality on driver version  Perhaps we should blocklist all versions lower than as well.  It would be good to understand what caused the increase, though I didn't see anything obvious in the beta 6 -> beta 7 patches.

Jeff, what do you think about blocklisting all the drivers below the one above?

Anthony, do we have the telemetry for which other GMAX4500 drivers we're seeing, that are not in the list above?  It'd be good to understand if we'd be blocklisting versions that didn't have problems.
Flags: needinfo?(jmuizelaar)
Flags: needinfo?(
(In reply to Milan Sreckovic [:milan] from comment #5)
> Anthony, do we have the telemetry for which other GMAX4500 drivers we're
> seeing, that are not in the list above?  It'd be good to understand if we'd
> be blocklisting versions that didn't have problems.

As far as I know the list of driver versions I provided in comment 2 is a complete list.
Flags: needinfo?(
(In reply to Anthony Hughes, QA Mentor (:ashughes) from comment #4)
> (In reply to Milan Sreckovic [:milan] from comment #3)
> > The spike is on Beta 7 only?
> That's what KaiRo said in the Release Coordination meeting today. I'll let
> him elaborate.

Yes, this spike is b7 only so far (we don't have usable data for b8 yet as we just shipped that today).
I suspect this may ultimately be an OOM issue. A large number of these reports have dangerously-low virtual and/or physical memory availability.

FWIW three out of nine comments from b7 mention playing video.
Wonder if the patch for bug 1193547 could be involved; it landed between beta 6 and beta 7 from what I can tell.
Flags: needinfo?(matt.woodrow)
I think that is the problem, we should back that out of aurora/beta (for bug 1202296 as well).

I'm pretty sure these devices are ones that fail the DoesD3D11TextureSharingWork() test, which is why we're taking this specific path.

When playing HD video without the patch, the decoder would upload to D3D9 textures internally (d3d9 textures don't seem to ever have issues with sharing, unlike d3d11 ones). With the patch the decoder will output system memory, we'll see that texture sharing doesn't work, and copy it into shmem and do the upload on the compositor.

I guess this could use more memory (since we have the shmem copy, as well as the GPU copy), but it's also possible that this is just a spike in this particular allocation stack rather than a true crash spike. Or a combination of both.

Let's back out for now. I think we can upload to d3d9 on the client side to closer match the previous behaviour.
Flags: needinfo?(matt.woodrow)
Tracked as this crash is in the top 10 for FF41.
FWIW, the spike persists in b8, so that's more confirmation that a change between b6 and b7 triggered the issue of this signature rising significantly (and thanks for digging and finding a possible culprit).
I backed bug 1193547 out of aurora/beta, so this should only affect nightly now.
This more closely matches what the MFTransform would do, and uploads to d3d9 on the client side.

This should stop us needing to keep shmem around and keep memory usage a bit lower.
Assignee: nobody → matt.woodrow
Attachment #8658941 - Flags: review?(bas)
Comment on attachment 8658941 [details] [diff] [review]
Upload to d3d9 textures

Review of attachment 8658941 [details] [diff] [review]:

::: gfx/layers/IMFYCbCrImage.cpp
@@ +231,1 @@
>        return GetD3D9TextureClient(aClient);

Probably worth a comment to note this will return null in case there is no D3D9 device.
Attachment #8658941 - Flags: review?(bas) → review+
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Flags: needinfo?(jmuizelaar)
You need to log in before you can comment on or make changes to this bug.