Closed Bug 1221348 Opened 4 years ago Closed 4 years ago

startup crash in igd10umd64.dll@0x1e669 when detecting texture sharing on 8.5.10.[18xx-1994] Intel driver

Categories

(Core :: Graphics, defect, critical)

44 Branch
Unspecified
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
firefox44 + fixed
firefox45 + fixed
b2g-v2.5 --- fixed

People

(Reporter: ashughes, Assigned: BenWa)

References

Details

(Keywords: crash, topcrash-win)

Crash Data

Attachments

(2 files, 6 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-cfe21b32-417b-4dfc-8436-cca942151103.
=============================================================
Ø 0 	igd10umd64.dll 	igd10umd64.dll@0x1e669 	
Ø 1 	igd10umd64.dll 	igd10umd64.dll@0x2ad04b 	
Ø 2 	igd10umd64.dll 	igd10umd64.dll@0x1017a 	
Ø 3 	kernel32.dll 	kernel32.dll@0x21a79 	
Ø 4 	igd10umd64.dll 	igd10umd64.dll@0xa58a 	
Ø 5 	igd10umd64.dll 	igd10umd64.dll@0x2ad04b 	
6 	d3d11.dll 	NDXGI::CDevice::AllocateCB(void*, _D3DDDICB_ALLOCATE*) 	
Ø 7 	igd10umd64.dll 	igd10umd64.dll@0x87de 	
Ø 8 	ntdll.dll 	ntdll.dll@0x11143f 	
Ø 9 	igd10umd64.dll 	igd10umd64.dll@0x7bf5 	
Ø 10 	igd10umd64.dll 	igd10umd64.dll@0x2d8e4 	
Ø 11 	igd10umd64.dll 	igd10umd64.dll@0xb6a3 	
Ø 12 	igd10umd64.dll 	igd10umd64.dll@0xa2ca 	
13 	d3d11.dll 	stdext::_Hash<stdext::_Hmap_traits<unsigned long, SDDIErrorCtx*, stdext::hash_compare<unsigned long, std::less<unsigned long> >, std::allocator<std::pair<unsigned long const, SDDIErrorCtx*> >, 0> >::find(unsigned long const&) 	
Ø 14 	igd10umd64.dll 	igd10umd64.dll@0x97d3 	
Ø 15 	igd10umd64.dll 	igd10umd64.dll@0x2b5b 	
16 	d3d11.dll 	CResource<ID3D11Texture3D>::CLS::FinalConstruct(CContext*, D3D11DDIARG_CREATERESOURCE const*, SD3D11SharedResourceCreationArgs*, SD3D11CrossLayerData*, D3D10DDI_HRTRESOURCE) 	
17 	d3d11.dll 	C10and11Resource<ID3D11Texture2D, ID3D10Texture2D>::C10and11Resource<ID3D11Texture2D, ID3D10Texture2D>(SD3D11LayeredDeviceChildCreationArgs const&, D3D11_USAGE, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, DXGI_FORMAT, DXGI_SAMPLE_DESC, unsigned int, unsigned int, unsigned int, unsigned int, unsigned int, SCLSInfo<CContext> const*, SD3D11SharedResourceCreationArgs const*, _GUID const&, D3D11_RESOURCE_DIMENSION) 	
18 	d3d11.dll 	TCLSWrappers<CTexture2D>::CLSFinalConstructFn(CTexture2D::CLS*, CContext*, CTexture2D::TConstructorArgs const*) 	
19 	d3d11.dll 	CLayeredObjectWithCLS<CTexture2D>::CreateInstance(CTexture2D::TConstructorArgs&, void*, void*, _GUID const&, void**, CLayeredObjectWithCLS<CTexture2D>::SInfo const*) 	
20 	d3d11.dll 	ATL::CComPtrBase<ID3D11Resource>::IsEqualObject(IUnknown*) 	
21 	d3d11.dll 	CResource<ID3D11Resource>::InvalidAlignment(D3D11_BOX const*) 	
22 	d3d11.dll 	CSafeDDIResource::CSafeDDIResource(SD3D11LayeredTexture2DCreationArgs const&) 	
23 	d3d11.dll 	CDevice::CreateLayeredChild(unsigned int, void const*, unsigned __int64, ID3D11LayeredUseCounted*, _GUID const&, void**) 	
=============================================================
More reports: https://crash-stats.mozilla.com/report/list?product=Firefox&signature=igd10umd64.dll%400x1e669
[Tracking Requested - why for this release]: New crash in Nightly starting on October 29, 2015.

Comments suggest this started with the update to Firefox 45.0a1 however reports indicate this happening starting with the 2015-10-28 build.

Platforms:
100% on Windows (88% Win 7)

GPU: 
100% with an Intel 4 Series Chipset Integrated Graphics Controller

Drivers:
1 	8.15.10.1855 	130 	27.14 %
2 	8.15.10.1892 	129 	26.93 %
3 	8.15.10.1994 	114 	23.80 %
4 	8.15.10.1883 	62 	12.94 %
5 	8.15.10.1872 	39 	8.14 %
6 	8.15.10.1851 	5 	1.04 %
One other note, there is a high number of DUPE reports so I suspect this might be a small number of users crashing a lot.
Keywords: topcrash-win
Summary: crash in igd10umd64.dll@0x1e669 → startup crash in igd10umd64.dll@0x1e669
This is now the #2 topcrash in Nightly and is also a startup crash.
This started in the October 28 Nightly build, here is the pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=9a8f2342fb3116d23989087e026448d38a3768c5&tochange=fc706d376f0658e560a59c3dd520437b18e8c4a4

A couple of anecdotes:
* The GPUs represent 9.83% of the sessions in our Telemetry data
* Firefox 44.0a1 reports 5.72 crashes per install
* Firefox 45.0a1 reports 3.46 crashes per install
Possibly bug 1097321?  The timing is right.
Assignee: nobody → bgirard
It looks like these are all from 64bit Firefox. Is there a corresponding 32 bit crash?
Flags: needinfo?(anthony.s.hughes)
KaiRo, we're tracking this based on your explosiveness report and it being a startup crash.

This landed on central on the last day of nightly, just before we found out that was the last day of nightly :), so it's on dev edition at this point.  We should definitely track it.
We can reproduce this locally
Crash Signature: [@ igd10umd64.dll@0x1e669] → [@ igd10umd64.dll@0x1e669] [@ igd10umd32.dll@0x18f35]
Flags: needinfo?(anthony.s.hughes)
We weren't able to reproduce this until we downgraded the Intel drive on our test machine.
Summary: startup crash in igd10umd64.dll@0x1e669 → startup crash in igd10umd64.dll@0x1e669 when opening a device on older Intel driver
Blocks: 1097321
(In reply to Benoit Girard (:BenWa) from comment #9)
> We weren't able to reproduce this until we downgraded the Intel drive on our
> test machine.

Which driver version triggered it? Does that correlate to what I put in comment 1?
Jeff installed 8.15.10.1855 probably due to Comment 1, so yes, perfectly. Very useful :D
I was able to isolate the problem to just this change:
if (FAILED(device->CreateTexture2D(&desc, nullptr, byRef(texture))))
->
if (FAILED(device->CreateTexture2D(&desc, &data, byRef(texture)))) {
The crash only seems to occur if you try to create a texture with flag D3D11_RESOURCE_MISC_SHARED_KEYEDMUTEX while providing data to CreateTexture2D.
(In reply to Milan Sreckovic [:milan] from comment #7)
> KaiRo, we're tracking this based on your explosiveness report and it being a
> startup crash.

FWIW, the igd10umd32.dll@0x18f35 signature is present in significant amount on release but not as a startup crash there.
Attached patch Part 1: Unify crash signatures (obsolete) — Splinter Review
Attachment #8683842 - Flags: review?(jmuizelaar)
Comment on attachment 8683842 [details] [diff] [review]
Part 1: Unify crash signatures

Review of attachment 8683842 [details] [diff] [review]:
-----------------------------------------------------------------

MOZ_CRASH is probably better to use
Attachment #8683842 - Flags: review?(jmuizelaar) → review+
Attached patch Part 1: Unify crash signatures (obsolete) — Splinter Review
Attachment #8683842 - Attachment is obsolete: true
Attachment #8683853 - Flags: review+
Duplicate of this bug: 1098597
Attached patch Part 1: Unify crash signatures (obsolete) — Splinter Review
Attachment #8683853 - Attachment is obsolete: true
Attachment #8683880 - Flags: review+
igd10umd64.dll@0x1e7a9 seems to be the signature for: AdapterDriverVersion: 8.15.10.1808
Crash Signature: [@ igd10umd64.dll@0x1e669] [@ igd10umd32.dll@0x18f35] → [@ igd10umd64.dll@0x1e669] [@ igd10umd32.dll@0x18f35] [@ igd10umd64.dll@0x1e7a9]
I looked through the crash report and affect versions are: 8.15.10.[18xx-1994] (inclusive).
Summary: startup crash in igd10umd64.dll@0x1e669 when opening a device on older Intel driver → startup crash in igd10umd64.dll@0x1e669 when opening a device on 8.5.10.[18xx-1994] Intel driver
I'm narrowing the scope of this bug. This is a general crash signature that happens in several situations. I want to limit this bug to the specific STR/signature that we are considering which happens from the texture detection code that we regressed in bug 1097321.

Since we believe that this signature can be caused by different things like UpdateSubresource during normal runtime these should be investigated in a different bug. I expect these will have different investigation and patches, even though the affected driver range will likely be the same.
Summary: startup crash in igd10umd64.dll@0x1e669 when opening a device on 8.5.10.[18xx-1994] Intel driver → startup crash in igd10umd64.dll@0x1e669 when detecting texture sharing on 8.5.10.[18xx-1994] Intel driver
Attachment #8684408 - Flags: review?(jmuizelaar)
Keywords: leave-open
Comment on attachment 8684408 [details] [diff] [review]
Part 2: Fix texture sharing detection code on Intel.

Review of attachment 8684408 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/thebes/gfxWindowsPlatform.cpp
@@ +1903,5 @@
>    // We're going to check that sharing actually works with this format
>    D3D11_SUBRESOURCE_DATA data;
>    data.pSysMem = color;
>    data.SysMemPitch = texture_size * 4;
>    data.SysMemSlicePitch = 0;

You should drop this data block and a comment about why we need to use UpdateSubresource instead of CreateTexture2D for init.
Attachment #8684408 - Flags: review?(jmuizelaar) → review+
Attachment #8684408 - Attachment is obsolete: true
Attachment #8684423 - Flags: review+
Attachment #8684423 - Attachment is obsolete: true
Attachment #8684458 - Attachment is obsolete: true
Attachment #8684461 - Flags: review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/4f9d6851b04a4b30c8b13197a951c1e6d96567ef
Bug 1221348 - Part 2: Fix texture sharing detection code on Intel. r=jrmuizel
Tracked for 44, at the moment this is # 19 (moved 10 spots down), but still in top 25.
Benoit, should we consider uplifting the fix to FF44 given that it's was tagged as a top crash?
Flags: needinfo?(bgirard)
Absolutely. Otherwise we need to backout the offending patch from aurora.
Flags: needinfo?(bgirard)
Comment on attachment 8683880 [details] [diff] [review]
Part 1: Unify crash signatures

Approval Request Comment
[Feature/regressing bug #]: bug 1097321
[User impact if declined]: Brick built for users with affect intel driver.
[Describe test coverage new/current, TreeHerder]: Some testing but we don't have any test machines with the required driver.
[Risks and why]: Some, change with potential driver stability impact but this has been on nightly and is aimed at fixing a regression
[String/UUID change made/needed]: none
Attachment #8683880 - Flags: approval-mozilla-aurora?
Comment on attachment 8684461 [details] [diff] [review]
Part 2: Fix texture sharing detection code on Intel.

Approval Request Comment
[Feature/regressing bug #]: bug 1097321
[User impact if declined]: Brick built for users with affect intel driver.
[Describe test coverage new/current, TreeHerder]: Some testing but we don't have any test machines with the required driver.
[Risks and why]: Some, change with potential driver stability impact but this has been on nightly and is aimed at fixing a regression
[String/UUID change made/needed]: none
Attachment #8684461 - Flags: approval-mozilla-aurora?
Benoit, you comment about backing out the offending patch makes me wonder, why don't we backout the offending patch from Aurora vs. uplifting these two new patches? To me the former seems more logical than the latter. Do you feel the same way?
Flags: needinfo?(bgirard)
Well if we care only about risk management then doing the backout is the best solution.
However if we uplift the fix then we're shipping an improvement for people with AMD Switchable graphics hardware which helps us in benchmarks. Given that we're early in the aurora cycle + a full beta cycle and that we already have a fix for this bug I think it's a calculated risk to go ahead. While I noted the possibility of more bugs, it's not really likely and we're still have reasonable time to catch an issue if it comes up. Should something else come up later then we would use the backout solution.

Agreed?
Flags: needinfo?(bgirard) → needinfo?(rkothari)
Benoit, while gfx issues often cause hard-to-find-and-debug crashes or startup crashes (which are absolutely worse), I tend to agree with your reasoning that we are early in the Aurora44 cycle and crash-stats data on Nightly45 is definitely not showing these signatures since the two patches landed. Let's take it in Aurora and hopefully we won't have to worry about these patches again (!). Thanks for the discussion, it is most helpful.
Flags: needinfo?(rkothari)
Attachment #8683880 - Attachment is obsolete: true
Attachment #8683880 - Flags: approval-mozilla-aurora?
Attachment #8687350 - Flags: review+
Attachment #8687350 - Flags: approval-mozilla-aurora?
Comment on attachment 8687350 [details] [diff] [review]
Part 1: Unify crash signatures

The Nightly45 crash data for this signature shows that it is gone since we landed these patches. Let's uplift to Aurora44 with the hope that the startup crash goes away on that channel too with these two patches.
Attachment #8687350 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Comment on attachment 8684461 [details] [diff] [review]
Part 2: Fix texture sharing detection code on Intel.

Aurora44+
Attachment #8684461 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Wes, tomcat: This is a critical patch that will improve DevEd44 stability. Could you please help uplift asap? Thanks!
Flags: needinfo?(wkocher)
Flags: needinfo?(cbook)
Flags: needinfo?(wkocher)
Flags: needinfo?(cbook)
These signatures are completely gone now in builds where the patch has landed. We still see crashes in 44 and 45 but they're with older builds.
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.