Closed Bug 1429608 Opened 2 years ago Closed 2 years ago

AMD Content crashes in d2d1.dll in Windows insider builds

Categories

(Core :: Graphics, defect, P3, critical)

57 Branch
x86_64
Windows 10
defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- wontfix
firefox59 + fixed
firefox60 + fixed
firefox61 + fixed

People

(Reporter: philipp, Unassigned)

References

Details

(Keywords: crash, regression, Whiteboard: [gfx-noted])

Crash Data

This bug was filed from the Socorro interface and is
report bp-ace9e323-cd16-4c4b-9db6-c1df10180105.
=============================================================

Top 10 frames of crashing thread:

0 d2d1.dll d2d1.dll@0xb4935 
1 d2d1.dll d2d1.dll@0x1145d7 
2 d2d1.dll d2d1.dll@0x1be398 
3 ntdll.dll ntdll.dll@0x30e4d 
4 d2d1.dll d2d1.dll@0x9c405 
5 ntdll.dll ntdll.dll@0x30e4d 
6 d2d1.dll d2d1.dll@0xdf5e0 
7 d2d1.dll d2d1.dll@0x2f7287 
8 d2d1.dll d2d1.dll@0xa4d92 
9 d2d1.dll d2d1.dll@0x9358a 

=============================================================

there are a number of content crash signatures relating to d2d1.dll spiking up from users with an AMD gpu and firefox 64bit builds. they are present in release as well but are getting more common in volume in 58 beta and 59 nightly.
if i'm reading the platform information correctly, they are all coming from windows 10 insider builds for the version that's targeted for release in spring 2018. the crashes seem to be triggered repeatedly on particular sites judging by the user comments.

Adapter device id facet 
1 	0x67df 	524 	20.74 %
2 	0x67b0 	182 	7.21 %
3 	0x67b1 	182 	7.21 %
4 	0x6798 	130 	5.15 %
5 	0x6939 	106 	4.20 %
6 	0x67ef 	96 	3.80 %
7 	0x6819 	93 	3.68 %
8 	0x6810 	87 	3.44 %
9 	0x679a 	86 	3.40 %
10 	0x687f 	76 	3.01 %

Adapter driver version facet
1 	23.20.15007.1005 	1070 	42.36 %
2 	23.20.780.2 	635 	25.14 %
3 	23.20.15002.11 	377 	14.92 %
4 	23.20.15007.2002 	59 	2.34 %
5 	23.20.782.259 	40 	1.58 %
6 	23.20.788.0 	39 	1.54 %
7 	23.20.793.1024 	35 	1.39 %
8 	23.20.793.0 	34 	1.35 %
9 	21.19.512.0 	30 	1.19 %
10 	16.300.2701.0 	21 	0.83 %

crash stats search query: https://crash-stats.mozilla.com/search/?submitted_from_infobar=%21__true__&platform_pretty_version=%3DWindows%2010&adapter_vendor_id=0x1002&signature=%5Ed2d1.dll&product=Firefox&process_type=content&date=%3E%3D2017-12-24&_sort=-date&_facets=signature&_facets=user_comments&_facets=version&_facets=build_id&_facets=install_time&_facets=release_channel&_facets=graphics_critical_error&_facets=adapter_device_id&_facets=adapter_driver_version&_facets=app_notes&_facets=platform_version&_facets=cpu_arch&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-adapter_device_id
A while back I filed Bug 1419052 for 57, I think it the same issue as this.
yes, the correlations are certainly similar (content crashes on win10 insider builds & amd gpus).
Crash Signature: [@ d2d1.dll@0xb38c6] [@ d2d1.dll@0xb4935] [@ d2d1.dll@0x46500] [@ d2d1.dll@0x4a15e] [@ d2d1.dll@0xb4ef6] [@ d2d1.dll@0x201aea] [@ d2d1.dll@0x322bb5] [@ d2d1.dll@0xf7094] → [@ d2d1.dll@0xb38c6] [@ d2d1.dll@0xb4935] [@ d2d1.dll@0x46500] [@ d2d1.dll@0x4a15e] [@ d2d1.dll@0xb4ef6] [@ d2d1.dll@0x201aea] [@ d2d1.dll@0x322bb5] [@ d2d1.dll@0xf7094] [@ CSysToHwTransferBuffer::NotifyWrittenBytes ]
Duplicate of this bug: 1419052
Crash Signature: [@ d2d1.dll@0xb38c6] [@ d2d1.dll@0xb4935] [@ d2d1.dll@0x46500] [@ d2d1.dll@0x4a15e] [@ d2d1.dll@0xb4ef6] [@ d2d1.dll@0x201aea] [@ d2d1.dll@0x322bb5] [@ d2d1.dll@0xf7094] [@ CSysToHwTransferBuffer::NotifyWrittenBytes ] → [@ d2d1.dll@0xb38c6] [@ d2d1.dll@0xb4935] [@ d2d1.dll@0x46500] [@ d2d1.dll@0x4a15e] [@ d2d1.dll@0xb4ef6] [@ d2d1.dll@0x201aea] [@ d2d1.dll@0x322bb5] [@ d2d1.dll@0xf7094] [@ d2d1.dll@0xb3996] [@ d2d1.dll@0xb4705] [@ d2d1.dll@0x46510] [@ CSysToHw…
The traces I looked at all are mostly trashed and/or very deep and we are missing the symbols (if you want to inspect the minidumps and lack access, I can provide). The volume is fairly low, but the aggregate of users with those AMD driver versions is only ~0.5%; presumably that is because relatively few people are in the Windows 10 Insider program. This could get a lot worse in a few months then if released as is.
Flags: needinfo?(bas)
Priority: -- → P3
Whiteboard: [gfx-noted]
(In reply to Andrew Osmond [:aosmond] from comment #4)
> The traces I looked at all are mostly trashed and/or very deep and we are
> missing the symbols (if you want to inspect the minidumps and lack access, I
> can provide). The volume is fairly low, but the aggregate of users with
> those AMD driver versions is only ~0.5%; presumably that is because
> relatively few people are in the Windows 10 Insider program. This could get
> a lot worse in a few months then if released as is.

I think reporting this to AMD is the right course of action. This doesn't look like it has anything to do with 'our' code.
Flags: needinfo?(bas)
MSFT Engineer here with an AMD rig hitting this issue. If you have some more info or steps I can pull some more data I can walk it across the hall and see what I can do from our end too.
(ni? Bas in case he has ideas re: Comment #6)
Flags: needinfo?(bas)
Noone's commented on any of these bug reports. A very similar crash happens here: https://crash-stats.mozilla.com/report/index/d949dd24-4194-42ce-843e-cb1d60180120 This one is a slightly older version of the D2D1 DLL and so we have stacks for it (it appears the 170067 symbols haven't been downloaded yet), this seems to be related to vertex buffer uploads.. So maybe some complex SVGs can do the trick? That's the main thing I can think of where we very heavily rely on D2D's tessellation code, although in general we use it more than Edge does as we use D2D less efficiently because of our additional levels of abstraction.
Flags: needinfo?(bas) → needinfo?(jonathan.gardner04)
Duplicate of this bug: 1432063
Duplicate of this bug: 1425741
Crash Signature: CSysToHwTransferBuffer::NotifyWrittenBytes ] → CSysToHwTransferBuffer::NotifyWrittenBytes ] [@ CHwVertexBufferWriter::AddQuads ] [@ Fill6x5FilteredText<T> ] [@ CHwVertexBufferWriter::AddTriangles ]
(we seem to have symbols for the more recent win10 insider builds, so i'm removing a bunch of generic d2d1.dll signatures here)

the next windows 10 version based on these insider builds is reported to be generally released to the public "sometime in March 2018", with would somewhat coincide with our 59 release.
therefore this issue probably needs to be bumped up in priority as the content crash situation of amd users on win10 insider builds looks really dire - these graphics-related crashes amount to ~85% of all the content crashes reported from this user group: http://bit.ly/2DYhRQC
Crash Signature: [@ d2d1.dll@0xb38c6] [@ d2d1.dll@0xb4935] [@ d2d1.dll@0x46500] [@ d2d1.dll@0x4a15e] [@ d2d1.dll@0xb4ef6] [@ d2d1.dll@0x201aea] [@ d2d1.dll@0x322bb5] [@ d2d1.dll@0xf7094] [@ d2d1.dll@0xb3996] [@ d2d1.dll@0xb4705] [@ d2d1.dll@0x46510] [@ CSysToHw… → [@ CSysToHwTransferBuffer::NotifyWrittenBytes ] [@ CHwVertexBufferWriter::AddQuads ] [@ Fill6x5FilteredText<T> ] [@ CHwVertexBufferWriter::AddTriangles ] [@ memcpy | CSysToHwTransferBuffer::Flush] [@ memcpy | GlyphRunFiller::FillSubrect] [@ memcpy |…
Flags: needinfo?(milan)
Flags: needinfo?(milan) → needinfo?(bas)
Crash Signature: [@ CSysToHwTransferBuffer::NotifyWrittenBytes ] [@ CHwVertexBufferWriter::AddQuads ] [@ Fill6x5FilteredText<T> ] [@ CHwVertexBufferWriter::AddTriangles ] [@ memcpy | CSysToHwTransferBuffer::Flush] [@ memcpy | GlyphRunFiller::FillSubrect] [@ memcpy |… → [@ CSysToHwTransferBuffer::NotifyWrittenBytes ] [@ CHwVertexBufferWriter::AddQuads ] [@ Fill6x5FilteredText<T> ] [@ CHwVertexBufferWriter::AddTriangles ] [@ CHwVertexBufferWriter::AddTriangleFan] [@ memcpy | CSysToHwTransferBuffer::Flush] [@ memcpy …
Any news on this from MSFT?
(In reply to Bas Schouten (:bas.schouten) from comment #12)
> Any news on this from MSFT?

I am not sure what has changed but I am running 16299.192 and FF Nightly and no longer seeing issues with hardware acceleration.
Flags: needinfo?(jonathan.gardner04)
OK, let's keep an eye on this over the next week. Maybe we'll see the crash rate decline for this group of people running the insider build, as they update to 16299.192.
the crash pattern seems to have stopped in newer insider builds with platform version 10.0.17083 & 10.0.17093 when looking at crash-stats data.
Duplicate of this bug: 1433028
Duplicate of this bug: 1433541
Good to hear. It looks low volume on 59, anyway, and it is too late for a 59 fix.
Can we chalk this up to some buggy Insider builds and go ahead and close this now?
Flags: needinfo?(madperson)
yes, we can probably call this fixed (by third-party). in retrospect it was an issue contained to the two insider builds https://changewindows.org/build/17063 & https://changewindows.org/build/17074
Status: NEW → RESOLVED
Closed: 2 years ago
Flags: needinfo?(madperson)
Flags: needinfo?(bas)
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.