Closed
Bug 1207665
Opened 9 years ago
Closed 9 years ago
startup crash in mozilla::layers::CompositorD3D11::GetTextureFactoryIdentifier()
Categories
(Core :: Graphics, defect)
Tracking
()
People
(Reporter: philipp, Assigned: dvander)
References
Details
(Keywords: crash)
Crash Data
Attachments
(5 files, 1 obsolete file)
3.28 KB,
patch
|
jrmuizel
:
review+
|
Details | Diff | Splinter Review |
1.73 KB,
patch
|
jrmuizel
:
review+
dvander
:
checkin+
|
Details | Diff | Splinter Review |
5.74 KB,
patch
|
bas.schouten
:
review+
|
Details | Diff | Splinter Review |
3.30 KB,
patch
|
dvander
:
review+
Sylvestre
:
approval-mozilla-aurora+
Sylvestre
:
approval-mozilla-beta+
Sylvestre
:
approval-mozilla-release+
cbook
:
checkin+
|
Details | Diff | Splinter Review |
2.20 KB,
patch
|
vladan
:
review+
bas.schouten
:
review+
|
Details | Diff | Splinter Review |
[Tracking Requested - why for this release]:
This bug was filed from the Socorro interface and is
report bp-10f6c07f-7aa4-4983-953c-4176c2150922.
=============================================================
0 xul.dll mozilla::layers::CompositorD3D11::GetTextureFactoryIdentifier() gfx/layers/d3d11/CompositorD3D11.cpp
1 xul.dll mozilla::layers::CompositorParent::AllocPLayerTransactionParent(nsTArray<mozilla::layers::LayersBackend> const&, unsigned __int64 const&, mozilla::layers::TextureFactoryIdentifier*, bool*) gfx/layers/ipc/CompositorParent.cpp
2 xul.dll mozilla::layers::PCompositorParent::OnMessageReceived(IPC::Message const&, IPC::Message*&) obj-firefox/ipc/ipdl/PCompositorParent.cpp
3 xul.dll mozilla::ipc::MessageChannel::DispatchSyncMessage(IPC::Message const&, IPC::Message*&) ipc/glue/MessageChannel.cpp
4 xul.dll mozilla::ipc::MessageChannel::DispatchMessageW(IPC::Message const&) ipc/glue/MessageChannel.cpp
5 xul.dll mozilla::ipc::MessageChannel::OnMaybeDequeueOne() ipc/glue/MessageChannel.cpp
this crash seems to spike after the 41 release, at the moment it is the #4 crasher in 41.0 and on top of the crash score board due to crashing on startup - many user comments also reflect that this happened after a firefox update.
the gpu adapter mainly affected in early data seems to be the "Intel Media Accelerator 3150":
1 0xa001 164 98.80 %
2 0x0106 1 0.60 %
3 0x0116 1 0.60 %
Adapter driver version facet
1 8.14.10.2230 104 62.65 %
2 8.14.10.2117 59 35.54 %
3 8.15.10.2253 1 0.60 %
4 8.15.10.2567 1 0.60 %
5 9.17.10.2867 1 0.60 %
> 1 0xa001 164 98.80 %
I bet this fell through the cracks of bug 1137716?
Flags: needinfo?(jmuizelaar)
(Just a note in case this comes up in Fx41 post-mortem discussions)
41 Nightly had 14 crashes
41 Aurora had 36 crashes
41 Beta had 231 crashes
41 Release has 195 crashes
At these volumes we would not have seen it in pre-release topcrash reports. This is likely an instance of where our pre-release audience is not indicative of our release audience.
That 3150 seems to be a nettop Atom device. I can see something like that not showing up on nightly.
For some reason this adapter did not crash with this signature in versions before 41. Are we doing something differently now that would send it down this codepath?
David (Anderson), would you expect to have this caught in the startup testing? Can we check telemetry if these devices showed up before, or just started showing up now?
David (Major) - let me dig through some late fixed bugs, see if something we recently fixed could have introduced this.
Flags: needinfo?(dvander)
Assignee | ||
Comment 6•9 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #5)
> David (Anderson), would you expect to have this caught in the startup
> testing? Can we check telemetry if these devices showed up before, or just
> started showing up now?
No, there is no startup crash guard here. In fact it's a self-induced crash[1]. Instead of crashing we should probably just make the compositor fail, if possible, and fallback to software.
What do you mean by "if these devices showed up before"? Notably, using the hardware search on the dashboard, device 0x0106 is 2.53% of the sample population and device 0x0116 is 4.51%. Device 0xa001 with 98% of the crashes is 0.21% of the population. So that seems like a good candidate for blocking.
[1] http://hg.mozilla.org/releases/mozilla-release/annotate/78c82e5cd777/gfx/layers/d3d11/CompositorD3D11.cpp#l422
Flags: needinfo?(dvander)
Assignee | ||
Comment 7•9 years ago
|
||
The metadata for the report says: "Failed to get SharedHandle for sync texture. Result: 0x0" and has a bunch of other random d3d11 error messages.
David, since you already got emotionally involved, how about a patch? :)
Assignee: nobody → dvander
Tracked for 41 as this is a top startup crash.
Comment 10•9 years ago
|
||
42 is also affected.
status-firefox44:
--- → ?
tracking-firefox42:
--- → ?
tracking-firefox43:
--- → +
tracking-firefox44:
--- → +
Assignee | ||
Comment 11•9 years ago
|
||
This should be backported as far as reasonable. But if someone knows how to update the downloadable blocklist, that would work for Firefox 41 and older.
A downloadable blocklist update will not work for Firefox 42+ because of a bug that I'll post a patch for in a moment.
Attachment #8665630 -
Flags: review?(jmuizelaar)
Assignee | ||
Comment 12•9 years ago
|
||
Whoops. This one is my fault due to the gfxWindowsPlatform refactoring, and I'm glad we caught it before Firefox 42 went to release. D2D 1.1 is disabled for WARP, but D2D 1.0 was not. This patch just moves the check further up.
Attachment #8665634 -
Flags: review?(jmuizelaar)
Updated•9 years ago
|
Attachment #8665634 -
Flags: review?(jmuizelaar) → review+
Comment 13•9 years ago
|
||
Comment on attachment 8665630 [details] [diff] [review]
part 1, block device for d3d/d2d
Review of attachment 8665630 [details] [diff] [review]:
-----------------------------------------------------------------
Any reason to not block the entire family?
"a001",
"a002",
"a011",
"a012"
Assignee | ||
Comment 14•9 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #13)
> Comment on attachment 8665630 [details] [diff] [review]
> part 1, block device for d3d/d2d
>
> Review of attachment 8665630 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> Any reason to not block the entire family?
> "a001",
> "a002",
> "a011",
> "a012"
We aren't seeing any crashes on the other devices. 0xa011 appears ~0.85% of the time in Telemetry which is kinda high for one device. Should I stick the driver versions in there as well?
Flags: needinfo?(jmuizelaar)
Assignee | ||
Comment 15•9 years ago
|
||
This moves the GetSharedHandle call to CompositorD3D11::Initialize, instead of GetTextureFactoryIdentifier. It does not remove the MOZ_CRASH though.
Attachment #8665769 -
Flags: review?(bas)
Assignee | ||
Comment 16•9 years ago
|
||
I'd like to turn the MOZ_CRASH here into a return false (with a telemetry ping), but it looks like Compositor::Initialize returning false has some weird side effects. Lots of gfx error log spam and it looks like we continue to use D2D1. Investigating how safe this is first.
Assignee | ||
Comment 17•9 years ago
|
||
Filed bug 1208638 to make failing gracefully more reliable.
Attachment #8666203 -
Flags: review?(bas)
Assignee | ||
Comment 18•9 years ago
|
||
Comment on attachment 8666203 [details] [diff] [review]
part 4, fail gracefully
Vladan, this histogram is really a proxy for having dropped the MOZ_CRASH. We think this value being true is severe enough that we want to monitor it on all channels. Is a boolean histogram the right thing to do?
Attachment #8666203 -
Flags: review?(vladan.bugzilla)
Comment 19•9 years ago
|
||
Comment on attachment 8665630 [details] [diff] [review]
part 1, block device for d3d/d2d
Review of attachment 8665630 [details] [diff] [review]:
-----------------------------------------------------------------
Yeah, please block the entire family.
Attachment #8665630 -
Flags: review?(jmuizelaar) → review+
David, is part 1 the patch that will fix this startup crash for 41? As you may know, updates are disabled for FF41 due to this bug. I am hoping to build a 41 dot release (like 41.0.1) in the next 12 hrs or so and would appreciate it if you nominated the right patch for uplift to mozilla-release branch.
Flags: needinfo?(dvander)
Assignee | ||
Comment 21•9 years ago
|
||
(In reply to Ritu Kothari (:ritu) from comment #20)
> David, is part 1 the patch that will fix this startup crash for 41? As you
> may know, updates are disabled for FF41 due to this bug. I am hoping to
> build a 41 dot release (like 41.0.1) in the next 12 hrs or so and would
> appreciate it if you nominated the right patch for uplift to mozilla-release
> branch.
No, I didn't know that.
Flags: needinfo?(dvander)
Assignee | ||
Comment 22•9 years ago
|
||
w/ Jeff's comment, carrying r+ forward.
Approval Request Comment
[Feature/regressing bug #]: None
[User impact if declined]: Startup crashes on some low-end Intel video cards
[Describe test coverage new/current, TreeHerder]:
[Risks and why]: This is just a driver blocklist entry, so no risk. The video cards in this blocklist are about 1% of the population.
[String/UUID change made/needed]:
Attachment #8666610 -
Flags: review+
Attachment #8666610 -
Flags: approval-mozilla-release?
Attachment #8666610 -
Flags: approval-mozilla-beta?
Attachment #8666610 -
Flags: approval-mozilla-aurora?
Comment 23•9 years ago
|
||
Comment on attachment 8666610 [details] [diff] [review]
minimal fix for aurora/beta/release
Taking it for all branches.
Ritu, I am approving it for you because I know you want it for 41.0.1. Having it sooner will help with treeherder and potential rebasing issues.
Attachment #8666610 -
Flags: approval-mozilla-release?
Attachment #8666610 -
Flags: approval-mozilla-release+
Attachment #8666610 -
Flags: approval-mozilla-beta?
Attachment #8666610 -
Flags: approval-mozilla-beta+
Attachment #8666610 -
Flags: approval-mozilla-aurora?
Attachment #8666610 -
Flags: approval-mozilla-aurora+
Assignee | ||
Comment 24•9 years ago
|
||
Comment on attachment 8666610 [details] [diff] [review]
minimal fix for aurora/beta/release
This should land on nightly too but inbound is closed right now.
Attachment #8666610 -
Flags: checkin?
Comment 25•9 years ago
|
||
Comment 26•9 years ago
|
||
Updated•9 years ago
|
Attachment #8666610 -
Flags: checkin? → checkin+
Assignee | ||
Updated•9 years ago
|
Keywords: leave-open
Comment 27•9 years ago
|
||
Comment on attachment 8666203 [details] [diff] [review]
part 4, fail gracefully
Review of attachment 8666203 [details] [diff] [review]:
-----------------------------------------------------------------
::: gfx/layers/d3d11/CompositorD3D11.cpp
@@ +1366,5 @@
> if (FAILED(hr) || !mSyncHandle) {
> gfxCriticalError() << "Failed to get SharedHandle for sync texture. Result: "
> << hexa(hr);
> + NS_DispatchToMainThread(NS_NewRunnableFunction([] () -> void {
> + Accumulate(Telemetry::D3D11_SYNC_HANDLE_FAILURE, true);
you never accumulate a false value?
::: toolkit/components/telemetry/Histograms.json
@@ +9658,5 @@
> "description": "Attempt to notify ServiceWorker of push notification resubscription."
> + },
> + "D3D11_SYNC_HANDLE_FAILURE": {
> + "alert_emails": ["bschouten@mozilla.com"],
> + "expires_in_version": "never",
I don't like opt-out histograms that never expire, can you set it say for version 50 and then we can revisit its expiration then?
Attachment #8666203 -
Flags: review?(vladan.bugzilla) → review-
Comment 28•9 years ago
|
||
Clarification: if you don't need to accumulate false values, then you should use a count histogram
Comment 29•9 years ago
|
||
Assignee | ||
Comment 30•9 years ago
|
||
Attachment #8666203 -
Attachment is obsolete: true
Attachment #8666203 -
Flags: review?(bas)
Attachment #8666960 -
Flags: review?(vladan.bugzilla)
Assignee | ||
Updated•9 years ago
|
Attachment #8666960 -
Flags: review?(bas)
Comment 31•9 years ago
|
||
Comment on attachment 8666960 [details] [diff] [review]
part 4, fail gracefully
Review of attachment 8666960 [details] [diff] [review]:
-----------------------------------------------------------------
::: toolkit/components/telemetry/Histograms.json
@@ +9763,5 @@
> + "alert_emails": ["bschouten@mozilla.com","danderson@mozilla.com","msreckovic@mozilla.com","ashughes@mozilla.com"],
> + "expires_in_version": "50",
> + "releaseChannelCollection": "opt-out",
> + "kind": "count",
> + "description": "True if the D3D11 compositor failed to get a texture sync handle."
Update the description (it's not a boolean histogram anymore)
Attachment #8666960 -
Flags: review?(vladan.bugzilla) → review+
Updated•9 years ago
|
Attachment #8665769 -
Flags: review?(bas) → review+
Updated•9 years ago
|
Attachment #8666960 -
Flags: review?(bas) → review+
Comment 33•9 years ago
|
||
Comment on attachment 8665630 [details] [diff] [review]
part 1, block device for d3d/d2d
Review of attachment 8665630 [details] [diff] [review]:
-----------------------------------------------------------------
Fwiw, theoretically there's no reason to block D3D11 here as far as I can see. Shared textures are only required for D2D and WebGL, so we should try unblocking D3D11 layers on inbound.
Comment 34•9 years ago
|
||
Assignee | ||
Comment 35•9 years ago
|
||
Comment on attachment 8665634 [details] [diff] [review]
part 2, don't use d2d 1.0 on WARP
Approval Request Comment
[Feature/regressing bug #]: None
[User impact if declined]: Graphics corruption/instability due to using an unsupported combination of acceleration features: D3D10 content with software compositing.
[Describe test coverage new/current, TreeHerder]: Nightly
[Risks and why]: None, we're just aborting acceleration earlier.
[String/UUID change made/needed]:
This patch should be taken for aurora/beta as well to eliminate another source of problems there.
(The rest of this bug is more speculative though so the other patches will be nightly only.)
Attachment #8665634 -
Flags: approval-mozilla-beta?
Attachment #8665634 -
Flags: approval-mozilla-aurora?
Comment 36•9 years ago
|
||
Comment 37•9 years ago
|
||
Comment 38•9 years ago
|
||
Comment on attachment 8665634 [details] [diff] [review]
part 2, don't use d2d 1.0 on WARP
OK, let's take in 42. Should be in beta 3.
Attachment #8665634 -
Flags: approval-mozilla-beta?
Attachment #8665634 -
Flags: approval-mozilla-beta+
Attachment #8665634 -
Flags: approval-mozilla-aurora?
Attachment #8665634 -
Flags: approval-mozilla-aurora+
Comment 39•9 years ago
|
||
Added to the release notes:
"Fix a startup crash with some Intel Media Accelerator 3150 graphic cards (1207665)"
relnote-firefox:
--- → 41+
Updated•9 years ago
|
This patch failed to apply on aurora. Unsure what that means.
Flags: needinfo?(dvander)
Assignee | ||
Comment 42•9 years ago
|
||
Comment on attachment 8665634 [details] [diff] [review]
part 2, don't use d2d 1.0 on WARP
Whoops, it's already on Aurora, sorry about that.
Flags: needinfo?(dvander)
Attachment #8665634 -
Flags: approval-mozilla-beta+ → checkin+
Assignee | ||
Comment 43•9 years ago
|
||
Comment on attachment 8665634 [details] [diff] [review]
part 2, don't use d2d 1.0 on WARP
(Err.. removed the wrong flag.)
Attachment #8665634 -
Flags: approval-mozilla-aurora+
Assignee | ||
Updated•9 years ago
|
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Comment 44•9 years ago
|
||
Wes, should this be marked fixed for all the branches where it landed?
Flags: needinfo?(wkocher)
I think the "leave-open" made our tools skip over marking the flags. Sorry about that.
Flags: needinfo?(wkocher)
Comment 46•9 years ago
|
||
FF41 was EOL'd last week, so it's a wontfix.
Comment 48•7 years ago
|
||
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in
before you can comment on or make changes to this bug.
Description
•