Closed
Bug 1146313
Opened 10 years ago
Closed 9 years ago
crash in mozilla::layers::CompositorD3D11::UpdateConstantBuffers()
Categories
(Core :: Graphics, defect)
Tracking
()
RESOLVED
FIXED
mozilla39
People
(Reporter: kairo, Assigned: mattwoodrow)
Details
(Keywords: crash, regression)
Crash Data
Attachments
(2 files)
38.60 KB,
text/plain
|
Details | |
1.21 KB,
patch
|
jrmuizel
:
review+
|
Details | Diff | Splinter Review |
This bug was filed from the Socorro interface and is
report bp-8c45571f-fdd1-43fb-94bf-cc7d72150323.
=============================================================
This is a new topcrash in 37.0b7 (it exists at very low level in other versions, but only spiked with this one).
There seem to be a few different stacks, next to the one above also those in bp-7ac3a62e-5d81-49fc-98e6-6277f2150323 or bp-c0e3f013-cb56-4b4a-a654-27ce42150323.
The top few frames are either this:
0 xul.dll mozilla::layers::CompositorD3D11::UpdateConstantBuffers() gfx/layers/d3d11/CompositorD3D11.cpp
1 xul.dll mozilla::layers::CompositorD3D11::DrawQuad(mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::EffectChain const&, float, mozilla::gfx::Matrix4x4 const&) gfx/layers/d3d11/CompositorD3D11.cpp
2 xul.dll mozilla::layers::ContentHostTexture::Composite(mozilla::layers::EffectChain&, float, mozilla::gfx::Matrix4x4 const&, mozilla::gfx::Filter const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, nsIntRegion const*) gfx/layers/composite/ContentHost.cpp
3 xul.dll mozilla::layers::PaintedLayerComposite::RenderLayer(nsIntRect const&) gfx/layers/composite/PaintedLayerComposite.cpp
[...]
Or this:
0 xul.dll mozilla::layers::CompositorD3D11::UpdateConstantBuffers() gfx/layers/d3d11/CompositorD3D11.cpp
1 xul.dll mozilla::layers::CompositorD3D11::ClearRect(mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&) gfx/layers/d3d11/CompositorD3D11.cpp
2 xul.dll mozilla::layers::CompositorD3D11::BeginFrame(nsIntRegion const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*, mozilla::gfx::RectTyped<mozilla::gfx::UnknownUnits>*) gfx/layers/d3d11/CompositorD3D11.cpp
[...]
The crashes are mainly on Win7 and are all EXCEPTION_ACCESS_VIOLATION_WRITE with non-null, mostly high addresses.
Most graphics adapters are
Click the link in the Crash Signature field of this bug to get more reports and stats.
![]() |
Reporter | |
Comment 1•10 years ago
|
||
[Tracking Requested - why for this release]:
This is #4 with 2.8% of all 37.0b7 crashes (and b7 seems to have a higher crash rate than b6 before).
Bas, nical: Any idea what's up here?
status-firefox37:
--- → affected
tracking-firefox37:
--- → ?
Flags: needinfo?(nical.bugzilla)
Flags: needinfo?(bas)
![]() |
Reporter | |
Comment 2•10 years ago
|
||
Also note that this is the #1 (16%) of crashes in 37.0b7 with YouTube in the URL.
Comment 3•10 years ago
|
||
As Kairo said, this is not a new crash in 37 but is certainly much more explosive.
We're scheduled to build the 37 desktop RC today. ni kats and Jeff to help as well as we have very little time to figure this out. Is there something that we can backout?
status-firefox36:
--- → wontfix
status-firefox38:
--- → affected
status-firefox39:
--- → affected
Flags: needinfo?(jmuizelaar)
Flags: needinfo?(bugmail.mozilla)
Comment 4•10 years ago
|
||
I have no idea what could have caused this. It seems like it should only happen with a driver bug. It looks like it is happening more on Intel cards. Can we get some correlation information on devices/drivers?
Flags: needinfo?(jmuizelaar) → needinfo?(kairo)
![]() |
Reporter | |
Comment 5•10 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #4)
> I have no idea what could have caused this. It seems like it should only
> happen with a driver bug. It looks like it is happening more on Intel cards.
> Can we get some correlation information on devices/drivers?
You can get that in detail from the Signature Summary in the link on the Crash Signature field, but I see I didn't finish the sentence I started in comment #0 about the adapters, sorry.
What I wanted to say there is: "Most graphics adapters are Intel, but there are AND and NVidia adapters in the mix as well."
Flags: needinfo?(kairo)
Comment 6•10 years ago
|
||
From a random sampling of the crash stacks it looks like some are crashing at [1] and some are crashing at [2]. I don't know this code at all but clearly the mContext->Map call is not populating resource.pData properly and we are expecting it to. Considering the crash address is nonzero I don't think we can check for null to guard against this, it seems to be putting some unwritable address there. As far as I can tell from my random sampling there doesn't appear to be a correlation in the actual crash address, or any of the memory stats (total/available memory/page file/physical memory/etc.)
It also doesn't look like this code was modified directly between b6 and b7 so I'm not sure what we could backout to fix this; it must be fallout from some other change.
[1] http://hg.mozilla.org/releases/mozilla-beta/annotate/790546ceb89f/gfx/layers/d3d11/CompositorD3D11.cpp#l1362
[2] http://hg.mozilla.org/releases/mozilla-beta/annotate/790546ceb89f/gfx/layers/d3d11/CompositorD3D11.cpp#l1355
Flags: needinfo?(bugmail.mozilla)
Comment 7•10 years ago
|
||
Here's a list of all the app notes extracted from the raw crash data on 2015-03-23. I ran it through sort | uniq -c | sort -rn to get a list sorted by frequency (the number at the start the line is the number of occurrences).
![]() |
Reporter | |
Comment 8•10 years ago
|
||
FWIW, I pretty much suspect that this could be fallout from bug 1138967, which was the most risky gfx patch we took specifically for 37.0b7. Would that patch change the cases you talk about in comment #6?
Comment 9•10 years ago
|
||
It looks like the most likely candidate, but again I'm not familiar with that code so I can't say for sure.
Comment 10•10 years ago
|
||
kats - We're going to need to deal with this in 37. Who is more familiar with the code and can determine whether we need to backout bug 1138967?
Flags: needinfo?(bugmail.mozilla)
Comment 11•10 years ago
|
||
Matt would probably the person. On IRC he said "I'm working on it"
Flags: needinfo?(bugmail.mozilla) → needinfo?(matt.woodrow)
Assignee | ||
Comment 12•10 years ago
|
||
pData isn't initialized by us, so I guess it's possible that Map() is returning S_OK, but not setting pData.
We could initialize it to null and then check for that as well as the HRESULT.
That doesn't explain why this spiked, but I agree that bug 1138967 is the most likely.
I'll probably back out part 3 of that bug soon, given the number of regressions.
Assignee | ||
Updated•10 years ago
|
Flags: needinfo?(matt.woodrow)
Updated•10 years ago
|
Flags: needinfo?(nical.bugzilla)
Comment 13•10 years ago
|
||
Parking with matt for now as he's looking into it.
Assignee: nobody → matt.woodrow
Flags: needinfo?(bas)
Assignee | ||
Comment 14•10 years ago
|
||
Blacklisting the main driver here should get this back the previously low levels, but it shouldn't hurt to avoid crashing anyway.
It looks like the Map call is returning S_OK, but not setting pData.
Attachment #8582836 -
Flags: review?(bas)
Comment 15•10 years ago
|
||
Comment on attachment 8582836 [details] [diff] [review]
Avoid crashing in UpdateConstantBuffers
Review of attachment 8582836 [details] [diff] [review]:
-----------------------------------------------------------------
This shouldn't make a difference, but you should add a gfxCriticalError in the case where things go unexpected.
Attachment #8582836 -
Flags: review?(bas) → review+
Assignee | ||
Comment 16•10 years ago
|
||
Comment 17•10 years ago
|
||
Comment on attachment 8582836 [details] [diff] [review]
Avoid crashing in UpdateConstantBuffers
Review of attachment 8582836 [details] [diff] [review]:
-----------------------------------------------------------------
This is insane. But let's do it.
Comment 18•10 years ago
|
||
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
![]() |
Reporter | |
Comment 19•10 years ago
|
||
This is back to larger volume in Dev Editon 39 builds of April 9 - 11. I'm not reopening right now as I don't see reports on the April 12 builds, though.
Comment 20•10 years ago
|
||
This no longer looks like a significant enough crash to fix in 37. I have marked 37 as wontfix. 38 is marked as affected. Does this need to be uplifted to Beta?
Flags: needinfo?(matt.woodrow)
![]() |
Reporter | |
Comment 21•10 years ago
|
||
(In reply to Lawrence Mandel [:lmandel] (use needinfo) from comment #20)
> This no longer looks like a significant enough crash to fix in 37. I have
> marked 37 as wontfix. 38 is marked as affected. Does this need to be
> uplifted to Beta?
No, it also is of no significant volume in 38. It only is back to larger levels on 39.
![]() |
||
Comment 22•10 years ago
|
||
This is definitely back on aurora 39, and on nightly 40 as well. It looks like the same symptoms as before: bogus pData pointers -- despite the init and hr check!
It shot up on aurora build 20150409004007. Regression range: https://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=85071beda936&tochange=9dd03bf49426
Two-thirds of URLs are on YouTube. All Win7 and Win7SP1. All Intel drivers with version <= 8.10.15.2993, but nearly all of them are <= 8.10.15.2622, which makes me think bug 1151721 may be related.
![]() |
||
Comment 23•10 years ago
|
||
Aha: pData is pointing to nonwritable pages inside igd10umd32.dll.
Assignee | ||
Comment 24•10 years ago
|
||
Bug 1151721 seems believable, disabling hardware decoding and uploading video textures manually will change our driver usage and hit bugs that we weren't hitting before.
Even without hardware decoding blacklisted, we still might not use it in some cases (too many active DXVA decoders for one), so we probably need to handle <= 2993.
Bas, are you ok with us blacklisting d3d11 layers for these intel driver versions?
Flags: needinfo?(matt.woodrow) → needinfo?(bas)
Comment 25•10 years ago
|
||
Matt, let's get a patch and we can get it reviewed and landed; this is way too high in the crashers list to not do something about it :)
Flags: needinfo?(matt.woodrow)
Comment 26•10 years ago
|
||
Is this related to bug 1153123?
![]() |
Reporter | |
Comment 27•10 years ago
|
||
[Tracking Requested - why for this release]:
This is now the top crash on 39.0b1 out side of OOM|small.
tracking-firefox39:
--- → ?
Updated•10 years ago
|
Flags: needinfo?(milan)
Comment 28•10 years ago
|
||
Tracking 39+ because regression, tracking 40+ because it could affect 40; is a top crash.
Comment 29•10 years ago
|
||
Matt, are you still looking at this crash? It still sounds like the top crash on 39 beta 3, 5% of overall crash rate for 39.
![]() |
Reporter | |
Comment 30•10 years ago
|
||
Note that this seems to be practically exclusive to Intel chipsets with driver versions of 8.15.10.* (a lot of different numbers appear on that 4th part of the version, so a pretty big range there), see the different facets on https://crash-stats.mozilla.com/search/?product=Firefox&version=39.0b3&process_type=browser&process_type=content&signature=%3Dmozilla%3A%3Alayers%3A%3ACompositorD3D11%3A%3AUpdateConstantBuffers%28%29&_facets=adapter_vendor_id&_facets=adapter_device_id&_facets=adapter_driver_version&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-adapter_driver_version
![]() |
||
Comment 31•10 years ago
|
||
Yeah, the Intel correlation would be consistent with comment 23.
I see that nearly all the crashes have a gfxCriticalError that says "Failed to map PSConstantBuffer. Result: -214702488" over and over. In hex that's 0x8007000e which means "Not enough storage is available to complete this operation."
Virtual/physical/pagefile stats generally look fine. Could it be referring to video memory?
Comment 32•10 years ago
|
||
Are all of these crashes D3D11 + D2D combination (as in, not D3D11 + D2D 1.1 combination), right?
![]() |
||
Comment 33•10 years ago
|
||
(In reply to Milan Sreckovic [:milan] from comment #32)
> Are all of these crashes D3D11 + D2D combination (as in, not D3D11 + D2D 1.1
> combination), right?
Right.
Assignee | ||
Comment 34•10 years ago
|
||
We tried to reproduce this on a machine in the Toronto office, and got black video and browser hangs but no crash.
It's fixed in nightly though, my best guess is bug 1153123 (though I haven't confirmed it).
I don't see any crash reports for nightly with builds since this landed, might not be enough data. We should see if this drops off in beta 4 when this was uplifted to beta.
Flags: needinfo?(matt.woodrow)
![]() |
||
Comment 35•10 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #34)
> I don't see any crash reports for nightly with builds since this landed,
> might not be enough data.
Nightly had a consistent trickle of single-digit crashes per day, up until the day that bug 1153123 landed, and zero crashes in the two weeks since. That's pretty good in my book!
Comment 36•10 years ago
|
||
While we're waiting for the beta numbers (bug 1153123 got uplifted to beta on Monday), Matt is going to prepare a patch to completely disable client side uploading, and that will be the big hammer ready to be applied to beta in case we don't see this bug go away there.
Flags: needinfo?(milan)
![]() |
||
Comment 37•10 years ago
|
||
I'm adding this FlushDeletionPool signature because it has the same symptoms: 8.15.10.x drivers; D3D11+D2D; no crashes on nightly after bug 1153123. I thoroughly expect it to disappear in 39b4. In the unlikely event that it doesn't, I'll split off a new bug.
Crash Signature: [@ mozilla::layers::CompositorD3D11::UpdateConstantBuffers()] → [@ mozilla::layers::CompositorD3D11::UpdateConstantBuffers()]
[@ NOutermost::CDevice::FlushDeletionPool(bool) ]
Comment 38•10 years ago
|
||
No 39b7 crashes.
Comment 39•10 years ago
|
||
[Tracking Requested - why for this release]:
I see 2 crashes for 0b7, but that sounds encouraging!
I'll mark this fixed for 39.
Comment 40•10 years ago
|
||
40 betas are affected too.
Comment 41•10 years ago
|
||
I see very few reports for 40 going back as far as Nightly. It doesn't look like this qualifies as a topcrash on 40 and, given that the crash rate for 40 is in an acceptable range, this is now wontfix for 40.
Note that given the low rate on 40, this bug is not tracked for 41+.
Comment 42•9 years ago
|
||
This still affects some users but it's nowhere near a topcrash anymore.
Firefox 40 has 4 reports.
Firefox 41 has 17 reports.
Firefox 42 has 3 reports.
Firefox 43 has 0 reports.
Firefox 44 has 0 reports.
Keywords: topcrash
Updated•9 years ago
|
Crash Signature: [@ mozilla::layers::CompositorD3D11::UpdateConstantBuffers()]
[@ NOutermost::CDevice::FlushDeletionPool(bool) ] → [@ mozilla::layers::CompositorD3D11::UpdateConstantBuffers()]
[@ NOutermost::CDevice::FlushDeletionPool(bool) ]
[@ mozilla::layers::CompositorD3D11::UpdateConstantBuffers]
[@ NOutermost::CDevice::FlushDeletionPool ]
Comment 44•9 years ago
|
||
This still affects users in current branches but at extremely low volume. I think this bug was originally filed because it was a spiking crash so I think we can close this now. I nominate that we close this bug report and file a new one if we want to deal with the outliers.
Lets call it closed.
Status: REOPENED → RESOLVED
Closed: 10 years ago → 9 years ago
Flags: needinfo?(bas)
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•