Closed Bug 758531 Opened 8 years ago Closed 7 years ago

crash in gfxContext::PushClipsToDT

Categories

(Core :: Graphics, defect, critical)

15 Branch
All
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla19
Tracking Status
firefox15 - ---
firefox17 - wontfix
firefox18 + verified
firefox19 --- verified
firefox-esr17 --- wontfix

People

(Reporter: scoobidiver, Assigned: bas.schouten)

References

Details

(Keywords: crash, regression, topcrash)

Crash Data

Attachments

(4 files)

It first appeared in 15.0a1/20120523164348 and is currently #4 top crasher in today's build. The regression range is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=36e938e51481&tochange=d499dc65cdab
It's probably a regression from bug 715768 (see bug 734948).

There are two kinds of stack:
Frame 	Module 	Signature 	Source
0 	xul.dll 	gfxContext::PushClipsToDT 	gfx/thebes/gfxContext.cpp:2019
1 	xul.dll 	gfxContext::PushGroup 	gfx/thebes/gfxContext.cpp:1467
2 	xul.dll 	gfxContext::PushGroupAndCopyBackground 	gfx/thebes/gfxContext.cpp:1547
3 	xul.dll 	mozilla::layers::BasicLayerManager::PushGroupForLayer 	gfx/layers/basic/BasicLayers.cpp:620
4 	xul.dll 	mozilla::layers::BasicThebesLayer::PaintThebes 	gfx/layers/basic/BasicLayers.cpp:677
5 	xul.dll 	mozilla::layers::BasicLayerManager::PaintLayer 	gfx/layers/basic/BasicLayers.cpp:2023
6 	xul.dll 	mozilla::layers::BasicLayerManager::PaintLayer 	gfx/layers/basic/BasicLayers.cpp:2038
7 	xul.dll 	mozilla::layers::BasicLayerManager::EndTransactionInternal 	gfx/layers/basic/BasicLayers.cpp:1688
8 	xul.dll 	mozilla::layers::BasicLayerManager::EndTransaction 	gfx/layers/basic/BasicLayers.cpp:1635
9 	xul.dll 	mozilla::PaintInactiveLayer 	layout/base/FrameLayerBuilder.cpp:1560
10 	xul.dll 	mozilla::FrameLayerBuilder::DrawThebesLayer 	layout/base/FrameLayerBuilder.cpp:2527
11 	xul.dll 	gfxContext::Clip 	gfx/thebes/gfxContext.cpp:1127
12 	xul.dll 	gfxContext::Clip 	gfx/thebes/gfxContext.cpp:1128
13 	xul.dll 	mozilla::layers::ThebesLayerD3D10::DrawRegion 	gfx/layers/d3d10/ThebesLayerD3D10.cpp:460

Frame 	Module 	Signature 	Source
0 	xul.dll 	gfxContext::PushClipsToDT 	gfx/thebes/gfxContext.cpp:2019
1 	xul.dll 	gfxContext::PushGroup 	gfx/thebes/gfxContext.cpp:1467
2 	xul.dll 	nsCSSBorderRenderer::DrawBorders 	layout/base/nsCSSRenderingBorders.cpp:1649
3 	xul.dll 	nsCSSRendering::PaintBorderWithStyleBorder 	layout/base/nsCSSRendering.cpp:553
4 	xul.dll 	nsCSSRendering::PaintBorder 	layout/base/nsCSSRendering.cpp:398
5 	xul.dll 	nsDisplayBorder::Paint 	layout/base/nsDisplayList.cpp:1578
6 	xul.dll 	mozilla::FrameLayerBuilder::DrawThebesLayer 	layout/base/FrameLayerBuilder.cpp:2541
7 	xul.dll 	gfxContext::Clip 	gfx/thebes/gfxContext.cpp:1127

More reports at:
https://crash-stats.mozilla.com/report/list?signature=gfxContext%3A%3APushClipsToDT%28mozilla%3A%3Agfx%3A%3ADrawTarget*%29
Bent and I are hitting this a lot.  We would appreciate a timely fix.
I'm crashing here a bunch, but crash-stats is somehow unable to give me a stack. I've verified this stack manually using a minidump though.
Assignee: nobody → bas.schouten
Crash Signature: [@ gfxContext::PushClipsToDT(mozilla::gfx::DrawTarget*)] → [@ gfxContext::PushClipsToDT(mozilla::gfx::DrawTarget*)] [@ mozilla::gfx::DrawTargetDual::SetTransform(mozilla::gfx::Matrix const&)]
I'd just like to reiterate I'm actively looking into this and am doing anything I can do find the cause. I'm guessing it's easy if I manage to catch this in a debugger.
Status: NEW → ASSIGNED
(In reply to ben turner [:bent] from comment #5)
> The minidump in
> https://crash-stats.mozilla.com/report/index/bp-015ff951-8d5b-4ee1-a123-
> 2292d2120526 is from my machine, maybe that will help?

I can't access raw dumps.
That minidump is 0 bytes :(
I've sent that minidump to Bas.
Hrm, it's really hard figuring this out from the minidump, if any of you guys is seeing this crash, please try and create a full dump so I can look at what's on the heap as I haven't been able to figure out what's going on yet :(
Caught this in a debugger today. Looks like we're OOM in gfxContext::PushGroup (CreateSimilarDrawTarget is returning null at http://mxr.mozilla.org/mozilla-central/source/gfx/thebes/gfxContext.cpp#1461 ). Bas suspects some kind of leak in some new azure code.
I'm looking into this, OOM does indeed look to be the cause, and it appears to be particularly present on specific operations (presumably pushing groups). I hope to track this down in the next day or so.
When a gfxPattern with the Azure-Thebes wrapper is used repeatedly, we placement new Azure Patterns for efficiently. I we use the same gfxPattern repeatedly though we fail to call the last constructed Pattern's destructor before creating a new one. This will cause us to leak a reference to the GradientStops collection for GradientPatterns.
Attachment #628210 - Flags: review?(roc)
https://hg.mozilla.org/mozilla-central/rev/2cfe694cbc1a
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla15
There are still crashes in 15.0a1/20120531.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
It does look to be a -lot- less common though, so it's possible this is just a general OOM that changed stack traces with Azure. How's the situation for khuey?
This seems to have dropped to less frequent than #20 top crasher. So at least we're in much better shape.
Duplicate of this bug: 758532
fwiw I haven't seen this since this landed.
It's #51 top crasher in 15.0a2 and #42 in 16.0a1.
Keywords: topcrash
[Triage Comment]
As per email from Bas stating that this likely only crashes now when we have a genuine OOM, combined with crash-stats being relatively low enough now, this can be untracked.
Crash Signature: [@ gfxContext::PushClipsToDT(mozilla::gfx::DrawTarget*)] [@ mozilla::gfx::DrawTargetDual::SetTransform(mozilla::gfx::Matrix const&)] → [@ gfxContext::PushClipsToDT(mozilla::gfx::DrawTarget*)] [@ mozilla::gfx::DrawTargetDual::SetTransform(mozilla::gfx::Matrix const&)] [@ mozilla::gfx::DrawTargetDual::DrawTargetDual(mozilla::gfx::DrawTarget*, mozilla::gfx::DrawTarget*)]
There's a spike in crashes from 17.0a1/20120714 (twice the previous level) affecting specifically Windows 8. The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f89feda9d997&tochange=22288130fea2
Keywords: topcrash
(In reply to Scoobidiver from comment #23)
> There's a spike [...] affecting specifically Windows 8.

There might be some correlation with Win8 shipping a public consumer preview yesterday, IIRC.
Let's wait and see if it's just another spike or if there is sustainable volume before tracking for 17.
Just had this crash.  I'm using Windows 8 RTM with the latest nVidia drivers (302.80) on a 8600M GT.  The crash profile is here: https://crash-stats.mozilla.com/report/index/bp-8b85e764-f761-45e7-9646-40b6a2120831

I was in the process of attempting to go full-screen on a Vimeo embedded player when the crash happened.
In My case Nightly crash with Taskbar tabs preview, if I only hover some tab on taskbar I got crash right away, when HWA off no problems.
Just got this today viewing bugzilla. Seriously doubt it was an OOM problem.

https://crash-stats.mozilla.com/report/index/9beb9e69-634f-44d0-a260-5372a2120921

Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/18.0 Firefox/18.0
(In reply to semtex2 from comment #28)
> In My case Nightly crash with Taskbar tabs preview, if I only hover some tab
> on taskbar I got crash right away, when HWA off no problems.

Same here on Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:18.0) Gecko/18.0 Firefox/18.0 ID:20120921030601

bp-8bbd10fc-32f4-4a39-a6db-1a97f2120921
Depends on: 793175
Jim (or anyone), can you attach a debugger next time you crash and try to get more info? E.g., is aDT null? If so, try to figure out how mDT->CreateSimilarDrawTarget(mDT->GetSize(), gfxPlatform::GetPlatform()->Optimal2DFormatForContent(content)) in gfxContext::PushGroup return null ... is the size crazy?
STR:

1) Open the browser with a single tab
2) Enable tab previews under the tabs panel in options
3) restart, one tab open
4) hover over the firefox taskbar icon to bring up a tab preview
5) hover over the tab preview to bring up aero peek

crash
That #3 step isn't needed.
This is #3 crasher as of 9-24-12, up by 3.8%. Maybe this is a result of a recent change?
It spiked as of the 20th, so can someone get a list of changes?
Comment 33 is related to bug 793175, not this bug. The regression range for the spike is also in bug 793175.
The regression range for this bug is in comment 0.
(In reply to Scoobidiver from comment #37)
> Comment 33 is related to bug 793175, not this bug. The regression range for
> the spike is also in bug 793175.
> The regression range for this bug is in comment 0.

Ah sorry, thought I had it.
(In reply to Jim Mathies [:jimm] from comment #33)
> 1) Open the browser with a single tab
> 2) Enable tab previews under the tabs panel in options
> 3) restart, one tab open
> 4) hover over the firefox taskbar icon to bring up a tab preview
> 5) hover over the tab preview to bring up aero peek

This doesn't crash for me :-(
Does the page you've loaded in the tab matter?
Not sure. I get that crash in nightlies reliably though on win7. Let me refresh my mc repo and build up a debug build, see if I can reproduce and get a good stack.
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #40)
> Does the page you've loaded in the tab matter?

I can reproduce on about:home after opening the browser. 

https://crash-stats.mozilla.com/report/index/bp-0b05dbf9-8f48-407c-b5df-cbb9e2120925
Attached file stack
CreateSimilarDrawTarget in PushGroup returns null, and we don't check the result.

As scoobie points out though, this isn't the SetTransform crash.
Attached file stack with params
Here's the failure, it's in DrawTargetCairo::CreateSimilarDrawTarget's cairo_surface_create_similar.
Dupe or not, this is at #60 topcrasher for 17 which is too low to track for release. If that changes once 17 is on Beta and we collect more data AND this bug isn't a dupe of bug 793175, please renominate for tracking.
It's #2 top browser crasher in today's build, back to the pre-bug 793175 volume:
2012100903      20
builds with bug 793175
2012092003	25
2012091903	30
2012091803	22
2012091703	12
2012091603	11
2012091503	26
2012091403	28  <-- second spike
2012091303	12
2012091208	5
2012091203	7
2012091114	9  <-- first spike with IonMonkey
2012091103	3
2012091003	3
2012090903	4
2012090803	3
2012090703	4
2012090603	3
2012090503	2
2012090103	2
2012083003	1
2012082903	1
2012082803	5
FWIW Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121010030605 crashes like bp-7e941b8b-a29f-45fc-9ae6-4ac1f2121010 when I was trying to verify this bug as fixed.

The crash report points to bug 698391.

If there's anything I could provide, please ask.
(In reply to alex_mayorga from comment #49)
> FWIW Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:19.0) Gecko/19.0
> Firefox/19.0 ID:20121010030605 crashes like
> bp-7e941b8b-a29f-45fc-9ae6-4ac1f2121010 when I was trying to verify this bug
> as fixed.
How can you verify an open bug? Did you want to verify bug 793175 instead?

If you have STR with a new crash signature, please file a new bug.
Silly me =(

You're right, I've posted the crash IDs to bug 793175.
Let's see if this continues to be a top crash on Aurora before tracking for release.
My crash report: https://crash-stats.mozilla.com/report/index/bp-6bb5f3f1-49fe-457e-81d8-cc8082121011

Build: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121010030605

about:buildconfig
Build Machine

w64-ix-slave102
Source

Built from http://hg.mozilla.org/mozilla-central/rev/ec10630b1a54
Build platform
target
i686-pc-mingw32
Build tools
Compiler 	Version 	Compiler flags
%cl InvokeClWithDependencyGeneration cl 	16.00.30319.01 	-TC -nologo -W3 -Gy -Fdgenerated.pdb -we4553 -DNDEBUG -DTRIMMED -Zi -UDEBUG -DNDEBUG -GL -wd4624 -wd4952 -O1 -Oy-
%cl InvokeClWithDependencyGeneration cl 	16.00.30319.01 	-TP -nologo -W3 -Gy -Fdgenerated.pdb -wd4345 -wd4351 -wd4800 -we4553 -GR- -DNDEBUG -DTRIMMED -Zi -UDEBUG -DNDEBUG -GL -wd4624 -wd4952 -O1 -Oy-
Configure arguments

--enable-update-channel=nightly --enable-update-packaging --enable-jemalloc --enable-signmar --enable-profiling --enable-js-diagnostics
I got a crash here on Try by running reftests with Cairo canvas and D2D content. That's not a properly supported configuration, but occurs, for example when we fall back for a very large canvas, and (I think) for tab previews too, which would explain the STR above. This is a fairly new thing, a month ago, we did not crash on the same Try run.

https://tbpl.mozilla.org/php/getParsedLog.php?id=16105085&tree=Try#error162
So this doesn't appear to be a duplicate of bug 793175.

The other theory was that this is somehow related to Azure. Any reason why this bug would drop off when we moved Azure up to Beta, Bas?
(In reply to Alex Keybl [:akeybl] from comment #55)
> So this doesn't appear to be a duplicate of bug 793175.
> 
> The other theory was that this is somehow related to Azure. Any reason why
> this bug would drop off when we moved Azure up to Beta, Bas?

Presumably less people on Beta have acceleration enabled? :)
(In reply to Nick Cameron [:nrc] from comment #54)
> I got a crash here on Try by running reftests with Cairo canvas and D2D
> content. That's not a properly supported configuration, but occurs, for
> example when we fall back for a very large canvas, and (I think) for tab
> previews too, which would explain the STR above. This is a fairly new thing,
> a month ago, we did not crash on the same Try run.
> 
> https://tbpl.mozilla.org/php/getParsedLog.php?id=16105085&tree=Try#error162

I can't seem to reproduce this locally :(
Did you see the tbpl crash multiple times Nick?
Flags: needinfo?(ncameron)
(In reply to Bas Schouten (:bas) from comment #58)
> Did you see the tbpl crash multiple times Nick?

I've only seen it once, but I've only recently pushed Cairo canvas/D2D content to Try once. I haven't tried to recreate locally either. I can do another similar push if you would like and see if it still happens.
Flags: needinfo?(ncameron)
(In reply to Nick Cameron [:nrc] from comment #54)
> I got a crash here on Try by running reftests with Cairo canvas and D2D
> content. That's not a properly supported configuration, but occurs, for
> example when we fall back for a very large canvas, and (I think) for tab
> previews too, which would explain the STR above. This is a fairly new thing,
> a month ago, we did not crash on the same Try run.
> 
> https://tbpl.mozilla.org/php/getParsedLog.php?id=16105085&tree=Try#error162

Since I can't repro it would be great if you could repush a similar run at least so we can see if we're dealing with a reproducible problem.
(In reply to Nick Cameron [:nrc] from comment #61)
> Sure, lets see what happens:
> https://tbpl.mozilla.org/?tree=Try&rev=dd8653476848

Yep, same crash again: https://tbpl.mozilla.org/php/getParsedLog.php?id=16291750&tree=Try#error161
DWrite fonts create mScaledFont on demand. This causes GetCairoScaledFont to return NULL. This subsequently caused the surface to go into an error status which caused the CreateSimilar call to fail for that surface. This patch properly uses the call which will create the cairo_scaled_font_t if it's not there yet.

I'm unsure this is what's responsible for most of the crashes we're seeing, but let's get it in quickly and see.
Attachment #673630 - Flags: review?(jmuizelaar)
Blocks: 803949
Comment on attachment 673630 [details] [diff] [review]
Use CairoScaledFont instead of GetCairoScaledFont

Review of attachment 673630 [details] [diff] [review]:
-----------------------------------------------------------------

Include the nice bugzilla comment in the checkin comment.
Attachment #673630 - Flags: review?(jmuizelaar) → review+
https://hg.mozilla.org/mozilla-central/rev/74dd92789173

Should this have a test?
Status: REOPENED → RESOLVED
Closed: 8 years ago7 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
(In reply to Ryan VanderMeulen from comment #65)
> https://hg.mozilla.org/mozilla-central/rev/74dd92789173
> 
> Should this have a test?

This particular crash is not reproducible 'on purpose' to the best of my knowledge.
Target Milestone: mozilla15 → mozilla19
Only two crashes so far with the new build, which is considerably less than we saw before. Let's wait one more day and then start requesting aurora and beta approval for these patches. (along with the suspected causing patches)
There are currently 7 crashes whose 6 ones on Windows 8 while an average of 25 previously, so they dropped by 75%. I filed bug 805406 for remaining crashes.
Comment on attachment 673630 [details] [diff] [review]
Use CairoScaledFont instead of GetCairoScaledFont

[Approval Request Comment]
Bug caused by (feature/regressing bug #): Bug 778367
User impact if declined: Possible, rare crash.
Testing completed (on m-c, etc.): nightly coverage
Risk to taking this patch (and alternatives if risky): Barely any, this is strictly an improvement.
String or UUID changes made by this patch: None
Attachment #673630 - Flags: approval-mozilla-beta?
Attachment #673630 - Flags: approval-mozilla-aurora?
Comment on attachment 673630 [details] [diff] [review]
Use CairoScaledFont instead of GetCairoScaledFont

Given the risk assessment and the opportunity to get this into Beta 4 for 17 approving for uplift to branches.
Attachment #673630 - Flags: approval-mozilla-beta?
Attachment #673630 - Flags: approval-mozilla-beta+
Attachment #673630 - Flags: approval-mozilla-aurora?
Attachment #673630 - Flags: approval-mozilla-aurora+
(In reply to Lukas Blakk [:lsblakk] from comment #70)
> Comment on attachment 673630 [details] [diff] [review]
> Use CairoScaledFont instead of GetCairoScaledFont
> 
> Given the risk assessment and the opportunity to get this into Beta 4 for 17
> approving for uplift to branches.

Turns out this patch does not need uplifting, this issue only introduced on Aurora. The reason it was nominated was because we didn't know which of the two PushClipsToDT patches fixed the crashes! Guess it wasn't this one!
(In reply to Bas Schouten (:bas.schouten) from comment #71)
> (In reply to Lukas Blakk [:lsblakk] from comment #70)
> > Comment on attachment 673630 [details] [diff] [review]
> > Use CairoScaledFont instead of GetCairoScaledFont
> > 
> > Given the risk assessment and the opportunity to get this into Beta 4 for 17
> > approving for uplift to branches.
> 
> Turns out this patch does not need uplifting, this issue only introduced on
> Aurora. The reason it was nominated was because we didn't know which of the
> two PushClipsToDT patches fixed the crashes! Guess it wasn't this one!

Doesn't need uplifting to beta, that is. It should still go on Aurora.
Still a good number of crashes on 18 beta, but these seem to be tracked in bug 805406, so setting this to verified (original high number reduced)

https://crash-stats.mozilla.com/report/list?signature=gfxContext%3A%3APushClipsToDT%28mozilla%3A%3Agfx%3A%3ADrawTarget*%29
Keywords: verifyme
QA Contact: virgil.dicu
Wontfixing for ESR17 since this unfortunately didn't make it into Firefox 17 and it doesn't meet ESR criteria without signs of significant user pain. We can revisit if there are high levels of crash volume in ESR deployments.
(In reply to Virgil Dicu [:virgil] [QA] from comment #74)
> Still a good number of crashes on 18 beta, but these seem to be tracked in
> bug 805406, so setting this to verified (original high number reduced)
> 
> https://crash-stats.mozilla.com/report/
> list?signature=gfxContext%3A%3APushClipsToDT%28mozilla%3A%3Agfx%3A%3ADrawTarg
> et*%29

Setting to verified in 19 for the same reason. Still a good number of crashes, though. Bug 805406 to track those, I guess.
mass remove verifyme requests greater than 4 months old
Keywords: verifyme
You need to log in before you can comment on or make changes to this bug.