Closed Bug 793175 Opened 9 years ago Closed 9 years ago

Firefox 18 spike in crashes @ gfxContext::PushClipsToDT while hovering over tab preview in Task Bar

Categories

(Core :: Graphics, defect)

18 Branch
All
Windows 7
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla18
Tracking Status
firefox18 + verified

People

(Reporter: scoobidiver, Assigned: bas.schouten)

References

Details

(4 keywords)

Crash Data

Attachments

(1 file)

There's a spike in crashes from 18.0a1/20120921. The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=1e56d3016820&tochange=48c4938eaf57

Signature 	gfxContext::PushClipsToDT(mozilla::gfx::DrawTarget*) More Reports Search
UUID	e74430b0-4e7a-4c86-ac21-706972120921
Date Processed	2012-09-21 14:23:41
Uptime	1335
Last Crash	2.7 weeks before submission
Install Age	22.2 minutes since version was first installed.
Install Time	2012-09-21 14:01:16
Product	Firefox
Version	18.0a1
Build ID	20120921030601
Release Channel	nightly
OS	Windows NT
OS Version	6.1.7601 Service Pack 1
Build Architecture	x86
Build Architecture Info	GenuineIntel family 6 model 58 stepping 9
Crash Reason	EXCEPTION_ACCESS_VIOLATION_READ
Crash Address	0x0
App Notes 	
AdapterVendorID: 0x10de, AdapterDeviceID: 0x0e22, AdapterSubsysID: 23811462, AdapterDriverVersion: 9.18.13.623
Has dual GPUs. GPU #2: AdapterVendorID2: 0x8086, AdapterDeviceID2: 0x0162, AdapterSubsysID2: 0000000c, AdapterDriverVersion2: 9.17.10.2849D2D! D2D+ DWrite? DWrite+ D3D10 Layers? D3D10 Layers+ 
EMCheckCompatibility	True
Adapter Vendor ID	0x10de
Adapter Device ID	0x0e22
Total Virtual Memory	4294836224
Available Virtual Memory	3217252352
System Memory Use Percentage	20
Available Page File	22063792128
Available Physical Memory	9950052352

Frame 	Module 	Signature 	Source
0 	xul.dll 	gfxContext::PushClipsToDT 	gfx/thebes/gfxContext.cpp:2031
1 	xul.dll 	gfxContext::PushGroup 	gfx/thebes/gfxContext.cpp:1479
2 	xul.dll 	mozilla::layers::BasicLayerManager::PushGroupForLayer 	gfx/layers/basic/BasicLayerManager.cpp:81
3 	xul.dll 	mozilla::layers::BasicThebesLayer::PaintThebes 	gfx/layers/basic/BasicThebesLayer.cpp:132
4 	xul.dll 	mozilla::layers::BasicLayerManager::PaintSelfOrChildren 	gfx/layers/basic/BasicLayerManager.cpp:822
5 	xul.dll 	mozilla::layers::BasicLayerManager::PaintLayer 	gfx/layers/basic/BasicLayerManager.cpp:941
6 	xul.dll 	mozilla::layers::BasicLayerManager::PaintSelfOrChildren 	gfx/layers/basic/BasicLayerManager.cpp:837
7 	xul.dll 	mozilla::layers::BasicLayerManager::PaintLayer 	gfx/layers/basic/BasicLayerManager.cpp:941
8 	xul.dll 	mozilla::layers::BasicLayerManager::EndTransactionInternal 	gfx/layers/basic/BasicLayerManager.cpp:585
9 	xul.dll 	mozilla::layers::BasicLayerManager::EndTransaction 	gfx/layers/basic/BasicLayerManager.cpp:506

More reports at:
https://crash-stats.mozilla.com/report/list?signature=gfxContext%3A%3APushClipsToDT%28mozilla%3A%3Agfx%3A%3ADrawTarget*%29
Keywords: topcrash
Bug 772726 will have caused this.
Blocks: 772726
It's #1 top crasher with about 15 crashes an hour.
Assignee: nobody → bas.schouten
STR:
* Open two tabs
* Check "Show tab preview in the Windows taskbar" in Options - Tabs
* Hover over one of the tab thumbnail in the taskbar
(In reply to Scoobidiver from comment #3)
> STR:
> * Open two tabs
> * Check "Show tab preview in the Windows taskbar" in Options - Tabs
> * Hover over one of the tab thumbnail in the taskbar

And HWA must be on, when not available/disabled then there is no problem.
DrawTargetCairo::CreateSimilarDrawTarget is failing in cairo_surface_create_similar because the current surface is in an error state.

We shouldn't crash when newDT is null, at least.

(We're using cairo because the tab preview code creates a 2D canvas context to draw into a gfxWindowsSurface (from which it later extracts an HDC).)

The reason the current surface is in error is because painting an image called DrawTargetCairo::DrawPattern passing in a SurfacePattern with a SourceSurfaceD2D surface, and GetCairoSurfaceForSourceSurface has returned null.

GetCairoSurfaceForSourceSurface returned null because GetDataSurface returned null. Because it just isn't implemented :-(.

Image drawing's SurfacePattern uses a SourceSurfaceD2D because we start with a gfxImageSurface, but we pass it to gfxPlatform::GetSourceSurfaceForSurface, which turns it into a SourceSurfaceD2D. That's unfortunate since when we draw it into the gfxWindowsSurface, we'll need the gfxImageSurface contents.
This patch discards a source surface when it was created with a different backend type. In its current form this means if a surface is rapidly switching between usage for two backend types that could potentially be very inefficient. However I believe with the interop layer that should never really be happening(except in some extremely rare case of a complex design with a very large canvas that won't fit in D2D, and skia-azure-content drawing for example, when an image is rapidly redrawn on both), if it does we probably need to fix that anyway.
Attachment #664526 - Flags: review?(jmuizelaar)
Comment on attachment 664526 [details] [diff] [review]
Discard SourceSurface when of the wrong backend type

Review of attachment 664526 [details] [diff] [review]:
-----------------------------------------------------------------

This needs a test case that exercises this code path.

::: gfx/thebes/gfxPlatform.cpp
@@ +493,5 @@
>  {
> +  RefPtr<SourceSurface> mSrcSurface;
> +  BackendType mBackendType;
> +};
> +

This should have a comment about the problem it's trying to avoid.
Attachment #664526 - Flags: review?(jmuizelaar) → review+
Duplicate of this bug: 794119
Bas, can you fix SourceSurfaceD2D::GetDataSurface as well?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #11)
> Bas, can you fix SourceSurfaceD2D::GetDataSurface as well?

This isn't trivial, I do plan to do so but it means changing the way we upload -or- making a really slow version just for fallback purposes, I'm not 100% sure I know what the right approach is.
It's currently #2 top crasher in 18.0a1. Can we land the reviewed patch as it is and fix SourceSurfaceD2D::GetDataSurface in another bug?
https://hg.mozilla.org/mozilla-central/rev/6c89d2b6ea4b
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Backed out for (somehow) turning Win7 opt mochitest-other's browser_updatessl.js and friends permaorange. Retriggers on the way to be sure; but already looks prettty clear cut (even though from the diff, this wouldn't have been my first choice).

https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=Rev3%20WINNT%206.1%20mozilla-inbound%20opt%20test%20mochitest-other&rev=6c89d2b6ea4b

https://tbpl.mozilla.org/?tree=Mozilla-Inbound&jobname=Rev3%20WINNT%206.1%20mozilla-inbound%20opt%20test%20mochitest-other (and page down lots until you start to see green)

Backout:
https://hg.mozilla.org/integration/mozilla-inbound/rev/e330290d261f
https://hg.mozilla.org/mozilla-central/rev/08bf91d3c0c3
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla18 → ---
It does seem fairly impossible :) But who knows.
https://hg.mozilla.org/mozilla-central/rev/e1fb91e37cb8
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Was trying to verify this on Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121010030605 and got these crashes:

bp-7e941b8b-a29f-45fc-9ae6-4ac1f2121010
bp-7a83da0e-ba1d-45bf-9799-3b8182121010
bp-a4ab73a6-2f2e-4ab6-ac8d-58a332121010

Should this be reopened?
For me, I can't reproduce with the STR in comment 3.

(In reply to alex_mayorga from comment #20)
> Was trying to verify this on Mozilla/5.0 (Windows NT 6.1; Win64; x64;
> rv:19.0) Gecko/19.0 Firefox/19.0 ID:20121010030605 and got these crashes:
What are your exact STR?
I believe the key is to try to preview a tab that's yet to be loaded right after a session restore.

Another crash ID if it helps:

bp-6a549719-1e56-4127-a194-a38362121010
Can someone who sees this say what their memory usage at the time of crashing roughly looks like?
(In reply to alex_mayorga from comment #22)
> I believe the key is to try to preview a tab that's yet to be loaded right
> after a session restore.
> 
> Another crash ID if it helps:
> 
> bp-6a549719-1e56-4127-a194-a38362121010

I strongly suspect what you're seeing is bug 698391.
(In reply to Bas Schouten (:bas) from comment #23)
> Can someone who sees this say what their memory usage at the time of
> crashing roughly looks like?

It was 71% total usage when it crashed again like bp-4246ac6f-ff36-410b-b3dc-c475d2121010 but that's apparently bug 798274.

In case it is of interest, these are some screen captures of visual oddities that happen on or around these crashes:

http://imm.io/HqKp The URL bar disappears from the preview an the window's border overflows to the second monitor, perhaps bug 724940

http://imm.io/HqKM The whole preview is blank and the window's border overflows too, perhaps bug 724940

http://imm.io/HnnJ Nightly button ant tabs become transparent, perhaps bug 798224
This looks fixed to me on FF 18b1 and Nightly 20.0a1 (2012-11-27) based on the STR in comment 3, but I see some crash reports related to these versions. Any thoughts?
https://crash-stats.mozilla.com/report/list?signature=gfxContext%3A%3APushClipsToDT%28mozilla%3A%3Agfx%3A%3ADrawTarget*%29
(In reply to Paul Silaghi [QA] from comment #26)
> I see some crash reports related to these versions. Any thoughts?
They are bug 758531 and bug 803949 both fixed in 19.0, and bug 805406 not fixed.
Verified fixed on Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/18.0 Firefox/18.01b based on comment 27.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.