crash in mozilla::gfx::SourceSurfaceD2DTarget::GetDataSurface()

RESOLVED FIXED

Status

()

defect
--
critical
RESOLVED FIXED
4 years ago
Last year

People

(Reporter: philipp, Assigned: milan)

Tracking

({crash})

42 Branch
x86
Windows NT
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox41 unaffected, firefox42+ fixed, firefox43 fixed, firefox44 fixed)

Details

(crash signature)

Attachments

(1 attachment)

This bug was filed from the Socorro interface and is 
report bp-07ec21c3-c988-4b1b-9f4d-5c8b02150924.
=============================================================
Crashing Thread
Frame 	Module 	Signature 	Source
0 	xul.dll 	mozilla::gfx::SourceSurfaceD2DTarget::GetDataSurface() 	gfx/2d/SourceSurfaceD2DTarget.cpp
1 	xul.dll 	mozilla::gfx::GetCairoSurfaceForSourceSurface(mozilla::gfx::SourceSurface*, bool, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&) 	gfx/2d/DrawTargetCairo.cpp
2 	xul.dll 	mozilla::gfx::GfxPatternToCairoPattern 	gfx/2d/DrawTargetCairo.cpp
3 	xul.dll 	mozilla::gfx::DrawTargetCairo::DrawPattern(mozilla::gfx::Pattern const&, mozilla::gfx::StrokeOptions const&, mozilla::gfx::DrawOptions const&, mozilla::gfx::DrawTargetCairo::DrawPatternType, bool) 	gfx/2d/DrawTargetCairo.cpp

this graphics crash seems to have been introduced in 42.0a1 and subsequent builds - it's on rank #16 in the 42.0b1 crash list at them moment. a broad range of devices from all major gpu vendors are affected.

https://crash-stats.mozilla.com/search/?signature=%3Dmozilla%3A%3Agfx%3A%3ASourceSurfaceD2DTarget%3A%3AGetDataSurface%28%29&date=%3E2015-01-01&_facets=signature&_facets=version&_facets=adapter_vendor_id&_facets=adapter_device_id&_facets=adapter_driver_version&_facets=app_notes&_facets=user_comments&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform#facet-version
Based on build IDs it looks like this *may* have started with the July 30, 2015 build which assumes the following pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=2ee9895e032c&tochange=62469b20ec84
Assignee: nobody → bas
Status: NEW → ASSIGNED
Tracking as it is an important crash in 42.
Milan, do you know who could help with this? Thanks
Flags: needinfo?(milan)
The most supicious bug in the pushlog is bug 1183910.
Blocks: 1183910
We should be able to just patch the null pointer access, if the stack is to be trusted - David, could this be a result of the "things are getting mixed up with D2D and unaccelerated" you dealt with recently?
Assignee: bas → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(milan) → needinfo?(dvander)
Yeah, I bet it is exactly that. We used to not clear the ID3D10Device on TDRs, which bug 1183910 changed. We may have gotten a TDR, removed the factory, and fallen back to software. Then if we still had a 2D draw target lying around it might now assume we're still accelerated and still have an ID3D10 device.

Too bad we don't have more stack here, since I bet some cache is not being invalidated on TDRs and that's the real bug. But a null check on the d3d10 should be there anyway.
Flags: needinfo?(dvander)
Comment on attachment 8671538 [details] [diff] [review]
Stop the nullptr dereference. r=bschouten

Review of attachment 8671538 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/2d/SourceSurfaceD2DTarget.cpp
@@ +64,5 @@
>    desc.BindFlags = 0;
>    desc.MiscFlags = 0;
>  
> +  if (!Factory::GetDirect3D10Device()) {
> +    gfxCriticalError() << "Invalid D3D10 device in D2D target surface";

nit: No D3D10 device assigned to factory.

Would be a more correct message.
Attachment #8671538 - Flags: review?(bas) → review+
checkin-needed without a try run; we're replacing a nullptr dereference, but let me know if you want a run.
Keywords: checkin-needed
There are some more worrisome cases here: in gfx/2d/SourceSurfaceD2DTarget.cpp
Leave open after we land the patch above, just to see what happens to the crashes.
Keywords: leave-open
Comment on attachment 8671538 [details] [diff] [review]
Stop the nullptr dereference. r=bschouten

Approval Request Comment
[Feature/regressing bug #]: Suspected bug 1183910
[User impact if declined]: Crash
[Describe test coverage new/current, TreeHerder]: None, code not used anymore on trunk
[Risks and why]: Low, nullpointer check, may change crash into some artifacting
[String/UUID change made/needed]: None
Attachment #8671538 - Flags: approval-mozilla-beta?
Attachment #8671538 - Flags: approval-mozilla-aurora?
Comment on attachment 8671538 [details] [diff] [review]
Stop the nullptr dereference. r=bschouten

Fix a crash, taking it. Should be in 38 beta 6.
Attachment #8671538 - Flags: approval-mozilla-beta?
Attachment #8671538 - Flags: approval-mozilla-beta+
Attachment #8671538 - Flags: approval-mozilla-aurora?
Attachment #8671538 - Flags: approval-mozilla-aurora+
In the previous comment s/38/42/
Looks like I was traumatized by the 38 cycle :p
Crash Signature: [@ mozilla::gfx::SourceSurfaceD2DTarget::GetDataSurface()] → [@ mozilla::gfx::SourceSurfaceD2DTarget::GetDataSurface()] [@ mozilla::gfx::SourceSurfaceD2DTarget::GetDataSurface]
The crashes with this particular signature look appears to have stopped - was there a spike elsewhere?
(In reply to Milan Sreckovic [:milan] from comment #21)
> The crashes with this particular signature look appears to have stopped -
> was there a spike elsewhere?

Perhaps in OOMs?
Assignee: nobody → milan
We're still seeing reports of this crash but it's at really low volume and all on unsupported Firefox versions (14 reports last week, 100% in Firefox Beta 42). For this reason I'm closing this bug as fixed.
Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Removing leave-open keyword from resolved bugs, per :sylvestre.
Keywords: leave-open
You need to log in before you can comment on or make changes to this bug.