Closed Bug 622165 Opened 14 years ago Closed 13 years ago

Firefox 4.0b9pre Crash Report [@ _moz_cairo_surface_set_device_offset ]

Categories

(Core :: Graphics, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- final+

People

(Reporter: chofmann, Assigned: mattwoodrow)

References

Details

(Keywords: crash, reproducible, testcase, Whiteboard: [sg:dos][softblocker])

Crash Data

Attachments

(4 files, 1 obsolete file)

spin off from testing of security bug 581539

load mz's fuzzer planned for release next week:

http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_randomized_20100729_seed.html


I consistently hit a crash [@ _moz_cairo_surface_set_device_offset ] 

bp-77a7fdaf-3d67-4cc7-a719-eaa4d 210123012/30/1012:35 PM
bp-e703d730-aea6-4a55-a9f7-4abae 210123012/30/1012:27 PM


Frame 	Module 	Signature [Expand] 	Source
0 	XUL 	_moz_cairo_surface_set_device_offset 	gfx/cairo/cairo/src/cairo-surface.c:1126
1 	XUL 	mozilla::gl::BasicTextureImage::BeginUpdate 	gfx/thebes/GLContext.cpp:585
2 	XUL 	mozilla::layers::BasicBufferOGL::BeginPaint 	gfx/layers/opengl/ThebesLayerOGL.cpp:462
3 	XUL 	mozilla::layers::ThebesLayerOGL::RenderLayer 	gfx/layers/opengl/ThebesLayerOGL.cpp:546
4 	XUL 	mozilla::layers::ContainerLayerOGL::RenderLayer 	gfx/layers/opengl/ContainerLayerOGL.cpp:250
5 	XUL 	mozilla::layers::LayerManagerOGL::Render 	gfx/layers/opengl/LayerManagerOGL.cpp:600
6 	XUL 	mozilla::layers::LayerManagerOGL::EndTransaction 	gfx/layers/opengl/LayerManagerOGL.cpp:418
7 	XUL 	nsDisplayList::PaintForFrame 	layout/base/nsDisplayList.cpp:477
8 	XUL 	nsLayoutUtils::PaintFrame 	layout/base/nsLayoutUtils.cpp:1433
9 	XUL 	PresShell::Paint 	layout/base/nsPresShell.cpp:6108
10 	XUL 	nsViewManager::RenderViews 	view/src/nsViewManager.cpp:447
11 	XUL 	nsViewManager::Refresh 	view/src/nsViewManager.cpp:413
12 	XUL 	nsViewManager::DispatchEvent 	view/src/nsViewManager.cpp:912
13 	XUL 	HandleEvent 	view/src/nsView.cpp:161
14 	XUL 	nsChildView::DispatchEvent 	widget/src/cocoa/nsChildView.mm:1786
15 	XUL 	nsChildView::DispatchWindowEvent 	widget/src/cocoa/nsChildView.mm:1796
16 	XUL 	-[ChildView drawRect:inContext:] 	widget/src/cocoa/nsChildView.mm:2728
17 	XUL 	-[ChildView drawRect:] 	widget/src/cocoa/nsChildView.mm:2632
18 	AppKit 	AppKit@0x101080

security closed until this gets checked out.
Group: core-security
marcia, can you check this out with any release builds that you have set up in the lab?
looks like these show up at about the rate of 1-4 per day on b7 and b8.

once every few days we get the same signature but different stack on 3.6.13

those reports look like

http://crash-stats.mozilla.com/report/index/f7d054d4-2b6f-4078-bb87-f14942101229
also see this on another laptop with 10.5

http://crash-stats.mozilla.com/report/index/efafa8b9-1a1b-462d-b180-670da2101230
on winXP with 3.6.13 I run a bit longer but eventually crashed in [@ WrappedNativeMarker ] after JC

http://crash-stats.mozilla.com/report/index/60ec9f76-7e75-4de4-9259-438742101230
and with winXP on 4.0b8 a different stack with crash [@ DefinePropertyIfFound ]

http://crash-stats.mozilla.com/report/index/37a1e391-7dad-4179-b391-39e672101230
winXP from mozilla-central seems to be the only release that holds up pretty well without crashing for me so far.
I had some crashes on XP after a while, too (well, and Flash is crashing a lot, but that's a separate story).
I notice these errors on the console just before crashing on a clean profile

Fri Dec 31 22:19:41 chris-hofmanns-macbook-pro.local firefox-bin[8318] <Error>: kCGErrorFailure: CGSShapeWindow
Fri Dec 31 22:19:41 chris-hofmanns-macbook-pro.local firefox-bin[8318] <Error>: kCGErrorFailure: Set a breakpoint @ CGErrorBreakpoint() to catch errors as they are logged.
2010-12-31 22:19:41.472 firefox-bin[8318:903] _NXPlaceWindow: error setting window shape (1000)
Fri Dec 31 22:19:41 chris-hofmanns-macbook-pro.local firefox-bin[8318] <Error>: kCGErrorFailure: CGSShapeWindow
2010-12-31 22:19:41.472 firefox-bin[8318:903] _NSShapeRoundedWindowWithWeighting: error setting window shape (1000)
2010-12-31 22:19:58.312 firefox-bin[8334:903] *** __NSAutoreleaseNoPool(): Object 0x11fb00d20 of class __NSFastEnumerationEnumerator autoreleased with no pool in place - just leaking
2010-12-31 22:19:58.314 firefox-bin[8334:903] *** __NSAutoreleaseNoPool(): Object 0x10420b630 of class TopLevelWindowData autoreleased with no pool in place - just leaking
2010-12-31 22:19:58.315 firefox-bin[8334:903] *** __NSAutoreleaseNoPool(): Object 0x11fb00ee0 of class NSCFString autoreleased with no pool in place - just leaking
2010-12-31 22:19:58.316 firefox-bin[8334:903] *** __NSAutoreleaseNoPool(): Object 0x103048750 of class ToolbarWindow autoreleased with no pool in place - just leaking
2010-12-31 22:19:58.316 firefox-bin[8334:903] *** __NSAutoreleaseNoPool(): Object 0x104209460 of class WindowDelegate autoreleased with no pool in place - just leaking
mac 10.6 crash :

http://crash-stats.mozilla.com/report/index/8ae84975-53d1-43f9-8594-9e9572110102

Frame 	Module 	Signature [Expand] 	Source
0 	XUL 	_moz_cairo_surface_set_device_offset 	gfx/cairo/cairo/src/cairo-surface.c:1126
1 	XUL 	mozilla::gl::BasicTextureImage::BeginUpdate 	gfx/thebes/GLContext.cpp:585
2 	XUL 	mozilla::layers::BasicBufferOGL::BeginPaint 	gfx/layers/opengl/ThebesLayerOGL.cpp:462
3 	XUL 	mozilla::layers::ThebesLayerOGL::RenderLayer 	gfx/layers/opengl/ThebesLayerOGL.cpp:546
4 	XUL 	mozilla::layers::ContainerLayerOGL::RenderLayer 	gfx/layers/opengl/ContainerLayerOGL.cpp:250
5 	XUL 	mozilla::layers::LayerManagerOGL::Render 	gfx/layers/opengl/LayerManagerOGL.cpp:600
6 	XUL 	mozilla::layers::LayerManagerOGL::EndTransaction 	gfx/layers/opengl/LayerManagerOGL.cpp:418
7 	XUL 	nsDisplayList::PaintForFrame 	layout/base/nsDisplayList.cpp:477
8 	XUL 	nsLayoutUtils::PaintFrame 	layout/base/nsLayoutUtils.cpp:1433
9 	XUL 	PresShell::Paint 	layout/base/nsPresShell.cpp:6108
10 	XUL 	nsViewManager::RenderViews 	view/src/nsViewManager.cpp:447
11 	XUL 	nsViewManager::Refresh 	view/src/nsViewManager.cpp:413
12 	XUL 	nsViewManager::DispatchEvent 	view/src/nsViewManager.cpp:912
13 	XUL 	HandleEvent 	view/src/nsView.cpp:161
14 	XUL 	nsChildView::DispatchEvent 	widget/src/cocoa/nsChildView.mm:1786
15 	XUL 	nsChildView::DispatchWindowEvent 	widget/src/cocoa/nsChildView.mm:1796
16 	XUL 	-[ChildView drawRect:inContext:] 	widget/src/cocoa/nsChildView.mm:2728
17 	XUL 	-[ChildView drawRect:] 	widget/src/cocoa/nsChildView.mm:2632
I suspect everyone that runs the fuzzer off coredump.cx won't submit reports if they crash, but looks like the number of people that reported crashes while running the fuzzer on the first day of release was actually pretty low.

JS_CallTracer 3.6.13 Linux 2.6.35 x86_64 
wyciwyg://83/http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_final_20100728.html \N
http://crash-stats.mozilla.com/report/index/e6c22613-5093-4c5b-bdc6-f6b892110101

WrappedNativeMarker 3.6.13 Windows NT 6.1.7600 
http://lcamtuf.coredump.cx/cross_fuzz/targets/target.html \N
http://crash-stats.mozilla.com/report/index/73e32642-6247-4f40-a923-a75a82110101

XUL@0x11985b6 4.0b8pre Mac OS X 10.6.5 10H574 
http://lcamtuf.coredump.cx/cross_fuzz/targets/target.html \N
http://crash-stats.mozilla.com/report/index/45b7aeca-4dd4-4796-9d30-b294b2110101

WrappedNativeJSGCThingTracer 4.0b8 Windows NT 6.1.7600 
http://lcamtuf.coredump.cx/cross_fuzz/targets/%5Bobject%20History%5D RUNNING fuzzer
http://crash-stats.mozilla.com/report/index/9b2bb13b-d788-4d91-ba3e-4ddc52110101

XPCNativeSet::Mark() 4.0b8 Windows NT 5.1.2600 Service Pack 3 
wyciwyg://42/http://lcamtuf.coredump.cx/cross_fuzz/cross_fuzz_final_20100728.html \N
http://crash-stats.mozilla.com/report/index/ed08f663-0d4b-4a59-aaf5-fd6ec2110101
If I'm reading the format right, the two last ones look bad (i.e. non-NULL bad memory access).
Nothing in #11 looks thebes/gfx related. This is all GC and wrapper (well native wrapper == reflectors) business. But yeah, at least 2 look bad. We have to immediately get on top of this with the fuzzer out there. I will clone the bug for the wrapper stuff. Can I get some qa assistance with reproducing the crashes?
Blocks: 622456
No longer blocks: 622456
Depends on: 622456
The Cairo surface here is null, but the gfxASurface isn't.

Matt, can you look into this? We probably just need to insert one more check.

This blocks because it's a very common crash from the fuzzer, and we need to actually be able to find dangerous security bugs from it.
Assignee: nobody → matt.woodrow+bugzilla
blocking2.0: --- → final+
Whiteboard: [sg:dos]
Attached patch Fix (obsolete) — Splinter Review
I cant reproduce this locally, I'm hitting a crash inside PL_Base64DecodeBuffer instead. Will look for a bug for this.

This patch should fix the crash - Though it yet again highlights our problem of using a constructor to initialize surfaces which can fail silently.

I'd be interested in seeing exactly what conditions caused this to fail, just ignoring it may not be the best solution.
Attachment #501173 - Flags: review?(joe)
Bug 345094 looks to be my issue.
No longer depends on: 622456
Comment on attachment 501173 [details] [diff] [review]
Fix

I don't think this will work. I'm pretty sure this is being caused by gfxQuartzSurface constructors' early exit; we don't really put the surface in an error state in that case. It's perfectly valid to have an erroneous surface - drawing to it just doesn't do anything.

I think we need to fix the gfxQuartzSurface constructors to put their surfaces in an error state in that case. Then we can return NULL here if the surface is in error, or, if we're feeling bold, just continue on. (The worry I have with just carrying on is that the error status will propagate to our surfaces elsewhere.)
Attachment #501173 - Flags: review?(joe) → review-
CairoStatus() checks mSurfaceValid which is initialized to false in the gfxASurface constructor. If we early exit from the gfxQuartzSurface ctor, this should stay false, and return -1 from CairoStatus.
No longer blocks: crossfuzz
Whiteboard: [sg:dos] → [sg:dos][softblocker]
Comment on attachment 501173 [details] [diff] [review]
Fix

In that case, we should get this in and see if it fixes the problem.
Attachment #501173 - Flags: review- → review+
Attached file testcase
bp-51cbbd68-f6b8-45cd-a7c5-9d8092110111

reproduced on mac os x 10.5 intel. haven't tried elsewhere yet.
Thanks for the test case Bob!

The above patch fixes the specific crash that you are experiencing, but firefox instead locked up and eventually killed my entire session on mac (all applications closed :( ).

The problem is mLayer->GetVisibleRegion().GetBounds() inside BasicBufferOGL::BeginPaint is returning:

(gdb) p visibleBounds
$4 = {
  x = 0, 
  y = 0, 
  width = 1000000, 
  height = 65, 
}

This is beyond the max texture size and causing our texture allocations to fail.

CC'ing roc in case he has any ideas how this could happen.
The testcase can be further simplified to:

<script>
window.resizeTo(50, 4500000000);
window.resizeTo(1e6, 50);
</script>

The first resizeTo call is hitting:

###!!! ASSERTION: non-root frame's desired size changed during an incremental reflow: '(target == rootFrame && size.height == NS_UNCONSTRAINEDSIZE) || (desiredSize.width == size.width && desiredSize.height == size.height)', file /Users/mattwoodrow/mozilla-central2/layout/base/nsPresShell.cpp, line 7818


The second then ends up with a visible region 1e6 wide and away we go.

Either line on it's own doesn't crash, the first results in no rendering and the second clamps the width to my screen size and renders fine.
We simply try to create layers as big as the window.
I don't understand why the 1e6 value is clamped to window width normally, but accepted as is when used after the first resizeTo attempt.

It seems like we should fix the assertion and make the resizeTo behaviour consistent.

If we still need to patch the accelerated backends to skip rendering on oversized layers at that point, then I can work on patches for that.
Attached patch Fix v2Splinter Review
This fixes the crash entirely and stops all GL errors.

Obviously rendering is entirely broken, and trying to use expose causes the WindowServer process to use 100% cpu for 30 seconds making my computer unusable.

Doesn't sound like an acceptable solution, but it could be a usable stopgap.
Attachment #501173 - Attachment is obsolete: true
Attachment #504315 - Flags: review+
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Crash Signature: [@ _moz_cairo_surface_set_device_offset ]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: