Closed Bug 627656 Opened 9 years ago Closed 9 years ago

Crash in nsCocoaWindow::DrawOver [@ GeForceGLDriver@0x12910 ] in glDeleteTextures

Categories

(Core :: Graphics, defect, critical)

x86
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: scoobidiver, Assigned: joe)

References

Details

(Keywords: crash, regression, topcrash, Whiteboard: [softblocker])

Crash Data

Attachments

(4 files, 4 obsolete files)

It is #32 top crasher on Mac OS X in 4.0b9 for the last week.

Signature	GeForceGLDriver@0x12910
UUID	38d10328-8faf-444e-a94b-482ce2110119
Time 	2011-01-19 14:03:14.964447
Uptime	363073
Last Crash	5012390 seconds (8.3 weeks) before submission
Install Age	363073 seconds (4.2 days) since version was first installed.
Product	Firefox
Version	4.0b9
Build ID	20110110191913
Branch	2.0
OS	Mac OS X
OS Version	10.6.6 10J567
CPU	amd64
CPU Info	family 6 model 23 stepping 10
Crash Reason	EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
Crash Address	0x0
App Notes 	Renderers: 0x22600,0x22600,0x20400

Frame 	Module 	Signature [Expand] 	Source
0 	GeForceGLDriver 	GeForceGLDriver@0x12910 	
1 	GeForceGLDriver 	GeForceGLDriver@0xd5786 	
2 	GLEngine 	GLEngine@0x11e2a7 	
3 	GLEngine 	GLEngine@0xf0ab1 	
4 	GLEngine 	GLEngine@0x11e275 	
5 	GLEngine 	GLEngine@0x1878a 	
6 	GLEngine 	GLEngine@0x20013 	
7 	GLEngine 	GLEngine@0x11e275 	
8 	XUL 	nsCocoaWindow::DrawOver 	GLContext.h:1979
9 	XUL 	nsChildView::DrawOver 	widget/src/cocoa/nsChildView.mm:2085
10 	XUL 	mozilla::layers::LayerManagerOGL::Render 	gfx/layers/opengl/LayerManagerOGL.cpp:604
11 	XUL 	mozilla::layers::LayerManagerOGL::EndTransaction 	gfx/layers/opengl/LayerManagerOGL.cpp:420
12 	XUL 	nsDisplayList::PaintForFrame 	layout/base/nsDisplayList.cpp:495
13 	XUL 	nsLayoutUtils::PaintFrame 	layout/base/nsLayoutUtils.cpp:1460
14 	XUL 	PresShell::Paint 	layout/base/nsPresShell.cpp:6112
15 	XUL 	nsViewManager::RenderViews 	view/src/nsViewManager.cpp:447
16 	XUL 	nsViewManager::Refresh 	view/src/nsViewManager.cpp:413
17 	XUL 	nsViewManager::DispatchEvent 	view/src/nsViewManager.cpp:912
18 	XUL 	HandleEvent 	view/src/nsView.cpp:161
19 	XUL 	nsChildView::DispatchEvent 	widget/src/cocoa/nsChildView.mm:1786
20 	XUL 	nsChildView::DispatchWindowEvent 	widget/src/cocoa/nsChildView.mm:1796
21 	XUL 	-[ChildView drawRect:inContext:] 	widget/src/cocoa/nsChildView.mm:2737
22 	XUL 	-[ChildView drawRect:] 	widget/src/cocoa/nsChildView.mm:2641
23 	AppKit 	AppKit@0x100c48 	

More reports at:
http://crash-stats.mozilla.com/report/list?product=Firefox&version=Firefox%3A4.0b9&platform=mac&query_search=signature&query_type=exact&query=&range_value=4&range_unit=weeks&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=GeForceGLDriver%400x12910
Blocks: 595180
blocking2.0: --- → ?
Keywords: regression
All these crashes appear to be NULL dereferences, and to take place at
the following line (a call to fDeleteTextures()):

http://hg.mozilla.org/mozilla-central/diff/8d005c3a6685/widget/src/cocoa/nsCocoaWindow.mm

But 'tex' is always '0', and (according to the following doc on
glDeleteTextures()) "attempts to delete nonexistent texture names or
the texture name of zero are ignored without generating an error".  So
the call to fDeleteTextures() is (at best) a no-op.

http://glprogramming.com/red/chapter09.html#name4

But in fact at least one GL driver seems to crash (at least some of
the time) on calls to to glDeleteTextures() with 'const GLuint
*textureNames' set to '0'.

So, Markus, why not get rid of the call to fDeleteTextures(), or at
least make it conditional on 'tex' being non-zero?
> All these crashes appear to be NULL dereferences, and to take place
> at the following line (a call to fDeleteTextures()):
>
> http://hg.mozilla.org/mozilla-central/diff/8d005c3a6685/widget/src/cocoa/nsCocoaWindow.mm

The link should be as follows.  Sorry.

http://hg.mozilla.org/mozilla-central/annotate/90ee185d052e/widget/src/cocoa/nsCocoaWindow.mm#l1037
(In reply to comment #1)
> But 'tex' is always '0'

tex is passed to UploadSurfaceToTexture by reference, it shouldn't be 0 after that call.

Maybe manager->gl() is NULL.
> tex is passed to UploadSurfaceToTexture by reference

Doesn't look like that to me :-)

> #ifdef MOZ_ENABLE_LIBXUL
>   manager->gl()->UploadSurfaceToTexture(image, nsIntRect(0, 0, 15, 15), tex);
> #else
>   manager->gl()->UploadSurfaceToTextureExternal(image, nsIntRect(0, 0, 15, 15), tex);
> #endif
> ShaderProgramType UploadSurfaceToTexture(gfxASurface *aSurface, 
>                                          const nsIntRegion& aDstRegion,
>                                          GLuint& aTexture,
>                                          bool aOverwrite = false,
>                                          const nsIntPoint& aSrcPoint = nsIntPoint(0, 0),
>                                          bool aPixelBuffer = PR_FALSE);

A pointer sure would make it more obvious than a reference, I agree ;-)
Oh, OK.  False lead, then.  Sorry.

I *do* hate references.  I never use them in my own code.

C++ code doesn't need to hide pointers, or to be embarrasses that it uses them :-)
There are similar crashes in the Intel driver, though I expect this to be *much* less common, as you have to use a special program (like gfxCardStatus) to force us to use the Intel GPU - otherwise OS X forces us to use the NVIDIA GPU. 

bp-767dbadf-ddfd-4dcc-8ec1-5e29e2110123

I really want us to fix this, because this should be safe code, and it sucks being broken by polish code, but I have trouble saying we shouldn't release 4.0 because of this bug.
blocking2.0: ? → .x
It is #6 top crasher on Mac OS X in 4.0b10 over the last week.
blocking2.0: .x → ?
Keywords: topcrash
Assignee: nobody → joe
blocking2.0: ? → final+
Whiteboard: [softblocker]
Summary: Crash in nsCocoaWindow::DrawOver [@ GeForceGLDriver@0x12910 ] → Crash in nsCocoaWindow::DrawOver [@ GeForceGLDriver@0x12910 ] in glDeleteTextures
This needs beta coverage to validate whether the crash goes away, since we don't have steps to reproduce.

I'm thinking about making nsCocoaWindow store and hold on to a single texture rather than creating a new texture on every draw. I don't know if that is likely to avoid the problem, but if we only call glDeleteTextures once, on shutdown, it should at least mitigate the problem.
blocking2.0: final+ → betaN+
I had a better idea: let's just use the built-in resizer ability Neil built for Windows XP.
Attached patch step 1: remove the GL resizer (obsolete) — Splinter Review
If we're going to be using the XUL resizer elements, we don't need to draw them using OpenGL at all, so let's just remove this code.

I'm not removing nsIWidget_MOZILLA_2_0_BRANCH::DrawOver, though, because that's been added to our interface for 2.0. There's no reason for us to keep it past 2.0, so I've added a comment to that effect.
Attachment #512701 - Flags: review?(mstange)
Attached patch step 2: use the built-in resizer (obsolete) — Splinter Review
The resizer is only used for Windows 2000 and XP right now, but it's very easy to make that work on OS X as well.
Attachment #512702 - Flags: review?
Attachment #512702 - Flags: review? → review?(enndeakin)
The resizer, right now, confusingly contains both -moz-appearance and background, and it appears that the background is used until a scrollbar appears, at which time moz-appearance kicks in.

There doesn't seem to be any reason to use moz-appearance at all, because we've already got carefully hand-drawn png files for the resizer. However, unlike most OS X resizers, which stop 1 px from the edge of the window, the resizer in these png files go right to the edge. So I've just moved the contents over and up one pixel.
Attachment #512703 - Flags: review?(gavin.sharp)
(In reply to comment #13)
> There doesn't seem to be any reason to use moz-appearance at all

Resizers in native apps get a grey box drawn around them as soon as a scrollbar is present, which is a detail that our old resizer didn't emulate. So using -moz-appearance (which has the box) in the scrollbar case would actually fix that.

> However, unlike
> most OS X resizers, which stop 1 px from the edge of the window, the resizer in
> these png files go right to the edge.

That's something that has been annoying me about the textarea resizers ever since we started using them, so I'm very happy that you're fixing it now :)
Comment on attachment 512702 [details] [diff] [review]
step 2: use the built-in resizer

This fix doesn't work for me. I only get a resizer in the main browser window, and only when the Addons bar is hidden. Am I missing something?
(In reply to comment #3)
> Maybe manager->gl() is NULL.

What's the ownership / lifecycle situation here? Is it possible for a non-null LayerManagerOGL to have a null GLContext? Or would we already have crashed in UploadSurfaceToTexture if that were the case?
(In reply to comment #10)
> I had a better idea: let's just use the built-in resizer ability Neil built for
> Windows XP.

That one only shows up for the Firefox browser window though, not other windows.
Bah. Neil's right. Further, these patches will need more rework to work on 32-bit OS X builds, because we don't use OpenGL there, so the window manager will draw the resizer for us.

I guess it's back to plan A, outlined in comment 9!
Attachment #512701 - Attachment is obsolete: true
Attachment #512701 - Flags: review?(mstange)
Attachment #512702 - Attachment is obsolete: true
Attachment #512702 - Flags: review?(enndeakin)
Attachment #512703 - Attachment is obsolete: true
Attachment #512703 - Flags: review?(gavin.sharp)
(In reply to comment #14)
> That's something that has been annoying me about the textarea resizers ever
> since we started using them, so I'm very happy that you're fixing it now :)

Feel free to resurrect (parts of) my patch 3 in a separate bug :)
Interestingly, we don't ever call MakeCurrent in DrawOver. I wonder if that's related!
(In reply to comment #20)
> Interestingly, we don't ever call MakeCurrent in DrawOver. I wonder if that's
> related!

The way to tell that is to run a debug build with MOZ_GL_DEBUG=1. This will abort on first GL-call-on-wrong-context, hopefully pinpointing the issue.
There's no reason for nsCocoaWindow to have a DrawOver implementation, and we can more easily reason about the lifetime of layer managers wrt nsChildView, so I'm moving the implementation of DrawOver to nsChildView.
Attachment #513701 - Flags: review?(jmuizelaar)
Attachment #513701 - Flags: approval2.0?
Attachment #513703 - Flags: review?(jmuizelaar)
Attachment #513703 - Flags: approval2.0?
Whiteboard: [softblocker] → [softblocker][needs review jrmuizel][needs approval jrmuizel][beltzner do not clear approval flags]
Comment on attachment 513701 [details] [diff] [review]
part 1: move DrawOver to nsChildView

Provided this is just the move, it looks fine.
Attachment #513701 - Flags: review?(jmuizelaar) → review+
Comment on attachment 513703 [details] [diff] [review]
part 2: store our resizer in a TextureImage, and reuse it every time

Renaming DrawOver might also be a good idea. The current name makes you ask Draw what? Over what?
 
>+
>+    nsRefPtr<gfxQuartzSurface> image = static_cast<gfxQuartzSurface*>(asurf);
>+    CGContextRef ctx = image->GetCGContext();
>+

This CG code should be in a separate function called something like drawResizer()

>+    CGContextSetShouldAntialias(ctx, false);
>+    CGPoint points[6];
>+    points[0] = CGPointMake(13.0f, 4.0f);
>+    points[1] = CGPointMake(3.0f, 14.0f);
>+    points[2] = CGPointMake(13.0f, 8.0f);
>+    points[3] = CGPointMake(7.0f, 14.0f);
>+    points[4] = CGPointMake(13.0f, 12.0f);
>+    points[5] = CGPointMake(11.0f, 14.0f);
>+    CGContextSetRGBStrokeColor(ctx, 0.00f, 0.00f, 0.00f, 0.15f);
>+    CGContextStrokeLineSegments(ctx, points, 6);
>+
>+    points[0] = CGPointMake(13.0f, 5.0f);
>+    points[1] = CGPointMake(4.0f, 14.0f);
>+    points[2] = CGPointMake(13.0f, 9.0f);
>+    points[3] = CGPointMake(8.0f, 14.0f);
>+    points[4] = CGPointMake(13.0f, 13.0f);
>+    points[5] = CGPointMake(12.0f, 14.0f);
>+    CGContextSetRGBStrokeColor(ctx, 0.13f, 0.13f, 0.13f, 0.54f);
>+    CGContextStrokeLineSegments(ctx, points, 6);
>+
>+    points[0] = CGPointMake(13.0f, 6.0f);
Attachment #513703 - Flags: review?(jmuizelaar) → review+
(In reply to comment #25)
> Comment on attachment 513703 [details] [diff] [review]
> part 2: store our resizer in a TextureImage, and reuse it every time
> 
> Renaming DrawOver might also be a good idea. The current name makes you ask
> Draw what? Over what?

That's a bit out of scope for this bug, but I'll file a separate one to do so.
Whiteboard: [softblocker][needs review jrmuizel][needs approval jrmuizel][beltzner do not clear approval flags] → [softblocker][needs approval jrmuizel]
(In reply to comment #26)
> (In reply to comment #25)
> > Comment on attachment 513703 [details] [diff] [review]
> > part 2: store our resizer in a TextureImage, and reuse it every time
> > 
> > Renaming DrawOver might also be a good idea. The current name makes you ask
> > Draw what? Over what?
> 
> That's a bit out of scope for this bug, but I'll file a separate one to do so.

This is bug 635544.
Attachment #513701 - Flags: review?(matt.woodrow+bugzilla)
Attachment #513796 - Flags: review?(matt.woodrow+bugzilla)
Attachment #513796 - Flags: approval2.0? → approval2.0+
Attachment #513701 - Flags: review?(matt.woodrow+bugzilla) → review+
Comment on attachment 513796 [details] [diff] [review]
part 2: store our resizer in a TextureImage, and reuse it every time, updated for review comments


>+
>+    if (asurf->GetType() != gfxASurface::SurfaceTypeQuartz) {
>+      mResizerImage = nsnull;
>+      return;
>+    }
>+

Can we make this an assertion instead? This should be impossible with the current code. If it ever gets changed, silently removing the resizer probably isn't helpful.

Rest looks great, land it!
Attachment #513796 - Flags: review?(matt.woodrow+bugzilla) → review+
Whiteboard: [softblocker][needs approval jrmuizel] → [softblocker]
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Comment on attachment 513701 [details] [diff] [review]
part 1: move DrawOver to nsChildView

Has landed already, so clearing redundant approval2.0?, since it's causing this fixed bug to erroneously appear under approval requests here:
http://office.smedbergs.us/blocker-report/
Attachment #513701 - Flags: approval2.0?
Crash Signature: [@ GeForceGLDriver@0x12910 ]
You need to log in before you can comment on or make changes to this bug.