Last Comment Bug 305827 - Mac OS X gcc 4 builds with --disable-optimize/-O0 crash, often at startup [@ SetGWorld]
: Mac OS X gcc 4 builds with --disable-optimize/-O0 crash, often at startup [@ ...
Status: RESOLVED FIXED
: fixed1.8.0.1, fixed1.8.1
Product: Core Graveyard
Classification: Graveyard
Component: GFX: Mac (show other bugs)
: Trunk
: PowerPC Mac OS X
: -- critical (vote)
: mozilla1.8.1
Assigned To: Mark Mentovai
:
:
Mentors:
http://paste.lisp.org/display/11087
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2005-08-24 15:04 PDT by Mark Mentovai
Modified: 2009-01-22 10:17 PST (History)
5 users (show)
asa: blocking1.8rc1-
benjamin: blocking1.8.1+
dveditz: blocking1.8.0.1+
See Also:
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Fix (6.35 KB, patch)
2005-10-24 13:44 PDT, Mark Mentovai
jaas: review+
sfraser_bugs: superreview+
dveditz: approval1.8.0.1+
benjamin: approval1.8.1+
Details | Diff | Splinter Review

Description Mark Mentovai 2005-08-24 15:04:18 PDT
From bug 294244 comment 8:

OS X builds produced with gcc 4 and optimization disabled are unstable.  They
crash in QuickDraw, often at startup.  They're more likely to crash at startup
if the profile manager is configured to display at start.

Building at -O1 and up is fine.

I haven't touched this in a while, so I'm not sure if it's our bug or Apple's,
but I recall that I was leaning toward Apple.  Needs more investigation.

http://paste.lisp.org/display/11087 has a backtrace from someone else's debug
build produced with --disable-optimize.
Comment 1 Josh Aas 2005-10-13 17:15:29 PDT
Unoptimized GCC4 builds crash almost immediately on Intel Macs as well. Some lines from different 
traces:
--------------------------------------------
Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_PROTECTION_FAILURE (0x0002) at 0x00000215
Thread 0 Crashed:
0   com.apple.QD          	0x91668c0b GetPortBounds + 20
1   libgfx_mac.dylib      	0x25b2e35c nsRenderingContextMac::SelectDrawingSurface
(nsDrawingSurfaceMac*, unsigned) + 194 (nsRenderingContextMac.cpp:221)
2   libgfx_mac.dylib      	0x25b2e755 nsRenderingContextMac::Init(nsIDeviceContext*, nsIWidget*) + 
277 (nsRenderingContextMac.cpp:160)
------------------------------------------------
Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_PROTECTION_FAILURE (0x0002) at 0x00000008
Thread 0 Crashed:
0   libgfx_mac.dylib      	0x0644ce1b nsRenderingContextMac::~nsRenderingContextMac [in-charge 
deleting]() + 47 (nsRenderingContextMac.cpp:99)
1   libgfx_mac.dylib      	0x0644d1b0 nsRenderingContextMac::Release() + 276 
(nsRenderingContextMac.cpp:127)
2   libgkgfx.dylib        	0x01d1ca14 nsCOMPtr<nsIRenderingContext>::~nsCOMPtr [in-charge]() + 
66 (nsCOMPtr.h:583)
Comment 2 Mark Mentovai 2005-10-24 13:44:32 PDT
Created attachment 200650 [details] [diff] [review]
Fix

Got it.  This is our bug.  So sorry to have doubted you, Apple.

The problem is that we've got two different StPortSetter classes, with different structures.  Both are declared in .h files, so this isn't a problem at higher optimization levels presumably due to more aggressive inlining.  When this breaks, it seems that the ctor and dtor from different classes are being used.  This results in a bad port being set when the StPortSetter is destroyed.

The StPortSetters are defined in gfx/src/mac/nsGfxUtils.h and xpfe/bootstrap/appleevents/nsAEUtils.h .  We use the nsGfxUtils one all over gfx and widget.  The nsAEUtils one is only used in xpfe/bootstrap/appleevents/nsAEWindowClass.cpp.  There is nothing preventing appleevents from using the StPortSetter in nsGfxUtils, except that gfx' headers aren't available to appleevents.  There's no reason for them not to be, since gfx is already a prerequisite for appleevents.

This patch removes the StPortSetter from nsAEUtils and modifies nsAEWindowClass to pick up the nsGfxUtils version.  The Makefile is adjusted to allow gfx includes in appleevents.  Bringing in nsGfxUtils also brings in StHandleLocker, which is also defined in nsAEUtils.  Since these are also functionally identical, the nsAEUtils version is removed.  The only user of StHandleLocker in appleevents is nsAEWordClass, it's modified accordingly, but note that it is never compiled.

I've also cleaned up several unused variables in nsAEWindowClass.

This patch allows --disable-optimize and -O0 builds to launch and function on Mac PPC.  I can't test Mac Intel.  Those may still have trouble due to bug 313398.
Comment 3 Mark Mentovai 2005-10-24 14:28:33 PDT
Risk is low.  Our current behavior is wrong.  We don't break under other compilers or optimization levels because we rely on toolchain behaviors that aren't guaranteed, and have been lucky up until now.  The benefit of this change is that it's actually possible to debug non-optimized builds, and as everyone knows, debug builds are most easily debugged when they're not optimized.  I'm now working with this in my trunk tree and am actually able to use a proper debug build.  This is critical for continued development of the Intel Mac port, because gcc 4 is Apple's only supported/shipped compiler on that architecture.
Comment 4 Mark Mentovai 2005-10-24 14:31:45 PDT
Whoops, bugzilla didn't warn me about a mid-air.  I'd reminus but I'm not "sufficiently empowered."
Comment 5 Josh Aas 2005-10-24 17:01:58 PDT
Comment on attachment 200650 [details] [diff] [review]
Fix

This patch works.

I'm not sure what our dependency story is between xpfe and gfx - seems like this code was copied, and I wonder what the reason for that was. So long as you understand why the code was copied in the first place and didn't just use the same class like your patch makes it, r=josh.
Comment 6 Simon Fraser 2005-10-24 20:24:40 PDT
Comment on attachment 200650 [details] [diff] [review]
Fix

>Index: mozilla/xpfe/bootstrap/appleevents/nsAEWindowClass.cpp
>===================================================================

> 		if (err == noErr)
> 		{		
>-			short windowWidth =  r.right - r.left;
>-			short windowDepth =  r.bottom - r.top;
>-				
>  			MoveWindow(window, r.left, r.top, false);
>- 			// ¥¥¥ÊDoResize(window, windowWidth, windowDepth);
>+			// short windowWidth =  r.right - r.left;
>+			// short windowDepth =  r.bottom - r.top;
>+ 			// DoResize(window, windowWidth, windowDepth);
> 		}

What is this change doing?
Comment 7 Mark Mentovai 2005-10-24 21:03:47 PDT
(In reply to comment #6)
> (From update of attachment 200650 [details] [diff] [review] [edit])
> >Index: mozilla/xpfe/bootstrap/appleevents/nsAEWindowClass.cpp
> >===================================================================
> 
> > 		if (err == noErr)
> > 		{		
> >-			short windowWidth =  r.right - r.left;
> >-			short windowDepth =  r.bottom - r.top;
> >-				
> >  			MoveWindow(window, r.left, r.top, false);
> >- 			// ¥¥¥ÊDoResize(window, windowWidth, windowDepth);
> >+			// short windowWidth =  r.right - r.left;
> >+			// short windowDepth =  r.bottom - r.top;
> >+ 			// DoResize(window, windowWidth, windowDepth);
> > 		}
> 
> What is this change doing?

windowWidth and windowDepth were unused.  Normally, I'd just get rid of them, but they are used by the commented-out DoResize call, so I commented out the declarations too to match.  DoResize was preceded by bullets in MacRoman encoding, which come up as yen in UTF-8.  Since we don't really need anything non-ASCII in there, I just got rid of them.

I could also whack the lines.
Comment 8 Simon Fraser 2005-10-25 08:46:55 PDT
> windowWidth and windowDepth were unused.  Normally, I'd just get rid of them,
> but they are used by the commented-out DoResize call, so I commented out the
> declarations too to match.  DoResize was preceded by bullets in MacRoman
> encoding, which come up as yen in UTF-8.  Since we don't really need anything
> non-ASCII in there, I just got rid of them.
> 
> I could also whack the lines.

I'd leave them in, since in theory setting the bounds property should resize the window. Just take out the bullets.
Comment 9 Mark Mentovai 2005-10-25 11:03:14 PDT
The StPortSetter constructor was being inlined in nsRenderingContextMac, but the destructor was not and we were picking up the one from nsAEUtils.

Fixed on trunk.

Although this won't be taken on the 1.8 branch now, it will need attention there in the future.
Comment 10 Asa Dotzler [:asa] 2005-10-26 11:23:05 PDT
not going to block our release. developers who need this can patch locally for now.
Comment 11 Mark Mentovai 2005-11-11 07:47:58 PST
Nominating for 1.8.1 so we'll be able to properly debug on Intel when it becomes a real shipping product.  This has been on the trunk for a few weeks.
Comment 12 Mark Mentovai 2005-11-22 09:18:16 PST
Moving my stability bugs to blocking1.8.0.1?
Comment 13 Mark Mentovai 2005-12-28 07:48:29 PST
Comment on attachment 200650 [details] [diff] [review]
Fix

This is important to be able to debug on x86 Macs.  No risk.
Comment 14 Josh Aas 2006-01-04 12:41:03 PST
I disagree with 1.8.0.1 landing not getting approved. This is a very low risk patch and is a serious problem for anyone trying to work with this code on a mac.
Comment 15 Josh Aas 2006-01-04 12:46:35 PST
Furthermore, this has been baking on the trunk for a over a month with no regressions reported.
Comment 16 Mark Mentovai 2006-01-04 17:52:23 PST
...over TWO months!
Comment 17 Mark Mentovai 2006-01-05 08:42:02 PST
Fixed on MOZILLA_1_8_BRANCH.
Comment 18 Daniel Veditz [:dveditz] 2006-01-06 13:18:23 PST
Comment on attachment 200650 [details] [diff] [review]
Fix

a=dveditz for drivers
Comment 19 Mark Mentovai 2006-01-06 14:43:13 PST
Fixed on MOZILLA_1_8_0_BRANCH.

Note You need to log in before you can comment on or make changes to this bug.