Closed Bug 305827 Opened 19 years ago Closed 19 years ago

Mac OS X gcc 4 builds with --disable-optimize/-O0 crash, often at startup [@ SetGWorld]

Categories

(Core Graveyard :: GFX: Mac, defect)

PowerPC
macOS
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.8.1

People

(Reporter: mark, Assigned: mark)

References

()

Details

(Keywords: fixed1.8.0.1, fixed1.8.1)

Attachments

(1 file)

From bug 294244 comment 8:

OS X builds produced with gcc 4 and optimization disabled are unstable.  They
crash in QuickDraw, often at startup.  They're more likely to crash at startup
if the profile manager is configured to display at start.

Building at -O1 and up is fine.

I haven't touched this in a while, so I'm not sure if it's our bug or Apple's,
but I recall that I was leaning toward Apple.  Needs more investigation.

http://paste.lisp.org/display/11087 has a backtrace from someone else's debug
build produced with --disable-optimize.
Component: Build Config → GFX: Mac
Summary: Mac OS X gcc 4 builds with --disable-optimize/-O0 don't work → Mac OS X gcc 4 builds with --disable-optimize/-O0 crash
Summary: Mac OS X gcc 4 builds with --disable-optimize/-O0 crash → Mac OS X gcc 4 builds with --disable-optimize/-O0 crash [@ SetGWorld]
Summary: Mac OS X gcc 4 builds with --disable-optimize/-O0 crash [@ SetGWorld] → Mac OS X gcc 4 builds with --disable-optimize/-O0 crash, often at startup [@ SetGWorld]
Unoptimized GCC4 builds crash almost immediately on Intel Macs as well. Some lines from different 
traces:
--------------------------------------------
Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_PROTECTION_FAILURE (0x0002) at 0x00000215
Thread 0 Crashed:
0   com.apple.QD          	0x91668c0b GetPortBounds + 20
1   libgfx_mac.dylib      	0x25b2e35c nsRenderingContextMac::SelectDrawingSurface
(nsDrawingSurfaceMac*, unsigned) + 194 (nsRenderingContextMac.cpp:221)
2   libgfx_mac.dylib      	0x25b2e755 nsRenderingContextMac::Init(nsIDeviceContext*, nsIWidget*) + 
277 (nsRenderingContextMac.cpp:160)
------------------------------------------------
Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_PROTECTION_FAILURE (0x0002) at 0x00000008
Thread 0 Crashed:
0   libgfx_mac.dylib      	0x0644ce1b nsRenderingContextMac::~nsRenderingContextMac [in-charge 
deleting]() + 47 (nsRenderingContextMac.cpp:99)
1   libgfx_mac.dylib      	0x0644d1b0 nsRenderingContextMac::Release() + 276 
(nsRenderingContextMac.cpp:127)
2   libgkgfx.dylib        	0x01d1ca14 nsCOMPtr<nsIRenderingContext>::~nsCOMPtr [in-charge]() + 
66 (nsCOMPtr.h:583)
Severity: normal → critical
Attached patch FixSplinter Review
Got it.  This is our bug.  So sorry to have doubted you, Apple.

The problem is that we've got two different StPortSetter classes, with different structures.  Both are declared in .h files, so this isn't a problem at higher optimization levels presumably due to more aggressive inlining.  When this breaks, it seems that the ctor and dtor from different classes are being used.  This results in a bad port being set when the StPortSetter is destroyed.

The StPortSetters are defined in gfx/src/mac/nsGfxUtils.h and xpfe/bootstrap/appleevents/nsAEUtils.h .  We use the nsGfxUtils one all over gfx and widget.  The nsAEUtils one is only used in xpfe/bootstrap/appleevents/nsAEWindowClass.cpp.  There is nothing preventing appleevents from using the StPortSetter in nsGfxUtils, except that gfx' headers aren't available to appleevents.  There's no reason for them not to be, since gfx is already a prerequisite for appleevents.

This patch removes the StPortSetter from nsAEUtils and modifies nsAEWindowClass to pick up the nsGfxUtils version.  The Makefile is adjusted to allow gfx includes in appleevents.  Bringing in nsGfxUtils also brings in StHandleLocker, which is also defined in nsAEUtils.  Since these are also functionally identical, the nsAEUtils version is removed.  The only user of StHandleLocker in appleevents is nsAEWordClass, it's modified accordingly, but note that it is never compiled.

I've also cleaned up several unused variables in nsAEWindowClass.

This patch allows --disable-optimize and -O0 builds to launch and function on Mac PPC.  I can't test Mac Intel.  Those may still have trouble due to bug 313398.
Attachment #200650 - Flags: superreview?(sfraser_bugs)
Attachment #200650 - Flags: review?(joshmoz)
Flags: blocking1.8rc1?
Flags: blocking1.8rc1? → blocking1.8rc1-
Risk is low.  Our current behavior is wrong.  We don't break under other compilers or optimization levels because we rely on toolchain behaviors that aren't guaranteed, and have been lucky up until now.  The benefit of this change is that it's actually possible to debug non-optimized builds, and as everyone knows, debug builds are most easily debugged when they're not optimized.  I'm now working with this in my trunk tree and am actually able to use a proper debug build.  This is critical for continued development of the Intel Mac port, because gcc 4 is Apple's only supported/shipped compiler on that architecture.
Status: NEW → ASSIGNED
Flags: blocking1.8rc1- → blocking1.8rc1?
Whoops, bugzilla didn't warn me about a mid-air.  I'd reminus but I'm not "sufficiently empowered."
Comment on attachment 200650 [details] [diff] [review]
Fix

This patch works.

I'm not sure what our dependency story is between xpfe and gfx - seems like this code was copied, and I wonder what the reason for that was. So long as you understand why the code was copied in the first place and didn't just use the same class like your patch makes it, r=josh.
Attachment #200650 - Flags: review?(joshmoz) → review+
Comment on attachment 200650 [details] [diff] [review]
Fix

>Index: mozilla/xpfe/bootstrap/appleevents/nsAEWindowClass.cpp
>===================================================================

> 		if (err == noErr)
> 		{		
>-			short windowWidth =  r.right - r.left;
>-			short windowDepth =  r.bottom - r.top;
>-				
>  			MoveWindow(window, r.left, r.top, false);
>- 			// ¥¥¥ÊDoResize(window, windowWidth, windowDepth);
>+			// short windowWidth =  r.right - r.left;
>+			// short windowDepth =  r.bottom - r.top;
>+ 			// DoResize(window, windowWidth, windowDepth);
> 		}

What is this change doing?
(In reply to comment #6)
> (From update of attachment 200650 [details] [diff] [review] [edit])
> >Index: mozilla/xpfe/bootstrap/appleevents/nsAEWindowClass.cpp
> >===================================================================
> 
> > 		if (err == noErr)
> > 		{		
> >-			short windowWidth =  r.right - r.left;
> >-			short windowDepth =  r.bottom - r.top;
> >-				
> >  			MoveWindow(window, r.left, r.top, false);
> >- 			// ¥¥¥ÊDoResize(window, windowWidth, windowDepth);
> >+			// short windowWidth =  r.right - r.left;
> >+			// short windowDepth =  r.bottom - r.top;
> >+ 			// DoResize(window, windowWidth, windowDepth);
> > 		}
> 
> What is this change doing?

windowWidth and windowDepth were unused.  Normally, I'd just get rid of them, but they are used by the commented-out DoResize call, so I commented out the declarations too to match.  DoResize was preceded by bullets in MacRoman encoding, which come up as yen in UTF-8.  Since we don't really need anything non-ASCII in there, I just got rid of them.

I could also whack the lines.
> windowWidth and windowDepth were unused.  Normally, I'd just get rid of them,
> but they are used by the commented-out DoResize call, so I commented out the
> declarations too to match.  DoResize was preceded by bullets in MacRoman
> encoding, which come up as yen in UTF-8.  Since we don't really need anything
> non-ASCII in there, I just got rid of them.
> 
> I could also whack the lines.

I'd leave them in, since in theory setting the bounds property should resize the window. Just take out the bullets.
Attachment #200650 - Flags: superreview?(sfraser_bugs) → superreview+
The StPortSetter constructor was being inlined in nsRenderingContextMac, but the destructor was not and we were picking up the one from nsAEUtils.

Fixed on trunk.

Although this won't be taken on the 1.8 branch now, it will need attention there in the future.
not going to block our release. developers who need this can patch locally for now.
Flags: blocking1.8rc1? → blocking1.8rc1-
Nominating for 1.8.1 so we'll be able to properly debug on Intel when it becomes a real shipping product.  This has been on the trunk for a few weeks.
Flags: blocking1.8.1?
Moving my stability bugs to blocking1.8.0.1?
Flags: blocking1.8.1? → blocking1.8.0.1?
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Comment on attachment 200650 [details] [diff] [review]
Fix

This is important to be able to debug on x86 Macs.  No risk.
Attachment #200650 - Flags: approval1.8.0.1?
Attachment #200650 - Flags: approval1.8.1+
Attachment #200650 - Flags: approval1.8.0.1?
Attachment #200650 - Flags: approval1.8.0.1-
Flags: blocking1.8.1+
Flags: blocking1.8.0.1?
Flags: blocking1.8.0.1-
I disagree with 1.8.0.1 landing not getting approved. This is a very low risk patch and is a serious problem for anyone trying to work with this code on a mac.
Flags: blocking1.8.0.1- → blocking1.8.0.1?
Furthermore, this has been baking on the trunk for a over a month with no regressions reported.
...over TWO months!
Fixed on MOZILLA_1_8_BRANCH.
Keywords: fixed1.8.1
Target Milestone: --- → mozilla1.8.1
Comment on attachment 200650 [details] [diff] [review]
Fix

a=dveditz for drivers
Attachment #200650 - Flags: approval1.8.0.1- → approval1.8.0.1+
Flags: blocking1.8.0.1? → blocking1.8.0.1+
Fixed on MOZILLA_1_8_0_BRANCH.
Keywords: fixed1.8.0.1
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: