Mac OS X gcc 4 builds with --disable-optimize/-O0 crash, often at startup [@ SetGWorld]

RESOLVED FIXED in mozilla1.8.1

Status

Core Graveyard
GFX: Mac
--
critical
RESOLVED FIXED
12 years ago
8 years ago

People

(Reporter: Mark Mentovai, Assigned: Mark Mentovai)

Tracking

({fixed1.8.0.1, fixed1.8.1})

Trunk
mozilla1.8.1
PowerPC
Mac OS X
fixed1.8.0.1, fixed1.8.1
Bug Flags:
blocking1.8rc1 -
blocking1.8.1 +
blocking1.8.0.1 +

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment)

Fix
6.35 KB, patch
Josh Aas
: review+
Simon Fraser
: superreview+
bsmedberg
: approval1.8.1+
Details | Diff | Splinter Review
(Assignee)

Description

12 years ago
From bug 294244 comment 8:

OS X builds produced with gcc 4 and optimization disabled are unstable.  They
crash in QuickDraw, often at startup.  They're more likely to crash at startup
if the profile manager is configured to display at start.

Building at -O1 and up is fine.

I haven't touched this in a while, so I'm not sure if it's our bug or Apple's,
but I recall that I was leaning toward Apple.  Needs more investigation.

http://paste.lisp.org/display/11087 has a backtrace from someone else's debug
build produced with --disable-optimize.
(Assignee)

Updated

12 years ago
Component: Build Config → GFX: Mac
(Assignee)

Updated

12 years ago
Summary: Mac OS X gcc 4 builds with --disable-optimize/-O0 don't work → Mac OS X gcc 4 builds with --disable-optimize/-O0 crash

Updated

12 years ago
Summary: Mac OS X gcc 4 builds with --disable-optimize/-O0 crash → Mac OS X gcc 4 builds with --disable-optimize/-O0 crash [@ SetGWorld]

Updated

12 years ago
Summary: Mac OS X gcc 4 builds with --disable-optimize/-O0 crash [@ SetGWorld] → Mac OS X gcc 4 builds with --disable-optimize/-O0 crash, often at startup [@ SetGWorld]

Comment 1

12 years ago
Unoptimized GCC4 builds crash almost immediately on Intel Macs as well. Some lines from different 
traces:
--------------------------------------------
Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_PROTECTION_FAILURE (0x0002) at 0x00000215
Thread 0 Crashed:
0   com.apple.QD          	0x91668c0b GetPortBounds + 20
1   libgfx_mac.dylib      	0x25b2e35c nsRenderingContextMac::SelectDrawingSurface
(nsDrawingSurfaceMac*, unsigned) + 194 (nsRenderingContextMac.cpp:221)
2   libgfx_mac.dylib      	0x25b2e755 nsRenderingContextMac::Init(nsIDeviceContext*, nsIWidget*) + 
277 (nsRenderingContextMac.cpp:160)
------------------------------------------------
Exception:  EXC_BAD_ACCESS (0x0001)
Codes:      KERN_PROTECTION_FAILURE (0x0002) at 0x00000008
Thread 0 Crashed:
0   libgfx_mac.dylib      	0x0644ce1b nsRenderingContextMac::~nsRenderingContextMac [in-charge 
deleting]() + 47 (nsRenderingContextMac.cpp:99)
1   libgfx_mac.dylib      	0x0644d1b0 nsRenderingContextMac::Release() + 276 
(nsRenderingContextMac.cpp:127)
2   libgkgfx.dylib        	0x01d1ca14 nsCOMPtr<nsIRenderingContext>::~nsCOMPtr [in-charge]() + 
66 (nsCOMPtr.h:583)
(Assignee)

Updated

12 years ago
Severity: normal → critical
(Assignee)

Comment 2

12 years ago
Created attachment 200650 [details] [diff] [review]
Fix

Got it.  This is our bug.  So sorry to have doubted you, Apple.

The problem is that we've got two different StPortSetter classes, with different structures.  Both are declared in .h files, so this isn't a problem at higher optimization levels presumably due to more aggressive inlining.  When this breaks, it seems that the ctor and dtor from different classes are being used.  This results in a bad port being set when the StPortSetter is destroyed.

The StPortSetters are defined in gfx/src/mac/nsGfxUtils.h and xpfe/bootstrap/appleevents/nsAEUtils.h .  We use the nsGfxUtils one all over gfx and widget.  The nsAEUtils one is only used in xpfe/bootstrap/appleevents/nsAEWindowClass.cpp.  There is nothing preventing appleevents from using the StPortSetter in nsGfxUtils, except that gfx' headers aren't available to appleevents.  There's no reason for them not to be, since gfx is already a prerequisite for appleevents.

This patch removes the StPortSetter from nsAEUtils and modifies nsAEWindowClass to pick up the nsGfxUtils version.  The Makefile is adjusted to allow gfx includes in appleevents.  Bringing in nsGfxUtils also brings in StHandleLocker, which is also defined in nsAEUtils.  Since these are also functionally identical, the nsAEUtils version is removed.  The only user of StHandleLocker in appleevents is nsAEWordClass, it's modified accordingly, but note that it is never compiled.

I've also cleaned up several unused variables in nsAEWindowClass.

This patch allows --disable-optimize and -O0 builds to launch and function on Mac PPC.  I can't test Mac Intel.  Those may still have trouble due to bug 313398.
Attachment #200650 - Flags: superreview?(sfraser_bugs)
Attachment #200650 - Flags: review?(joshmoz)

Updated

12 years ago
Flags: blocking1.8rc1?

Updated

12 years ago
Flags: blocking1.8rc1? → blocking1.8rc1-
(Assignee)

Comment 3

12 years ago
Risk is low.  Our current behavior is wrong.  We don't break under other compilers or optimization levels because we rely on toolchain behaviors that aren't guaranteed, and have been lucky up until now.  The benefit of this change is that it's actually possible to debug non-optimized builds, and as everyone knows, debug builds are most easily debugged when they're not optimized.  I'm now working with this in my trunk tree and am actually able to use a proper debug build.  This is critical for continued development of the Intel Mac port, because gcc 4 is Apple's only supported/shipped compiler on that architecture.
Status: NEW → ASSIGNED
Flags: blocking1.8rc1- → blocking1.8rc1?
(Assignee)

Comment 4

12 years ago
Whoops, bugzilla didn't warn me about a mid-air.  I'd reminus but I'm not "sufficiently empowered."

Comment 5

12 years ago
Comment on attachment 200650 [details] [diff] [review]
Fix

This patch works.

I'm not sure what our dependency story is between xpfe and gfx - seems like this code was copied, and I wonder what the reason for that was. So long as you understand why the code was copied in the first place and didn't just use the same class like your patch makes it, r=josh.
Attachment #200650 - Flags: review?(joshmoz) → review+

Comment 6

12 years ago
Comment on attachment 200650 [details] [diff] [review]
Fix

>Index: mozilla/xpfe/bootstrap/appleevents/nsAEWindowClass.cpp
>===================================================================

> 		if (err == noErr)
> 		{		
>-			short windowWidth =  r.right - r.left;
>-			short windowDepth =  r.bottom - r.top;
>-				
>  			MoveWindow(window, r.left, r.top, false);
>- 			// ¥¥¥ÊDoResize(window, windowWidth, windowDepth);
>+			// short windowWidth =  r.right - r.left;
>+			// short windowDepth =  r.bottom - r.top;
>+ 			// DoResize(window, windowWidth, windowDepth);
> 		}

What is this change doing?
(Assignee)

Comment 7

12 years ago
(In reply to comment #6)
> (From update of attachment 200650 [details] [diff] [review] [edit])
> >Index: mozilla/xpfe/bootstrap/appleevents/nsAEWindowClass.cpp
> >===================================================================
> 
> > 		if (err == noErr)
> > 		{		
> >-			short windowWidth =  r.right - r.left;
> >-			short windowDepth =  r.bottom - r.top;
> >-				
> >  			MoveWindow(window, r.left, r.top, false);
> >- 			// ¥¥¥ÊDoResize(window, windowWidth, windowDepth);
> >+			// short windowWidth =  r.right - r.left;
> >+			// short windowDepth =  r.bottom - r.top;
> >+ 			// DoResize(window, windowWidth, windowDepth);
> > 		}
> 
> What is this change doing?

windowWidth and windowDepth were unused.  Normally, I'd just get rid of them, but they are used by the commented-out DoResize call, so I commented out the declarations too to match.  DoResize was preceded by bullets in MacRoman encoding, which come up as yen in UTF-8.  Since we don't really need anything non-ASCII in there, I just got rid of them.

I could also whack the lines.

Comment 8

12 years ago
> windowWidth and windowDepth were unused.  Normally, I'd just get rid of them,
> but they are used by the commented-out DoResize call, so I commented out the
> declarations too to match.  DoResize was preceded by bullets in MacRoman
> encoding, which come up as yen in UTF-8.  Since we don't really need anything
> non-ASCII in there, I just got rid of them.
> 
> I could also whack the lines.

I'd leave them in, since in theory setting the bounds property should resize the window. Just take out the bullets.

Updated

12 years ago
Attachment #200650 - Flags: superreview?(sfraser_bugs) → superreview+
(Assignee)

Comment 9

12 years ago
The StPortSetter constructor was being inlined in nsRenderingContextMac, but the destructor was not and we were picking up the one from nsAEUtils.

Fixed on trunk.

Although this won't be taken on the 1.8 branch now, it will need attention there in the future.

Comment 10

12 years ago
not going to block our release. developers who need this can patch locally for now.
Flags: blocking1.8rc1? → blocking1.8rc1-
(Assignee)

Comment 11

12 years ago
Nominating for 1.8.1 so we'll be able to properly debug on Intel when it becomes a real shipping product.  This has been on the trunk for a few weeks.
Flags: blocking1.8.1?
(Assignee)

Comment 12

12 years ago
Moving my stability bugs to blocking1.8.0.1?
Flags: blocking1.8.1? → blocking1.8.0.1?
(Assignee)

Updated

12 years ago
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
(Assignee)

Comment 13

12 years ago
Comment on attachment 200650 [details] [diff] [review]
Fix

This is important to be able to debug on x86 Macs.  No risk.
Attachment #200650 - Flags: approval1.8.0.1?
Attachment #200650 - Flags: approval1.8.1+
Attachment #200650 - Flags: approval1.8.0.1?
Attachment #200650 - Flags: approval1.8.0.1-
Flags: blocking1.8.1+
Flags: blocking1.8.0.1?
Flags: blocking1.8.0.1-

Comment 14

12 years ago
I disagree with 1.8.0.1 landing not getting approved. This is a very low risk patch and is a serious problem for anyone trying to work with this code on a mac.

Updated

12 years ago
Flags: blocking1.8.0.1- → blocking1.8.0.1?

Comment 15

12 years ago
Furthermore, this has been baking on the trunk for a over a month with no regressions reported.
(Assignee)

Comment 16

12 years ago
...over TWO months!
(Assignee)

Comment 17

12 years ago
Fixed on MOZILLA_1_8_BRANCH.
Keywords: fixed1.8.1
Target Milestone: --- → mozilla1.8.1
Comment on attachment 200650 [details] [diff] [review]
Fix

a=dveditz for drivers
Attachment #200650 - Flags: approval1.8.0.1- → approval1.8.0.1+
Flags: blocking1.8.0.1? → blocking1.8.0.1+
(Assignee)

Comment 19

12 years ago
Fixed on MOZILLA_1_8_0_BRANCH.
Keywords: fixed1.8.0.1
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.