Closed Bug 340453 Opened 14 years ago Closed 5 years ago

Objective-C/Cocoa exceptions are ignored

Categories

(Core :: Widget: Cocoa, defect)

PowerPC
macOS
defect
Not set

Tracking

()

RESOLVED INVALID

People

(Reporter: mark, Unassigned)

References

(Blocks 1 open bug)

Details

Since bug 339371, Objective-C assertions raised by Cocoa frameworks are ignored instead of causing app termination.

2006-06-05 14:10:39.021 firefox-bin[28714] *** -[NSView getNativeWindow]:
selector not recognized [self = 0x2a8525a0]
2006-06-05 14:10:39.021 firefox-bin[28714] *** NSRunLoop ignoring exception
'*** -[NSView getNativeWindow]: selector not recognized [self = 0x2a8525a0]'
that raised during posting of delayed perform with target 133d260 and selector
'runAppShell'

(from bug 340436)

NSRunLoop is handling these exceptions (by ignoring them) and allowing the app to continue running.  This frustrates debugging and will likely result in a hung app or an app that crashes with a stack entirely unrelated to the root cause.
Are you saying that objective-C assertions should be fatal in debug builds, like JavaScript engine assertions?
I mis-subjected the bug - it should have read "exceptions," not "assertions."  Previously, uncaught Objective-C exceptions were fatal, debug or not.  This is the expected behavior.
Summary: Objective-C/Cocoa assertions are ignored → Objective-C/Cocoa exceptions are ignored
Is Bug 163260 the same bug?
Nope.
bug 163260 is kind of related. Making these exceptions fatal would at least give us a safe way to recover.
Er, don't know why I said recover. I meant crash.

Anyway, the test extension on bug 163260 will be useful in testing any fix for this.
Flags: blocking1.9?
This seems to me like a subset of 163260.  Mark you seemed to think not... how critical is this bug and how do they relate?
Flags: blocking1.9? → blocking1.9-
As I read it, 163260 is specifically about bad things that can happen when gecko code calls into objc code, so any objc entry points that can be accessed from XPCOM, and any code that relies on C++ destructors destructing things when they're popped off the stack, need to be wrapped up in an objc @try.

This bug is more about objc exceptions that would otherwise be fatal not terminating the app.  This can happen with stacks that don't touch XPCOM at all.  For example, if a native event occurs, resulting in some objc object being sent a message, we might hit an objc exception during processing.  The example above ([NSView getNativeWindow]: selector not recognized) could happen without any XPCOM or C++ code on the stack at all.  You could catch it by expanding the scope of your @try-addition project (163260), but that shouldn't be necessary.  We should just let those exceptions be unhandled but remain fatal.

This bug (340453) was caused by some unfortunate workarounds in the Cocoa nsAppShell, that I'd like to see removed in favor of a Cocoa nsAppShell that delegates the entire run loop to NSApplicationRun/[NSApp run].  The model that we adopted last year (thread manager rewrite) doesn't really play nicely with Cocoa, which really wants to drive the loop itself all the time.  Everything we've done to the Cocoa app shell since last May has been a hack, and all of those hacks have had side effects, like this one, and like the one that gives you a few run loops on the stack before the application gets off the ground.
Blocks: 373122
Blocks: 163260
The current plan for this bug (and also bug 163260) is to crash the
browser whenever an exception is (or would have been) raised by
[NSException raise], after having NSLogged whatever information is in
the NSException object (i.e. it's name and description, and possibly
also its userInfo (if present)).

I hope to have a patch to accomplish this by the end of next week (by
2007-06-08).
If we fix the appshell to not run the entire program on a delayed perform, this whole problem (bug not 163260) should go away.
> If we fix the appshell to not run the entire program on a delayed
> perform, this whole problem (bug not 163260) should go away.

Unfortunately it doesn't.

About five weeks ago I wrote a patch to appshell to do this, and have
been using it steadily ever since.  It doesn't fix this problem.  (One
example is that bug 373122's testcase (attachment 257787 [details]) still causes
a bunch of bounds exceptions to happen without terminating the
browser, even with my appshell patch in place.)

(The reason I didn't post or land the patch is that there are still
some (minor) issues with it, and I first wanted to investigate the
problems that it was supposed to fix, but didn't.  Even though all the
so-called appshell problems turned out to be unrelated, I still think
the appshell patch is a good idea -- because it's simpler than the
current behavior and conforms better to Apple's "standard practice".
Would you like me to email you a copy?)
(Following up comment #11)

Here's an example of what you see in the console log (for bug 373122's
testcase) using my appshell patch (after disabling the fix for bug
373122):

  *** -[NSCFArray objectAtIndex:]: index (2) beyond bounds (2)

There's nothing about NSRunLoop ignoring an exception, but the browser
does keep running (it doesn't crash).
Is that really an NSException, or is it just something that gets NSLogged?
> Is that really an NSException

Yes.  The message is displayed by [NSException raise].  (I found this
out at bug 373122 by breaking on '-[NSException raise]' in gdb.)
What other than the delayed-perform is catching that exception, then?

I haven't seen your patch yet, but I'm concerned that it might over-catch exceptions that would otherwise be properly caught and handled (or purposely ignored) by some other code.
> I'm concerned that it might over-catch exceptions that would
> otherwise be properly caught and handled (or purposely ignored) by
> some other code.

It probably will.  But I don't think there's any feasible quick fix
other than what I outline in comment #9.  I think past performance
shows that we can't rely on Apple's code (wherever it might be) to
reliably distinguish between fatal and non-fatal exceptions.

Once "we" (that is I) know more about which exceptions should be
fatal, we can start drawing some distinctions.

(It's also possible that the bad things that have happened in the past
when Apple's code ignored certain NSExceptions aren't Apple's fault,
but are (somehow) the fault of the browser.  For example, I've noticed
that Colin's testcase for bug 163260 no longer "works" (see bug 163260
comment #15).  But even if the problem is already "fixed", discovering
what caused it will take time ... and the powers-that-be seem to want
this problem to be "fixed" quickly.)

(It's entirely possible that my "fix" will cause the browser to crash
much too often, and will need to be rolled back ... but I think we
should cross that bridge when (and if) we come to it.)
(Following up comment #16)

> For example, I've noticed that Colin's testcase for bug 163260 no
> longer "works" (see bug 163260 comment #15).

Actually it never worked properly.  But now I've created a slighly
altered version that _does_ "work" -- it crashes as Colin describes in
bug 163260 comment #10)
Hi Mark.  I see this is blocking bug 163260... can you revisit this to see if we can develop a patch to fix or work around this?
(In reply to comment #18)
> Hi Mark.  I see this is blocking bug 163260... can you revisit this to see if
> we can develop a patch to fix or work around this?

Mark hasn't had time to work on Mozilla for a while. This bug shouldn't be owned by him.
Assignee: mark → nobody
This should probably be owned by Steven or Josh.  Guys?
I'll grab it.  I've been working on a related issue at bug 441880 comment #1 and bug 442245.

In sum, bug 441880 comment #1 shows that Safari behaves exactly this way (as FF3 behaved before the patches for bug 163260 were landed), and therefore that what's reported here isn't really a problem (in itself).

Problems only arise when this behavior causes an XPCOM method to be aborted without its cleanup code having been called.  In other words, it's bug 163260 that's the real problem, not this bug.
Assignee: nobody → smichaud
I think this is (and always was) most likely invalid.  In any case I'll almost certainly never have time to work on it.
Assignee: smichaud → nobody
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.