Closed Bug 398002 Opened 17 years ago Closed 17 years ago

Camino sometimes crashes when quitting

Categories

(Camino Graveyard :: General, defect)

x86
macOS
defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: suishouen, Unassigned)

References

Details

(Keywords: crash, regression)

Attachments

(2 files)

User-Agent:       Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en; rv:1.9a9pre) Gecko/2007092902 Camino/2.0a1pre (like Firefox/3.0a9pre)
Build Identifier: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en; rv:1.9a9pre) Gecko/2007092902 Camino/2.0a1pre (like Firefox/3.0a9pre)

Camino sometimes crashes when quiting, and every time it crashes Console logs;
Camino[1350] *** _NSAutoreleaseNoPool(): Object 0x******* of class BrowserWindowController autoreleased with no pool in place - just leaking

Reproducible: Sometimes
Attached file Camino.crash.log
Summary: Camino sometimes crashes when quiting → Camino sometimes crashes when quitting
Severity: normal → critical
Keywords: crash
I've experience the same crash half an hour ago. Two tabs open, nothing special. Happened after a short session.
Have you found any pattern that could help reproduce this ?

From my crash log:
Thread 0 Crashed:
0   libobjc.A.dylib                	0x90a400f8 objc_msgSend + 24
1   com.apple.Foundation           	0x92be85f4 -[NSArray makeObjectsPerformSelector:withObject:] + 264
2   com.apple.AppKit               	0x938433fc -[NSApplication _deallocHardCore:] + 220
3   com.apple.AppKit               	0x93841fb4 -[NSApplication terminate:] + 520
4   com.apple.AppKit               	0x9383fc4c -[NSApplication sendAction:to:from:] + 108
5   com.apple.AppKit               	0x9389a4b8 -[NSMenu performActionForItemAtIndex:] + 392
6   com.apple.AppKit               	0x9389a23c -[NSCarbonMenuImpl performActionWithHighlightingForItemAtIndex:] + 104
7   com.apple.AppKit               	0x937a1500 _NSHandleCarbonMenuEvent + 372
8   com.apple.AppKit               	0x9379ee64 _DPSNextEvent + 1280
9   com.apple.AppKit               	0x9379e7a8 -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] + 116
10  com.apple.AppKit               	0x9379acec -[NSApplication run] + 472
11  com.apple.AppKit               	0x9388b87c NSApplicationMain + 452
12  org.mozilla.camino             	0x0000380c start + 812
13  org.mozilla.camino             	0x00003510 start + 48
Status: UNCONFIRMED → NEW
Ever confirmed: true
I'm seeing this 100% of the time with today's (2007-09-29-02) trunk build, with a simple launch/quit.
Version: unspecified → Trunk
Josh, Steven, have you all been doing anything with autorelease pools in the core? I've seen pretty much identical behavior with an extension trying to manage its own autorelease pools in low-level Gecko code, with the result that Camino objects were ending up in a pool that the Gecko cleanup was destroying right in the middle of Camino's shutdown process. Having autorelease pools with overlapping, non-subset durations tends to get very ugly.
Flags: blocking1.9?
Another crashlog;

Thread 0 Crashed:
0   libobjc.A.dylib                	0x90a61b09 _objc_error + 86
1   libobjc.A.dylib                	0x90a61b40 __objc_error + 45
2   libobjc.A.dylib                	0x90a601a0 _freedHandler + 53
3   com.apple.Foundation           	0x927f0fcd -[NSArray makeObjectsPerformSelector:withObject:] + 324
4   com.apple.Foundation           	0x92801ca1 -[NSArray makeObjectsPerformSelector:] + 141
5   com.apple.AppKit               	0x9341a7a3 -[NSApplication _deallocHardCore:] + 233
6   com.apple.AppKit               	0x9341a2a5 -[NSApplication terminate:] + 629
7   com.apple.AppKit               	0x93380d88 -[NSApplication sendAction:to:from:] + 107
8   com.apple.AppKit               	0x9342ece7 -[NSMenu performActionForItemAtIndex:] + 455
9   com.apple.AppKit               	0x9342ea29 -[NSCarbonMenuImpl performActionWithHighlightingForItemAtIndex:] + 103
10  com.apple.AppKit               	0x9342e680 -[NSMenu performKeyEquivalent:] + 766
11  com.apple.AppKit               	0x9342e121 -[NSApplication _handleKeyEquivalent:] + 381
12  com.apple.AppKit               	0x93361d87 -[NSApplication sendEvent:] + 3542
13  com.apple.AppKit               	0x9328cdfe -[NSApplication run] + 547
14  com.apple.AppKit               	0x93280d2f NSApplicationMain + 573
15  org.mozilla.camino             	0x00002666 start + 258
16  org.mozilla.camino             	0x0000258d start + 41
> Josh, Steven, have you all been doing anything with autorelease
> pools in the core?

I haven't been ... and I suspect Josh hasn't been either.

But I know of a couple cases where my new appshell (bug 395397) has
triggered/unmasked what are arguably Apple bugs (bug 396860 and bug
397039).  This may be another such case.

As I was developing the appshell patch I noticed that Camino started
crashing more often on exit.  But most of those stacks were different
and I put in workarounds (see [TopLevelWindowData windowWillClose:] in
widget/src/cocoa/nsWindowMap.mm and nsAppShell::WillTerminate() in
widget/src/cocoa/nsAppShell.mm; I'm no longer sure that
WillTerminate()'s "extra" call to NS_ProcessPendingEvents() triggers
these problems).

I _have_ also seen stacks like Eiichi's, which my workarounds didn't
fix ... but never enough to test with.  If it's now easy to reproduce
them 100% of the time, it'll be worth trying to tackle them, too.
Here's the part of the console.log from quitting my debug build (built last night); there's an appshell warning in there among all the other xpcom stuff, too (in addition to the autorelease message).
> WARNING: nsAppShell::Exit() called redundantly: ...

This is an old bug (probably cross-platform), and almost certainly
benign ... though the OS X appshell code only started complaining
about it recently.

See bug 396833.
I can crash reliably in 2007-09-28-00, but after a number of attempts with 2007-09-27-00, I was unable to crash it on quit at all.

That gives us this range: http://bonsai.mozilla.org/cvsquery.cgi?treeid=default&module=Camino&branch=HEAD&branchtype=match&dir=&file=&filetype=match&who=&whotype=match&sortby=Date&hours=2&date=explicit&mindate=2007-09-27+00%3A00&maxdate=2007-09-28+00%3A00&cvsroot=%2Fcvsroot

There was an appshell patch landed that day, bug 397439, but I'm not really sure which, if any, other bugs are interesting there.
Keywords: regression
I still think (as I say in comment #5) that this problem is most
likely to be a side effect of my original appshell patch (bug 395397)
... though there are clearly also other factors involved (which would
explain why the problem seems to have drifted in and out of existence
since then).

Smokey, try testing with the Camino trunk nightlies for 2007-09-19 and
2007-09-20 (the second is the first nightly my original appshell patch
appeared in).
> (as I say in comment #5)

Should have been "(as I say in comment #6)".
I tried 2007-09-27-00, but it also crashes.

Camino.crash.log;
Thread 0 Crashed:
0   libobjc.A.dylib                	0x90a594c0 objc_msgSend + 16
1   com.apple.Foundation           	0x92801ca1 -[NSArray makeObjectsPerformSelector:] + 141
2   com.apple.AppKit               	0x9341a7a3 -[NSApplication _deallocHardCore:] + 233
3   com.apple.AppKit               	0x9341a2a5 -[NSApplication terminate:] + 629
4   com.apple.AppKit               	0x93380d88 -[NSApplication sendAction:to:from:] + 107
5   com.apple.AppKit               	0x9342ece7 -[NSMenu performActionForItemAtIndex:] + 455
6   com.apple.AppKit               	0x9342ea29 -[NSCarbonMenuImpl performActionWithHighlightingForItemAtIndex:] + 103
7   com.apple.AppKit               	0x9342e680 -[NSMenu performKeyEquivalent:] + 766
8   com.apple.AppKit               	0x9342e121 -[NSApplication _handleKeyEquivalent:] + 381
9   com.apple.AppKit               	0x93361d87 -[NSApplication sendEvent:] + 3542
10  com.apple.AppKit               	0x9328cdfe -[NSApplication run] + 547
11  com.apple.AppKit               	0x93280d2f NSApplicationMain + 573
12  org.mozilla.camino             	0x00003126 start + 258
13  org.mozilla.camino             	0x0000304d start + 41
I'm testing 2007-09-19 and 2007-09-20.
2007-09-20 crashes, but 2007-09-19 seems to be O.K for now.
I've figured out the _deallocHardCore: crash, and why it's happening.
I've also revised my appshell patch to work around the problem (see
bug 400321).

As I explain in bug 400321 comment #0, this crash is ultimately caused
by a design flaw in Camino, which predates my appshell patch (bug
395397).  All my appshell patch did was uncover it.

For the sake of completeness, here's a step-by-step explanation of the
_deallocHardCore: crash:

After [NSApplication terminate:] has sent NSApplicationWillTerminate
notifications to all subscribers, it calls [NSApplication
_deallocHardCore:].  From it's name you'd expect this method to delete
everything that hasn't already been deleted, and this seems to be
exactly what it does.

Among other things, it does the following:

1) Gets a list of currently open windows, then uses [NSArray
   makeObjectsPerformSelector:] to send a "close" message to each of
   them.
2) Calls +[NSAutorelease releaseAllPools].

Eiichi's crash (which others have reported and which I've often seen
myself) takes place as makeObjectsPerformSelector: tries to send a
"close" message to an AutoCompleteWindow object that has already been
deleted (this class is defined in
camino/src/browser/AutoCompleteTextField.mm).

The reason this AutoCompleteWindow object has already been deleted is
that [NSArray makeObjectsPerformSelector:] previously sent a "close"
message to the last open browser window (class BrowserWindow).  This
call triggered an NSWindowWillClose notification to that browser
window's controller (class BrowserWindowController), which in turn
(indirectly) triggered a call to NS_TermEmbedding(), which (among
other things) destroyed the currently running appshell.  The
appshell's destructor contains code that (at least some of the time)
causes all extant autorelease pools (and all of the objects they
contain) to be released.  Among the objects released by this code is
the AutoCompleteWindow object that [NSArray
makeObjectsPerformSelector:] crashes on.
I should mention that whenever I see the _deallocHardCore: crash, I
also see one or more messages in the system console saying that an
object has been leaked (released with no autorelease pool in place).
One of these objects is always a BrowserWindowController.
(In reply to comment #15)
> As I explain in bug 400321 comment #0, this crash is ultimately caused
> by a design flaw in Camino

To capture the important part of our IRC discussion, the issue at hand is whether embedding is embedded in Camino, or Camino is embedded in embedding. If appshell is the app, and owns all the UI, then Camino is indeed doing very dangerous things. My perspective is the opposite, and if Camino owns the UI elements then having their lifetimes shortened is unexpected.

That's something that needs a larger discussion at some point, especially in light of the XULRunner framework discussions.
I think Stuart has summed up our discussion well:  If Camino is
supposed to be embedded in the appshell (to be owned by it), then the
design flaw is in Camino.  But if things are the other way around,
then the design flaw is in the appshell.

To my mind the fact that the appshell is currently responsible (on its
destruction) for releasing all autorelease pools (below the mMainPool
it creates in its constructor) gives it defacto ownership of Camino.
But this may not have been the original design, or how things should
ultimately be.

However things turn out, I don't think we should try to resolve this
before the Firefox 3.0 release -- I think my current workaround (the
patch for bug 400321) is a good compromise for the medium-term future.
By the way, Stuart, if Camino destroyed its own windows (on receipt of
an NSApplicationWillTerminate notification), and only called
NS_TermEmbedding() afterwards, this bug's crash on exit would never
have happened.

Camino should be responsible for destroying the things it "owns" :-)
Minusing per Steven's comments in #18 and #19
Flags: blocking1.9? → blocking1.9-
Actually, this should have been fixed by my patch for bug 400321.
Looks good here. Closing as FIXED by the patch in bug 400321.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: