[BeOS] xpcom crash on shutdown

RESOLVED FIXED

Status

()

--
critical
RESOLVED FIXED
17 years ago
17 years ago

People

(Reporter: cls, Assigned: cls)

Tracking

Trunk
x86
BeOS
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments, 1 obsolete attachment)

(Assignee)

Description

17 years ago
We are crashing inside NS_ShutdownXPCOM(nsnull).  If you run the browser and
exit, it works fine.  Viewer crashes when you close the window.  The tinderboxes
send a SIGTERM to close the app during the tinderbox tests and it crashes then
as well.  It appears to have been this way since shortly after NSCP's power
outage last month (~8/16).  I haven't been able to track this down to any single
checkin.  

Part of this might be due to BeOS' timer setup.  Via a static constructor in
libtimer_beos.so, we create a TimerManager object/native thread which is what
actually fires off the timer events.  This thread is never killed until the
TimerManager object is destroyed with is on shutdown.  Since this seems to be
asking for trouble, I (locally) made the TimerManager object dynamically created
 using NS_GENERIC_FACTORY_CONSTRUCTOR_INIT and the TimerManager should be
shutdown/removed when the module destructor is called.  In theory, at least, all
of that should work. 

It turns out that the bookmarks' timer is a repeating timer.  It gets created
early and it one of the last things deleted so it appears.  So, after some wild
debugging, I discovered that the bookmarks' timer doesn't get deleted until the
bookmarks component is deleted and that's not  until the service manager is
shutdown.  When the bookmarks' timer is killed, then the last nsTimerBeOS object
is destroyed.  But the TimerManager & the timer thread are still there
(sometimes). Then when xpcom goes to clear the event queue, the program crashes
because it attempts to handle what I'm going to assume is some native BeOS event
with a dialog stating:

"You need a valid BApplication object before interacting with the app_server"

If the TimerManager destructor is called (printf debugging), then the crash
doesn't occur.

FYI, following the example in gtk viewer, I killed the nsViewerApp (and its
BApplication) before calling NS_ShutdownXPCOM but I'm not sure what event is
triggering the crash.

Comment 1

17 years ago
Hey Chris, 
Can you get a stack trace?  Also, do you have a debug build that I can look at?

Who is the be owner?  I posted a ng message asking for someone to look at this.
(Assignee)

Comment 2

17 years ago
Be has had a few strong contributors but currently doesn't have an owner afaik.I'm not sure how to get a better trace out of bdb but it looks like:_startmainNS_ShutdownXPCOMnsEventQueueImpl:ProcessPendingEvents(void)PL_ProcessPendingEventsPL_HandleEvent(0xeeb5f3c6)(0xeeb0fc91)(0xeeb1002e)(0xeeb20064)(0xeeb0a9e4)(0xeeb0acbd)nsSupportsArray:Clear(void)(0xeebbd1c4)(0xeebbd3e4)(0xeebbedd4)(0xecf6d294)(0xecf6d4ad)BBitmap:~BBitmap(void)_BAppServerLink_::_BAppServerLink_(void)debugger
(Assignee)

Comment 3

17 years ago
Created attachment 49369 [details] [diff] [review]
dynamically allocate TimerManager
(Assignee)

Comment 4

17 years ago
Created attachment 49371 [details] [diff] [review]
Explicitly init & shutdown XPCOM from beos viewer
(Assignee)

Comment 5

17 years ago
Argh! NetPositivie & bugzilla don't mix!

Be has had a few strong contributors but currently doesn't have an owner afaik.
I'm not sure how to get a better trace out of bdb but it looks like:

_start
main
NS_ShutdownXPCOM
nsEventQueueImpl:ProcessPendingEvents(void)
PL_ProcessPendingEvents
PL_HandleEvent
(0xeeb5f3c6)
(0xeeb0fc91)
(0xeeb1002e
)(0xeeb20064
)(0xeeb0a9e4
)(0xeeb0acbd)
nsSupportsArray:Clear(void
)(0xeebbd1c4)
(0xeebbd3e4)
(0xeebbedd4)
(0xecf6d294)
(0xecf6d4ad)
BBitmap:~BBitmap(void)
_BAppServerLink_::_BAppServerLink_(void)
debugger


(Assignee)

Comment 6

17 years ago
Comment on attachment 49371 [details] [diff] [review]
Explicitly init & shutdown XPCOM from beos viewer

New patches coming up
Attachment #49371 - Attachment is obsolete: true
(Assignee)

Comment 7

17 years ago
Ok, I found the actual problem.  We weren't releasing the timer sync semaphore
when we exited nsAppShell via (::Exit).   It was only being released during the
normal course of the event handling loop.   So we were waiting in an
acquire_sem() call.  So the dynamic TimerManager patch isn't necessary (tested). 

I had to tweak the sighandlers for viewer & mozilla.  Just killing the
BApplication object wasn't cleaning things up correctly so I modified it to just
call nsAppShell::Exit() and shutdown normally.  I also changed viewer so that it
created the BApplication object in the same fashion as mozilla does...on a
separate thread that is mostly orthagonal to the main appshell thread.
Assignee: dougt → cls
(Assignee)

Comment 8

17 years ago
Created attachment 49560 [details] [diff] [review]
delete syncsem when shutting down nsAppShell
(Assignee)

Comment 9

17 years ago
Created attachment 49561 [details] [diff] [review]
shutdown appshell when hit with SIGTERM

Comment 10

17 years ago
sr=alecf on both patches.
(Assignee)

Comment 11

17 years ago
The last two patches have been checked in and the beos tinderbox is green once
again. :-)
Status: NEW → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.