Closed Bug 135330 Opened 22 years ago Closed 22 years ago

shutdown crash [@PR_EnterMonitor] called with a deleted monitor on the MemoryFlusher thread

Categories

(Core :: XPCOM, defect)

x86
Windows 2000
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla1.0

People

(Reporter: timeless, Assigned: dougt)

Details

(Keywords: crash, topembed+)

Crash Data

Attachments

(1 file, 1 obsolete file)

For reference I'm going to list all the threads as seen by msvc. I'm curious as 
to why there are other threads alive this late into shutdown...

Some interesting points.. 1. this isn't the first time i've crashed on one 
thread and later (not too much later) received an alert/assert dialog from a 
second thread. (I think it might be the first time MSVC didn't crash while 
trying to play with such a case) 2. the same thing happened here, i crashed and 
then i got the main thread assert. 3. that means i can't guarantee that threads 
as listed here match their state at the time of the crash *sigh*.

main thread:
NTDLL! 77f827e8()
KERNEL32! 77e86a3d()
nsDebug::WarnIfFalse(const char * 0x101070b8, const char * 0x101070ac, const 
char * 0x1010707c, int 582) line 397 + 21 bytes
NS_ShutdownXPCOM(nsIServiceManager * 0x00000000) line 582 + 31 bytes
main(int 1, char * * 0x00304b60) line 1774 + 8 bytes
mainCRTStartup() line 338 + 17 bytes
KERNEL32! 77e97d08()

The warning:
###!!! ASSERTION: Component Manager being held past XPCOM shutdown.: 'cnt == 
0', file f:\build\mozilla\xpcom\build\nsXPComInit.cpp, line 582

Thread main+1:
NTDLL! 77fa018d()
KERNEL32! 77e8758a()

crashed thread:
PR_EnterMonitor(PRMonitor * 0x002fd0e8) line 92 + 6 bytes
nsAutoMonitor::nsAutoMonitor(PRMonitor * 0x002fd0e8) line 200 + 13 bytes
nsComponentManagerImpl::GetServiceByContractID(nsComponentManagerImpl * const 
0x002fcf94, const char * 0x10108354, const nsID & {...}, void * * 0x00acfe94) 
line 2150
nsGetServiceByContractID::operator()(const nsID & {...}, void * * 0x00acfe94) 
line 123 + 38 bytes
nsCOMPtr<nsIEventQueueService>::assign_from_helper(const nsCOMPtr_helper & 
{...}, const nsID & {...}) line 922 + 18 bytes
nsCOMPtr<nsIEventQueueService>::nsCOMPtr<nsIEventQueueService>(const 
nsCOMPtr_helper & {...}) line 554
nsMemoryImpl::FlushMemory(const unsigned short * 0x10108074, int 0) line 439 + 
30 bytes
MemoryFlusher::Run(MemoryFlusher * const 0x002f74f0) line 192 + 43 bytes
nsThread::Main(void * 0x002f76b8) line 120 + 26 bytes
_PR_NativeRunThread(void * 0x002f77b8) line 433 + 13 bytes
_threadstartex(void * 0x002f7a30) line 212 + 13 bytes
KERNEL32! 77e8758a()

Thread main+3:
USER32! 77e1325c()
nsDNSService::Run(nsDNSService * const 0x00e5b334) line 1360 + 21 bytes
nsThread::Main(void * 0x00e306a0) line 120 + 26 bytes
_PR_NativeRunThread(void * 0x00dd5080) line 433 + 13 bytes
_threadstartex(void * 0x00e49050) line 212 + 13 bytes
KERNEL32! 77e8758a()

Thread main+4:
NTDLL! 77f82829()
KERNEL32! 77e86e1a()
KERNEL32! 77e8758a()

Thread main+5:
NTDLL! 77f827e8()
KERNEL32! 77e86a3d()
8b000000()

Thread main+6:
NTDLL! 77f82a84()
RPCRT4! 77d50781()
RPCRT4! 77d50c7a()
KERNEL32! 77e8758a()

Thread main+7:
NTDLL! 77f82837()
KERNEL32! 77e8758a()

Relevant bits from the top of the crashed thread (aka main+2):
-	mon	0x002fd0e8
|+	name	0xdddddddd ""
|+	cvar	0xdddddddd
\	entryCount	3722304989

from nsComponentManagerImpl::GetServiceByContractID
-	this	0x002fcf94
|+	nsIComponentManager	{...}
|+	nsIServiceManager	{...}
|+	nsIComponentRegistrar	{...}
|-	nsSupportsWeakReference	{...}
||+	nsISupportsWeakReference	{...}
|\+	mProxy	0x00000000
|+	nsIInterfaceRequestor	{...}
|+	nsIServiceManagerObsolete	{...}
|+	nsIComponentManagerObsolete	{...}
|	mRefCnt	3
|	_mOwningThread	0x00302ca0
|+	gComponentManager	0x002fcf90
|-	mFactories	{...}
||+	ops	0x00000000
||	data	0x00000000
||	hashShift	22
||	maxAlphaFrac	224 'р'
||	minAlphaFrac	51 '3'
||	entrySize	8
||	entryCount	888
||	removedCount	0
||	generation	0
|\+	entryStore	0x00d60068 
"нннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннн
ннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннн
ннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннннн
нннннн"
|-	mContractIDs	{...}
||+	ops	0x00000000
||	data	0x00000000
||	hashShift	21
||	maxAlphaFrac	224 'р'
||	minAlphaFrac	102 'f'
||	entrySize	12
||	entryCount	1040
||	removedCount	0
||	generation	1
|\-	entryStore	0x00dad630 ""
| \		CXX0030: Error: expression cannot be evaluated
|	mMon	0x002fd0e8
|+	mRegistry	0x00000000
|	mXPCOMKey	559
|	mClassesKey	602
|	mCLSIDKey	642
|	mPrePopulationDone	1
|	mLoadersKey	3452816845
|+	mNativeComponentLoader	0x00000000
|+	mStaticComponentLoader	0x00000000
|-	mComponentsDir	{...}
|\+	mRawPtr	0x00000000
|	mComponentsOffset	48
|	mShuttingDown	2
|+	mLoaderData	0x00000000
|	mNLoaderData	2
\	mMaxNLoaderData	6

My guess is that /a/ fault lies in:
nsresult nsComponentManagerImpl::Shutdown(void) 
{
/*...*/
    if (mMon)
        PR_DestroyMonitor(mMon);
/*...*/
}

It looks like the code should null out mMon :-) ... But to do that, it needs to 
hold a lock to prevent people from trying to use mMon :-)

Of course, this doesn't resolve any of the big problems like why other threads 
are still alive when this late into shutdown :-)

fwiw, bug 119693 is among the PR_EnterMonitor crash bugs, however it's 
essentially a talkback crash site collector and doesn't appear to have any 
useful information.
see 134070 (and dupe 134361)
Keywords: crash
Timeless, can you apply the patch in this bug?

http://bugzilla.mozilla.org/show_bug.cgi?id=132705
dougt: this crash is not something i can just run moz, quit and reproduce. 
actually i'm not even sure i want to close this session (i'd like an ok from 
one of the initial email'ees before i close the debugger). however a quick read 
of that bug's patch indicates to me that it wouldn't help, the function listed 
in the patch doesn't fit in the call stack.

dougt... note that the patch you're poointing e to is for creating instances, 
whereas this is a crash doing a getservice....

Andrew: my crash really happens before any code from nsIEventQueueService hits 
the stack, so while it's a nice suggestion I also don't believe it's relevant.
/me wonders why we aquire the mMon monitor before checking gXPCOMShuttingDown. 
Since gXPCOMShuttingDown is only set on one thread, it doesn't need to be
protected, does it?!

Timeless, your right.  XPCOM joinable threads should be interrupted and joined
prior to the end of xpcom shutdown.
Attached patch v.1 (obsolete) — Splinter Review
moves the monitor acquisition after the shutdown check.
sets mMon to null after freeing it.

Timeless, can you review this change?
Keywords: nsbeta1, topembed
Attachment #78177 - Flags: review+
Comment on attachment 78177 [details] [diff] [review]
v.1

I'm not buying this. Moving the nsAutoMonitors is fine. But I 
don't think this is good enough.

I think it is wrong to be creating and destroying the monitor
in the Init/Shutdown methods. If this was getting done in the
ctor/dtor then this problem would mostly go away. Creating the
monitor in the Init method is not so whacky if you want to
detect failure and have that impact the result of the Init. 
But, destroying the monitor in the shutdown method is just 
wrong. The dtor is the place for this.

BTW, you've since learned that PR_NewMonitor is not the right 
way to make a monitor. You ought to fix this while you're 
here.

The fact that NS_ShutdownXPCOM's call of 
(nsComponentManagerImpl::gComponentManager)->Shutdown() 
can race with normal callers to the component/service 
manager is just broken. But I suppose that's another bug.
Attachment #78177 - Flags: needs-work+
I will:
a) destroy the monitor in the objects destructor
b) use the new style monitor creation fu.  
Attached patch v.2Splinter Review
Attachment #78177 - Attachment is obsolete: true
Comment on attachment 78428 [details] [diff] [review]
v.2

sr=jband.

+	 mMon = nsnull;

I'd remove this now that the monitor is destroyed in the dtor.
Attachment #78428 - Flags: superreview+
right. done.
Keywords: mozilla1.0
Target Milestone: --- → mozilla1.0
Keywords: topembedtopembed+
Comment on attachment 78428 [details] [diff] [review]
v.2

r=brendan@mozilla.org -- for optional bonus points, change those NULLs to be
nsnulls, or better, test if (!mMon) to match style elsewhere.

/be
Attachment #78428 - Flags: review+
Checking in nsComponentManager.cpp;
/cvsroot/mozilla/xpcom/components/nsComponentManager.cpp,v  <-- 
nsComponentManager.cpp
new revision: 1.195; previous revision: 1.194
done

We will want this on the mozilla 1.0 branch.
Comment on attachment 78428 [details] [diff] [review]
v.2

a=asa (on behalf of drivers) for checkin to the 1.0 branch
Attachment #78428 - Flags: approval+
Landed fix on branch:
Checking in nsComponentManager.cpp;
/cvsroot/mozilla/xpcom/components/nsComponentManager.cpp,v  <-- 
nsComponentManager.cpp
new revision: 1.194.2.2; previous revision: 1.194.2.1
done

Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
adding fixed1.0.0 keyword (branch resolution). This bug has comments saying it
was fixed on the 1.0 branch and a bonsai checkin comment that agrees. To verify
the bug has been fixed on the 1.0 branch please replace the fixed1.0.0 keyword
with verified1.0.0.
Keywords: fixed1.0.0
Crash Signature: [@PR_EnterMonitor]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: