Closed Bug 44090 Opened 25 years ago Closed 23 years ago

service manager assertions when starting viewer

Categories

(Core :: XPCOM, defect, P3)

x86
Windows 98
defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: dbaron, Assigned: dougt)

Details

There are three assertions about reentering the service manager when starting viewer on Win98. This is a bit of a pain when running regression tests. This is the assertion we added a few days ago after fixing the only place it happened in mozilla. We should figure out what's happening here too.
Hrm, I'm not seeing these on NT. Could you grab stack traces?
Status: NEW → ASSIGNED
Here are the stack traces for the three assertions: KERNEL32! bff768a0() nsDebug::Assertion(const char * 0x0053d10c, const char * 0x0053d0f4, const char * 0x0053d0b4, int 355) line 246 + 13 bytes nsServiceManagerImpl::GetService(nsServiceManagerImpl * const 0x00be4d30, const nsID & {...}, const nsID & {...}, nsISupports * * 0x0203fc6c, nsIShutdownListener * 0x00000000) line 355 + 44 bytes nsServiceManager::GetService(const nsID & {...}, const nsID & {...}, nsISupports * * 0x0203fc6c, nsIShutdownListener * 0x00000000) line 562 nsGetServiceByCID::operator()(const nsID & {...}, void * * 0x0203fc6c) line 46 + 22 bytes nsCOMPtr<nsIMIMEService>::assign_from_helper(const nsCOMPtr_helper & {...}, const nsID & {...}) line 856 + 18 bytes nsCOMPtr<nsIMIMEService>::nsCOMPtr<nsIMIMEService>(const nsCOMPtr_helper & {...}) line 553 nsFileIO::Open(nsFileIO * const 0x00cd0890, char * * 0x00cd0900, int * 0x00cd0928) line 156 + 30 bytes nsFileTransport::Process() line 478 + 58 bytes nsFileTransport::Run(nsFileTransport * const 0x00cd08e4) line 452 nsThreadPoolRunnable::Run(nsThreadPoolRunnable * const 0x0132bb50) line 689 + 12 bytes nsThread::Main(void * 0x0132bb00) line 84 + 26 bytes _PR_NativeRunThread(void * 0x0132b990) line 399 + 13 bytes _threadstartex(void * 0x0132b7e0) line 212 + 13 bytes KERNEL32! bff88f20() KERNEL32! bff869ef() KERNEL32! bff868ec() KERNEL32! bff768a0() nsDebug::Assertion(const char * 0x0053d10c, const char * 0x0053d0f4, const char * 0x0053d0b4, int 355) line 246 + 13 bytes nsServiceManagerImpl::GetService(nsServiceManagerImpl * const 0x00be4d30, const nsID & {...}, const nsID & {...}, nsISupports * * 0x01f1fc6c, nsIShutdownListener * 0x00000000) line 355 + 44 bytes nsServiceManager::GetService(const nsID & {...}, const nsID & {...}, nsISupports * * 0x01f1fc6c, nsIShutdownListener * 0x00000000) line 562 nsGetServiceByCID::operator()(const nsID & {...}, void * * 0x01f1fc6c) line 46 + 22 bytes nsCOMPtr<nsIMIMEService>::assign_from_helper(const nsCOMPtr_helper & {...}, const nsID & {...}) line 856 + 18 bytes nsCOMPtr<nsIMIMEService>::nsCOMPtr<nsIMIMEService>(const nsCOMPtr_helper & {...}) line 553 nsFileIO::Open(nsFileIO * const 0x01326050, char * * 0x013260c0, int * 0x013260e8) line 156 + 30 bytes nsFileTransport::Process() line 478 + 58 bytes nsFileTransport::Run(nsFileTransport * const 0x013260a4) line 452 nsThreadPoolRunnable::Run(nsThreadPoolRunnable * const 0x00cd3410) line 689 + 12 bytes nsThread::Main(void * 0x00cd4410) line 84 + 26 bytes _PR_NativeRunThread(void * 0x01328eb0) line 399 + 13 bytes _threadstartex(void * 0x01328d00) line 212 + 13 bytes KERNEL32! bff88f20() KERNEL32! bff869ef() KERNEL32! bff868ec() KERNEL32! bff768a0() nsDebug::Assertion(const char * 0x0053d10c, const char * 0x0053d0f4, const char * 0x0053d0b4, int 355) line 246 + 13 bytes nsServiceManagerImpl::GetService(nsServiceManagerImpl * const 0x00be4d30, const nsID & {...}, const nsID & {...}, nsISupports * * 0x01dffc6c, nsIShutdownListener * 0x00000000) line 355 + 44 bytes nsServiceManager::GetService(const nsID & {...}, const nsID & {...}, nsISupports * * 0x01dffc6c, nsIShutdownListener * 0x00000000) line 562 nsGetServiceByCID::operator()(const nsID & {...}, void * * 0x01dffc6c) line 46 + 22 bytes nsCOMPtr<nsIMIMEService>::assign_from_helper(const nsCOMPtr_helper & {...}, const nsID & {...}) line 856 + 18 bytes nsCOMPtr<nsIMIMEService>::nsCOMPtr<nsIMIMEService>(const nsCOMPtr_helper & {...}) line 553 nsFileIO::Open(nsFileIO * const 0x01325900, char * * 0x01325b20, int * 0x01325b48) line 156 + 30 bytes nsFileTransport::Process() line 478 + 58 bytes nsFileTransport::Run(nsFileTransport * const 0x01325b04) line 452 nsThreadPoolRunnable::Run(nsThreadPoolRunnable * const 0x01327c80) line 689 + 12 bytes nsThread::Main(void * 0x01327c30) line 84 + 26 bytes _PR_NativeRunThread(void * 0x01327ac0) line 399 + 13 bytes _threadstartex(void * 0x01327910) line 212 + 13 bytes KERNEL32! bff88f20() KERNEL32! bff869ef() KERNEL32! bff868ec() I was not able to break on the inner GetService. Could this have something to do with getting the service on multiple threads at the same time?
Leak log from leaksoup: <void*> (16) Leaks 0x0A34E138 [0] <void*> (16){19946} 0x0B26401C ·&@· 0x0A372258 [1] <nsMIMEService> (16){19930} 0x00000000 0x00000000 malloc operator new(unsigned long, const std::nothrow_t&) operator new(unsigned long) nsServiceManagerImpl::GetService(const nsID&, const nsID&, nsISupports**, nsIShutdownListener*) nsServiceManager::GetService(const nsID&, const nsID&, nsISupports**, nsIShutdownListener*) nsGetServiceByCID::operator()(const nsID&, void**) const nsCOMPtr<14nsIMIMEService>::assign_from_helper(const nsCOMPtr_helper&, const nsID&) nsFileIO::Open(char**, int*) nsFileTransport::Process() nsFileTransport::Run() nsThreadPoolRunnable::Run() nsThread::Main(void*) _PR_UserRunThread _PR_UserRunThread
We tracked at least one of these assertions down to a race condition that occurs when the service manager's monitor is exited, here: http://lxr.mozilla.org/mozilla/source/xpcom/components/nsServiceManager.cpp#346 We tried taking out the monitor exit (so that the service manager holds the monitor through the CreateInstance call), and ended up triggering a bunch of brendan-assertions that complain about dangerous lock ordering. This is definitely bad, but nobody had great ideas about how to fix.
brendan suggested putting dummy record into the hashtable before we call out to CreateInstance(). Then, a nested call to GetService() would discover the dummy record and could return NS_ERROR_REGISTER_AGAIN_LATER or something. Of course, this doesn't really *solve* the problem of the nested service, but at least it provides a better way to detect the problem and recover. Adding rayw and scc to the cc list: any chance I could convince either of you to hack this out?
In threaded systems, racing cache fills often must rendezvous via a "fill in progress" dummy (really, incomplete) entry. We could use nsServiceEntry with a null mService to signify such a state. The problem here of a nested GetService, where there is no concurrency and there is presumably a data dependency on the return value of the inner GetService, cannot be solved with a new state and error result alone, although as waterson says, it beats an assertbotch! What we should endeavor to do is un-nest the control flow, or otherwise undo the data dependency. Maybe we can use the new error result to cause a deferral through the event loop. That's a dignified old hack that has saved the day numerous times in the MozillaClassic codebase. /be
WFM win98 I think the asserts stopped 10 days ago.
Even if the assertions stopped, the underlying thread safety problems probably still exist.
Giving to the XPCOM owner.
Assignee: waterson → scc
Status: ASSIGNED → NEW
Component: Viewer App → XPCOM
QA Contact: rickg → kandrot
Giving to the XPCOM owner.
Assignee: scc → dougt
QA Contact: kandrot → scc
we already have thread safety bugs which address most of the thread problems in XPCOM. I want to close this bug since it really doesn't specifically address any of the real problems. (Check out my bug list for the real xpcom threadsafety problems.) i am marking wfm but it really may just be a dup.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → WORKSFORME
These assertions, were, IIRC, a sign that the service manager was creating *multiple instances* of a service. That's bad. What other bug covers this problem?
I think that http://bugzilla.mozilla.org/show_bug.cgi?id=5795 would cover this problem. Note that it is marked INVALID. Maybe we should be locking in the generic factory code... ick. I would rather not.
You need to log in before you can comment on or make changes to this bug.