Closed Bug 14259 Opened 25 years ago Closed 25 years ago

Linux/Alpha: nsNativeComponentLoader::Init doesn't get called in gdb

Categories

(Core :: XPCOM, defect, P3)

DEC
Linux
defect

Tracking

()

VERIFIED DUPLICATE of bug 14263

People

(Reporter: niles, Assigned: shaver)

Details

For some reason the address of the mLock variable in xpcom/ds/nsHashtable.cpp is getting set to an address of 0x8, thus causing a seg. fault. It's pretty far into the loading process so it's not like it the first call to it. I've included the GDB traceback, but it's a bit mysterious since it's a protected variable how it's address is getting changed. My theory would be that nsNativeComponentLoader::CreateDll is being called without a (required?) nsNativeComponentLoader::Init before it. This shown from the fact that the "this" pointer is NULL. (see traceback) Anyway, mozilla is totally broken on this platform as of the 19990917 source that this bug report is based on. I think adding some more PR_ASSERT calls would help catch these sorts of bugs early. I added: PR_ASSERT(mDllStore != NULL); at nsNativeComponentLoader.cpp:937 Here's the traceback: > gdb ./simplebrowser GNU gdb 4.17.0.11 with Linux support Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "alpha-redhat-linux"... (gdb) run Starting program: /home/niles/mozilla/dist/bin/./simplebrowser Assertion: "Cannot obtain unix toolkit service." (rv == NS_OK) at file ../../../../webshell/tests/viewer/nsSetupRegistry.cpp, line 285 Break: at file ../../../../webshell/tests/viewer/nsSetupRegistry.cpp, line 285 NS_SetupRegistry() MOZ_TOOLKIT=error, WIDGET_DLL=error, GFX_DLL=error Program received signal SIGSEGV, Segmentation fault. nsHashtable::Get (this=0x0, aKey=0x11fffd4a0) at nsHashtable.cpp:162 162 if (mLock) PR_Lock(mLock); Current language: auto; currently c++ (gdb) where #0 nsHashtable::Get (this=0x0, aKey=0x11fffd4a0) at nsHashtable.cpp:162 #1 0x20000606048 in nsNativeComponentLoader::CreateDll (this=0x12018cb70, aSpec=0x0, aLocation=0x120194300 "lib:libraptorwebwidget.so", modificationTime=0, fileSize=0, aDll=0x11fffd568) at nsNativeComponentLoader.cpp:937 #2 0x2000060250c in nsNativeComponentLoader::GetFactory (this=0x12018cb70, aCID=@0x120194330, aLocation=0x120194300 "lib:libraptorwebwidget.so", aType=0x120194370 "application/x-mozilla-native", _retval=0x11fffda70) at nsNativeComponentLoader.cpp:104 #3 0x20000600588 in nsFactoryEntry::GetFactory (this=0x120194330, aFactory=0x11fffda70, mgr=0x12018b740) at nsComponentManager.h:193 #4 0x200005f9c94 in nsComponentManagerImpl::FindFactory (this=0x12018b740, aClass=@0x20000125e78, aFactory=0x11fffda70) at nsComponentManager.cpp:1046 #5 0x200005fa478 in nsComponentManagerImpl::CreateInstance (this=0x12018b740, aClass=@0x20000125e78, aDelegate=0x0, aIID=@0x20000125e68, aResult=0x120196260) at nsComponentManager.cpp:1209 #6 0x2000060e418 in nsComponentManager::CreateInstance (aClass=@0x20000125e78, aDelegate=0x0, aIID=@0x20000125e68, aResult=0x120196260) at nsRepository.cpp:77 #7 0x2000012308c in GtkMozillaContainer::Show (this=0x120196250) at GtkMozillaContainer.cpp:61 #8 0x20000122534 in gtk_mozilla_realize (widget=0x120195f10) at gtkmozilla.cpp:133 #9 0x20000d7753c in gtk_marshal_NONE__NONE () #10 0x20000d1e4ac in gtk_signal_real_emit () #11 0x20000d1b8c4 in gtk_signal_emit () #12 0x20000d65400 in gtk_widget_realize () #13 0x20000d64f0c in gtk_widget_map () #14 0x20000c4c1dc in gtk_box_map () #15 0x20000d7753c in gtk_marshal_NONE__NONE () #16 0x20000d1e4ac in gtk_signal_real_emit () #17 0x20000d1b8c4 in gtk_signal_emit () #18 0x20000d64f64 in gtk_widget_map () #19 0x20000d74344 in gtk_window_map () #20 0x20000d7753c in gtk_marshal_NONE__NONE () #21 0x20000d1e4ac in gtk_signal_real_emit () #22 0x20000d1b8c4 in gtk_signal_emit () #23 0x20000d64f64 in gtk_widget_map () #24 0x20000d74034 in gtk_window_show () #25 0x20000d7753c in gtk_marshal_NONE__NONE () #26 0x20000d1e4ac in gtk_signal_real_emit () #27 0x20000d1b8c4 in gtk_signal_emit () #28 0x20000d64254 in gtk_widget_show () #29 0x120001ed8 in main (argc=1, argv=0x11ffff688) at simplebrowser.c:172 #30 0x2000179ffb0 in __libc_start_main (main=0x120001a20 <main>, argc=1, argv=0x11ffff688, init=0x120001400 <_init>, fini=0x1200022c0 <_fini>, rtld_fini=0, stack_end=0x11ffff670) at ../sysdeps/generic/libc-start.c:78 (gdb) p &mLock $1 = (struct PRLock **) 0x8 (gdb)
Shaver, would gladly accept any help you can provide here. Take the bug over if you helping.
Target Milestone: M11
Shaver ?
Assignee: dp → shaver
Status: ASSIGNED → NEW
This is going to be a pain in the butt for me to debug, because I don't have a Linux/Alpha system, but I'll take it for now. I'll not lie to you, though: it might not get fixed before beta. Try putting a watchpoint on mLock and see where it gets changed. That'll help a lot.
Status: NEW → ASSIGNED
I just noticed you were using simplebrowser, and it looks like you're not getting the registry set up correctly. Try running apprunner or viewer, and try deleting the component.reg file in dist/bin.
I should have made this clear before... If I run the shell scripts mozilla-apprunner.sh or mozilla-viewer.sh then it finds and loads the components. However, if I want to run in the debugger I can't ever get it to find those components. I've tried simplebrower, apprunner, and viewer and I get the same behavior. When I try to watch on mLock it seems the whole called tree's addresses go crazy...what do you think of this? Breakpoint 4, nsHashtable::nsHashtable (this=0x12010f840, aInitSize=0, threadSafe=0) at nsHashtable.cpp:100 100 : mLock(NULL) (gdb) wher #0 nsHashtable::nsHashtable (this=0x12010f840, aInitSize=0, threadSafe=0) at nsHashtable.cpp:100 #1 0x200004ec608 in nsComponentManagerImpl::Init (this=0x12010ee50) at nsComponentManager.cpp:209 #2 0x200004892c4 in NS_InitXPCOM (result=0x0, registryFile=0x0, componentDir=0x0) at nsXPComInit.cpp:187 #3 0x120004940 in main (argc=1, argv=0x11ffff688) at nsAppRunner.cpp:633 #4 0x200011e3fb0 in __libc_start_main (main=0x1200047a0 <main>, argc=1, argv=0x11ffff688, init=0x120002400 <_init>, fini=0x1200093c0 <_fini>, rtld_fini=0x12010f840, stack_end=0x11ffff670) at ../sysdeps/generic/libc-start.c:78 (gdb) c Continuing. Breakpoint 4, nsHashtable::nsHashtable (this=0x10, aInitSize=536868048, threadSafe=1) at nsHashtable.cpp:100 100 : mLock(NULL) (gdb) where #0 nsHashtable::nsHashtable (this=0x10, aInitSize=536868048, threadSafe=1) at nsHashtable.cpp:100 #1 0x200004ec76c in nsComponentManagerImpl::Init (this=0x12010ee50) at nsComponentManager.cpp:233 warning: Hit heuristic-fence-post without finding warning: enclosing function for address 0x2fe0000047ff041f
I forced it to find it by sym. linking it to the explicit path /usr/lib/mozilla...Now I get: > gdb apprunner GNU gdb 4.17.0.11 with Linux support Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "alpha-redhat-linux"... (gdb) run Starting program: /home/niles/mozilla/dist/bin/apprunner nsNativeComponentLoader: autoregistering /usr/lib/mozilla/components PreCondition: "You can't dereference a NULL nsCOMPtr with operator->()." (mRawPtr != 0) at file ../base/nsCOMPtr.h, line 588 Break: at file ../base/nsCOMPtr.h, line 588 Program received signal SIGSEGV, Segmentation fault. 0x200004fb378 in nsNativeComponentLoader::AutoRegisterComponent (this=0x120110280, when=0, component=0x120117070, registered=0x11ffff204) at nsNativeComponentLoader.cpp:719 719 rv = mCompMgr->RegistryLocationForSpec(component, &persistentDescriptor); (gdb) where #0 0x200004fb378 in nsNativeComponentLoader::AutoRegisterComponent (this=0x120110280, when=0, component=0x120117070, registered=0x11ffff204) at nsNativeComponentLoader.cpp:719 #1 0x200004f99b8 in nsNativeComponentLoader::RegisterComponentsInDir (this=0x120110280, when=0, dir=0x120114d10) at nsNativeComponentLoader.cpp:323 #2 0x200004f947c in nsNativeComponentLoader::AutoRegisterComponents (this=0x120110280, aWhen=0, aDirectory=0x120114d10) at nsNativeComponentLoader.cpp:267 #3 0x200004f2e08 in nsComponentManagerImpl::AutoRegister (this=0x12010ee50, when=0, inDirSpec=0x0) at nsComponentManager.cpp:1955 #4 0x20000504c10 in nsComponentManager::AutoRegister (when=0, directory=0x0) at nsRepository.cpp:196 #5 0x120005cc8 in NS_AutoregisterComponents () at nsSetupRegistry.cpp:95 #6 0x120007700 in NS_SetupRegistry_1 () at nsSetupRegistry.cpp:115 #7 0x120003cf4 in main1 (argc=1, argv=0x11ffff698) at nsAppRunner.cpp:501 #8 0x120004998 in main (argc=1, argv=0x11ffff698) at nsAppRunner.cpp:637 #9 0x200011e3fb0 in __libc_start_main (main=0x1200047a0 <main>, argc=1, argv=0x11ffff698, init=0x120002400 <_init>, fini=0x1200093c0 <_fini>, rtld_fini=0x120110298, stack_end=0x11ffff680) at ../sysdeps/generic/libc-start.c:78 (gdb)
Summary: Linux/Alpha: xpcom/ds/nsHashtable.cpp mLock gets bogus address. → Linux/Alpha: nsNativeComponentLoader has null mCompMgr
Try adding ``#define DEBUG_shaver 1'' to the top of xpcom/components/nsNativeComponentLoader.cpp and let us know what messages, if any, you see. It sounds like nsNativeComponentLoader::Init isn't getting called, which will cause some rather serious stress. I'll read through the code and see if I can figure out which path it's taking, or which questions I should be asking. (I _love_ store-and-forward debugging.)
Summary: Linux/Alpha: nsNativeComponentLoader has null mCompMgr → Linux/Alpha: nsNativeComponentLoader::Init doesn't get called in gdb
Agreed! After adding that #define I found that the behavior is different in and out of the debugger, which is quite annoying. Inside the debugger the Init function is never called. Outside the debugger it's called and seems to run correctly (i.e. neither of your other print messages occur), but other bad things happen later (See Bug #14263). I would not mark this as a duplicate bug yet, as I believe there's two separate bugs happening here.
I believe this could be an SMP + (create an initial .mozilla dir) bug. I forgot to mention this Alpha was SMP (sorry). I tried M10 on my Dual PPro after removing the .mozilla directory and I had similar problems. Once I rebooted in non-SMP mode I was able to create the .mozilla directory and run in SMP mode. I can't reboot the Alpha in non-SMP mode (it's too important). Can someone else test M10 on Linux/i386/SMP without an existing .mozilla directory and post the results? I think this is threads getting out of order problem.
Works fine on Alpha/SMP using tip of tree + patch in http://bugzilla.mozilla.org/show_bug.cgi?id=14263
Status: ASSIGNED → RESOLVED
Closed: 25 years ago
Resolution: --- → DUPLICATE
I am going to mark this bug a dup of 14263 as this is describing two different behaviours of the same problems. *** This bug has been marked as a duplicate of 14263 ***
verified dupe of now fixed bug
Status: RESOLVED → VERIFIED
Component: XPCOM Registry → XPCOM
QA Contact: dp → xpcom
You need to log in before you can comment on or make changes to this bug.