Closed Bug 234020 Opened 21 years ago Closed 20 years ago

nsTestSample & regxpcom result in "bus error" on OSX [crash @CFStringGetLength]

Categories

(Core :: XPCOM, defect)

PowerPC
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: darin.moz, Assigned: benjamin)

References

Details

(Keywords: crash, regression)

nsTestSample results in a "bus error" on OSX. This seems like some kind of regression. A trunk build from Nov 14, 2003 works fine, but a trunk build from Feb 12, 2004 fails. I first noticed this on the STRING_20040119_BRANCH, so that helps narrow the interval in which this regression was introduced. I also noticed some link time warnings about prebinding. Have we changed the build significantly on OSX recently? Could prebinding be related? Here's the stack trace: Program received signal EXC_BAD_ACCESS, Could not access memory. 0x00000000 in ?? () (gdb) bt #0 0x00000000 in ?? () Cannot access memory at address 0x0 Cannot access memory at address 0x0 #1 0x90196aac in CFStringGetLength () #2 0x901b2e84 in _resolveFileSystemPaths () #3 0x901c8c6c in _CFURLCopyAbsoluteFileURL () #4 0x901aecfc in CFURLCopyAbsoluteURL () #5 0x901ae458 in _CFURLInit () #6 0x901b32d4 in _CFURLInitWithString () #7 0x901bcdfc in CFURLCreateWithString () #8 0x901be560 in _CFBundleCopyInfoDictionaryInDirectoryWithVersion () #9 0x901c8484 in CFBundleGetInfoDictionary () #10 0x901ccb10 in _CFBundleCopyExecutableURLInDirectoryWithAllocator () #11 0x02b59a5c in nsDirectoryService::GetCurrentProcessDirectory(nsILocalFile**) (th is=Cannot access memory at address 0x4f9 ) at /builds/darinf/mozilla/xpcom/io/nsDirectoryService.cpp:198 #12 0x02b5b278 in nsDirectoryService::GetFile(char const*, int*, nsIFile**) (this=0x 4029f0, prop=0x2c213cc "XCurProcD", persistent=0xbffff828, _retval=0xbffff824) at /b uilds/darinf/mozilla/xpcom/io/nsDirectoryService.cpp:763 #13 0x02b5a784 in FindProviderFile(nsISupports*, void*) (aElement=0x4029f8, aData=0x bffff820) at /builds/darinf/mozilla/xpcom/io/nsDirectoryService.cpp:617 #14 0x02b5aae4 in nsDirectoryService::Get(char const*, nsID const&, void**) (this=0x 4029f0, prop=0x2c213cc "XCurProcD", uuid=@0x2c7ffdc, result=0xbffff8b0) at /builds/d arinf/mozilla/xpcom/io/nsDirectoryService.cpp:657 #15 0x02b1fd14 in NS_InitXPCOM2 (result=0xbffffa90, binDirectory=0x0, appFileLocatio nProvider=0x0) at /builds/darinf/mozilla/xpcom/build/nsXPComInit.cpp:490 #16 0x00009500 in NS_InitXPCOM2 (result=0xbffffa90, binDirectory=0x0, appFileLocatio nProvider=0x0) at /builds/darinf/mozilla/xpcom/glue/standalone/nsXPCOMGlue.cpp:170 #17 0x00008b48 in main () at /builds/darinf/mozilla/xpcom/sample/nsTestSample.cpp:71 I tried moving the call to CFBundleCopyExecutableURL earlier so as to minimize the amount of our code before this function is called. In fact, I was able to cut NS_InitXPCOM2 down to nothing more than the following: { nsCOMPtr<nsIDirectoryService> ds; nsresult rv = nsDirectoryService::Create(nsnull, NS_GET_IID(nsIDirectoryService), getter_AddRefs(ds)); ds->Init(); } And, then I changed nsDirectoryService::Init to the following: nsresult nsDirectoryService::Init() { nsresult rv; printf(">>> calling NS_NewISupportsArray...\n"); rv = NS_NewISupportsArray(getter_AddRefs(mProviders)); if (NS_FAILED(rv)) return rv; { CFBundleRef appBundle0 = CFBundleGetMainBundle(); if (appBundle0 != nsnull) { CFURLRef bundleURL0 = CFBundleCopyExecutableURL(appBundle0); } } ... This call to CFBundleCopyExecutableURL ends up crashing. I also removed the mHashtable member variable from nsDirectoryService to minimize action in the constructor. From the looks of things, it seems unlikely that our code is the one trashing memory. This bug means that invoking the GRE via the XPCOM glue is horked on OSX.
regxpcom is also borked on OSX as a result of this bug.
Summary: nsTestSample results in "bus error" on OSX [crash @CFStringGetLength] → nsTestSample & regxpcom result in "bus error" on OSX [crash @CFStringGetLength]
what changed? This did work a couple months ago.
Darin, what version of the OS are you running? What I fear is that this comment: http://lxr.mozilla.org/seamonkey/source/xpcom/io/nsDirectoryService.cpp#194 may no longer be true for unbundled execs?
conrad: i've only tested 10.3... i should probably try other versions of OSX... but, this worked fine -- even on 10.3 -- in the older moz build.
Darin, can you try this after this line: CFBundleRef appBundle0 = CFBundleGetMainBundle(); Assuming appBundle0 != NULL (can you confirm that?) you can dump the bundle to the console using: (gdb) call (void)CFShow(appBundle0) and see what it looks like.
(In reply to comment #5) > Darin, can you try this after this line: > CFBundleRef appBundle0 = CFBundleGetMainBundle(); > Assuming appBundle0 != NULL (can you confirm that?) you can dump the bundle to > the console using: > (gdb) call (void)CFShow(appBundle0) > and see what it looks like. hmm... looks like CFShow throws a SIGTRAP. i tried invoking it from GDB, and i also tried modifying the code. both result in a SIGTRAP. i'll try moving the call to CFBundleGetMainBundle earlier in NS_InitXPCOM2...
that didn't help :( any other suggestions?
regxpcom doesn't link with libxpcom. It loads it via glue. The problem with .dylibs on Mach-O is that they're either linked with the -dynamiclib flag (in which case they can be linked against) or -bundle (in which case they can be loaded manually) XPCOM glue depends on the latter, but libxpcom is not linked in such a way to allow that. I'm surprised it even loads. That CFBundle functions fail is probably just the beginning of problems. > A trunk build from Nov 14, 2003 works fine This is really surprising. I thought longer ago that that, we were building libxpcom in both forms, but that had problems and caused bloat since the application without glue didn't need the -bundle form. I'd look to see if definitions of XPCOM_GLUE_NO_DYNAMIC_LOADING or XPCOM_GLUE have been changed for Mac in that time frame.
thanks for the clues Conrad.
OK, I have a partial fix for this coming up. We *can* dynamically link against .dylibs on OSX, but we have to use the low-level Mach-O function NSAddImage instead of the higher level CF* functions. I have a mostly-done patch that breaks the dependency of the XPCOM glue on NSPR. *However*: I can't get this to work reliably on OS10.1, because of the XPCOM dependencies on NSPR. 10.1 does not support the flag NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME, and otherwise the dynamic linker will try to find NSPR in all the wrong places, and die. So, we have a couple options: we can ship static seamonkey/firefox/whatevers for 10.1, and use a GRE/xpcom glue on 10.2+, or we can ditch the glue entirely on mac, at least until 10.1 is dead. Suggestions?
Assignee: dougt → bsmedberg
Nothing much to add here, but at some point, someone's going to have to decide when to quit supporting Mac OS 10.1.x, and announce "Mozilla x.x is the last for Mac OS 10.1.x." It's been done already for the Classic Mac OS. Perhaps Mike Pinkerton might could give some thoughts on this, possibly not here, but somewhere?
FWIW, there's also a patch on bug 202450 to build Gecko as an OS X framework, which may be preferable to using GRE and XPCOM glue. If there's interest in doing something like that to be shared between FireFox/TBird, comment there.
Is there any status update on this? I'm running into this problem also. Can the patch at least be posted so I can test it out? Thanks.
Depends on: 297923
Fixed by bug 297923
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.