Closed Bug 234020 Opened 21 years ago Closed 19 years ago

nsTestSample & regxpcom result in "bus error" on OSX [crash @CFStringGetLength]

Categories

(Core :: XPCOM, defect)

PowerPC
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED

People

(Reporter: darin.moz, Assigned: benjamin)

References

Details

(Keywords: crash, regression)

nsTestSample results in a "bus error" on OSX.  This seems like some kind of
regression.  A trunk build from Nov 14, 2003 works fine, but a trunk build from
Feb 12, 2004 fails.  I first noticed this on the STRING_20040119_BRANCH, so that
helps narrow the interval in which this regression was introduced.

I also noticed some link time warnings about prebinding.  Have we changed the
build significantly on OSX recently?  Could prebinding be related?

Here's the stack trace:

Program received signal EXC_BAD_ACCESS, Could not access memory.
0x00000000 in ?? ()
(gdb) bt
#0  0x00000000 in ?? ()
Cannot access memory at address 0x0
Cannot access memory at address 0x0
#1  0x90196aac in CFStringGetLength ()
#2  0x901b2e84 in _resolveFileSystemPaths ()
#3  0x901c8c6c in _CFURLCopyAbsoluteFileURL ()
#4  0x901aecfc in CFURLCopyAbsoluteURL ()
#5  0x901ae458 in _CFURLInit ()
#6  0x901b32d4 in _CFURLInitWithString ()
#7  0x901bcdfc in CFURLCreateWithString ()
#8  0x901be560 in _CFBundleCopyInfoDictionaryInDirectoryWithVersion ()
#9  0x901c8484 in CFBundleGetInfoDictionary ()
#10 0x901ccb10 in _CFBundleCopyExecutableURLInDirectoryWithAllocator ()
#11 0x02b59a5c in nsDirectoryService::GetCurrentProcessDirectory(nsILocalFile**) (th
is=Cannot access memory at address 0x4f9
) at /builds/darinf/mozilla/xpcom/io/nsDirectoryService.cpp:198
#12 0x02b5b278 in nsDirectoryService::GetFile(char const*, int*, nsIFile**) (this=0x
4029f0, prop=0x2c213cc "XCurProcD", persistent=0xbffff828, _retval=0xbffff824) at /b
uilds/darinf/mozilla/xpcom/io/nsDirectoryService.cpp:763
#13 0x02b5a784 in FindProviderFile(nsISupports*, void*) (aElement=0x4029f8, aData=0x
bffff820) at /builds/darinf/mozilla/xpcom/io/nsDirectoryService.cpp:617
#14 0x02b5aae4 in nsDirectoryService::Get(char const*, nsID const&, void**) (this=0x
4029f0, prop=0x2c213cc "XCurProcD", uuid=@0x2c7ffdc, result=0xbffff8b0) at /builds/d
arinf/mozilla/xpcom/io/nsDirectoryService.cpp:657
#15 0x02b1fd14 in NS_InitXPCOM2 (result=0xbffffa90, binDirectory=0x0, appFileLocatio
nProvider=0x0) at /builds/darinf/mozilla/xpcom/build/nsXPComInit.cpp:490
#16 0x00009500 in NS_InitXPCOM2 (result=0xbffffa90, binDirectory=0x0, appFileLocatio
nProvider=0x0) at /builds/darinf/mozilla/xpcom/glue/standalone/nsXPCOMGlue.cpp:170
#17 0x00008b48 in main () at /builds/darinf/mozilla/xpcom/sample/nsTestSample.cpp:71

I tried moving the call to CFBundleCopyExecutableURL earlier so as to minimize
the amount of our code before this function is called.  In fact, I was able to
cut NS_InitXPCOM2 down to nothing more than the following:

    {
        nsCOMPtr<nsIDirectoryService> ds;
        nsresult rv = nsDirectoryService::Create(nsnull,
                                    NS_GET_IID(nsIDirectoryService),
                                    getter_AddRefs(ds));
        ds->Init();
    }

And, then I changed nsDirectoryService::Init to the following:

    nsresult 
    nsDirectoryService::Init()
    {
        nsresult rv;
        printf(">>> calling NS_NewISupportsArray...\n");
        rv = NS_NewISupportsArray(getter_AddRefs(mProviders));
        if (NS_FAILED(rv)) return rv;
        {  
            CFBundleRef appBundle0 = CFBundleGetMainBundle();
            if (appBundle0 != nsnull)
            {
                CFURLRef bundleURL0 = CFBundleCopyExecutableURL(appBundle0);
            }
        }
    ...

This call to CFBundleCopyExecutableURL ends up crashing.  I also removed the
mHashtable member variable from nsDirectoryService to minimize action in the
constructor.

From the looks of things, it seems unlikely that our code is the one trashing
memory.

This bug means that invoking the GRE via the XPCOM glue is horked on OSX.
regxpcom is also borked on OSX as a result of this bug.
Summary: nsTestSample results in "bus error" on OSX [crash @CFStringGetLength] → nsTestSample & regxpcom result in "bus error" on OSX [crash @CFStringGetLength]
what changed?  This did work a couple months ago.
Darin, what version of the OS are you running? What I fear is that this comment:
http://lxr.mozilla.org/seamonkey/source/xpcom/io/nsDirectoryService.cpp#194
may no longer be true for unbundled execs?
conrad: i've only tested 10.3... i should probably try other versions of OSX...
but, this worked fine -- even on 10.3 -- in the older moz build.
Darin, can you try this after this line:
CFBundleRef appBundle0 = CFBundleGetMainBundle();
Assuming appBundle0 != NULL (can you confirm that?) you can dump the bundle to
the console using:
(gdb) call (void)CFShow(appBundle0)
and see what it looks like.
(In reply to comment #5)
> Darin, can you try this after this line:
> CFBundleRef appBundle0 = CFBundleGetMainBundle();
> Assuming appBundle0 != NULL (can you confirm that?) you can dump the bundle to
> the console using:
> (gdb) call (void)CFShow(appBundle0)
> and see what it looks like.

hmm... looks like CFShow throws a SIGTRAP.  i tried invoking it from GDB, and i
also tried modifying the code.  both result in a SIGTRAP.  i'll try moving the
call to CFBundleGetMainBundle earlier in NS_InitXPCOM2...
that didn't help :(

any other suggestions?
regxpcom doesn't link with libxpcom. It loads it via glue. The problem with
.dylibs on Mach-O is that they're either linked with the -dynamiclib flag (in
which case they can be linked against) or -bundle (in which case they can be
loaded manually) XPCOM glue depends on the latter, but libxpcom is not linked in
such a way to allow that. I'm surprised it even loads. That CFBundle functions
fail is probably just the beginning of problems. 

> A trunk build from Nov 14, 2003 works fine
This is really surprising. I thought longer ago that that, we were building
libxpcom in both forms, but that had problems and caused bloat since the
application without glue didn't need the -bundle form. I'd look to see if
definitions of XPCOM_GLUE_NO_DYNAMIC_LOADING or XPCOM_GLUE have been changed for
Mac in that time frame.
thanks for the clues Conrad.
OK, I have a partial fix for this coming up. We *can* dynamically link against
.dylibs on OSX, but we have to use the low-level Mach-O function NSAddImage
instead of the higher level CF* functions.  I have a mostly-done patch that
breaks the dependency of the XPCOM glue on NSPR.

*However*: I can't get this to work reliably on OS10.1, because of the XPCOM
dependencies on NSPR. 10.1 does not support the flag
NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME, and otherwise the dynamic
linker will try to find NSPR in all the wrong places, and die.

So, we have a couple options:
we can ship static seamonkey/firefox/whatevers for 10.1, and use a GRE/xpcom
glue on 10.2+, or we can ditch the glue entirely on mac, at least until 10.1 is
dead. Suggestions?
Assignee: dougt → bsmedberg
Nothing much to add here, but at some point, someone's going to have to decide
when to quit supporting Mac OS 10.1.x, and announce "Mozilla x.x is the last for
Mac OS 10.1.x."

It's been done already for the Classic Mac OS.  Perhaps Mike Pinkerton might
could give some thoughts on this, possibly not here, but somewhere?
FWIW, there's also a patch on bug 202450 to build Gecko as an OS X framework,
which may be preferable to using GRE and XPCOM glue. If there's interest in
doing something like that to be shared between FireFox/TBird, comment there.
Is there any status update on this?  I'm running into this problem also.  Can
the patch at least be posted so I can test it out?  Thanks.
Depends on: 297923
Fixed by bug 297923
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.