Last Comment Bug 325260 - XPCOMGlueLoadXULFunctions should search library's symbol table
: XPCOMGlueLoadXULFunctions should search library's symbol table
Status: RESOLVED FIXED
[nvn-dl]
: fixed1.8.0.2, fixed1.8.1
Product: Core
Classification: Components
Component: XPCOM (show other bugs)
: Trunk
: PowerPC Mac OS X
: -- normal (vote)
: ---
Assigned To: jhp (no longer active)
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2006-01-30 12:39 PST by jhp (no longer active)
Modified: 2006-03-01 13:55 PST (History)
3 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (2.65 KB, patch)
2006-01-30 13:34 PST, jhp (no longer active)
no flags Details | Diff | Splinter Review
patch v1.1 (3.61 KB, patch)
2006-01-30 13:48 PST, jhp (no longer active)
mark: superreview-
Details | Diff | Splinter Review
patch v1.2 (4.01 KB, patch)
2006-01-31 09:24 PST, jhp (no longer active)
mark: review+
benjamin: review-
Details | Diff | Splinter Review
patch v1.3 (6.73 KB, patch)
2006-01-31 12:42 PST, jhp (no longer active)
benjamin: review+
mark: superreview+
benjamin: approval‑branch‑1.8.1+
benjamin: approval1.8.0.2+
Details | Diff | Splinter Review

Description jhp (no longer active) 2006-01-30 12:39:34 PST
|XPCOMGlueLoadXULFunctions| on Mac OS X currently uses |NSLookupAndBindSymbol|, which searches for a symbol in the global symbol table (across all loaded libraries).  When using JavaXPCOM to embed XULRunner in Java, I get an error where |XPCOM_NATIVE(getComponentManager)| in nsJavaXPCOMGlue.cpp keeps calling itself, rather than calling the similarly named function in the XUL library.  This isn't a problem on other platforms, since they only look for the symbols in the given library (libXUL).  For Mac OS X, we should at least be using |NSLookupAndBindSymbolWithHint|, or better yet, |NSLookupSymbolInImage|.

Not sure why I didn't see this earlier; I'm guessing that in those cases, |NSLookupAndBindSymbol| was pulling the right symbol from the global symbol table, but today it's getting the wrong one.
Comment 1 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-01-30 12:46:23 PST
Are the functions the same name? We typically make sure that the functions are named differently (e.g. NS_InitXPCOM2 in libxpcom.dylib forwards to NS_InitXPCOM_P in XUL). Though the InImage variant might also solve this issue.
Comment 2 jhp (no longer active) 2006-01-30 12:53:07 PST
Yes, in my case, the functions are named the same.  Didn't think that would be a problem.  However, even if I did fix this, I still think we should fix this to only look in the XUL library, to prevent this issue from happening to others.  Working on a patch now.
Comment 3 jhp (no longer active) 2006-01-30 13:34:24 PST
Created attachment 210172 [details] [diff] [review]
patch
Comment 4 jhp (no longer active) 2006-01-30 13:36:05 PST
Hmm, actually, we should probably be looking for |NS_GetFrozenFunctions| in the same way, right?
Comment 5 jhp (no longer active) 2006-01-30 13:48:03 PST
Created attachment 210174 [details] [diff] [review]
patch v1.1

According to the docs, this should be faster, since we are only looking for the images in the appropriate libs, rather than looking in the global symbol table.
Comment 6 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-01-30 13:50:59 PST
Comment on attachment 210174 [details] [diff] [review]
patch v1.1

Mento I'd like you to look at this also.
Comment 7 Mark Mentovai 2006-01-30 15:18:39 PST
Comment on attachment 210174 [details] [diff] [review]
patch v1.1

>Index: nsGlueLinkingOSX.cpp
>===================================================================
>RCS file: /cvsroot/mozilla/xpcom/glue/standalone/nsGlueLinkingOSX.cpp,v
>retrieving revision 1.1.2.1
>diff -u -6 -p -r1.1.2.1 nsGlueLinkingOSX.cpp
>--- nsGlueLinkingOSX.cpp	9 Jan 2006 19:03:09 -0000	1.1.2.1
>+++ nsGlueLinkingOSX.cpp	30 Jan 2006 21:46:05 -0000
>@@ -42,70 +42,84 @@
> #include <mach-o/dyld.h>
> #include <sys/param.h>
> #include <stdlib.h>
> #include <string.h>
> #include <stdio.h>
> 
>+static const mach_header* sXULLibImage;

Initialize?

> GetFrozenFunctionsFunc
> XPCOMGlueLoad(const char *xpcomFile)
> {
>+    const mach_header* lib = nsnull;
>+
>     if (xpcomFile[0] != '.' || xpcomFile[1] != '\0') {
>         char xpcomDir[PATH_MAX];
>         if (realpath(xpcomFile, xpcomDir)) {
>             char *lastSlash = strrchr(xpcomDir, '/');
>             if (lastSlash) {
>                 *lastSlash = '\0';
> 
>                 XPCOMGlueLoadDependentLibs(xpcomDir, ReadDependentCB);
>+
>+                sprintf(lastSlash, "/" XUL_DLL);

Buffer overflow waiting to happen?

>+                sXULLibImage = NSAddImage(xpcomDir,
>+                              NSADDIMAGE_OPTION_RETURN_ON_ERROR |
>+                              NSADDIMAGE_OPTION_WITH_SEARCHING |
>+                              NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME);

Return nsnull if !sXULLibImage - or explain if you had something else in mind.

Is XPCOMGlueLoad called more than once per process?

>             }
>         }
> 
>-        (void) NSAddImage(xpcomFile,
>-                          NSADDIMAGE_OPTION_RETURN_ON_ERROR |
>-                          NSADDIMAGE_OPTION_WITH_SEARCHING |
>-                          NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME);
>-        // We don't really care if this fails, as long as we can get
>-        // NS_GetFrozenFunctions below.
>+        lib = NSAddImage(xpcomFile,
>+                         NSADDIMAGE_OPTION_RETURN_ON_ERROR |
>+                         NSADDIMAGE_OPTION_WITH_SEARCHING |
>+                         NSADDIMAGE_OPTION_MATCH_FILENAME_BY_INSTALLNAME);

You can return nsnull here if !lib rather than trying to pass it to NSLookupSymbolInImage below.

>     }
> 
>-    if (!NSIsSymbolNameDefined("_NS_GetFrozenFunctions"))
>-      return nsnull;
>+    NSSymbol sym = NSLookupSymbolInImage(lib, "_NS_GetFrozenFunctions",
>+                                 NSLOOKUPSYMBOLINIMAGE_OPTION_BIND |
>+                                 NSLOOKUPSYMBOLINIMAGE_OPTION_RETURN_ON_ERROR);
>+    if (!sym)
>+        return nsnull;
> 
>-    NSSymbol sym = NSLookupAndBindSymbol("_NS_GetFrozenFunctions");
>     return (GetFrozenFunctionsFunc) NSAddressOfSymbol(sym);
> }
> 
> void
> XPCOMGlueUnload()
> {
>   // nothing to do, since we cannot unload dylibs on OS X
> }
> 
> nsresult
> XPCOMGlueLoadXULFunctions(const nsDynamicFunctionLoad *symbols)
> {
>+    // We don't null-check sXULLibHandle because this might work even
>+    // if it is null

Wait.  What?  Do you mean sXULLibImage?  Are you working with flat namespaces?

>     nsresult rv = NS_OK;
>     while (symbols->functionName) {
>         char buffer[512];
>         snprintf(buffer, sizeof(buffer), "_%s", symbols->functionName);
> 
>-        if (!NSIsSymbolNameDefined(buffer)) {
>-            rv = NS_ERROR_LOSS_OF_SIGNIFICANT_DATA;
>-        }
>-        else {
>-            NSSymbol sym = NSLookupAndBindSymbol(buffer);
>+        NSSymbol sym = NSLookupSymbolInImage(sXULLibImage, buffer,
>+                                 NSLOOKUPSYMBOLINIMAGE_OPTION_BIND |
>+                                 NSLOOKUPSYMBOLINIMAGE_OPTION_RETURN_ON_ERROR);
>+        if (sym)
>             *symbols->function = (NSFuncPtr) NSAddressOfSymbol(sym);
>-        }
>+        if (!*symbols->function)
>+            rv = NS_ERROR_LOSS_OF_SIGNIFICANT_DATA;

This seems weird to me.  Are you expecting *symbols->function to be set to something meaningful (like null) even before calling NSLookupSymbolInImage?
Comment 8 jhp (no longer active) 2006-01-30 20:42:02 PST
(In reply to comment #7)
> You can return nsnull here if !lib rather than trying to pass it to
> NSLookupSymbolInImage below.
> Wait.  What?  Do you mean sXULLibImage?  Are you working with flat namespaces?

Turns out I was confused about the documentation for |NSLookupSymbolInImage|.  I thought if the first argument was null, it would handle it like |dlsym| does (search global symbol table).  I'll fix this, and the rest of the issues, tomorrow.
Comment 9 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-01-31 05:50:26 PST
> >+static const mach_header* sXULLibImage;
> 
> Initialize?

statics are auto-initialized to 0, and per brendan it's good form to not specify the initializer explicitly.

> Return nsnull if !sXULLibImage - or explain if you had something else in mind.

Yeah, we definitely *don't* want to fail if !sXULLibImage, but continue gracefully.

> Is XPCOMGlueLoad called more than once per process?

Well, not normally, but it's possible that multiple sharedlibs which statically link libpcomglue.a could call it independently (and would have separate statics).

> You can return nsnull here if !lib rather than trying to pass it to
> NSLookupSymbolInImage below.

We need to make sure that the "xpcomLib" argument passed can be null or "."; that is an old use-case of the glue which must be preserved. See http://lxr.mozilla.org/mozilla/source/xpcom/glue/standalone/nsGlueLinkingDlopen.cpp#108

In this case we should fallback to getting NS_GetFrozenFunctions using the flat namespace.
Comment 10 Mark Mentovai 2006-01-31 06:30:11 PST
OK, in that case, you'll need to call NSLookupSymbolInImage if you have a non-null struct mach_header in hand and NSLookupAndBindSymbol otherwise.  NSLookupAndBindSymbolWithHint is another option, but I'd prefer to avoid it.  The search through the entire symbol table is expensive, so I'd prefer not doing it if possible when you know what library you want the symbol from.

Unfortunately, the dlopen family isn't an option because it doesn't work before 10.3.

NSLookupSymbolInImage only searches globally when a flat namespace is in use.  Flat isn't the default and should be avoided, it really only exists for compatibility with older code that depends on it.
Comment 11 jhp (no longer active) 2006-01-31 09:24:18 PST
Created attachment 210247 [details] [diff] [review]
patch v1.2
Comment 12 Mark Mentovai 2006-01-31 11:18:52 PST
Comment on attachment 210247 [details] [diff] [review]
patch v1.2

>+
>+                snprintf(lastSlash, PATH_MAX - strlen(xpcomDir),"/" XUL_DLL);
----->                                               nit: space ^^

Would you mind fixing the dlopen and other implementations here too?

>+    NSSymbol sym;
>+    if (lib) {
>+        sym = NSLookupSymbolInImage(lib, "_NS_GetFrozenFunctions",
>+                                 NSLOOKUPSYMBOLINIMAGE_OPTION_BIND |
>+                                 NSLOOKUPSYMBOLINIMAGE_OPTION_RETURN_ON_ERROR);

Comparing this to the dlopen implementation, I think you should set sXULLibImage = nsnull here, or call XPCOMGlueUnload() and augment that to set the static for you.  This probably doesn't make a difference for well-behaved callers, but that might be my ignorance of libxpcom_glue speaking.

>+        NSSymbol sym;
>+        if (sXULLibImage) {
>+            sym = NSLookupSymbolInImage(sXULLibImage, buffer,
>+                                 NSLOOKUPSYMBOLINIMAGE_OPTION_BIND |
>+                                 NSLOOKUPSYMBOLINIMAGE_OPTION_RETURN_ON_ERROR);
>+        } else {
>+            sym = NSLookupAndBindSymbol(buffer);
>         }

Since you do this twice now, I think this would be better as a static function, letting the compiler do the inlining.

r=me with those changes, over to Benjamin.
Comment 13 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-01-31 11:49:32 PST
> Comparing this to the dlopen implementation, I think you should set
> sXULLibImage = nsnull here, or call XPCOMGlueUnload() and augment that to set

This isn't necessary (on mac) because .dylibs cannot be unloaded, so we don't have to worry about "state" changes in XPCOMGlueUnload.

Comment 14 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-01-31 11:55:10 PST
Comment on attachment 210247 [details] [diff] [review]
patch v1.2

>-    if (!NSIsSymbolNameDefined("_NS_GetFrozenFunctions"))
>-      return nsnull;

You removed this hunk without adding it back below. You need it, because NSLookupAndBindSymbol traps if the symbol is not found.

>+    // the global symbol table.  If we couldn't get a mach_header pointer
>+    // for the XPCOM dylib, then use |NSLookupAndBindSymbol| to search the
>+    // global symbol table (this shouldn't normally happen).

This is not quite true. It is legal to call XPCOMGlueStartup(".") and rely on libxpcom.dylib already being loaded.

>+        if (sXULLibImage) {

You don't need to do flat-namespace lookup here... if we haven't found the XUL lib, you can just hard-fail (return NS_ERROR_FAILURE). If you want to keep flat-namespace lookup you need the same NSIsSymbolNameDefined code as above.
Comment 15 Mark Mentovai 2006-01-31 12:08:29 PST
> > Comparing this to the dlopen implementation, I think you should set
> > sXULLibImage = nsnull here, or call XPCOMGlueUnload() and augment that to set
> 
> This isn't necessary (on mac) because .dylibs cannot be unloaded, so we don't
> have to worry about "state" changes in XPCOMGlueUnload.

You're right, I was reading that snippet as plucking a symbol from sXULLibImage instead of lib.

>>-    if (!NSIsSymbolNameDefined("_NS_GetFrozenFunctions"))
>>-      return nsnull;
>
>You removed this hunk without adding it back below. You need it, because
>NSLookupAndBindSymbol traps if the symbol is not found.

I know that's the documented behavior, but when I tested it, I found that NSLookupAndBindSymbol doesn't trap on failure as advertised, but instead returns null.  This is on 10.4.4 ppc and x86, your mileage may vary.
Comment 16 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-01-31 12:17:19 PST
I can guarantee that it trapped when I originally wrote the code on 10.3.something.
Comment 17 jhp (no longer active) 2006-01-31 12:42:52 PST
Created attachment 210260 [details] [diff] [review]
patch v1.3

|NSLookupAndBindSymbol| also doesn't trap for me (10.4.4PPC), but better safe than sorry (for older OSes).  Actually, I don't like also using |NSIsSymbolNameDefined|, because that means it looks through the global symbol table twice.  However, based on an internet search, it looks like |NSInstallLinkEditErrorHandlers| is broken in 10.4.
Comment 18 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-01-31 12:48:45 PST
Comment on attachment 210260 [details] [diff] [review]
patch v1.3

Be careful with nsGlueLinkingWin, you will probably need to #define snprintf _snprintf (or just use _snprintf directly).
Comment 19 Mark Mentovai 2006-01-31 15:52:47 PST
Comment on attachment 210260 [details] [diff] [review]
patch v1.3

r=me with _snprintf
Comment 20 jhp (no longer active) 2006-02-01 08:52:17 PST
Checked in to trunk.
Comment 21 jhp (no longer active) 2006-02-01 11:31:11 PST
Comment on attachment 210260 [details] [diff] [review]
patch v1.3

Trunk tinderboxen are all green, so asking for approval on the branches.
Comment 22 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-02-06 08:54:04 PST
Comment on attachment 210260 [details] [diff] [review]
patch v1.3

Recommending approval1.8.0.2 based on the fact that this is standalone-glue-only and therefore won't affect ffox/tbird, seamonkey doesn't use the standalone glue on mac, and this is verified on trunk, and fixes a JavaXPCOM embedding bug on mac.
Comment 23 Daniel Veditz [:dveditz] 2006-02-22 11:59:57 PST
Comment on attachment 210260 [details] [diff] [review]
patch v1.3

approved for 1.8.0 branch, a=dveditz for drivers
Comment 24 Benjamin Smedberg AWAY UNTIL 2-AUG-2016 [:bsmedberg] 2006-02-22 12:20:53 PST
Comment on attachment 210260 [details] [diff] [review]
patch v1.3

Correcting dveditz typo.
Comment 25 jhp (no longer active) 2006-02-22 15:03:19 PST
Checked in to both branches.
Comment 26 Dave Liebreich [:davel] 2006-03-01 13:55:33 PST
marking [nvn-dl], which removes this bug from the "to be verified by QA" list
for Firefox 1.5.0.2.

Note You need to log in before you can comment on or make changes to this bug.