Closed Bug 297802 Opened 19 years ago Closed 19 years ago

linker flags for shared libraries are set improperly on unix

Categories

(NSS :: Libraries, defect, P2)

3.10
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: julien.pierre, Assigned: julien.pierre)

Details

Attachments

(1 file, 3 obsolete files)

Currently, the build for executables and shared libraries on Solaris does not
check that all dependencies have been resolved.

For example, if I add :
extern void dummy_function(void);

and a call to dummy_function() within libssl, the compile/relink works without
complaining. It should.
Currently, such errors for libraries are only found at runtime, when shlibsign
gets to run. If the error occurs in certain executables, they may not be found
at all.

The coreconf rules need to be changed. It is probably the linker flags. I'm not
sure what the correct flags are.
Taking bug.  I have this fixed in my workarea for libSSL.
The fix unfortunately is not (just) in coreconf.
It requires linking with every system shared library that is 
needed to resolve any symbols used directly or indirectly by the
shared library (that is, it requires that you list the shared libs
on which your code depends, AND the shared libs upon which those
shared libraries depend, in turn, requiring a full transitive closure
of the call graph.  The list of dependencies will vary by instruction
set artchitecture, e.g. different for 32 bit than for 64, and different 
for x86_64 than sparc.  So, it's not a small task to do for all of NSS.
Assignee: wtchang → nelson
Priority: -- → P2
Target Milestone: --- → 3.11
Version: unspecified → 3.10
Attached patch patch for solaris (obsolete) — Splinter Review
Here's the patch I use for libSSL on Solaris 10.
Nelson, my testing shows that only the immediate dependencies need to be specified.

We already have a list of shared libraries in coreconf called OS_LIBS which
appears like it could be linked to every shared library safely. I'm not sure why
we aren't already using that with -z defs . Wan-Teh, do you know ?
Attachment #186388 - Attachment is obsolete: true
Comment on attachment 186400 [details] [diff] [review]
working patch for all shared libs and executables in NSS on solaris

Julien, I don't know why we aren't using -z defs in coreconf.
NSPR uses -z defs.  I seem to recall it was suggested by a
Solaris engineer or Sun's shared library documentation.

>Index: SunOS5.mk
...
>+LDFLAGS += -z defs
>+DSO_LDOPTS += -z defs

-z defs is implied when linking an executable program.	So we only need
to add -z defs to DSO_LDOPTS.

By the way, NSPR uses -z combreloc -z defs.  So we may want to do
the same in coreconf.

>+EXTRA_SHARED_LIBS += $(OS_LIBS)

Instead of this, we should add $(OS_LIBS) to the makefile rule
for $(SHARED_LIBRARY) (at the end of the $(MKSHLIB) command line)
in mozilla/security/coreconf/rules.mk.	This will require testing
on all platforms but is the proper way to fix this.  Since $(OS_LIBS)
is already used in the makefile rule for $(PROGRAM), I don't anticipate
any problem.
Attachment #186400 - Flags: review-
I found that the problem also exists if random symbols are added to nss.def .
In this case, the linker will only generate a warning, and happily link
libnss3.so . The error will only be found later, when linking addbuiltin, at
which point there is an error that "random" is missing. I believe the solution
is the same - to use -z defs and link with $(OS_LIBS).
We should get this build issue resolved ASAP.
Assignee: nelson → julien.pierre.bugs
I found that AIX already has the proper option setup.
Linux needs -z defs .
I can't figure out the proper option for HP-UX at the moment. There seems to be
a whole bunch of new -B option in 11.11i, like -B direct and -B immediate, which
might work, but they aren't supported on the PA-RISC machine.
I found that the linker option for HP-UX is +noallowunsats . But it is not
supported in 32-bit PA-RISC builds. The option gets ignored as follows :

ld: (Warning) Ignoring +noallowunsats. The option is currently not supported in
a 32-bit PA link.

In a 64-bit PA-RISC build, the option works, but I get more than I bargained
for. The following  happens :


ld -b +h libsoftokn3.sl +noallowunsats -c HP-UXB.11.11_64_DBG.OBJ/softokn.def -o
HP-UXB.11.11_64_DBG.OBJ/libsoftokn3.sl HP-UXB.11.11_64_DBG.OBJ/dbinit.o
HP-UXB.11.11_64_DBG.OBJ/dbmshim.o HP-UXB.11.11_64_DBG.OBJ/ecdecode.o
HP-UXB.11.11_64_DBG.OBJ/fipstest.o HP-UXB.11.11_64_DBG.OBJ/fipstokn.o
HP-UXB.11.11_64_DBG.OBJ/keydb.o HP-UXB.11.11_64_DBG.OBJ/lowcert.o
HP-UXB.11.11_64_DBG.OBJ/lowkey.o HP-UXB.11.11_64_DBG.OBJ/lowpbe.o
HP-UXB.11.11_64_DBG.OBJ/padbuf.o HP-UXB.11.11_64_DBG.OBJ/pcertdb.o
HP-UXB.11.11_64_DBG.OBJ/pk11db.o HP-UXB.11.11_64_DBG.OBJ/pkcs11.o
HP-UXB.11.11_64_DBG.OBJ/pkcs11c.o HP-UXB.11.11_64_DBG.OBJ/pkcs11u.o
HP-UXB.11.11_64_DBG.OBJ/rsawrapr.o HP-UXB.11.11_64_DBG.OBJ/softkver.o
HP-UXB.11.11_64_DBG.OBJ/tlsprf.o  
../../../../dist/HP-UXB.11.11_64_DBG.OBJ/lib/libfreebl.a
../../../../dist/HP-UXB.11.11_64_DBG.OBJ/lib/libsecutil.a
../../../../dist/HP-UXB.11.11_64_DBG.OBJ/lib/libdbm.a 
-L../../../../dist/HP-UXB.11.11_64_DBG.OBJ/lib -lplc4 -lplds4 -lnspr4  -lpthread
-lm -lrt
ld: Unsatisfied symbol "_end" in file /usr/lib/pa20_64/libc.2
ld: Unsatisfied symbol "crap" in file HP-UXB.11.11_64_DBG.OBJ/rsawrapr.o
ld: Unsatisfied symbol "main" in file /usr/lib/pa20_64/libc.2
3 errors.
gmake: *** [HP-UXB.11.11_64_DBG.OBJ/libsoftokn3.sl] Error 1

"crap" is a symbol I referenced to test the behavior, so the option is working.
However, the linker shouldn't need to resolve "_end" and "main" when linking a
shared library. Is there an HP-UX expert we could ask ?
In general we should use the compiler as the front end to the
linker when linking shared libraries or executable programs,
because the compiler knows what C or C++ runtime libraries it
needs to link with.  This means in our makefiles we need to
use $(CC) instead of $(LD) in the definition of MKSHLIB.  This
may resolve the unsatisfied symbols.  Note that when you change
$(LD) to $(CC), some of the linker options need to passed via
a special flag -Wl like -Wl,-foo,-bar.
I have tested that this doesn't break any Unix platform.
I still haven't found a working fix for HP-UX . If I do, I will attach a
separate patch.
Attachment #186400 - Attachment is obsolete: true
Attachment #193783 - Flags: review?(wtchang)
Comment on attachment 193783 [details] [diff] [review]
Working fix for Solaris and Linux

r=wtc.	Bob, Nelson, please review the following issue.

Issue: this patch will cause the NSS shared libraries to
depend on $(OS_LIBS).  It is very likely that some of
the system libraries in $(OS_LIBS) are not needed, partly
because NSPR hides the system library dependency.  (On
Linux I can build NSS with -z defs and without $(OS_LIBS).)
I think having unnecessary system library dependency
is a minor problem.  Those system libraries are still
needed by libnspr4.so.	They will just delay the loading
of libnss3.so, etc. a little bit.

Julien, please fix the following when you check in.

In SunOS5.mk

> # -KPIC generates position independent code for use in shared libraries.
> # (Similarly for -fPIC in case of gcc.)
> ifdef NS_USE_GCC
> 	DSO_CFLAGS += -fPIC
> else
> 	DSO_CFLAGS += -KPIC
> endif
> 
> NOSUCHFILE   = /solaris-rm-f-sucks
>+DSO_LDOPTS += -z combreloc -z defs

Please put DSO_LDOPTS right below the block for DSO_CFLAGS
because these two DSO_xxx variables are related.

We should also add $(OS_LIBS) to the OS/2 case in rules.mk.
Attachment #193783 - Flags: superreview?(nelson)
Attachment #193783 - Flags: review?(wtchang)
Attachment #193783 - Flags: review?(rrelyea)
Attachment #193783 - Flags: review+
Wan-Teh,

There is no lazy loading on OS/2, just as on Windows, so there is no need to add
the libraries for that case. If some libraries were missing, the linking of
shared libraries would already be failing.

On most other platforms, I believe the linker is smart enough to notice if the
libraries aren't needed, so that unnecessary dependencies are not added to
shared libraries unless a symbol from that library is needed by the target.
I experimented on Solaris and Windows with adding unneeded shared libraries to
OS_LIBS . It turns out Solaris does not eliminate unneeded dependencies, at
least by default, but Windows does, as shown by ldd and DUMPBIN /DEPENDENTS .

The "-z ignore" option can be used to remove unnecessary dependencies on Solaris
. The default is "-z record" ... This shrinks the ldd output quite a bit for
libnss3 !

With -z record :

        libsoftokn3.so =>        (file not found)
        libplc4.so =>    (file not found)
        libplds4.so =>   (file not found)
        libnspr4.so =>   (file not found)
        libthread.so.1 =>        /lib/64/libthread.so.1
        libnsl.so.1 =>   /lib/64/libnsl.so.1
        libsocket.so.1 =>        /lib/64/libsocket.so.1
        librt.so.1 =>    /lib/64/librt.so.1
        libdl.so.1 =>    /lib/64/libdl.so.1
        libc.so.1 =>     /lib/64/libc.so.1
        libgcc_s.so.1 =>         /usr/sfw/lib/64/libgcc_s.so.1
        libmp.so.2 =>    /lib/64/libmp.so.2
        libmd5.so.1 =>   /lib/64/libmd5.so.1
        libscf.so.1 =>   /lib/64/libscf.so.1
        libaio.so.1 =>   /lib/64/libaio.so.1
        libdoor.so.1 =>  /lib/64/libdoor.so.1
        libuutil.so.1 =>         /lib/64/libuutil.so.1
        libm.so.2 =>     /lib/64/libm.so.2

With -z ignore :

        libsoftokn3.so =>        (file not found)
        libplc4.so =>    (file not found)
        libplds4.so =>   (file not found)
        libnspr4.so =>   (file not found)
        libc.so.1 =>     /lib/64/libc.so.1
        libgcc_s.so.1 =>         /usr/sfw/lib/64/libgcc_s.so.1
        libm.so.2 =>     /lib/64/libm.so.2

Should we switch to -z ignore ?
Yes, using -z ignore is a good idea.  Make sure
it's supported on all the Solaris versions you
care about.

The GNU linker we use on Linux doesn't support
-z ignore.
I changed my mind about OS/2 and added OS_LIBS to its rule for consistency.
This change doesn't hurt anything because OS_LIBS is currently not set on OS/2
. Also, I think the OS/2 linker works the same as Windows and eliminated
unnecessary dependencies by default.

I added -z ignore to the Solaris linker options. Wan-Teh, I think the same
should be done for the NSPR autoconf build .

I'm amazed GNU ld doesn't support -z ignore and doesn't have any equivalent .

Checking in Linux.mk;
/cvsroot/mozilla/security/coreconf/Linux.mk,v  <--  Linux.mk
new revision: 1.25; previous revision: 1.24
done
Checking in SunOS5.mk;
/cvsroot/mozilla/security/coreconf/SunOS5.mk,v	<--  SunOS5.mk
new revision: 1.18; previous revision: 1.17
done
Checking in rules.mk;
/cvsroot/mozilla/security/coreconf/rules.mk,v  <--  rules.mk
new revision: 1.64; previous revision: 1.63
Attachment #193783 - Attachment is obsolete: true
I forgot to mention that I tested -z ignore on Solaris 8, 9, and 10, and it
works. It is in the ld man pages for all of them .
Re: HP-UX and comment #9,

Using the compiler as the front-end for the linker doesn't help. I have tried
the following test case.

test.c :

#include <stdlib.h>

void test(void)
{
    int i = atoi("1");
    unsat();
}

mkshlib (build script) :

#!/bin/tcsh
cc +DA2.0W +z -c test.c
ld +noallowunsats -b -o test.sl test.o -lc
cc -b -o test.sl test.o -Wl,+noallowunsats,-L,/usr/lib/pa20_64,-lc
cc +DA2.0W -b -o test.sl test.c -Wl,+noallowunsats,-L,/usr/lib/pa20_64,-lc

All three link methods (last 3 lines) produce the exact same errors about
unsatisfied symbols unsat, main, and _end.

ld: Unsatisfied symbol "unsat" in file test.o
ld: Unsatisfied symbol "_end" in file /usr/lib/pa20_64/libc.sl
ld: Unsatisfied symbol "main" in file /usr/lib/pa20_64/libc.sl
3 errors.
FYI, I tried the same test with an executable program . All 3 linking methods
worked. What's interesting is that the compiler did not create and "_end" symbol
in main.o . But _end ends up being present in the "main" executable, as shown by
nm output :

[120]    |   9223372041149743528|       0|NOTYP|GLOB |0|     ABS|_end

Therefore, somehow the linker always generates this _end symbol when creating an
executable; but not when creating a shared library. Seems like a bug to me. Or
I'm missing some option in the shared library case. But which ?

#!/bin/tcsh

# the following works - compile and link in one step
cc -o main main.c -lc +DA2.0W -Wl,+noallowunsats

# the following succeeds  - compile with cc, and link with ld
cc +DA2.0W +z -c main.c
ld +noallowunsats -o main main.o -lc

# the following succeeds - compile with cc, and link with cc
cc +DA2.0W -o main main.c -Wl,+noallowunsats,-L,/usr/lib/pa20_64,-lc
Summary: linker flags are set improperly on unix → linker flags for shared libraries are set improperly on unix
The -z ignore linker flag is less important for
NSPR because NSPR has a custom OS_LIBS value for
each of its three shared libraries.
Comment on attachment 193783 [details] [diff] [review]
Working fix for Solaris and Linux

assuming the use of -z ignore. We should make sure the smae option exists for
Linux (or such an option is needed).
Attachment #193783 - Flags: review?(rrelyea) → review+
The GNU linker used on Linux ignores the -z xxx options
that it doesn't implement for Solaris compatibility.

The GNU linker doesn't support -z ignore; it silently
ignores it.  It does support -z defs though.

One thing we can do to address this problem is for
coreconf to define OS_LIBS with the absolute miminum
system libraries, and then do OS_LIBS += ... in each
library's directory if additional system libraries are
needed for that library.  This is essentially what NSPR
does.
Comment on attachment 193783 [details] [diff] [review]
Working fix for Solaris and Linux

(two reviews are enough :)
Attachment #193783 - Flags: superreview?(nelson)
This fix was made primarily to help developers ensure that they are building
complete libraries while making code changes - ie. to catch the error earlier
than when linking an executable. Even though HP-UX isn't fixed, missing symbols
in libraries still get found during the build on HP-UX in the executable link.
I'm going to close this bug as fixed, despite HP-UX remaining unfixed, because
nobody on the NSS team uses HP-UX as their primary development platform, and
thus nobody is inconvenienced by the lack of fix on HP-UX. If that changes, we
can reopen the bug.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
While reviewing a patch for bug 303508, I found that
we are using the +k +vshlibunsats flags on HP-UX.
These flags may be what we are looking for.

+k causes the linker to not produce the shared library
if there were unresolved references.

+vshlibunsats causes the linker to print a list of
unsatisfied symbols used by the shared library.
Wan-Teh,

I tried with +k +vshlibunsats . I was able to link a library with an unresolved
symbol with this combination .

+vshlibunsats caused me to get warnings about main and _end, but not about the
actual unresolved symbol I was calling !
I learned about the (new?) --as-needed GNU linker flag today.
I opened bug 311236 to look into using that on Linux and other
platforms that use GNU ld.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: