Closed
Bug 297802
Opened 19 years ago
Closed 19 years ago
linker flags for shared libraries are set improperly on unix
Categories
(NSS :: Libraries, defect, P2)
Tracking
(Not tracked)
RESOLVED
FIXED
3.11
People
(Reporter: julien.pierre, Assigned: julien.pierre)
Details
Attachments
(1 file, 3 obsolete files)
2.66 KB,
patch
|
Details | Diff | Splinter Review |
Currently, the build for executables and shared libraries on Solaris does not check that all dependencies have been resolved. For example, if I add : extern void dummy_function(void); and a call to dummy_function() within libssl, the compile/relink works without complaining. It should. Currently, such errors for libraries are only found at runtime, when shlibsign gets to run. If the error occurs in certain executables, they may not be found at all. The coreconf rules need to be changed. It is probably the linker flags. I'm not sure what the correct flags are.
Comment 1•19 years ago
|
||
Taking bug. I have this fixed in my workarea for libSSL. The fix unfortunately is not (just) in coreconf. It requires linking with every system shared library that is needed to resolve any symbols used directly or indirectly by the shared library (that is, it requires that you list the shared libs on which your code depends, AND the shared libs upon which those shared libraries depend, in turn, requiring a full transitive closure of the call graph. The list of dependencies will vary by instruction set artchitecture, e.g. different for 32 bit than for 64, and different for x86_64 than sparc. So, it's not a small task to do for all of NSS.
Assignee: wtchang → nelson
Priority: -- → P2
Target Milestone: --- → 3.11
Version: unspecified → 3.10
Comment 2•19 years ago
|
||
Here's the patch I use for libSSL on Solaris 10.
Assignee | ||
Comment 3•19 years ago
|
||
Nelson, my testing shows that only the immediate dependencies need to be specified. We already have a list of shared libraries in coreconf called OS_LIBS which appears like it could be linked to every shared library safely. I'm not sure why we aren't already using that with -z defs . Wan-Teh, do you know ?
Assignee | ||
Comment 4•19 years ago
|
||
Attachment #186388 -
Attachment is obsolete: true
Comment 5•19 years ago
|
||
Comment on attachment 186400 [details] [diff] [review] working patch for all shared libs and executables in NSS on solaris Julien, I don't know why we aren't using -z defs in coreconf. NSPR uses -z defs. I seem to recall it was suggested by a Solaris engineer or Sun's shared library documentation. >Index: SunOS5.mk ... >+LDFLAGS += -z defs >+DSO_LDOPTS += -z defs -z defs is implied when linking an executable program. So we only need to add -z defs to DSO_LDOPTS. By the way, NSPR uses -z combreloc -z defs. So we may want to do the same in coreconf. >+EXTRA_SHARED_LIBS += $(OS_LIBS) Instead of this, we should add $(OS_LIBS) to the makefile rule for $(SHARED_LIBRARY) (at the end of the $(MKSHLIB) command line) in mozilla/security/coreconf/rules.mk. This will require testing on all platforms but is the proper way to fix this. Since $(OS_LIBS) is already used in the makefile rule for $(PROGRAM), I don't anticipate any problem.
Attachment #186400 -
Flags: review-
Assignee | ||
Comment 6•19 years ago
|
||
I found that the problem also exists if random symbols are added to nss.def . In this case, the linker will only generate a warning, and happily link libnss3.so . The error will only be found later, when linking addbuiltin, at which point there is an error that "random" is missing. I believe the solution is the same - to use -z defs and link with $(OS_LIBS). We should get this build issue resolved ASAP.
Assignee | ||
Updated•19 years ago
|
Assignee: nelson → julien.pierre.bugs
Assignee | ||
Comment 7•19 years ago
|
||
I found that AIX already has the proper option setup. Linux needs -z defs . I can't figure out the proper option for HP-UX at the moment. There seems to be a whole bunch of new -B option in 11.11i, like -B direct and -B immediate, which might work, but they aren't supported on the PA-RISC machine.
Assignee | ||
Comment 8•19 years ago
|
||
I found that the linker option for HP-UX is +noallowunsats . But it is not supported in 32-bit PA-RISC builds. The option gets ignored as follows : ld: (Warning) Ignoring +noallowunsats. The option is currently not supported in a 32-bit PA link. In a 64-bit PA-RISC build, the option works, but I get more than I bargained for. The following happens : ld -b +h libsoftokn3.sl +noallowunsats -c HP-UXB.11.11_64_DBG.OBJ/softokn.def -o HP-UXB.11.11_64_DBG.OBJ/libsoftokn3.sl HP-UXB.11.11_64_DBG.OBJ/dbinit.o HP-UXB.11.11_64_DBG.OBJ/dbmshim.o HP-UXB.11.11_64_DBG.OBJ/ecdecode.o HP-UXB.11.11_64_DBG.OBJ/fipstest.o HP-UXB.11.11_64_DBG.OBJ/fipstokn.o HP-UXB.11.11_64_DBG.OBJ/keydb.o HP-UXB.11.11_64_DBG.OBJ/lowcert.o HP-UXB.11.11_64_DBG.OBJ/lowkey.o HP-UXB.11.11_64_DBG.OBJ/lowpbe.o HP-UXB.11.11_64_DBG.OBJ/padbuf.o HP-UXB.11.11_64_DBG.OBJ/pcertdb.o HP-UXB.11.11_64_DBG.OBJ/pk11db.o HP-UXB.11.11_64_DBG.OBJ/pkcs11.o HP-UXB.11.11_64_DBG.OBJ/pkcs11c.o HP-UXB.11.11_64_DBG.OBJ/pkcs11u.o HP-UXB.11.11_64_DBG.OBJ/rsawrapr.o HP-UXB.11.11_64_DBG.OBJ/softkver.o HP-UXB.11.11_64_DBG.OBJ/tlsprf.o ../../../../dist/HP-UXB.11.11_64_DBG.OBJ/lib/libfreebl.a ../../../../dist/HP-UXB.11.11_64_DBG.OBJ/lib/libsecutil.a ../../../../dist/HP-UXB.11.11_64_DBG.OBJ/lib/libdbm.a -L../../../../dist/HP-UXB.11.11_64_DBG.OBJ/lib -lplc4 -lplds4 -lnspr4 -lpthread -lm -lrt ld: Unsatisfied symbol "_end" in file /usr/lib/pa20_64/libc.2 ld: Unsatisfied symbol "crap" in file HP-UXB.11.11_64_DBG.OBJ/rsawrapr.o ld: Unsatisfied symbol "main" in file /usr/lib/pa20_64/libc.2 3 errors. gmake: *** [HP-UXB.11.11_64_DBG.OBJ/libsoftokn3.sl] Error 1 "crap" is a symbol I referenced to test the behavior, so the option is working. However, the linker shouldn't need to resolve "_end" and "main" when linking a shared library. Is there an HP-UX expert we could ask ?
Comment 9•19 years ago
|
||
In general we should use the compiler as the front end to the linker when linking shared libraries or executable programs, because the compiler knows what C or C++ runtime libraries it needs to link with. This means in our makefiles we need to use $(CC) instead of $(LD) in the definition of MKSHLIB. This may resolve the unsatisfied symbols. Note that when you change $(LD) to $(CC), some of the linker options need to passed via a special flag -Wl like -Wl,-foo,-bar.
Assignee | ||
Comment 10•19 years ago
|
||
I have tested that this doesn't break any Unix platform. I still haven't found a working fix for HP-UX . If I do, I will attach a separate patch.
Attachment #186400 -
Attachment is obsolete: true
Attachment #193783 -
Flags: review?(wtchang)
Comment 11•19 years ago
|
||
Comment on attachment 193783 [details] [diff] [review] Working fix for Solaris and Linux r=wtc. Bob, Nelson, please review the following issue. Issue: this patch will cause the NSS shared libraries to depend on $(OS_LIBS). It is very likely that some of the system libraries in $(OS_LIBS) are not needed, partly because NSPR hides the system library dependency. (On Linux I can build NSS with -z defs and without $(OS_LIBS).) I think having unnecessary system library dependency is a minor problem. Those system libraries are still needed by libnspr4.so. They will just delay the loading of libnss3.so, etc. a little bit. Julien, please fix the following when you check in. In SunOS5.mk > # -KPIC generates position independent code for use in shared libraries. > # (Similarly for -fPIC in case of gcc.) > ifdef NS_USE_GCC > DSO_CFLAGS += -fPIC > else > DSO_CFLAGS += -KPIC > endif > > NOSUCHFILE = /solaris-rm-f-sucks >+DSO_LDOPTS += -z combreloc -z defs Please put DSO_LDOPTS right below the block for DSO_CFLAGS because these two DSO_xxx variables are related. We should also add $(OS_LIBS) to the OS/2 case in rules.mk.
Attachment #193783 -
Flags: superreview?(nelson)
Attachment #193783 -
Flags: review?(wtchang)
Attachment #193783 -
Flags: review?(rrelyea)
Attachment #193783 -
Flags: review+
Assignee | ||
Comment 12•19 years ago
|
||
Wan-Teh, There is no lazy loading on OS/2, just as on Windows, so there is no need to add the libraries for that case. If some libraries were missing, the linking of shared libraries would already be failing. On most other platforms, I believe the linker is smart enough to notice if the libraries aren't needed, so that unnecessary dependencies are not added to shared libraries unless a symbol from that library is needed by the target.
Assignee | ||
Comment 13•19 years ago
|
||
I experimented on Solaris and Windows with adding unneeded shared libraries to OS_LIBS . It turns out Solaris does not eliminate unneeded dependencies, at least by default, but Windows does, as shown by ldd and DUMPBIN /DEPENDENTS . The "-z ignore" option can be used to remove unnecessary dependencies on Solaris . The default is "-z record" ... This shrinks the ldd output quite a bit for libnss3 ! With -z record : libsoftokn3.so => (file not found) libplc4.so => (file not found) libplds4.so => (file not found) libnspr4.so => (file not found) libthread.so.1 => /lib/64/libthread.so.1 libnsl.so.1 => /lib/64/libnsl.so.1 libsocket.so.1 => /lib/64/libsocket.so.1 librt.so.1 => /lib/64/librt.so.1 libdl.so.1 => /lib/64/libdl.so.1 libc.so.1 => /lib/64/libc.so.1 libgcc_s.so.1 => /usr/sfw/lib/64/libgcc_s.so.1 libmp.so.2 => /lib/64/libmp.so.2 libmd5.so.1 => /lib/64/libmd5.so.1 libscf.so.1 => /lib/64/libscf.so.1 libaio.so.1 => /lib/64/libaio.so.1 libdoor.so.1 => /lib/64/libdoor.so.1 libuutil.so.1 => /lib/64/libuutil.so.1 libm.so.2 => /lib/64/libm.so.2 With -z ignore : libsoftokn3.so => (file not found) libplc4.so => (file not found) libplds4.so => (file not found) libnspr4.so => (file not found) libc.so.1 => /lib/64/libc.so.1 libgcc_s.so.1 => /usr/sfw/lib/64/libgcc_s.so.1 libm.so.2 => /lib/64/libm.so.2 Should we switch to -z ignore ?
Comment 14•19 years ago
|
||
Yes, using -z ignore is a good idea. Make sure it's supported on all the Solaris versions you care about. The GNU linker we use on Linux doesn't support -z ignore.
Assignee | ||
Comment 15•19 years ago
|
||
I changed my mind about OS/2 and added OS_LIBS to its rule for consistency. This change doesn't hurt anything because OS_LIBS is currently not set on OS/2 . Also, I think the OS/2 linker works the same as Windows and eliminated unnecessary dependencies by default. I added -z ignore to the Solaris linker options. Wan-Teh, I think the same should be done for the NSPR autoconf build . I'm amazed GNU ld doesn't support -z ignore and doesn't have any equivalent . Checking in Linux.mk; /cvsroot/mozilla/security/coreconf/Linux.mk,v <-- Linux.mk new revision: 1.25; previous revision: 1.24 done Checking in SunOS5.mk; /cvsroot/mozilla/security/coreconf/SunOS5.mk,v <-- SunOS5.mk new revision: 1.18; previous revision: 1.17 done Checking in rules.mk; /cvsroot/mozilla/security/coreconf/rules.mk,v <-- rules.mk new revision: 1.64; previous revision: 1.63
Attachment #193783 -
Attachment is obsolete: true
Assignee | ||
Comment 16•19 years ago
|
||
I forgot to mention that I tested -z ignore on Solaris 8, 9, and 10, and it works. It is in the ld man pages for all of them .
Assignee | ||
Comment 17•19 years ago
|
||
Re: HP-UX and comment #9, Using the compiler as the front-end for the linker doesn't help. I have tried the following test case. test.c : #include <stdlib.h> void test(void) { int i = atoi("1"); unsat(); } mkshlib (build script) : #!/bin/tcsh cc +DA2.0W +z -c test.c ld +noallowunsats -b -o test.sl test.o -lc cc -b -o test.sl test.o -Wl,+noallowunsats,-L,/usr/lib/pa20_64,-lc cc +DA2.0W -b -o test.sl test.c -Wl,+noallowunsats,-L,/usr/lib/pa20_64,-lc All three link methods (last 3 lines) produce the exact same errors about unsatisfied symbols unsat, main, and _end. ld: Unsatisfied symbol "unsat" in file test.o ld: Unsatisfied symbol "_end" in file /usr/lib/pa20_64/libc.sl ld: Unsatisfied symbol "main" in file /usr/lib/pa20_64/libc.sl 3 errors.
Assignee | ||
Comment 18•19 years ago
|
||
FYI, I tried the same test with an executable program . All 3 linking methods worked. What's interesting is that the compiler did not create and "_end" symbol in main.o . But _end ends up being present in the "main" executable, as shown by nm output : [120] | 9223372041149743528| 0|NOTYP|GLOB |0| ABS|_end Therefore, somehow the linker always generates this _end symbol when creating an executable; but not when creating a shared library. Seems like a bug to me. Or I'm missing some option in the shared library case. But which ? #!/bin/tcsh # the following works - compile and link in one step cc -o main main.c -lc +DA2.0W -Wl,+noallowunsats # the following succeeds - compile with cc, and link with ld cc +DA2.0W +z -c main.c ld +noallowunsats -o main main.o -lc # the following succeeds - compile with cc, and link with cc cc +DA2.0W -o main main.c -Wl,+noallowunsats,-L,/usr/lib/pa20_64,-lc
Summary: linker flags are set improperly on unix → linker flags for shared libraries are set improperly on unix
Comment 19•19 years ago
|
||
The -z ignore linker flag is less important for NSPR because NSPR has a custom OS_LIBS value for each of its three shared libraries.
Comment 20•19 years ago
|
||
Comment on attachment 193783 [details] [diff] [review] Working fix for Solaris and Linux assuming the use of -z ignore. We should make sure the smae option exists for Linux (or such an option is needed).
Attachment #193783 -
Flags: review?(rrelyea) → review+
Comment 21•19 years ago
|
||
The GNU linker used on Linux ignores the -z xxx options that it doesn't implement for Solaris compatibility. The GNU linker doesn't support -z ignore; it silently ignores it. It does support -z defs though. One thing we can do to address this problem is for coreconf to define OS_LIBS with the absolute miminum system libraries, and then do OS_LIBS += ... in each library's directory if additional system libraries are needed for that library. This is essentially what NSPR does.
Comment 22•19 years ago
|
||
Comment on attachment 193783 [details] [diff] [review] Working fix for Solaris and Linux (two reviews are enough :)
Attachment #193783 -
Flags: superreview?(nelson)
Assignee | ||
Comment 23•19 years ago
|
||
This fix was made primarily to help developers ensure that they are building complete libraries while making code changes - ie. to catch the error earlier than when linking an executable. Even though HP-UX isn't fixed, missing symbols in libraries still get found during the build on HP-UX in the executable link. I'm going to close this bug as fixed, despite HP-UX remaining unfixed, because nobody on the NSS team uses HP-UX as their primary development platform, and thus nobody is inconvenienced by the lack of fix on HP-UX. If that changes, we can reopen the bug.
Status: NEW → RESOLVED
Closed: 19 years ago
Resolution: --- → FIXED
Comment 24•19 years ago
|
||
While reviewing a patch for bug 303508, I found that we are using the +k +vshlibunsats flags on HP-UX. These flags may be what we are looking for. +k causes the linker to not produce the shared library if there were unresolved references. +vshlibunsats causes the linker to print a list of unsatisfied symbols used by the shared library.
Assignee | ||
Comment 25•19 years ago
|
||
Wan-Teh, I tried with +k +vshlibunsats . I was able to link a library with an unresolved symbol with this combination . +vshlibunsats caused me to get warnings about main and _end, but not about the actual unresolved symbol I was calling !
Comment 26•19 years ago
|
||
I learned about the (new?) --as-needed GNU linker flag today. I opened bug 311236 to look into using that on Linux and other platforms that use GNU ld.
You need to log in
before you can comment on or make changes to this bug.
Description
•