Open Bug 1213698 Opened 9 years ago Updated 2 years ago

error: undefined reference to 'dlsym' if building with ASan and GCC (Tor 17509)

Categories

(Firefox Build System :: General, defect, P3)

x86_64
Linux
defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: gk, Unassigned)

References

(Blocks 1 open bug)

Details

(Whiteboard: [tor][tor-standalone])

Attachments

(1 file)

Trying to build Firefox with GCC and Address Sanitizer breaks with /path/to/firefox/intl/icu/source/common/putil.cpp:2103: error: undefined reference to 'dlsym' collect2: error: ld returned 1 exit status That is not an ICU-only issue as building without it breaks the build later with the same error message. There is no fancy .mozconfig involved. Basically only export CFLAGS="-fsanitize=address -Dxmalloc=myxmalloc" export CXXFLAGS="-fsanitize=address -Dxmalloc=myxmalloc" export LDFLAGS="-fsanitize=address" ac_add_options --enable-address-sanitizer ac_add_options --disable-jemalloc ac_add_options --disable-elf-hack ac_add_options --enable-optimize ac_add_options --disable-strip ac_add_options --disable-install-strip ac_add_options --disable-tests ac_add_options --disable-debug
Attached is the config.log. For some reason -ldl is said to be not needed (contrary to a non-ASan build). But the compilation error seems to indicate the opposite.
This does not happen with GCC 4.9.x.
Summary: error: undefined reference to 'dlsym' if building with ASan and GCC 5.2.1 → error: undefined reference to 'dlsym' if building with ASan and GCC 5
So, with GCC revision 215527 the check whether we need to specify -ldl expiclitly via testing with dlopen() is not working anymore if one is building with ASan support. What patch would Mozilla merge to fix that? For instance testing with dlsym() instead seems to solve the issue for me. Would that be an acceptable option?
Flags: needinfo?(mh+mozilla)
Why is the dlopen() test failing?
Flags: needinfo?(mh+mozilla)
(In reply to Mike Hommey [:glandium] from comment #4) > Why is the dlopen() test failing? You mean why -ldl does not get added to the linker flags although it fails later with /path-to/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp:167: error: undefined reference to 'dlerror' /path-tp/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp:176: error: undefined reference to 'dlsym' /path-to/mozilla-central/xpcom/glue/standalone/nsXPCOMGlue.cpp:176: error: undefined reference to 'dlsym' collect2: error: ld returned 1 exit status ? Good question. I could not come up with an answer yet. Bisecting the ASan changes is not really working as the problem does not happen with clang and all the changes got squashed into one commit, rev 215527. I look over them a couple of times but there was not anything that immediately jumped to my mind. If you or anybody else has some ideas why dlopen() does not do the trick here while dlsym() (e.g.) does and especially on what kind of patch you would accept, I'd be happy to do the (remaining) work.
Whiteboard: [tor]
Did you try to run the configure test independently and see how/why it fails?
(In reply to Mike Hommey [:glandium] from comment #6) > Did you try to run the configure test independently and see how/why it fails? Yes. Here are the results (commands run + output): conftest.c ---------- char dlopen(); int main() { dlopen() ; return 0; } no ASan: -------- gcc -o conftest -std=gnu99 -fgnu89-inline -fno-strict-aliasing -fno-math-errno -Wl,-z,noexecstack -Wl,-z,text -Wl,--build-id -B /path/to/obj-x86_64-unknown-linux-gnu/build/unix/gold -Wl,-Bsymbolic conftest.c /tmp/ccEvighn.o:conftest.c:function main: error: undefined reference to 'dlopen' collect2: error: ld returned 1 exit status ASan: gcc -o conftest -fsanitize=address -Dxmalloc=myxmalloc -std=gnu99 -fgnu89-inline -fno-strict-aliasing -fno-math-errno -fsanitize=address -Wl,-z,noexecstack -Wl,-z,text -Wl,--build-id -B /path/to/obj-x86_64-unknown-linux-gnu/build/unix/gold conftest.c If I change conftest.c to char dlsym(); int main() { dlsym() ; return 0; } I get no ASan: -------- gcc -o conftest -std=gnu99 -fgnu89-inline -fno-strict-aliasing -fno-math-errno -Wl,-z,noexecstack -Wl,-z,text -Wl,--build-id -B /path/to/obj-x86_64-unknown-linux-gnu/build/unix/gold -Wl,-Bsymbolic conftest.c /tmp/ccFf85Zg.o:conftest.c:function main: error: undefined reference to 'dlsym' collect2: error: ld returned 1 exit status ASan: ----- gcc -o conftest -fsanitize=address -Dxmalloc=myxmalloc -std=gnu99 -fgnu89-inline -fno-strict-aliasing -fno-math-errno -fsanitize=address -Wl,-z,noexecstack -Wl,-z,text -Wl,--build-id -B /path/to/obj-x86_64-unknown-linux-gnu/build/unix/gold conftest.c /tmp/ccTBNSTL.o:conftest.c:function main: error: undefined reference to 'dlsym' collect2: error: ld returned 1 exit status Looking at the verbose output does not gives me any hints either. The ASan one basically differs due to adding the ASan specific bits.
There should be a -ldl on both commands, added by LIBS="-l$i $ac_func_search_save_LIBS" where $i is "dl".
(In reply to Mike Hommey [:glandium] from comment #8) > There should be a -ldl on both commands, added by > LIBS="-l$i $ac_func_search_save_LIBS" > > where $i is "dl". I totally agree with that and adding them (manually) is working fine but that is not what is happening with the configure script (and why there is this bug report): ASan case: ---------- configure:11799: checking for library containing dlopen configure:11817: gcc -o conftest -fsanitize=address -Dxmalloc=myxmalloc -std=gnu99 -fgnu89-inline -fno-strict-aliasing -fno-math-errno -fsanitize=address -Wl,-z,noexecstack -Wl,-z,text -Wl,--build-id -B /home/thomas/Arbeit/Tor/tor-browser/obj-x86_64-unknown-linux-gnu/build/unix/gold conftest.c 1>&5 configure:11857: checking for dlfcn.h non-ASan case: -------------- configure:11799: checking for library containing dlopen configure:11817: gcc -o conftest -std=gnu99 -fgnu89-inline -fno-strict-aliasing -fno-math-errno -Wl,-z,noexecstack -Wl,-z,text -Wl,--build-id -B /home/thomas/Arbeit/Tor/tor-browser/obj-x86_64-unknown-linux-gnu/build/unix/gold conftest.c 1>&5 /tmp/cc9U7CRo.o:conftest.c:function main: error: undefined reference to 'dlopen' collect2: error: ld returned 1 exit status configure: failed program was: #line 11806 "configure" #include "confdefs.h" /* Override any gcc2 internal prototype to avoid an error. */ /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char dlopen(); int main() { dlopen() ; return 0; } configure:11839: gcc -o conftest -std=gnu99 -fgnu89-inline -fno-strict-aliasing -fno-math-errno -Wl,-z,noexecstack -Wl,-z,text -Wl,--build-id -B /home/thomas/Arbeit/Tor/tor-browser/obj-x86_64-unknown-linux-gnu/build/unix/gold conftest.c -ldl 1>&5 configure:11857: checking for dlfcn.h If I convince the configure script to use dlsym for testing then it works as expected, hence my question if such a patch would be acceptable.
(In reply to Georg Koppen from comment #9) > ASan case: > ---------- > > configure:11799: checking for library containing dlopen > configure:11817: gcc -o conftest -fsanitize=address -Dxmalloc=myxmalloc > -std=gnu99 -fgnu89-inline -fno-strict-aliasing -fno-math-errno > -fsanitize=address -Wl,-z,noexecstack -Wl,-z,text -Wl,--build-id -B > /home/thomas/Arbeit/Tor/tor-browser/obj-x86_64-unknown-linux-gnu/build/unix/ > gold conftest.c 1>&5 So, in fact, the question is why does this command *not* fail without -ldl?
Yes. As I said above that behavior changed with GCC's r215527 but I am not sure how to debug that further.
(In reply to Mike Hommey [:glandium] from comment #10) > (In reply to Georg Koppen from comment #9) > > ASan case: > > ---------- > > > > configure:11799: checking for library containing dlopen > > configure:11817: gcc -o conftest -fsanitize=address -Dxmalloc=myxmalloc > > -std=gnu99 -fgnu89-inline -fno-strict-aliasing -fno-math-errno > > -fsanitize=address -Wl,-z,noexecstack -Wl,-z,text -Wl,--build-id -B > > /home/thomas/Arbeit/Tor/tor-browser/obj-x86_64-unknown-linux-gnu/build/unix/ > > gold conftest.c 1>&5 > > So, in fact, the question is why does this command *not* fail without -ldl? It looks like libasan defines a weak dlopen symbol, but not a weak dlsym symbol.
so something that uses both dlopen and dlsym, which is a rightful combination, gets dlopen from libasan and dlsym from libdl? That sounds awful... why are they doing this?
(In reply to Mike Hommey [:glandium] from comment #13) > so something that uses both dlopen and dlsym, which is a rightful > combination, gets dlopen from libasan and dlsym from libdl? That sounds > awful... why are they doing this? It looks like they want to intercept dlopen, which I would expect is in llvm's version too? so I'm curious why we don't see a problem there, but I don't have an llvm checkout to poke at that. what libasan has is a weak alias, so this probably works fine, but I'm curious why itss publicly visible though its explicitly __attribute__((visibility(default))) so I assume there is a good reason.
(In reply to Mike Hommey [:glandium] from comment #13) > so something that uses both dlopen and dlsym, which is a rightful > combination, gets dlopen from libasan and dlsym from libdl? That sounds > awful... why are they doing this? (In reply to Trevor Saunders (:tbsaunde) from comment #14) > It looks like they want to intercept dlopen, which I would expect is in > llvm's version too? so I'm curious why we don't see a problem there, but I > don't have an llvm checkout to poke at that. Agreed: it's to intercept dlopen() and dlclose() in order to track what's loaded and where; I assume it's just passing through the library handles and doesn't need to know about symbol lookups, so doesn't intercept dlsym. Clang's ASan does the same thing. As for why Clang works while GCC 5 doesn't: empirically, it looks like Clang adds a bunch of extra libraries when linking with -fsanitize=address, namely, -lpthread -lrt -lm -ldl -lgcc_s. Or, as found by searching the Clang source for "ldl": static void linkSanitizerRuntimeDeps(const ToolChain &TC, ArgStringList &CmdArgs) { // Force linking against the system libraries sanitizers depends on // (see PR15823 why this is necessary). CmdArgs.push_back("--no-as-needed"); CmdArgs.push_back("-lpthread"); CmdArgs.push_back("-lrt"); CmdArgs.push_back("-lm"); // There's no libdl on FreeBSD. if (TC.getTriple().getOS() != llvm::Triple::FreeBSD) CmdArgs.push_back("-ldl"); } So, see also https://llvm.org/bugs/show_bug.cgi?id=15823 (which I think GCC solved differently; I've already shaved enough yaks for this comment, but I notice that GCC uses a shared libasan.so by default).
In the meantime, I simply added -dl to LDFLAGS environment variable for local build of ASAN version of C-C TB with GCC5 (and GCC6). I think for a local build this is fine (?)
Priority: -- → P3
Summary: error: undefined reference to 'dlsym' if building with ASan and GCC 5 → error: undefined reference to 'dlsym' if building with ASan and GCC 5 (Tor 17509)
Blocks: meta_tor
Whiteboard: [tor] → [tor][tor-standalone]
We'd like to get a fix for this bug into ESR52. Mike, what kind of fix would be your preferred one?
Flags: needinfo?(mh+mozilla)
Add a check for dlsym?
Flags: needinfo?(mh+mozilla)
Assignee: nobody → gk
Hi, Again, automatic detection and configuration is very good. In the meantime, I have done away with the issue by adding manually "-ld" to my MOZCONFIG file thusly: # Mandatory flags for ASan export ASANFLAGS="-fsanitize=address -Dxmalloc=myxmalloc -fPIC" export CFLAGS="$ASANFLAGS $CFLAGS -fno-delete-null-pointer-checks " export CXXFLAGS="$ASANFLAGS $CXXFLAGS -fno-delete-null-pointer-checks " export LDFLAGS="-fsanitize=address -ldl" <=== NOTE the addition. The addition of f-no-delete-null-pointer-checks is a workaround for GCC6 issue with comm-central TB. (I believe that the these exported environmental variables also have effect during configuration, too. Come to think of it, since build invokes makefiles and stuff that are created during configuration, MOZCONFIG setup is it!.) TIA
We're not using gcc 5 anymore, going to wontfix this
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
(In reply to Tom Ritter [:tjr] from comment #20) > We're not using gcc 5 anymore, going to wontfix this Yes, but the issue is not gone. :) With 6.4.0 and without explicitly -ldl I still get errors like /home/thomas/Arbeit/Tor/tor-browser/xpcom/glue/standalone/nsXPCOMGlue.cpp:116: error: undefined reference to 'dlsym' I've removed the GCC version information in case that's what confused you.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Summary: error: undefined reference to 'dlsym' if building with ASan and GCC 5 (Tor 17509) → error: undefined reference to 'dlsym' if building with ASan and GCC (Tor 17509)
Product: Core → Firefox Build System

I am afraid this problem may be back.
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/mozglue/misc/StackWalk.cpp:794: error: undefined reference to 'dladdr'
and several other places.
Oh, wait. I am getting this error WITH ordinary build using GCC, not ASAN build (!)

(In reply to ISHIKAWA, Chiaki from comment #22)

I am afraid this problem may be back.
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/mozglue/misc/StackWalk.cpp:794: error: undefined reference to 'dladdr'
and several other places.
Oh, wait. I am getting this error WITH ordinary build using GCC, not ASAN build (!)

I added

# suddenly dlopen link fails. Aug 17, 2022 (maybe after apg-get dist-upgrade)
export LDFLAGS="-dl"

to my MOZCONFIG, and it solved many l errors except for the following errors that still remain.

gmake[4]: Leaving directory '/NEW-SSD/moz-obj-dir/objdir-tb3/xpcom/tests'
gmake[4]: Entering directory '/NEW-SSD/moz-obj-dir/objdir-tb3/security/manager/ssl/ipcclientcerts'
security/manager/ssl/ipcclientcerts/force-cargo-library-build
/home/ishikawa/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/bin/cargo rustc  --offline --manifest-path /NEW-SSD/NREF-COMM-CENTRAL/mozilla/security/manager/ssl/ipcclientcerts/Cargo.toml -vv --color=always  --lib --target=x86_64-unknown-linux-gnu  --
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/linking/prlink.c:97: error: undefined reference to 'dlerror'
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/linking/prlink.c:783: error: undefined reference to 'dlsym'
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/malloc/prmem.c:99: error: undefined reference to 'dlopen'
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/malloc/prmem.c:103: error: undefined reference to 'dlsym'
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/malloc/prmem.c:104: error: undefined reference to 'dlclose'
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/linking/prlink.c:960: error: undefined reference to 'dladdr'
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/linking/prlink.c:140: error: undefined reference to 'dlopen'
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/linking/prlink.c:584: error: undefined reference to 'dlopen'
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/linking/prlink.c:676: error: undefined reference to 'dlclose'
/NEW-SSD/NREF-COMM-CENTRAL/mozilla/nsprpub/pr/src/pthreads/ptthread.c:1696: error: undefined reference to 'dlsym'
collect2: error: ld returned 1 exit status
gmake[4]: *** [/NEW-SSD/NREF-COMM-CENTRAL/mozilla/config/rules.mk:540: libnspr4.so] Error 1
gmake[4]: Leaving directory '/NEW-SSD/moz-obj-dir/objdir-tb3/config/external/nspr/pr'
gmake[3]: *** [/NEW-SSD/NREF-COMM-CENTRAL/mozilla/config/recurse.mk:72: config/external/nspr/pr/target] Error 2
gmake[3]: *** Waiting for unfinished jobs....

I have upgraded Debian GNU/linux packages before the source tree update, and that may have something to do with
the issue. More investigation continues.

Again, I had a problem today with non-ASAN build with gcc-10.

Adding -ldl on CFLAGS, and CXXFLAGS as in the case of ASAN fixed the issue.
But it was not necessary for a LONG TIME.

Relevant part from my shell script to invoke |mach build|, etc. after setting up various environment variables.

# ASAN=-fsanitize=address -ldl
# We need -ldl even for non-asan build?           
ASAN=-ldl                                                       #   <---- NEW
#

#
# -fPIC -mcmodel=large caused cpuid() asm definition in
# mozilla/media/libvpx/vpx_ports/x86.h to blow up.
# 
# see the discussion in https://gcc.gnu.org/ml/gcc-patches/2012-12/msg01484.html
# But I don't understand the GCC in-line asm of late.
#
MEMORYMODEL="-fPIC -mcmodel=large"
MEMORYMODEL="-mcmodel=large"
MEMORYMODEL=

# July, 2015 after gtk+-3 (?)
# MEMORYMODEL=-fPIC
# Again in Jan 2016 when
# /usr/local/bin/ld: error: /NREF-COMM-CENTRAL/objdir-tb3/toolkit/library/../../mailnews/import/src/nsImportScanFile.o: requires dynamic R_X86_64_PC32 reloc against '_Z15FullyReadStreamP14nsIInputStreamPvjPj' which may overflow at runtime; recompile with -fPIC
# 
# MEMORYMODEL=-fPIC

# undefine DEBUG

# April 23, 2016
# break CC Macro into main CC macro for binary and  CFLAGS
#

# added -Werror=sign-compare on Sept 15, 2016
# added -Werror=unused-result on Sept 22, 2016


WARNFLAGS="-Werror=sign-compare -Werror=unused-result -Werror=unused-variable -Werror=format"

NULLPTRCHK=-fno-delete-null-pointer-checks
NULLPTRCHK=

# took out -DDEBUG=1 from the command line on Jan 24, 2014
# but it is obviously back as -DDEBUG in CFLAGS.

#
SPLITDWARF=-gsplit-dwarf
SPLITDWARF=
export CFLAGS="$CFLAGS $MEMORYMODEL $ASAN -fno-builtin-strlen -Dfdatasync=fdatasync  -DDEBUG_4GB_CHECK -DUSEHELGRIND=1 -DUSEVALGRIND=1 -DDEBUG  $NULLPTRCHK -g ${SPLITDWARF} ${WARNFLAGS} -fuse-ld=gold"
export CXXFLAGS="$CXXFLAGS $MEMORYMODEL $ASAN  -fno-builtin-strlen  -Dfdatasync=fdatasync -DDEBUG_4GB_CHECK -DUSEHELGRIND=1 -DUSEVALGRIND=1 -DDEBUG $NULLPTRCHK  -g ${SPLITDWARF} ${WARNFLAGS} -fuse-ld=gold"

... eventually |mach "$*" | is invoked.
Severity: normal → S3

The bug assignee is inactive on Bugzilla, so the assignee is being reset.

Assignee: gk → nobody
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: