Closed
Bug 1489443
Opened 6 years ago
Closed 6 years ago
Linking error on libfreebl3.so in Fennec x86 build (read-only segment has dynamic relocations failure linking)
Categories
(Firefox Build System :: Android Studio and Gradle Integration, defect)
Firefox Build System
Android Studio and Gradle Integration
Tracking
(firefox64 fixed)
RESOLVED
FIXED
mozilla64
Tracking | Status | |
---|---|---|
firefox64 | --- | fixed |
People
(Reporter: edgar, Assigned: nalexander)
Details
Attachments
(1 file, 1 obsolete file)
Gecko revision: 0c947d96e8f3 NDS version: android-ndk-r16b (I have tried android-ndk-r15c, but got same error) I got the following error, ... 0:26.35 js/src/gdb 0:26.38 js/src/jsapi-tests 0:26.39 libfreebl3.so 0:26.56 /Users/edgarchen/.mozbuild/android-ndk-r16b/toolchains/x86-4.9/prebuilt/darwin-x86_64/lib/gcc/i686-linux-android/4.9.x/../../../../i686-linux-android/bin/ld: error: read-only segment has dynamic relocations 0:26.58 clang: error: linker command failed with exit code 1 (use -v to see invocation) 0:26.58 make[4]: *** [libfreebl3.so] Error 1 0:26.58 make[3]: *** [security/nss/lib/freebl/freebl_freebl3/target] Error 2 0:26.58 make[3]: *** Waiting for unfinished jobs.... 0:32.17 gdb-tests 0:45.83 jsapi-tests 0:49.66 make[2]: *** [compile] Error 2 0:49.67 make[1]: *** [default] Error 2 0:49.67 make: *** [build] Error 2 0:49.72 280 compiler warnings present.
Comment 1•6 years ago
|
||
Hm I don't think we have seen this error. Have you tried NDK r17b or r18 beta?
Reporter | ||
Comment 2•6 years ago
|
||
I saw this with NDK r17b as well. Haven't yet tried r18 beta, trying now.
Assignee | ||
Comment 3•6 years ago
|
||
This is a dupe of https://bugzilla.mozilla.org/show_bug.cgi?id=1449356, AFAICT. I really don't have any idea what's going on; I think in #build we concluded this was a compiler bug.
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → DUPLICATE
Assignee | ||
Comment 4•6 years ago
|
||
nfroyd: we're seeing this a lot trying to build target Android x86 on host macOS. Does the solution from FreeBSD (!) at https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=230239 sound like a sensible thing to do, perhaps only if we're using `--enable-linker=lld`?
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 5•6 years ago
|
||
Relevant error log is: 22:03.02 security/nss/lib/freebl/libfreebl3.so 22:03.02 /Users/k/.mozbuild/macosx64-clang/bin/ld.lld: error: can't create dynamic relocation R_386_PC32 against symbol: s_mpi_is_sse2 in readonly segment; recompile object files with -fPIC 22:03.02 >>> defined in mpcpucache.o 22:03.02 >>> referenced by mpi_x86.s:71 22:03.02 >>> mpi_x86.o:(.text+0xE) 22:03.02 /Users/k/.mozbuild/macosx64-clang/bin/ld.lld: error: can't create dynamic relocation R_386_PC32 against symbol: s_mpi_is_sse2 in readonly segment; recompile object files with -fPIC 22:03.05 >>> defined in mpcpucache.o 22:03.05 >>> referenced by mpi_x86.s:170 22:03.05 >>> mpi_x86.o:(.text+0x9F) 22:03.05 /Users/k/.mozbuild/macosx64-clang/bin/ld.lld: error: can't create dynamic relocation R_386_PC32 against symbol: s_mpi_is_sse2 in readonly segment; recompile object files with -fPIC 22:03.05 >>> defined in mpcpucache.o 22:03.05 >>> referenced by mpi_x86.s:267 22:03.05 >>> mpi_x86.o:(.text+0x13D) 22:03.05 /Users/k/.mozbuild/macosx64-clang/bin/ld.lld: error: can't create dynamic relocation R_386_PC32 against symbol: s_mpi_is_sse2 in readonly segment; recompile object files with -fPIC 22:03.05 >>> defined in mpcpucache.o 22:03.06 >>> referenced by mpi_x86.s:394 22:03.06 >>> mpi_x86.o:(.text+0x203) 22:03.06 clang-6.0: error: linker command failed with exit code 1 (use -v to see invocation) 22:03.06 make[4]: *** [libfreebl3.so] Error 1 22:03.06 make[3]: *** [security/nss/lib/freebl/freebl_freebl3/target] Error 2 (That's with lld, hence the informative error messages that we didn't see with GNU ld and/or gold.)
Assignee | ||
Comment 6•6 years ago
|
||
Wow... also: https://searchfox.org/mozilla-central/rev/65f9687eb192f8317b4e02b0b791932eff6237cc/Makefile.in#320-325 -- this issue has _history_. So that suggests that we have a badly configured toolchain somehow that's getting a system header that it should, that I don't see locally but (most?) others do.
Assignee | ||
Comment 7•6 years ago
|
||
Comparing my config.status against a failing config.status, I see the following changes that seem relevant: - 'DSO_LDOPTS': '-shared -Wl,-z,defs', + 'DSO_LDOPTS': '-shared', - 'GCC_USE_GNU_LD': '1', + 'GCC_USE_GNU_LD': '', - 'LLVM_CONFIG': '/usr/local/opt/llvm/bin/llvm-config', + 'LLVM_CONFIG': '/opt/local/bin/llvm-config-mp-4.0', - 'MOZ_CLANG_PATH': '/usr/local/Cellar/llvm/5.0.1/bin/clang', + 'MOZ_CLANG_PATH': '/opt/local/libexec/llvm-4.0/bin/clang', - 'MOZ_LIBCLANG_PATH': b'/usr/local/Cellar/llvm/5.0.1/lib', + 'MOZ_LIBCLANG_PATH': b'/opt/local/libexec/llvm-4.0/lib', I don't understand what DSO_LDOPTS controls. But what I think this all means is that /usr/bin/clang (which is HOST_CC) is different for me and for people seeing this. Although maybe all those MOZ_* clang things are different again? I get: $ /usr/bin/clang --version Apple LLVM version 9.1.0 (clang-902.0.39.2) Target: x86_64-apple-darwin17.7.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin and $ /usr/local/Cellar/llvm/5.0.1/bin/clang --version clang version 5.0.1 (tags/RELEASE_501/final) Target: x86_64-apple-darwin17.7.0 Thread model: posix InstalledDir: /usr/local/Cellar/llvm/5.0.1/bin lina: could you figure out what the versions of clang in the snippets listed above are for you? froyd: can we override all of these using HOST_CC/HOST_CXX?
Flags: needinfo?(lina)
Comment 8•6 years ago
|
||
Oh, sure enough, mine are different! > $ /usr/bin/clang --version > Apple LLVM version 10.0.0 (clang-1000.11.45.2) > Target: x86_64-apple-darwin17.7.0 > Thread model: posix > InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin > $ /opt/local/libexec/llvm-4.0/bin/clang --version > clang version 4.0.1 (tags/RELEASE_401/final) > Target: x86_64-apple-darwin17.7.0 > Thread model: posix > InstalledDir: /opt/local/libexec/llvm-4.0/bin The one I'm using now, that produced the log in comment 5: > $ /Users/k/.mozbuild/macosx64-clang/bin/clang --version > clang version 6.0.1 (tags/RELEASE_601/final) (llvm/tags/RELEASE_601/final 335538) > Target: x86_64-apple-darwin17.7.0 > Thread model: posix > InstalledDir: /Users/k/.mozbuild/macosx64-clang/bin I can try upgrading the clang in `/opt/local` tomorrow, and seeing if that helps; that's the one installed by MacPorts. I'm surprised it shows up at all, though; shouldn't it be using the one I downloaded into `$HOME/.mozbuild/macosx64-clang` instead?
Flags: needinfo?(lina)
Comment 9•6 years ago
|
||
For folks whose builds work and whose builds don't, what is your CC (from config.status) and what do you get executing: $CC -print-prog-name=ld and then, what do you get for: `$CC -print-prog-name=ld` -v (the backticks are important there). I swear we've looked at this on #build and figured out that: 1. Android builds on Macs fail. 2. Android builds on Linux work. 3. Android builds on Mac, but with the freebl objects from Linux, work. 4. Android builds on Linux, but with the freebl objects from Mac, work. which strongly suggested that the Mac linker was somehow busted, but if some people can get Android builds on Mac to work, maybe there's something very specific going wrong. We *could* make things work by using the FreeBSD fix, but that is a hack, and if we *really* had to, it would only be available on developer, not release, builds. (You don't want relocations in your .text section/segment in general, because that requires your shared library's .text section/segment to be writable, which makes it both not shareable between multiple processes and read-write-execute code segments are a Bad Thing. Android might even yell at you nowadays if you try to load objects with read-write-execute code segments.)
Flags: needinfo?(nfroyd)
Comment 10•6 years ago
|
||
Thanks for looking into this, Nathan! From config.status: > 'CC': '/Users/k/.mozbuild/macosx64-clang/bin/clang -std=gnu99 --target=i686-linux-android' Without backticks: > $ /Users/k/.mozbuild/macosx64-clang/bin/clang -std=gnu99 --target=i686-linux-android -print-prog-name=ld -v > clang version 6.0.1 (tags/RELEASE_601/final) (llvm/tags/RELEASE_601/final 335538) > Target: i686--linux-android > Thread model: posix > InstalledDir: /Users/k/.mozbuild/macosx64-clang/bin > Found candidate GCC installation: /Users/k/.mozbuild/macosx64-clang/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.4 > Selected GCC installation: /Users/k/.mozbuild/macosx64-clang/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.4 > Candidate multilib: .;@m64 > Candidate multilib: 32;@m32 > Selected multilib: 32;@m32 > /opt/local/bin/ld With backticks: > $ `/Users/k/.mozbuild/macosx64-clang/bin/clang -std=gnu99 --target=i686-linux-android -print-prog-name=ld` -v > @(#)PROGRAM:ld PROJECT:ld64-274.2 > configured to support archs: i386 x86_64 x86_64h armv6 armv7 armv7s armv7m armv7k arm64 (tvOS) > LTO support using: LLVM version 3.9.1 > 1. Android builds on Macs fail. > 2. Android builds on Linux work. This matches my experience, too; Android builds fine in an Ubuntu VM. :-)
Assignee | ||
Comment 11•6 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #9) > For folks whose builds work and whose builds don't, what is your CC (from > config.status) and what do you get executing: > > $CC -print-prog-name=ld > > and then, what do you get for: > > `$CC -print-prog-name=ld` -v > > (the backticks are important there). My builds work, both with and without --enable-linker=lld. bash-3.2$ export CC='/Users/nalexander/.mozbuild/macosx64-clang-6/bin/clang -std=gnu99 --target=i686-linux-android' bash-3.2$ $CC -v clang version 6.0.1 (tags/RELEASE_601/final) (llvm/tags/RELEASE_601/final 335538) Target: i686--linux-android Thread model: posix InstalledDir: /Users/nalexander/.mozbuild/macosx64-clang-6/bin Found candidate GCC installation: /Users/nalexander/.mozbuild/macosx64-clang-6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.4 Selected GCC installation: /Users/nalexander/.mozbuild/macosx64-clang-6/bin/../lib/gcc/x86_64-unknown-linux-gnu/4.9.4 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Selected multilib: 32;@m32 clang-6.0: warning: argument unused during compilation: '-std=gnu99' [-Wunused-command-line-argument] bash-3.2$ $CC -print-prog-name=ld /Users/nalexander/.mozbuild/android-sdk-macosx/build-tools/23.0.3/i686-linux-android-ld bash-3.2$ `$CC -print-prog-name=ld` -v GNU gold (GNU Binutils 2.22.90.20120727) 1.11 bash-3.2$ `$CC -print-prog-name=ld.lld` -v LLD 6.0.1 (tags/RELEASE_601/final 335538) (compatible with GNU linkers) bash-3.2$ $CC -print-prog-name=ld.lld /Users/nalexander/.mozbuild/macosx64-clang-6/bin/ld.lld Is it possible that somehow gold is getting worked into the mix here in some way that avoids the issue for me? > I swear we've looked at this on #build and figured out that: > > 1. Android builds on Macs fail. > 2. Android builds on Linux work. > 3. Android builds on Mac, but with the freebl objects from Linux, work. > 4. Android builds on Linux, but with the freebl objects from Mac, work. This agrees with my memory. > which strongly suggested that the Mac linker was somehow busted, but if some > people can get Android builds on Mac to work, maybe there's something very > specific going wrong. > > We *could* make things work by using the FreeBSD fix, but that is a hack, > and if we *really* had to, it would only be available on developer, not > release, builds. (You don't want relocations in your .text section/segment > in general, because that requires your shared library's .text > section/segment to be writable, which makes it both not shareable between > multiple processes and read-write-execute code segments are a Bad Thing. > Android might even yell at you nowadays if you try to load objects with > read-write-execute code segments.) Thanks for this context. And in fact, yes, Android should yell (post Android 23) about this, IIUC. froyd: can you explain why GCC_USE_GNU_LD differs for me and for lina?
Flags: needinfo?(nfroyd)
Assignee | ||
Comment 12•6 years ago
|
||
To follow up: for Lina, $ /Users/k/.mozbuild/macosx64-clang/bin/clang -std=gnu99 --target=i686-linux-android -print-prog-name=ld /opt/local/bin/ld $ /Users/k/.mozbuild/macosx64-clang/bin/clang -std=gnu99 --target=i686-linux-android -print-prog-name=ld.lld /Users/k/.mozbuild/macosx64-clang/bin/ld.lld So it seems that somehow we’re invoking `ld`, and for Lina that’s a (busted?) macOS linker, and for me that’s a (non-busted?) gold.
Assignee | ||
Comment 13•6 years ago
|
||
agi: could you provide the outputs in https://bugzilla.mozilla.org/show_bug.cgi?id=1489443#c9 for your busted macOS Android x86 objdir? If you don't have the build, you should be able to just set up the mozconfig and then ./mach configure to get the details _without_ doing the actual build.
Flags: needinfo?(agi)
Comment 14•6 years ago
|
||
Nick, I think my setup is similar to Lina's My config status has: > $ cat obj-i686-linux-android/config.status | grep \'CC\' > 'CC': '/Users/asferro/clang/clang+llvm-6.0.0-x86_64-apple-darwin/bin/clang -std=gnu99 --target=i686-linux-android', This is the output of $CC --print-prog-name=ld > /Users/asferro/clang/clang+llvm-6.0.0-x86_64-apple-darwin/bin/clang -std=gnu99 --target=i686-linux-android --print-prog-name=ld -v > clang version 6.0.0 (tags/RELEASE_600/final) > Target: i686--linux-android > Thread model: posix > InstalledDir: /Users/asferro/clang/clang+llvm-6.0.0-x86_64-apple-darwin/bin > /usr/bin/ld and this is the output of `$CC --print-prog-name=ld` -v > `/Users/asferro/clang/clang+llvm-6.0.0-x86_64-apple-darwin/bin/clang -std=gnu99 --target=i686-linux-android --print-prog-name=ld` -v > @(#)PROGRAM:ld PROJECT:ld64-409.12 > BUILD 02:04:28 Aug 14 2018 > configured to support archs: armv6 armv7 armv7s arm64 i386 x86_64 x86_64h armv6m armv7k armv7m armv7em arm64e arm64_32 > LTO support using: LLVM version 10.0.0, (clang-1000.11.45.2) (static support for 21, runtime is 21) > TAPI support using: Apple TAPI version 10.0.0 (tapi-1000.11.8.2) Thanks for looking at this!
Flags: needinfo?(agi)
Assignee | ||
Comment 15•6 years ago
|
||
So, what's really interesting is that I get GNU gold and everybody else gets some system linker. GNU gold sets the GCC_USE_GNU_LD. I think I get this 'cuz the build-tools are on my $PATH. I'm trying a build with them not on my path and we'll see if I can reproduce the failure.
Comment 16•6 years ago
|
||
If I put my android sdk folder in the path > export PATH=$PATH:$HOME/.mozbuild/android-sdk-macosx/build-tools/28.0.2/ I get the right linker! > $CC --print-prog-name=ld > /Users/asferro/.mozbuild/android-sdk-macosx/build-tools/28.0.2/i686-linux-android-ld
Assignee | ||
Comment 17•6 years ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #15) > So, what's really interesting is that I get GNU gold and everybody else gets > some system linker. GNU gold sets the GCC_USE_GNU_LD. I think I get this > 'cuz the build-tools are on my $PATH. I'm trying a build with them not on > my path and we'll see if I can reproduce the failure. I can reproduce the failure! So the issue is definitely that in some way we're invoking a macOS system linker, which is not the specified linker, since I have --enable-linker=lld. I notice that freebl is built from gyp definitions; is it possible that our conversion from gyp isn't doing the right thing vis-a-vis linker definitions? I will try to figure out how to force the linker that our clang toolchain finds to be a more sensible option.
Comment 18•6 years ago
|
||
You may want to completely remove the garbage that we have for determining GCC_USE_GNU_LD and instead add the necessary logic to select_linker in toolchain.configure, which has a more complete picture of what toolchain we're using.
Flags: needinfo?(nfroyd)
Comment 19•6 years ago
|
||
(select_linker also adds in Android-specific flags to pick the correct tools--as/ld/etc.--which may address the situation of people not having things on their PATH)
Assignee | ||
Comment 20•6 years ago
|
||
The failing invocation is: 0:07.07 security/nss/lib/freebl/libfreebl3.so 0:07.07 rm -f libfreebl3.so 0:07.07 /usr/local/Cellar/icecream/1.1/libexec/icecc/bin/clang -std=gnu99 --target=i686-linux-android -Qunused-arguments -isystem /Users/nalexander/.mozbuild/android-ndk-r15c/platforms/android-16/arch-x86/usr/include -isystem /Users/nalexander/.mozbuild/android-ndk-r15c/platforms/android-16/arch-x86/usr/include -gcc-toolchain /Users/nalexander/.mozbuild/android-ndk-r15c/toolchains/x86-4.9/prebuilt/darwin-x86_64 -D__ANDROID_API__=16 -fno-short-enums -fno-exceptions -fno-strict-aliasing -fno-math-errno -pipe -g -Oz -fno-omit-frame-pointer -funwind-tables -Qunused-arguments -Wall -Wempty-body -Wignored-qualifiers -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wunreachable-code -Wunreachable-code-return -Wclass-varargs -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wstring-conversion -Wtautological-overlap-compare -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -fPIC -shared -Wl,-h,libfreebl3.so -o libfreebl3.so /Users/nalexander/Mozilla/objdirs/objdir-droid-icecc3/security/nss/lib/freebl/freebl_freebl3/libfreebl3_so.list -L/Users/nalexander/.mozbuild/android-ndk-r15c/platforms/android-16/arch-x86/usr/lib -Wl,-rpath-link=/Users/nalexander/.mozbuild/android-ndk-r15c/platforms/android-16/arch-x86/usr/lib --sysroot=/Users/nalexander/.mozbuild/android-ndk-r15c/platforms/android-16/arch-x86 -Wl,--allow-shlib-undefined -gcc-toolchain /Users/nalexander/.mozbuild/android-ndk-r15c/toolchains/x86-4.9/prebuilt/darwin-x86_64 -fuse-ld=lld -Wl,-z,noexecstack -Wl,-z,text -Wl,-z,relro -Wl,-z,nocopyreloc -Wl,--build-id=sha1 -Wl,--hash-style=sysv -Wl,-rpath-link,/Users/nalexander/Mozilla/objdirs/objdir-droid-icecc3/dist/bin -Wl,-rpath-link,/usr/local/lib ../../../../libnss3.so Just FYI, icecream doesn't seem to make any difference -- I can succeed with and without it, and I can fail with it -- so I've been using it, 'cuz it's so much faster.
Updated•6 years ago
|
Status: RESOLVED → REOPENED
Resolution: DUPLICATE → ---
Assignee | ||
Comment 21•6 years ago
|
||
The successful invocation is: security/nss/lib/freebl/libfreebl3.so rm -f libfreebl3.so /usr/local/Cellar/icecream/1.1/libexec/icecc/bin/clang -std=gnu99 --target=i686-linux-android -Qunused-arguments -isystem /Users/nalexander/.mozbuild/android-ndk-r15c/platforms/android-16/arch-x86/usr/include -isystem /Users/nalexander/.mozbuild/android-ndk-r15c/platforms/android-16/arch-x86/usr/include -gcc-toolchain /Users/nalexander/.mozbuild/android-ndk-r15c/toolchains/x86-4.9/prebuilt/darwin-x86_64 -D__ANDROID_API__=16 -fno-short-enums -fno-exceptions -fno-strict-aliasing -fno-math-errno -pipe -g -Oz -fno-omit-frame-pointer -funwind-tables -Qunused-arguments -Wall -Wempty-body -Wignored-qualifiers -Wpointer-arith -Wshadow-field-in-constructor-modified -Wsign-compare -Wtype-limits -Wunreachable-code -Wunreachable-code-return -Wclass-varargs -Wfloat-overflow-conversion -Wfloat-zero-conversion -Wloop-analysis -Wstring-conversion -Wtautological-overlap-compare -Wno-error=deprecated-declarations -Wno-error=array-bounds -Wformat -Wformat-security -Wno-gnu-zero-variadic-macro-arguments -fPIC -shared -Wl,-z,defs -Wl,-h,libfreebl3.so -o libfreebl3.so /Users/nalexander/Mozilla/objdirs/objdir-droid-icecc2/security/nss/lib/freebl/freebl_freebl3/libfreebl3_so.list -L/Users/nalexander/.mozbuild/android-ndk-r15c/platforms/android-16/arch-x86/usr/lib -Wl,-rpath-link=/Users/nalexander/.mozbuild/android-ndk-r15c/platforms/android-16/arch-x86/usr/lib --sysroot=/Users/nalexander/.mozbuild/android-ndk-r15c/platforms/android-16/arch-x86 -Wl,--allow-shlib-undefined -gcc-toolchain /Users/nalexander/.mozbuild/android-ndk-r15c/toolchains/x86-4.9/prebuilt/darwin-x86_64 -fuse-ld=lld -Wl,-z,noexecstack -Wl,-z,text -Wl,-z,relro -Wl,-z,nocopyreloc -Wl,--build-id=sha1 -Wl,--hash-style=sysv -Wl,-rpath-link,/Users/nalexander/Mozilla/objdirs/objdir-droid-icecc2/dist/bin -Wl,-rpath-link,/usr/local/lib ../../../../libnss3.so -Wl,--version-script,out.freebl.def The changes are an additional -Wl,-z,defs and -Wl,--version-script,out.freebl.def on success. If I understand correctly -Wl,-z,defs just complains about undefined functions, so I doubt there's anything significant there.
Assignee | ||
Comment 22•6 years ago
|
||
The mpi_x86.s compilation invocation is identical. The resulting mpi_x86.o files differ by a single byte: they have the objdir encoded in them, which has a single character difference.
Assignee | ||
Comment 23•6 years ago
|
||
This is a cheap way to address a simple problem that has a complicated ideal solution. The underlying issue is that in some situations, when targeting Android, a macOS system `ld` is interrogated to determine if a cross-compiling linker "is GNU ld" and a particular linker feature is set in that situation. The macOS system `ld` doesn't pass the "is GNU ld", and the linker feature isn't set; that causes link failures, even though the actual linker has nothing to do with the system `ld`. The ideal solution is to test for linker capabilities dynamically. We do a lot of that in old-configure.in, and we don't do any of that in toolchain.configure. Rather than start testing in toolchain.configure, we hard-code: a cheap solution to the immediate problem.
Updated•6 years ago
|
Attachment #9016467 -
Attachment description: Bug 1489443 - Use -Wl,--version-script based on linker kind. r?froydnj → Bug 1489443 - Set GCC_USE_GNU_LD based on linker kind.
Comment 24•6 years ago
|
||
Pushed by nalexander@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/91300d29898b Set GCC_USE_GNU_LD based on linker kind. r=froydnj
Comment 25•6 years ago
|
||
Backed out for MinGW build bustages. Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=success,testfailed,busted,exception&classifiedState=unclassified&fromchange=b8b3597d7702e4e6d7c3358f19384b49d8c220c0&tochange=a06d50af52573db2ac6265ee169fc8039fbf8b04&searchStr=mingw,all&selectedJob=205190773&group_state=expanded Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=205190773&repo=autoland&lineNumber=41408 Backout link: https://hg.mozilla.org/integration/autoland/rev/a06d50af52573db2ac6265ee169fc8039fbf8b04
Flags: needinfo?(nalexander)
Assignee | ||
Comment 26•6 years ago
|
||
I thought MinGW was tier 2. I guess I said "back me out if that's the right call on IRC", but was it the right call? In any case: tjr, can you explain what the correct fix here is for mingw? This patch changes linkers that are "command line compatible with GNU ld (bfd)" to actually behave like that, even if they don't say "GNU" in their version information. Clearly whatever is building MinGW has a linker.KIND that is GNU ld compatible, but is not in fact GNU ld compatible.
Flags: needinfo?(tom)
Comment 27•6 years ago
|
||
(In reply to Nick Alexander :nalexander [he/him] from comment #26) > In any case: tjr, can you explain what the correct fix here is for mingw? > This patch changes linkers that are "command line compatible with GNU ld > (bfd)" to actually behave like that, even if they don't say "GNU" in their > version information. Clearly whatever is building MinGW has a linker.KIND > that is GNU ld compatible, but is not in fact GNU ld compatible. It's the opposite: we are ld compatible, but this isn't detecting it as such. We had several NSS patches that added mingw support to NSS, and they all altered the build scripts to detect MinGW as "OS == win and cc_use_gnu_ld == 1". You can see comments to this effect in the gyp files: https://searchfox.org/mozilla-central/search?q=mingw&case=false®exp=false&path=*.gyp Your patch stops setting GCC_USE_GNU_LD for MinGW. The problem is that select_linker is never run because is_linker_option_enabled is false. (Because it detects the target kernel as WINNT.) Since select_linker is never run, we don't have a KIND, so we don't set GCC_USE_GNU_LD and we fail all over the place. In theory the fix is simple: > -@depends(target) > -def is_linker_option_enabled(target): > +@depends(target, c_compiler) > +def is_linker_option_enabled(target, compiler): > if target.kernel not in ('WINNT', 'SunOS'): > return True > + if target.kernel == 'WINNT' and compiler.type == 'clang': > + return True In practice this doesn't work: > /builds/worker/workspace/build/src/sccache2/sccache /builds/worker/workspace/build/src/clang/bin/x86_64-w64-mingw32-clang -std=gnu99 -Wl,--version > lld-link: error: undefined symbol: WinMain > >>> referenced by libmingw32.a(lib64_libmingw32_a-crt0_c.o):(main) > clang-7: error: linker command failed with exit code 1 (use -v to see invocation) But we can do something dirty.... > @@ -1768,6 +1770,12 @@ def select_linker(linker, c_compiler, de > if retcode == 1 and 'Logging ld64 options' in stderr: > kind = 'ld64' > > + # MinGW can't just print you out a version without giving it a lot of > + # extra stuff, but we can detect it and just fudge our way through: > + if retcode == 1 and 'referenced by libmingw32.a' in stderr and \ > + 'lld-link: error' in stderr: > + kind = 'lld' > + Here's a try run, hopefully it works: https://treeherder.mozilla.org/#/jobs?repo=try&revision=cc34ededc45166ecd8e409694d86c5491d4c06a7
Flags: needinfo?(tom)
Comment 28•6 years ago
|
||
(In reply to Tom Ritter [:tjr] from comment #27) > Here's a try run, hopefully it works: > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=cc34ededc45166ecd8e409694d86c5491d4c06a7 Try 2: https://treeherder.mozilla.org/#/jobs?repo=try&revision=cc60af1c227ff591922802ed3b1294131b3ec008
Assignee | ||
Comment 29•6 years ago
|
||
The desired outcome of this change is that we'll set `-Wl,--version-script` based on linker kind and not on the output of `$LINKER -v`. This is a cheap way to address a simple problem that has a complicated ideal solution. The underlying issue is that in some situations, when targeting Android, a macOS system `ld` is interrogated to determine if a cross-compiling linker "is GNU ld" and a particular linker feature is set in that situation. The macOS system `ld` doesn't pass the "is GNU ld" test, and the linker feature isn't set; that causes link failures, even though the actual linker has nothing to do with the system `ld`. The ideal solution is to test for linker capabilities dynamically. We do a lot of that in old-configure.in, and we don't do any of that in toolchain.configure. Rather than start testing in toolchain.configure, we hard-code: a cheap solution to the immediate problem. MinGW suffers somewhat from the opposite problem: the linker "is GNU ld" (compatible), but the checks aren't happening at all. We enable `select_linker` for MinGW to follow the flow here as well.
Comment 30•6 years ago
|
||
It would be better if you reopened D8471 and updated it, so that interdiff works.
Assignee | ||
Updated•6 years ago
|
Attachment #9017396 -
Attachment is obsolete: true
Flags: needinfo?(nalexander)
Assignee | ||
Comment 31•6 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #30) > It would be better if you reopened D8471 and updated it, so that interdiff > works. I have done this. Mike, something demands `--help`, and it looks like actually making that happen infects many, many layers of meta-programming in `toolchain.configure`. Can you provide guidance, preferably in the form of a patch that makes the configure lint checks pass?
Assignee | ||
Updated•6 years ago
|
Assignee: nobody → nalexander
Comment 32•6 years ago
|
||
Yeah... it's better to not have any indirect dependency on c_compiler in a `when`. Keeping that would mean configure --help would run the compiler checks... (which is why the --help dependencies are explicit: to make it apparent what bad things are going to happen if you do) Your best bet is to change the definition of GCC_USE_GNU_LD itself to be select_linker.KIND in ('bfd', 'gold', 'lld') or (target.kernel == 'WINNT' and compiler.type == 'clang') or None
Assignee | ||
Comment 33•6 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #32) > Yeah... it's better to not have any indirect dependency on c_compiler in a > `when`. Keeping that would mean configure --help would run the compiler > checks... (which is why the --help dependencies are explicit: to make it > apparent what bad things are going to happen if you do) I did not know this. Thanks. > Your best bet is to change the definition of GCC_USE_GNU_LD itself to be > select_linker.KIND in ('bfd', 'gold', 'lld') or (target.kernel == 'WINNT' > and compiler.type == 'clang') or None I think that tjr explains why this doesn't work in https://bugzilla.mozilla.org/show_bug.cgi?id=1489443#c27: specifically, select_linker isn't invoked, for $REASONS. Any other approach?
Flags: needinfo?(mh+mozilla)
Comment 34•6 years ago
|
||
I doesn't matter that select_linker is not invoked, since you'd be basing the choise on select_linker *or* target/compiler.
Flags: needinfo?(mh+mozilla)
Assignee | ||
Comment 35•6 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #34) > I doesn't matter that select_linker is not invoked, since you'd be basing > the choise on select_linker *or* target/compiler. This works well enough to reland, I think: https://treeherder.mozilla.org/#/jobs?repo=try&revision=77ba5c8dfea5a928e9a636d2f145d2c819b35f35 Thanks Mike, thanks tjr.
Comment 36•6 years ago
|
||
Pushed by nalexander@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/63248a593eae Set GCC_USE_GNU_LD based on linker kind. r=froydnj
Comment 37•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/63248a593eae
Status: REOPENED → RESOLVED
Closed: 6 years ago → 6 years ago
status-firefox64:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 64
Updated•5 years ago
|
Product: Firefox for Android → Firefox Build System
Target Milestone: Firefox 64 → mozilla64
You need to log in
before you can comment on or make changes to this bug.
Description
•