Closed Bug 1690353 Opened 5 years ago Closed 5 years ago

Linker issue when building with MOZILLA_OFFICIAL

Categories

(Firefox Build System :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: agi, Assigned: mhentges)

Details

Attachments

(3 files)

I'm trying to generate a MOZILLA_OFFICIAL build for some testing. I have this mozconfig:

ac_add_options --enable-application=mobile/android
ac_add_options --target=arm-linux-androideabi

# Mozilla Official
ac_add_options --with-branding=mobile/android/branding/nightly
export FENNEC_NIGHTLY=1
export MOZILLA_OFFICIAL=1
ac_add_options --with-android-min-sdk=16

mk_add_options MOZ_OBJDIR=./objdir-arm

ac_add_options --enable-crashreporter
ac_add_options --with-macos-sdk=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX.sdk

# ICECC
ac_add_options --with-compiler-wrapper="/usr/local/bin/icecc"
mk_add_options MOZ_MAKE_FLAGS="-j40"

# Rust cc library doesn't work well without this
export HOST_CC="/Users/asferro/.mozbuild/clang/bin/clang --target=x86_64-apple-darwin19.5.0"
export HOST_CXX="/Users/asferro/.mozbuild/clang/bin/clang --target=x86_64-apple-darwin19.5.0"

If I build with the above, mach tries to use my local clang instead of the one in .mozbuild for some reason (MOZILLA_OFFICIAL somehow triggers this behavior, removing the flag Gecko compiles fine), so I added:

export CC="/Users/asferro/.mozbuild/clang/bin/clang"
export CXX="/Users/asferro/.mozbuild/clang/bin/clang"
export NASM="/Users/asferro/.mozbuild/nasm/nasm"
ac_add_options --enable-linker=lld

And now I get linker errors related to -lm:

 0:21.42 /Users/asferro/workspace/mozilla-central/objdir-arm/_virtualenvs/init_py3/bin/python -m mozbuild.action.check_binary --target libmodules-test.so
 0:21.42 ld.lld: error: undefined symbol: round
 0:21.42 >>> referenced by Replay.cpp:631 (/Users/asferro/workspace/mozilla-central/memory/replace/logalloc/replay/Replay.cpp:631)
 0:21.42 >>>               /Users/asferro/workspace/mozilla-central/objdir-arm/memory/replace/logalloc/replay/Replay.o:(Replay::jemalloc_stats(Buffer&, Buffer&))
 0:21.42 >>> referenced by Replay.cpp:631 (/Users/asferro/workspace/mozilla-central/memory/replace/logalloc/replay/Replay.cpp:631)
 0:21.42 >>>               /Users/asferro/workspace/mozilla-central/objdir-arm/memory/replace/logalloc/replay/Replay.o:(Replay::jemalloc_stats(Buffer&, Buffer&))
 0:21.42 >>> referenced by Replay.cpp:631 (/Users/asferro/workspace/mozilla-central/memory/replace/logalloc/replay/Replay.cpp:631)
 0:21.42 >>>               /Users/asferro/workspace/mozilla-central/objdir-arm/memory/replace/logalloc/replay/Replay.o:(Replay::jemalloc_stats(Buffer&, Buffer&))
 0:21.42 >>> referenced 6 more times
 0:21.42 ld.lld: error: undefined symbol: ceil
 0:21.42 >>> referenced by FdPrintf.cpp:71 (/Users/asferro/workspace/mozilla-central/memory/replace/logalloc/FdPrintf.cpp:71)
 0:21.42 >>>               /Users/asferro/workspace/mozilla-central/objdir-arm/memory/replace/logalloc/replay/../FdPrintf.o:(FdPrintf(int, char const*, ...))
 0:21.42 ld.lld: error: undefined symbol: pow
 0:21.43 >>> referenced by math.h:1001 (/Users/asferro/.mozbuild/android-ndk-r20/sources/cxx-stl/llvm-libc++/include/math.h:1001)
 0:21.43 >>>               /Users/asferro/workspace/mozilla-central/objdir-arm/memory/replace/logalloc/replay/../FdPrintf.o:(FdPrintf(int, char const*, ...))
 0:21.43 clang-11: error: linker command failed with exit code 1 (use -v to see invocation)
 0:21.43 make[4]: *** [../../../../dist/bin/logalloc-replay] Error 1
 0:21.43 make[3]: *** [memory/replace/logalloc/replay/target] Error 2
 0:21.43 make[3]: *** Waiting for unfinished jobs....
Assignee: nobody → mhentges
Status: NEW → ASSIGNED
Priority: -- → P2
Attached patch agi.patchSplinter Review

Heh, we currently don't distinguish between [developer machine vs CI/downstream machine] and [debug build/release build].

I've attached a little test patch here. If you apply it, you can do a release build while allowing ~/.mozbuild tools to be used by adjusting your mozconfig:

export MOZILLA_OFFICIAL=1
export MOZ_DEVELOPER_MACHINE=1

@Agi, can you give that a shot to see if that's sufficient to reproduce your Firefox bug?

Flags: needinfo?(agi)

Thanks for looking at this!

I get further into the build but it fails here:

 9:22.63 Warning: $SRCDIR/mobile/android/installer/package-manifest.in:23: Missing file(s): bin/dictionaries/*
 9:22.64 Warning: $SRCDIR/mobile/android/installer/package-manifest.in:82: Missing file(s): bin/package-name.txt
 9:22.64 Warning: $SRCDIR/mobile/android/installer/package-manifest.in:94: Missing file(s): bin/components/components.manifest
 9:22.69 Warning: $SRCDIR/mobile/android/installer/package-manifest.in:129: Missing file(s): bin/features/*
 9:22.72 Warning: $SRCDIR/mobile/android/installer/package-manifest.in:144: Missing file(s): bin/defaults/pref/channel-prefs.js
 9:22.72 Warning: $SRCDIR/mobile/android/installer/package-manifest.in:200: Missing file(s): bin/crashreporter-override.ini
 9:23.97 ../../../dist/geckoview/lib/x86_64/liblgpllibs.so: Couldn't find .bss. Skipping
 9:23.98 ../../../dist/geckoview/lib/x86_64/libmozavutil.so: Can't grow .dynamic section to set DT_INIT. Skipping
 9:24.00 ../../../dist/geckoview/lib/x86_64/libmozavcodec.so: Can't grow .dynamic section to set DT_INIT. Skipping
 9:25.89 ../../../dist/geckoview/lib/x86_64/libxul.so: Reduced by 10006272 bytes
 9:25.93 ../../../dist/geckoview/lib/x86_64/libnssckbi.so: terminate called after throwing an instance of 'std::runtime_error'
 9:25.93   what():  Segments overlap
 9:26.05 Traceback (most recent call last):
 9:26.05   File "/home/agi/workspace/mozilla-central/toolkit/mozapps/installer/packager.py", line 300, in <module>
 9:26.05     main()
 9:26.05   File "/home/agi/workspace/mozilla-central/toolkit/mozapps/installer/packager.py", line 295, in main
 9:26.05     copier.copy(args.destination)
 9:26.05   File "/home/agi/workspace/mozilla-central/python/mozbuild/mozpack/copier.py", line 434, in copy
 9:26.05     copy_results.append((destfile, f.copy(destfile, skip_if_older)))
 9:26.05   File "/home/agi/workspace/mozilla-central/python/mozbuild/mozpack/files.py", line 343, in copy
 9:26.05     elfhack(dest)
 9:26.05   File "/home/agi/workspace/mozilla-central/python/mozbuild/mozpack/executables.py", line 136, in elfhack
 9:26.05     errors.fatal("Error executing " + " ".join(cmd))
 9:26.05   File "/home/agi/workspace/mozilla-central/python/mozbuild/mozpack/errors.py", line 104, in fatal
 9:26.05     self._handle(self.FATAL, msg)
 9:26.05   File "/home/agi/workspace/mozilla-central/python/mozbuild/mozpack/errors.py", line 99, in _handle
 9:26.05     raise ErrorMessage(msg)
 9:26.05 mozpack.errors.ErrorMessage: Error: Error executing /home/agi/workspace/mozilla-central/objdir-release/build/unix/elfhack/elfhack ../../../dist/geckoview/lib/x86_64/libnssckbi.so
 9:26.06 /home/agi/workspace/mozilla-central/toolkit/mozapps/installer/packager.mk:25: recipe for target 'stage-package' failed
 9:26.06 make[4]: *** [stage-package] Error 1
 9:26.06 /home/agi/workspace/mozilla-central/mobile/android/build.mk:14: recipe for target 'stage-package' failed
 9:26.06 make[3]: *** [stage-package] Error 2
 9:26.07 /home/agi/workspace/mozilla-central/config/recurse.mk:32: recipe for target 'android-stage-package' failed
 9:26.07 make[2]: *** [android-stage-package] Error 2
 9:26.07 /home/agi/workspace/mozilla-central/config/rules.mk:355: recipe for target 'default' failed
 9:26.07 make[1]: *** [default] Error 2
 9:26.07 client.mk:89: recipe for target 'build' failed
 9:26.07 make: *** [build] Error 2
 9:26.07 271 compiler warnings present.
Flags: needinfo?(agi)

The above was on my linux machine, I'm trying on my mac which is where I had this problem in the first place.

I just tried on my mac and with the patch in Comment 1 I can successfully build a MOZILLA_OFFICIAL build! The failure on linux might be unrelated?

Can you try --enable-bootstrap instead? (without the patch)

Thanks for looking at this!

Thanks for the good vibes 😎

I get further into the build but it fails here:

On your linux machine, would you mind hopping on the newest central , doing a fresh ./mach bootstrap, and trying again?
If it still fails, would you mind sending me:

  • Your mozconfig
  • The contents of $objdir/config.status?

I did a local Android build here on Linux, and it successfully built, so I'm hoping that the above steps will track down our discrepancy or resolve the failure.

Flags: needinfo?(agi)
Attached file config.status

Just tried, I get the same error.

This is my mozconfig:

ac_add_options --enable-application=mobile/android
ac_add_options --target=x86_64

# Mozilla Official
ac_add_options --with-branding=mobile/android/branding/nightly
export FENNEC_NIGHTLY=1
export MOZILLA_OFFICIAL=1
ac_add_options --with-android-min-sdk=16
ac_add_options --enable-crashreporter
export MOZ_DEVELOPER_MACHINE=1

mk_add_options MOZ_OBJDIR=./objdir-release
Flags: needinfo?(agi)

(In reply to Mike Hommey [:glandium] from comment #5)

Can you try --enable-bootstrap instead? (without the patch)

 0:01.78 mozbuild.configure.options.InvalidOptionError: --enable-bootstrap is not available in this configuration

(In reply to Agi Sferro | :agi | ni? for questions | ⏰ PST | he/him from comment #8)

(In reply to Mike Hommey [:glandium] from comment #5)

Can you try --enable-bootstrap instead? (without the patch)

 0:01.78 mozbuild.configure.options.InvalidOptionError: --enable-bootstrap is not available in this configuration

You're not on a recent central, are you?

Erf, the change I was thinking of hadn't landed yet. Can you try applying the patch from bug 1690712 and retry with --enable-bootstrap?

I've got a local reproduce here with MOZ_DEVELOPER_MACHINE, digging in now.

Update: minimal mozconfig to repro the failure:

ac_add_options --enable-application=mobile/android
ac_add_options --target=x86_64
export MOZILLA_OFFICIAL=1
export MOZ_DEVELOPER_MACHINE=1
Attached patch agi.patchSplinter Review

@Agi two things:

  1. That failure was due to elf_hack running on our local build. I don't know specifically why it failed, but I can imagine it not expecting developer-machine-specific results. Either way, you can try a build with my new patch here.
  2. The other day in the #build channel, you mentioned:

👋 team! is there a way to generate a MOZILLA_OFFICIAL build on try? I have a bug that reproduces only on the nightly build but not on a try (or local) build

I just did a try job here, and looking at the logs, we can see that they have MOZILLA_OFFICIAL=1. (If you look at their raw logs and search for MOZILLA_OFFICIAL=1, you'll find the section where the mozconfig is printed). So, you should be able to do ./mach try fuzzy and pick build-android-x86_64/debug/build-android-x86_64/opt, and hopefully that will give you a build that you want.

Would you mind trying either/both of these options to see if you can reproduce your bug you're tracking down?

Flags: needinfo?(agi)

Thank you so much :mhentges! with your patch I'm able to successfully build on linux too.

I just did a try job here, and looking at the logs, we can see that they have MOZILLA_OFFICIAL=1. (If you look at their raw logs and search for MOZILLA_OFFICIAL=1, you'll find the section where the mozconfig is printed). So, you should be able to do ./mach try fuzzy and pick build-android-x86_64/debug/build-android-x86_64/opt, and hopefully that will give you a build that you want.

Yeah this worked for me! It would be nice to have a way to build locally too so I don't have to wait 3 hours for each change but yeah :)

(In reply to Mike Hommey [:glandium] from comment #10)

Erf, the change I was thinking of hadn't landed yet. Can you try applying the patch from bug 1690712 and retry with --enable-bootstrap?

Sure, I'll try that now!

Flags: needinfo?(agi)

(In reply to Mike Hommey [:glandium] from comment #10)

Erf, the change I was thinking of hadn't landed yet. Can you try applying the patch from bug 1690712 and retry with --enable-bootstrap?

Sure, I'll try that now!

This works too!

Sweet, that's good news, thanks Agi!

Heh, we currently don't distinguish between [developer machine vs CI/downstream machine] and [debug build/release build].

I'm going to leave this ticket open until our next Build Peers meeting where we decide how we'd like to expose that^ behaviour.

Flags: needinfo?(mhentges)

I'm going to call this closed here.
Had some discussion with Glandium about adjusting how we decide whether or not to use tools from ~/.mozbuild.
From central, it'll be possible to explicitly opt-in with the --enable-bootstrap argument, which is good.

IMO, I'd rather that we always opt out of using ~/.mozbuild (in CI and distro builds), rather than the current case where you sometimes (e.g.: when doing release builds) have to opt-in. This would increase developer usability.
However, as Glandium mentioned in the meeting, developers generally be in situations where they're implicitly opted-in, and even then, we can improve the manual opt-in situation with documentation. Besides, having that opt-in to opt-out change would be breaking, and there's a nontrivial cost communicating that change to downstream.

Anyways, closing this issue.

Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Flags: needinfo?(mhentges)
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: