Closed Bug 1336471 Opened 7 years ago Closed 6 years ago

consider using the NDK's make-standalone-toolchain.sh to setup CC et al rather than hoops we go through now

Categories

(Firefox Build System :: General, defect)

All
Android
defect
Not set
normal

Tracking

(firefox54 affected)

RESOLVED WONTFIX
Tracking Status
firefox54 --- affected

People

(Reporter: froydnj, Unassigned)

Details

I think we should seriously consider using the NDK's make-standalone-toolchain.sh to set up CC/CXX.  There are several reasons for this:

1. Building Rust programs in a cross configuration requires passing the "linker" (i.e. gcc) into Cargo via an environment variable (bug 1329737).  The "linker", however, needs to know where various sysroot-y things are and our current setup requires passing in flags:

http://dxr.mozilla.org/mozilla-central/source/build/autoconf/android.m4#22

to make this happen.  Cargo has no facilities for passing in additional flags in this case; acrichto suggests that we could hack in RUSTFLAGS='-C link-args=...' (bug 1329737 comment 6), but, um, that's a bit gross.  Having CC/CXX automatically know where to find include files, libraries, etc. would be much more convenient.

2. As a nice side effect, this would make our Android compiler more like our desktop compilers: we don't have to pass in all sorts of wild options to specify include paths and such.

3. We would like to move to clang for our compiler for Fennec, or at least make it an option for people so they can enjoy better error messages and/or so we can stand up a taskcluster clang-android build job so we can ensure that things don't break until we make the switch.  Making clang the default is problematic right now because building Fennec with clang incurs a large space penalty (~10%) for libxul compared to GCC: https://github.com/android-ndk/ndk/issues/133

In any event, getting configure to support clang with our current setup is difficult: several options have slightly different meanings than GCC, clang requires different options, etc.  The one thing that I don't think is easily resolvable is:

http://dxr.mozilla.org/mozilla-central/source/build/moz.configure/android-ndk.configure#91

The NDK's clang doesn't want you to use -idirafter; it wants you to use something else for silly reasons.  But we don't know which flags to pick at that point because we don't know the compiler we're using, and we can't determine the compiler we're using until we have those flags.  Classic.  (Maybe we could unravel this knot with a bit of hacking, but we'd still face issue #1.)  Using the NDK script to produce a `clang` for CC means that we wouldn't have to deal with that.

4. There are probably other good reasons (maybe it makes life simpler for IDEs?) that I have not thought of.

How would this actually happen?

This is the handwavy part.  I don't think we want this to happen as part of configure, since the whole point of this is to set things up for configure.  But setting it up prior to configure means that something outside of configure/mozconfigs needs to know details about what we're building, which does not seem so nice.  This would be bootstrap in the case of local builds, I think, and some mozharness/taskcluster config thing in the case of automation.  The latter is not so bad, but sticking details like that into bootstrap leaves an unpleasant taste in my mouth.  Maybe we're OK with that, though.

*Possibly* for local builds, we could ask people to manually run the make-standalone-toolchain script, on the theory that not many people actually develop the C++ side of Android and a little pain is OK in that case.  But that does not seem very friendly.

Anyway, thoughts, Nick and Mike?  (And anybody else who wants to chime in?)
Flags: needinfo?(nalexander)
Flags: needinfo?(mh+mozilla)
NI to snorp, who has more compiler toolchain experience than me.

(In reply to Nathan Froyd [:froydnj] from comment #0)
> I think we should seriously consider using the NDK's
> make-standalone-toolchain.sh to set up CC/CXX.  There are several reasons
> for this:
> 
> 1. Building Rust programs in a cross configuration requires passing the
> "linker" (i.e. gcc) into Cargo via an environment variable (bug 1329737). 
> The "linker", however, needs to know where various sysroot-y things are and
> our current setup requires passing in flags:
> 
> http://dxr.mozilla.org/mozilla-central/source/build/autoconf/android.m4#22
> 
> to make this happen.  Cargo has no facilities for passing in additional
> flags in this case; acrichto suggests that we could hack in RUSTFLAGS='-C
> link-args=...' (bug 1329737 comment 6), but, um, that's a bit gross.  Having
> CC/CXX automatically know where to find include files, libraries, etc. would
> be much more convenient.

An alternative: yet another layer of indirection.  Encode the flags into $TOPOBJDIR/cc{x} and roll on.  I'm sure we've done this in other places; I haven't a clue if it's a good idea or a bad idea.

> 2. As a nice side effect, this would make our Android compiler more like our
> desktop compilers: we don't have to pass in all sorts of wild options to
> specify include paths and such.

This would be nice.

> 3. We would like to move to clang for our compiler for Fennec, or at least
> make it an option for people so they can enjoy better error messages and/or
> so we can stand up a taskcluster clang-android build job so we can ensure
> that things don't break until we make the switch.  Making clang the default
> is problematic right now because building Fennec with clang incurs a large
> space penalty (~10%) for libxul compared to GCC:
> https://github.com/android-ndk/ndk/issues/133
> 
> In any event, getting configure to support clang with our current setup is
> difficult: several options have slightly different meanings than GCC, clang
> requires different options, etc.  The one thing that I don't think is easily
> resolvable is:
> 
> http://dxr.mozilla.org/mozilla-central/source/build/moz.configure/android-
> ndk.configure#91
> 
> The NDK's clang doesn't want you to use -idirafter; it wants you to use
> something else for silly reasons.  But we don't know which flags to pick at
> that point because we don't know the compiler we're using, and we can't
> determine the compiler we're using until we have those flags.  Classic. 
> (Maybe we could unravel this knot with a bit of hacking, but we'd still face
> issue #1.)  Using the NDK script to produce a `clang` for CC means that we
> wouldn't have to deal with that.
> 
> 4. There are probably other good reasons (maybe it makes life simpler for
> IDEs?) that I have not thought of.

There are some small wins using clang and the integrated JDK/NDK debugger in AS, but I don't think anybody in the Android platform group actually does this -- I think they all use jimdb.  (I'd like to see more hybrid debugger use, since it's pretty awesome, but that's not a good reason to do or not do the work in this ticket.)

> How would this actually happen?
> 
> This is the handwavy part.  I don't think we want this to happen as part of
> configure, since the whole point of this is to set things up for configure. 
> But setting it up prior to configure means that something outside of
> configure/mozconfigs needs to know details about what we're building, which
> does not seem so nice.  This would be bootstrap in the case of local builds,
> I think, and some mozharness/taskcluster config thing in the case of
> automation.  The latter is not so bad, but sticking details like that into
> bootstrap leaves an unpleasant taste in my mouth.  Maybe we're OK with that,
> though.
> 
> *Possibly* for local builds, we could ask people to manually run the
> make-standalone-toolchain script, on the theory that not many people
> actually develop the C++ side of Android and a little pain is OK in that
> case.  But that does not seem very friendly.

I think not using prebuilt toolchains as a matter of course is a seriously high bar for developing Gecko on Android.  That'll mean folks like kats and botond, who do a little Android work every few months, will have to build a potentially expensive Android toolchain every time they want to address an Android-specific issue.  I think building the toolchain is very slow -- hours, last I looked -- but please correct me if I'm wrong.  Perhaps Mozilla can build and host these toolchains (in tooltool or whatever), and the bootstrapper can pick them up?

Basically, I don't think we should require to compile clang (hours) and then Firefox (hours on slow hardware) to work on the Gecko layer.  snorp may have other opinions.

Sorry I don't have much relevant experience or a really strong opinion here!
Flags: needinfo?(nalexander) → needinfo?(snorp)
(In reply to Nick Alexander :nalexander from comment #1)
> (In reply to Nathan Froyd [:froydnj] from comment #0)
> > *Possibly* for local builds, we could ask people to manually run the
> > make-standalone-toolchain script, on the theory that not many people
> > actually develop the C++ side of Android and a little pain is OK in that
> > case.  But that does not seem very friendly.
> 
> I think not using prebuilt toolchains as a matter of course is a seriously
> high bar for developing Gecko on Android.  That'll mean folks like kats and
> botond, who do a little Android work every few months, will have to build a
> potentially expensive Android toolchain every time they want to address an
> Android-specific issue.

Just to clarify, make-standalone-toolchain.sh copies/symlinks files from the NDK into a coherent directory structure.  It does *not* require rebuilding the NDK from scratch.

On my machine (i7-6850k, SSD):

froydnj@thor:~$ time ./.mozbuild/android-ndk-r11c/build/tools/make-standalone-toolchain.sh --arch=arm --platform=android-9 --install-dir=~/tmp/ndk-r11c-arm --toolchain=arm-linux-androideabi-4.9
HOST_OS=linux
HOST_EXE=
HOST_ARCH=x86_64
HOST_TAG=linux-x86_64
HOST_NUM_CPUS=8
BUILD_NUM_CPUS=16
Copying prebuilt binaries...
Copying sysroot headers and libraries...
Copying c++ runtime headers and libraries...
Copying files to: ~/tmp/ndk-r11c-arm
Cleaning up...
Done.

real	0m0.707s
user	0m0.084s
sys	0m0.544s

So not an onerous burden.  Does copy about 300M of files, apparently, assuming `du` isn't following symlinks.

I note that the script actually has a --package-dir argument for creating packages, so maybe we should just be sticking that in tooltool instead, and downloading that from mach bootstrap?  That would solve a lot of things in one fell swoop.
Oh, duh, we probably can't have people download the (repackaged) NDK from tooltool because of legal reasons...
If we had `mach bootstrap` run `make-standalone-toolchain` into the ~/.mozbuild dir, and then tweaked configure so it was easy to point things there (maybe --with-android-toolchain=whatever?) that seems like it wouldn't be too bad.
Yeah, I think putting this in bootstrap is fine. You already have to do some mozconfig tweaks by hand with current bootstrap, so pasting the output as Ted says is not worse than what we have now.

For automation, we already have to run a script to repackage the NDK for that environment, so I think we (uh, Nathan) would just modify the existing script to also do the make-standalone-toolchain.sh bits.
Flags: needinfo?(snorp)
We have a configure flag to pass the path to the ndk, why not run `make-standalone-toolchain` from there to a convenient place in the objdir?
Flags: needinfo?(mh+mozilla)
(In reply to Mike Hommey [:glandium] from comment #6)
> We have a configure flag to pass the path to the ndk, why not run
> `make-standalone-toolchain` from there to a convenient place in the objdir?

Because I don't know a good way to make configure do something like:

1. Check for Android.
1a. If Android, run make-standalone-toolchain and set all the necessary environment variables/configure options, and try to avoid conflicting with things that the user might have already set.
1b. If not, continue as normal.
2. Run all the configure checks.

I guess we could have a @depends function somewhere in toolchain.configure that would have the side-effect of performing step 1a above, and complain if anything that it set turned out to be in conflict with the predetected settings?
Flags: needinfo?(mh+mozilla)
(In reply to Nathan Froyd [:froydnj] from comment #7)
> 1a. If Android, run make-standalone-toolchain and set all the necessary
> environment variables/configure options, and try to avoid conflicting with
> things that the user might have already set.

When I was trying to moz.configure-ify the iOS build, this was basically impossible. I think you'd need to hack toolchain.configure directly -- or add some kind of hook there for this kind of thing.
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #8)
> (In reply to Nathan Froyd [:froydnj] from comment #7)
> > 1a. If Android, run make-standalone-toolchain and set all the necessary
> > environment variables/configure options, and try to avoid conflicting with
> > things that the user might have already set.
> 
> When I was trying to moz.configure-ify the iOS build, this was basically
> impossible. I think you'd need to hack toolchain.configure directly -- or
> add some kind of hook there for this kind of thing.

Things have changed since we talked about ios.m4. The main problems, from a cursory look at my irc logs were windows.configure and imply_option needing to appear before the Option it sets. The former was fixed in bug 1316957, and the latter is not really true anymore, depending what you do (and I think the linter doesn't catch the cases where you shoot yourself in the foot, so I should probably fix that)

Anyways, I think it's possible to add some imply_option's to android-ndk.configure to set e.g. CC and CXX, and that would barf if the user has some values of CC and CXX in their mozconfig at the same time.
Flags: needinfo?(mh+mozilla)
Aside: The Android NDK team is hoping to make standalone toolchains obsolete later in 2018. The goal is that the NDK itself would function in the same way that an NDK standalone toolchain functions today. See this roadmap entry: https://android.googlesource.com/platform/ndk/+/master/docs/Roadmap.md#make-standalone-toolchains-obsolete.
(In reply to Ryan Prichard from comment #10)
> Aside: The Android NDK team is hoping to make standalone toolchains obsolete
> later in 2018. The goal is that the NDK itself would function in the same
> way that an NDK standalone toolchain functions today. See this roadmap
> entry:
> https://android.googlesource.com/platform/ndk/+/master/docs/Roadmap.md#make-
> standalone-toolchains-obsolete.

Hi Ryan, thanks for reaching out to us.  As I understand now, Firefox for Android doesn't use standalone toolchains -- we use the NDK toolchain directly from tar.bz provided by Google.  Nathan might counter me; he knows much more than I do.
Product: Core → Firefox Build System
(In reply to Nick Alexander :nalexander from comment #11)
> (In reply to Ryan Prichard from comment #10)
> > Aside: The Android NDK team is hoping to make standalone toolchains obsolete
> > later in 2018. The goal is that the NDK itself would function in the same
> > way that an NDK standalone toolchain functions today. See this roadmap
> > entry:
> > https://android.googlesource.com/platform/ndk/+/master/docs/Roadmap.md#make-
> > standalone-toolchains-obsolete.
> 
> Hi Ryan, thanks for reaching out to us.  As I understand now, Firefox for
> Android doesn't use standalone toolchains -- we use the NDK toolchain
> directly from tar.bz provided by Google.  Nathan might counter me; he knows
> much more than I do.

Nope, you are correct.  I think given the NDK plans, we are just going to WONTFIX this.

Thanks for the explanation, Ryan!
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.