Investigate Icecream for distributed compilation

NEW
Unassigned

Status

()

Core
Build Config
4 years ago
6 months ago

People

(Reporter: gps, Unassigned)

Tracking

(Depends on: 2 bugs, Blocks: 1 bug)

Trunk
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

4 years ago
Icecream (https://github.com/icecc/icecream) is a distributed compilation tool (like distcc). It seems to have some compelling advantages over distcc. Notable are a central scheduler with prioritization and support for managing toolchains. This effectively means builds are more reliable.

We should investigate the feasibility of using Icecream to build the tree.
Does it support something similar to the distcc pipe mode?  Also, does it have decent first class Windows support?
(Reporter)

Comment 2

4 years ago
I don't believe it has Windows support.

I think you mean "pump" mode. I'm not sure.
(In reply to comment #2)
> I don't believe it has Windows support.

That would make it unsuitable to our needs (fast Windows builds.)
(Reporter)

Comment 4

4 years ago
As much as I want to make Windows builds faster, not everyone builds on Windows.

Looking at icecream's source code, the code base seems pretty clean. If it doesn't support Windows, perhaps that's something we contribute upstream.
Definitely interesting to look at, but I don't think we're at a place where we can even consider individual distributed build solutions until we have a build that can easily distribute.  Windows support is a must though, but we can discuss that elsewhere :)
(Reporter)

Comment 6

4 years ago
We are there today. The MOZ_PSEUDO_DERECURSE work will fan out to infinite cores. If you build with -j128, you will get 128 compiler processes. We are ready to make use of distributed compilation.
(Reporter)

Comment 7

4 years ago
I've got icecream working between a few virtual machines \o/

Clients use network broadcast to discover the scheduler node. If you are in the SF office and run the iceccd daemon, it should automatically discover my icecc scheduler server and join the party. This is awesome because it means developers get distributed compilation benefits automagically.

The icecc compiler wrapper program is pretty intelligent. If the iceccd daemon isn't running on the local machine, it falls back to compiling locally. If the iceccd daemon isn't connected to a scheduler, it compiles locally, but through icecream, so the toolchain specified is honored.

The scheduler seems reasonably intelligent. If you are compiling with j < max jobs supported by the local iceccd daemon, all compilation appears to occur locally. But as soon as you start feeding more jobs than can locally be processed simultaneously, it starts feeding them off to other nodes. Furthermore, the scheduler is aware of the load on each node and will farm off jobs intelligently.

The mechanism for transferring toolchains is pretty amazing. You can specify a .tar.gz containing the compiler binary and related files. That archive is shipped to every node, uncompressed, and icecream chroots into that environment to perform the build. Pretty nifty.

icecream is pretty amazing. I think it's worth investing in. While Windows support and the "pump" feature from distcc may not be available, I imagine they could be implemented if people were thrown at the problem.

OK. So next steps.

I'd like to start baking support for icecream into the build system.

I would like to have configure detect the presence of the icecream wrapper binaries. If they are present, we point CC and CXX at them.

Because stable and consistent builds are amazing, I'd like support for specifying the toolchain archive to work from the beginning. Here's what I'm thinking. Somewhere inside the repo we have a file mapping architectures to archive filenames. At the top of the build we open this file and check if the archive is present on the local system (we can use ~/.mozbuild for storage). If it isn't, we can fetch that archive from a well-known public server (possibly people.mozilla.org until we have something more official). This would be similar to how we use tooltool to manage Clang archives.

What do people think?
I think it's not configure's job to try and find random CC wrappers. If you want to use a CC wrapper, set it in your mozconfig. I wouldn't oppose a mach command to fill mozconfig with that, though.
(Reporter)

Comment 9

4 years ago
Icecream for OS X could be "fun." It appears that even when you don't specify the archive of a custom toolchain via ICECC_VERSION, icecream goes through the same process of distributing an archive to other nodes. Behind the scenes, it appears to run `icecc --build-native` on the first compiler invocation. This would be fine and dandy... if it worked. `icecc --build-native` is currently failing for me on OS X. I suspect it needs to be taught how to deal with modern versions of Xcode.

But the real kicker is that AFAICT icecream *always* goes through the chroot magic to perform compilations. For OS X, this means that the chroot needs some of the OS X frameworks (at least for building Firefox). This means legal issues will likely prevent us from making OS X chroot environment archives available on the public internet. Since this legal and technical issue is roughly the same as what's necessary in bug 921040, I believe we'll be able to work around this internally. But it is slightly annoying.

Since icecream supports cross-compiling, bug 921040 also means that we *should* be able to re-use the archive they are producing to have icecream build OS X binaries on Linux hosts! So, we won't need an OS X compiler farm: just Linux hosts!
(In reply to Gregory Szorc [:gps] from comment #9)
> Since icecream supports cross-compiling, bug 921040 also means that we
> *should* be able to re-use the archive they are producing to have icecream
> build OS X binaries on Linux hosts! So, we won't need an OS X compiler farm:
> just Linux hosts!

What does icecream do for debugging information?  I sure don't want /some/random/path/to/my/file.cpp in the debug info for my compile; I want the path on my local machine.

I know you can handle this sort of case with GCC; I don't think you can handle it with clang.
(Reporter)

Comment 11

4 years ago
(In reply to Nathan Froyd (:froydnj) from comment #10)
> What does icecream do for debugging information?  I sure don't want
> /some/random/path/to/my/file.cpp in the debug info for my compile; I want
> the path on my local machine.
> 
> I know you can handle this sort of case with GCC; I don't think you can
> handle it with clang.

Does https://github.com/icecc/icecream/blob/master/client/local.cpp#L153 cover it? If not, perhaps we should land support for this in Clang.
(In reply to Gregory Szorc [:gps] from comment #7)
> OK. So next steps.
> 
> I'd like to start baking support for icecream into the build system.

Please no :(  I have nothing against icecream (other than the non-support for non-gcc-like compilers, the fact that it's written in C/C++), but please let's not bake anything into the build system until we have a build system to bake things into.  Until there are still makefiles in the tree, it makes me sad to see any effort going anywhere else but moving everything to moz.build, and getting to the One True Build Graph.
(In reply to Gregory Szorc [:gps] from comment #11)
> (In reply to Nathan Froyd (:froydnj) from comment #10)
> > What does icecream do for debugging information?  I sure don't want
> > /some/random/path/to/my/file.cpp in the debug info for my compile; I want
> > the path on my local machine.
> > 
> > I know you can handle this sort of case with GCC; I don't think you can
> > handle it with clang.
> 
> Does https://github.com/icecc/icecream/blob/master/client/local.cpp#L153
> cover it? If not, perhaps we should land support for this in Clang.

Oof, that's pretty crazy, but no, I don't think that covers it.  Getting better error messages is always nice, though. :)
(Reporter)

Comment 14

4 years ago
Vlad: C++ in the build system today, while not a unified graph, can be 90%+ concurrent due to the |make compile| work. Only about 3-5 of our total 15 minutes of wall build times on modern machines is outside of the compiler. Therefore distributed compilation will have a larger impact on build times than the remaining optimization work. That tells me we should be working on it in some capacity. Keep in mind that your daily pull central and rebuild cycle is effectively a no-op build because of churn due to #include hell.

That being said, distributed compilation isn't on build system's Q4 goals. I'm going to stop working on it. This bug is just something I was doing while waiting for WebIDL test builds to finish. I'm ready to pass the torch. Perhaps a next step can be for someone to generate archives of toolchains. Bug 896023 perhaps?
AIUI, distributed compilation is potentially great if you're in a big Mozilla office with lots of other people, but less so for remoties in far-flung locations.  Is that correct?
(In reply to comment #14)
> Vlad: C++ in the build system today, while not a unified graph, can be 90%+
> concurrent due to the |make compile| work. Only about 3-5 of our total 15
> minutes of wall build times on modern machines is outside of the compiler.
> Therefore distributed compilation will have a larger impact on build times than
> the remaining optimization work. That tells me we should be working on it in
> some capacity. Keep in mind that your daily pull central and rebuild cycle is
> effectively a no-op build because of churn due to #include hell.

This has changed as a result of the unified builds project.  Now, based on anecdotal evidence from my machine, we spend about 30% of our time building outside of C/C++ compilation, mostly not utilizing all of the cores.
(Reporter)

Comment 17

3 years ago
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #16)
> This has changed as a result of the unified builds project.  Now, based on
> anecdotal evidence from my machine, we spend about 30% of our time building
> outside of C/C++ compilation, mostly not utilizing all of the cores.

30% seems about right to me. Once linking is non-recursive, expect that percent to go down again. While distributed compilation will still result in wall time wins, unified sources has made it less important. I'd still like to look into this someday, but that day likely isn't today.

Comment 18

3 years ago
I don't know much about Mozilla, but Icecream doesn't really need any special build system support. If it's possible to get every compiler invocation prefixed by 'icecc' (e.g. by setting CC='icecc gcc') or replace compilers used with Icecream's wrappers (e.g. export PATH=/usr/lib/icecc/bin:$PATH), then any parallel build should simply use Icecream. As long as this works, there's not much more needed, except perhaps picking this up by default, and/or making sure that a largely parallel build doesn't overload the local machine (if e.g. building Java, generating documentation or anything else that cannot get distributed could do that, such commands can be prefixed with 'icerun' to handle this).

If you want a reference, building LibreOffice works just fine with Icecream (and its build system has support for all the things listed above).


(In reply to :Ehsan Akhgari (needinfo? me!) (slow responsiveness, emailapocalypse) from comment #1)
> Does it support something similar to the distcc pipe mode?

No. Tests have so far shown that it doesn't bring any noticeable performance gain. Preprocessors in recent GCC (and Clang) seem to be rather fast, and Icecream's usage of chroot has some overhead compared to distcc's hackish handling of headers in pump mode.

> Also, does it have decent first class Windows support?

I'm not sure what that is supposed to mean, but the answer is presumably no. Icecream is designed for a Unix system and GCC or Clang compiler and I don't have sufficient experience with Windows to judge possibility of Windows support (does this mean distcc works with Windows? I don't quite see how it would, unless the question is about Cygwin, in which case probably nobody has tried that yet for Icecream).

(In reply to Gregory Szorc [:gps] from comment #7)
> The icecc compiler wrapper program is pretty intelligent. If the iceccd
> daemon isn't running on the local machine, it falls back to compiling
> locally.

Note that you do not want to use icecc if you do not have iceccd running, as that'll result in everything being serialized (i.e. one job at a time, while with iceccd running it'll be NCPU jobs at a time).

> The mechanism for transferring toolchains is pretty amazing. You can specify
> a .tar.gz containing the compiler binary and related files.

Note that you don't have to though. The documentation until recently was written in a way that made many people believe ICECC_VERSION was necessary, but it's in fact better to do without it in the normal case.

(In reply to Gregory Szorc [:gps] from comment #9)
> Icecream for OS X could be "fun."
...
> ... if it worked. `icecc --build-native` is currently failing for me on OS X. I suspect it needs to
> be taught how to deal with modern versions of Xcode.

There have been some fixed from OS X from contributors, maybe you could try again. Or you can simply make a pull request on github with a fix.

> But the real kicker is that AFAICT icecream *always* goes through the chroot
> magic to perform compilations. For OS X, this means that the chroot needs
> some of the OS X frameworks (at least for building Firefox).

I don't have any experience with OS X, but I don't understand this. The only thing that happens in the remote chroot is roughly a .i->.o compiler run (i.e. a compilation that doesn't need any includes and produces an object file). I don't see how that should need anything more than the compiler itself.

(In reply to Nathan Froyd (:froydnj) from comment #10)
> What does icecream do for debugging information?  I sure don't want
> /some/random/path/to/my/file.cpp in the debug info for my compile; I want
> the path on my local machine.

It does nothing special, you should have the same paths. Includes are preprocessed locally, so the remote compilation will already see line directives. In general, you should never in practice notice a difference between locally and remotely built sources.

(In reply to Nicholas Nethercote [:njn] from comment #15)
> AIUI, distributed compilation is potentially great if you're in a big
> Mozilla office with lots of other people, but less so for remoties in
> far-flung locations.  Is that correct?

Yes and no. Having a number of computers in an Icecream cluster can do wonders to reducing compile times :), but even cutting compile time down to almost a half is noticeable for people who have e.g. two computers at home (or a laptop for working on and a powerful larger machine). People use Icecream in both scenarios.
(Reporter)

Updated

3 years ago
Duplicate of this bug: 875843

Comment 20

11 months ago
We ran:
$ sudo apt-get install icecc

on about 8 desktop machines in Toronto. Now with 40 to 70 jobs we can get 4:30mins Linux builds compared to about 15-20mins on a single machine.
(Reporter)

Updated

8 months ago
Depends on: 1296752
We've got documentation on how to use this and we've had good success using this and looks like more offices have picked it up:
https://developer.mozilla.org/en-US/docs/Mozilla/Developer_guide/Using_Icecream

Perhaps we can close this or morph it into a tracking bug for future improvements.
Depends on: 1313519
You need to log in before you can comment on or make changes to this bug.