Closed Bug 1095293 Opened 10 years ago Closed 7 years ago

Using multiple processes building provided by MSVC compiler

Categories

(Firefox Build System :: General, defect)

x86_64
Windows 8.1
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: xidorn, Unassigned)

References

(Blocks 1 open bug)

Details

MSVC provides its own version of multiple processes building via opinion "/MP". With that, the compiler spawns limited number of processes and reuse them to compile source files passed in. http://msdn.microsoft.com/en-us/library/vstudio/bb385193%28v=vs.120%29.aspx

Given the fact that process creation is expensive on Windows, I guess that mechanism could save the build time.
We looked at this in the past. You need to pass multiple source files to the compiler at once for this option to be useful. I don't think this will win us anything nowadays because our preferred method is "unified compilation", where we generate a single source file that #includes all the other source files in the directory, and then compile that one source file. This has been shown to decrease compile times quite a bit.
There are still lots of files which are not included in unified compilation. Probably we can use multiple processes for them?
We already use multiple processes for multiple sources. I'm not sure what having cl do it instead of make is going to change much... In fact, it would make things worse because make would then try to run multiple cls that would run multiple compilations, so CPU would be overcommitted.
The advantage of using cl's MP is that, that seems to be able to reuse existing compiler processes rather than creating a process for each file. It can be a big win on Windows as creating process is expensive there.
Probably not really worth investigating. In a full Windows build, we run cl on about 3000 source files. Assuming it takes Windows 30ms to create each process, it adds up to only 90s in total, which doesn't seem to be a big factor of the whole build.
(In reply to Xidorn Quan [:xidorn] (UTC+10) from comment #5)
> Probably not really worth investigating. In a full Windows build, we run cl
> on about 3000 source files. Assuming it takes Windows 30ms to create each
> process, it adds up to only 90s in total, which doesn't seem to be a big
> factor of the whole build.

Interestingly on my Windows 10 machine we're spawning at least 5 processes per |cl.exe| invocation:
> - sh.exe
>   - sh.exe
>     - sh.exe
>       - python.exe
>         - cl.exe

(It might actually be 6, I can't quite tell if the mozmake.exe instances are sticking around).

From your estimate we're now talking 90s X 5 = 450s overhead. This seems like a pretty big win.
Oh, why do we spawn python and sh for every cl.exe?! I guess that is something should be fixed...

But, well, 30ms is the worst case possible. This estimation is based on http://stackoverflow.com/a/26515727, which shows 27ms on Windows XP and 15ms on Windows 7 at worst. So the estimation may be cut by half. And since this overhead would be spread into different processors, the impact on overall time consumption is probably not that big.
The excessive process spawning to obtain cl.exe is known.

The recipe of a GNU make rule is executed in a shell. That's 1 of the sh.exe.

python.exe is so we can wrap the cl.exe execution to parse /showIncludes output to write make dependency files.

IIRC most sh.exe processes have a child sh.exe process just because. Some msys funkiness most likely.

The 2nd sh.exe is likely the GNU make recipe / shell script invoking a process that requires a new process.

Shells require lots of new processes. This is why configure is so slow on Windows.
sh can probably be skipped with some effort rewriting the commands used to compile such that make calls CreateProcess on its own instead of going through the shell. Python can't be because that's what gets us the dependency files for make to figure out when to rebuild something (although sccache can do this too, and the rust rewrite should have less overhead than python).
Oh, 1 of the sh.exe is the shell spawned by GNU make to execute the temporary file / shell script containing the recipe. So, you have:

mozmake.exe
  sh.exe (to exec a temporary file)
    sh.exe (the script containing the GNU make recipe)
      python.exe (our CC wrapper)
        cl.exe

If you use sysinternals process explorer, you can see more process details.
Blocks: 1326328
(In reply to Xidorn Quan [:xidorn] (UTC+10) from comment #7)
> Oh, why do we spawn python and sh for every cl.exe?! I guess that is
> something should be fixed...

Filed bug 1326329 to track this.

> But, well, 30ms is the worst case possible. This estimation is based on
> http://stackoverflow.com/a/26515727, which shows 27ms on Windows XP and 15ms
> on Windows 7 at worst. So the estimation may be cut by half. And since this
> overhead would be spread into different processors, the impact on overall
> time consumption is probably not that big.

Sure, but that's still 225s (ignoring multiple cores, I'm not sure how much that would actually be spread out). I also wouldn't be surprised if we have more overhead per-process due to our msys/cygwin/mozmake setup.
FWIW, mozmake is a native win32 application. It doesn't have a msys fork() overhead for each subprocess it spawns, and it only spawns itself for subdirectories.
(In reply to Xidorn Quan [:xidorn] (UTC+10) from comment #2)
> There are still lots of files which are not included in unified compilation.
> Probably we can use multiple processes for them?

So, I looked at implementing this ages ago in bug 428532. It was a win for -j1 builds but not -jN back then, but the build system parallelism was terrible at that time. I think if we're only passing a single source file to the compiler at a time then -MP isn't going to show any improvement. If we pass multiple source files then we've got two issues:
1) We rely on parsing the output of -showIncludes to generate dependency files, and I don't think that would actually work in this scenario.
2) MSVC is not going to honor make's jobserver pipeline, so we can easily wind up spawning too many processes for the number of available cores.

I think we'd probably be better served by fixing whatever issues prevent us from using UNIFIED_SOURCES for the remaining source files in SOURCES.
In fact, yes, the docs explicitly say that /showIncludes doesn't work with /MP:
https://msdn.microsoft.com/en-us/library/bb385193(v=vs.100).aspx

That seems like a dealbreaker, I don't know how else we could usefully get dependency info out of MSVC.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #13)
> I think we'd probably be better served by fixing whatever issues prevent us
> from using UNIFIED_SOURCES for the remaining source files in SOURCES.

"Fixing" is the wrong term here, "hackily working around bustage caused by breaking assumptions about the isolation of compilation units" is probably more accurate. Either way, do we have a bug for this?

(In reply to Ted Mielczarek [:ted.mielczarek] from comment #14)
> In fact, yes, the docs explicitly say that /showIncludes doesn't work with
> /MP:
> https://msdn.microsoft.com/en-us/library/bb385193(v=vs.100).aspx
> 
> That seems like a dealbreaker, I don't know how else we could usefully get
> dependency info out of MSVC.

Supporting proper visual studio project files in bug 1326333 is probably a better goal (and would subsume this).
Flags: needinfo?(ted)
(In reply to Eric Rahm [:erahm] from comment #15)
> "Fixing" is the wrong term here, "hackily working around bustage caused by
> breaking assumptions about the isolation of compilation units" is probably
> more accurate. Either way, do we have a bug for this?

Sure, call it what you will. I don't know offhand, but maybe the open deps hanging off of bug 939583 have some information.

> Supporting proper visual studio project files in bug 1326333 is probably a better goal (and would subsume this).

I'm still not sure I agree, but I'm also not opposed to someone doing that work.
Flags: needinfo?(ted)
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.