Open
Bug 347592
(PCH)
Opened 18 years ago
Updated 2 years ago
Use precompiled headers (GCC, clang, and VC)
Categories
(Firefox Build System :: General, enhancement)
Firefox Build System
General
Tracking
(Not tracked)
NEW
mozilla2.0
People
(Reporter: wbrana, Unassigned)
References
(Blocks 1 open bug)
Details
Attachments
(3 files, 4 obsolete files)
159.24 KB,
text/plain
|
Details | |
2.27 KB,
text/x-chdr
|
Details | |
20.21 KB,
patch
|
Details | Diff | Splinter Review |
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1b1) Gecko/20060805 BonEcho/2.0b1 Build Identifier: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.8.1b1) Gecko/20060805 BonEcho/2.0b1 GCC 3.4 and newer support precompiled headers. http://gcc.gnu.org/onlinedocs/gcc/Precompiled-Headers.html Reproducible: Always Actual Results: Precompiled headers aren't used during building. Expected Results: Precompiled headers could be used to speedup compilation.
Updated•18 years ago
|
OS: All → Linux
Summary: using precompiled headers with GCC → Use precompiled headers with GCC
Comment 1•16 years ago
|
||
We actually will want to do this for Visual C++ as well as GCC.
OS: Linux → All
Product: Firefox → Core
QA Contact: build.config → build-config
Hardware: PC → All
Summary: Use precompiled headers with GCC → Use precompiled headers (GCC and VC)
Target Milestone: --- → mozilla2.0
Updated•16 years ago
|
Assignee: nobody → ted.mielczarek
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 2•16 years ago
|
||
bsmedberg and I discussed this a while ago, our thought was to make one header per module that included everything for that module, so like <xpcom.h>. Then per-compile directory we would generate a PCH from the contents of REQUIRES.
Comment 3•16 years ago
|
||
I'm for this. . . having seen PCHs speed up builds from 5 minutes to 1 minute (with VC++, but I assume it's the same for gcc), I know they can be a huge improvement. I'm willing to help if any help is needed or desired.
Updated•16 years ago
|
Blocks: build-perf
Last night I was motivated to experiment with this after noticing that files like layout/base/nsPresShell.cpp include ridiculous numbers of files. Although nsPresShell.cpp is 8000 lines of code, the preprocessed input that gets compiled for it is almost 900,000 lines of code that comes from ultimately including 1725 (not distinct) headers. I hacked together the following: A makefile defines PCH_EXPORT = <module> to export its headers to a stub header that can later be pulled in by defining PCH_IMPORT = <module>. A makefile defines PCH_HEADERS = <headers> to export the given headers. All IDL generated headers are added automatically A makefile defines PCH_IMPORT = <modules> to import the given modules and compile a PCH for that directory. Some guards are present to ensure that if compile flags change or weirdness happens we don't hose the build. So I added PCH_EXPORT = xpcom to everything relevant in xpcom/ and also exported nsCOMPtr.h and a few others (in addition to the automatically exported IDL generated headers) and then added PCH_IMPORT = xpcom to xpcom/io and timed a complete build of that directory with and without PCH. That is, I did cd xpcom/io <disable PCH> make clean make #timed this <enable PCH> make clean make #timed this The time for a total build of the directory (including IDL processing, XPT generation, etc) dropped 20%. I would expect to see even more substantial gains in directories like content/ and to a lesser extent layout/ which include those absurd numbers of headers.
1724 headers. I didn't count unique headers but there is a LOT of duplication (nsCOMPtr.h gets included 52 times) which balloons the time to preprocess these files. I didn't do any measurements of this but making nsPresShell.i takes a substantial fraction (feels like at least 50%) of the time it takes to make the object file outright.
Attachment #449889 -
Attachment mime type: application/octet-stream → text/plain
(All of these measurements are for a debug build)
More timing measurements (opt build this time, on Win32). layout/base (no PCH) real 1m41.992s user 0m1.521s sys 0m3.643s layout/base (PCH) real 1m32.650s user 0m1.380s sys 0m3.790s 10% win just from xpcom, probably a lot more win to be had with PCHifying dom/, content/, and widget/ We also appear to spend a ridiculous amount of time building dependencies. I wonder if we could instead grab the output of showincludes and process that to make our .pp files.
Alias: PCH
Comment 8•14 years ago
|
||
Unsurprisingly, there's a bug filed on that (bug 518136). There's also bug 462463 about dependencies.
Updated•13 years ago
|
Assignee: ted.mielczarek → nobody
Comment 9•12 years ago
|
||
I experimented with this a bit in clang. As an extremely simple test, I moved all of nsGlobalWindow.cpp's includes into a PCH. Build time for that file went from 5s to 3s, a speedup of 1.7. The PCH takes 4s to compile. Crucially, I found that commenting out the includes in nsGlobalWindow.cpp when using PCH had no effect on build speed. This means that we don't need to modify every source file in order to take advantage of PCH. It would be cool if we could do something here.
Updated•12 years ago
|
Summary: Use precompiled headers (GCC and VC) → Use precompiled headers (GCC, clang, and VC)
Comment 10•12 years ago
|
||
I took a look at trying to get these included. Initial results can look promising, but I don't think that managing them is quite as easy as I had hoped. If they are used incorrectly, they can result in worse incremental build performance than without them, since extra files may be unnecessarily re-compiled. Granted I'm not an expert in using PCH, so if I missed something obvious here please let me know. First, I am able to reproduce some of the results above. I tried to start out small by making a PCH to be shared among files in xpcom/base, and manually ran some tests with a shell script to compile *.cpp and compare timings. In order to figure out which .h files should be listed in the PCH, I grepped the *.pp files and pulled out a list of header files included by every .cpp file (~30 .h files). In Linux, this resulted in a ~6% speedup (7.753s vs 7.330s), whereas Windows fared much better with a ~20% speedup (17.290s vs 14.365s). Here is where I ran into the first problem - how do we really decide what goes into the PCH? I was manually creating it by grepping *.pp, but that doesn't lend itself to automation, since to create *.pp, we have to first pre-process every file, which is what we're trying to avoid. Another option is to let an expert in the subsystem manually decide what goes into the PCH (essentially list what they think is needed by *.cpp in the subsystem, perhaps by grepping *.pp :), and check that file in and keep it up-to-date. However, this list may bit-rot, which has the following downsides: 1) The PCH may list a .h file that is not actually needed by all of the .cpp files. Take jlebar's example - if we put all of nsGlobalWindow.cpp's headers into the PCH for that directory, then when changing nsGlobalWindow.cpp there is a net gain of 2s. However, if we were to instead change nsGlobalWindow.h, there is now a net loss. The first effect is because we have to re-generate the PCH and then build nsGlobalWindow.cpp (3s + 4s == 7s, a loss of 2 seconds). The second, and more significant effect, is that we are now re-building all 29 files in the directory, instead of just the 12 that actually include nsGlobalWindow.h. Clearly this is not as simple as it might seem at first. 2) Listing "foo.h" in the PCH means it is automatically included by every .cpp. So now a new .cpp file may start using things from foo.h, even though it doesn't explicitly include it. Updating the PCH to remove foo.h due to 1) is now more difficult, since you have to go back and re-add the headers everywhere they are needed. 3) We would also need to include the PCH even if it's not actually pre-compiled, otherwise the list of headers included in a PCH build will differ from a non-PCH build. Without this, someone may check something in that compiles fine with PCH on, but is broken without PCH since the .cpp file is missing required headers. So we either pay a price of a higher number of broken commits, or pay the price of unnecessary re-compiles in 1) even when PCH is disabled. Assuming there is a way to get a correct PCH listing and keep it up-to-date, I continued testing on a larger scale. I repeated this test for xpcom/io by generating a new list of .h files used by xpcom/io/*.cpp and pre-compiling it. Here things were even better, as re-compiling *.cpp in Linux is ~10% faster (6.563s vs 5.955s), and Windows up to ~25% faster (15s vs 12s). I now tried to merge these two directories so that they could share a single "xpcom.h" file, by listing only the headers included by all files in both directories. Since the list of headers being pre-compiled was now a bit smaller, the gain was a little smaller as well (xpcom/base fell from 6% to 5% in Linux). However, this is where I hit the second problem - PCHs can't be shared across files where -D flags are different. I built the PCH using xpcom/base's flags, so the PCH was ignored when I tried to use it in xpcom/io. To get an idea of the scale of this problem, grep through Makefile.in's for DEFINES (find . -name Makefile.in | xargs grep 'DEFINES\s*+=' or some such). Although we might create an xpcom.h, it would need to be compiled separately in at least 23 directories under xpcom. (Or, some poor soul could go through and try to consolidate all of the defines...). This means each PCH is not compiled just once, but once per directory. In summary, I believe to implement PCH we would need to do the following: A) Come up with a way to figure out which files should go into the PCH list. This list needs to be pre-compiled for each directory that uses it, given the scope of the -D flag problem. So we get xpcom/base/xpcom.pch and xpcom/io/xpcom.pch which are both derived from xpcom/xpcom.h. Note that each .cpp file can only have one PCH, so there's no point in trying to share them across subsystems. B) Use -include to force including the header list first (even if PCH is disabled) to avoid the problem of PCH builds behaving differently from non-PCH builds due to missing #includes. C) Keep these lists up-to-date to avoid hurting incremental build performance. I am concerned that by introducing PCH, people will be tempted to include more files in the list than should actually be there because it makes a clobber build faster. I know that clobber build times seem like an attractive thing to optimize, but they are only really necessary because of the deficiencies of make. Given my focus on incremental build performance I am worried this will be more damaging long-term, to the point where changing any .h file results in many more files being re-compiled than are strictly necessary. I would much prefer that the number of things that need to be rebuilt when a .h file changes goes down, not up. Some questions I have: Are there strategies for managing or automating what goes in to the PCH, both initially and over time as people forget about them? Is there a way to mitigate the -D flags added by the various Makefile.ins so a PCH could be shared at a higher level without being compiled multiple times? Is there an argument for how this won't negatively affect incremental builds? I was hoping this would end up being a --build-faster flag, but I think it is not so simple.
Comment 11•12 years ago
|
||
A lot of good things to reply to here. >Although we might create an xpcom.h, it would need to be compiled separately >in at least 23 directories under xpcom. (Or, some poor soul could go through >and try to consolidate all of the defines...). [...] >This means each PCH is not compiled just once, but once per directory. I think figuring this out is pretty key to getting PCH to work. If we have to do a new PCH for each directory, we're likely missing key gains in top-level directories where we spend a lot of time (dom, content come to mind). I wonder if things are not as bad as they seem, and if the vast majority of source files share the same -Ds. (Or, if they're nearly the same, if they could be made to be the same.) One way you could check this is by doing a clobber build, recording all of the command-line invocations, and counting how many sets of -Ds are used in how many different invocations of CC. >A) Come up with a way to figure out which files should go into the PCH list. >This list needs to be pre-compiled for each directory that uses it, given the >scope of the -D flag problem. So we get xpcom/base/xpcom.pch and >xpcom/io/xpcom.pch which are both derived from xpcom/xpcom.h. Note that each >.cpp file can only have one PCH, so there's no point in trying to share them >across subsystems. Clang does support multiple PCH's per cpp file (the dependency graph can be either a list or a DAG). Given that the vast majority of developers are on Linux or Mac, I'd be quite happy with a solution which required clang. http://clang.llvm.org/docs/PCHInternals.html But I recognize this doesn't get us around the -D problem. >C) Keep these lists up-to-date to avoid hurting incremental build performance. > >I am concerned that by introducing PCH, people will be tempted to include more >files in the list than should actually be there because it makes a clobber >build faster. I know that clobber build times seem like an attractive thing to >optimize, but they are only really necessary because of the deficiencies of >make. gps said something like this in his recent build-faster presentation, so you're in good company. But I don't think it is true at all. I think the misconception may stem from choosing the dichotomy of "clobber" versus "incremental" builds. But I don't think that's the relevant dichotomy here. Instead, we should think about the differences between builds which recompile many files and builds which recompile few files. It's my contention that builds which recompile many files will be common even if we have a build system which never requires us to rm -rf the objdir. To wit, you end up compiling many files when you * change compiler flags (e.g., I usually make only debug builds, but sometimes I need a release build, a valgrind build, or a trace-malloc build). * change a header which is included by many files (we have many such headers, and some, such as nsContentUtils.h, are changed frequently). * pull a change from upstream which modifies such a header. In practice, I end up rebuilding much of the tree every time I pull from upstream. * switch between branches which include a change to such a header. In practice, nearly all of my substantial branches include such a change. PCH helps us in all of these cases, and no build system change is going to substantially reduce the numer of files we have ot re-compile in these cases. In addition, our automated builders end up re-compiling most object files on most builds, due to the above. So PCH should be a build-faster win on our automated builders. Previous attempts I'm aware of to reduce the number of headers that our cpp files depend on have not yielded great success, in part because many of these dependencies are true dependencies. >Given my focus on incremental build performance I am worried this will >be more damaging long-term, to the point where changing any .h file results in >many more files being re-compiled than are strictly necessary. I would much >prefer that the number of things that need to be rebuilt when a .h file >changes goes down, not up. I agree this is a concern; I don't have a good answer. One way to address this is to ensure that everything builds both with and without PCH. Then you can have the option of not using PCH for your incremental builds, or of using the dependencies generated from a non-PCH invocation of GCC to decide which files to rebuild. But in practice, people break builds unless they're on tinderbox. So we'd have to have non-PCH builds on tinderbox, which would result in more, not less, load on builders. Running non-PCH builds on tinderbox doesn't entirely defeat our purpose here, because we wouldn't need to wait for non-PCH builds to finish before running tests, so tests could still complete faster, and developers could still get tryserver results faster. And of course builds on devs' machines would be faster. But it's not a great solution.
Comment 12•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #11) > Clang does support multiple PCH's per cpp file (the dependency graph can be > either a list or a DAG). Given that the vast majority of developers are on > Linux or Mac, I'd be quite happy with a solution which required clang. This is a feedback loop that results in the vast majority of developers using a different product (non-Windows Firefox) from what 95% of our users are using (Firefox on Windows). I've heard many times that we'd like to have more developers developing on Windows so we have more "genuine" dogfooding of the product, but we're making nearly everything painful and painfully slow on Windows (including, especially, B2G and Fennec, since they require the build to be done in a VM). It's wrong to hold back improvements on other platforms that Windows can't get as easily, but it's also wrong to suggest we can just avoid making those improvements on Windows. > Instead, we should think about the differences between builds which recompile > many files and builds which recompile few files. It's my contention that > builds which recompile many files will be common even if we have a build > system which never requires us to rm -rf the objdir. [snip examples] In my experience, none of your examples are nearly as common as the case where I make a change to a single .cpp file and I want to recompile just that file and relink libxul. My second most common type of build is where I update a header file used by a few dozen files only in my own module. I do dozens of those builds a day. However, I pull in changes that require me to clobber only about once a day. Consequently, I'd guess that it is extremely important to make incremental builds as incremental as possible, as often as possible, and we should be extremely careful (at least) about avoiding regressing this aspect.
Comment 13•12 years ago
|
||
> This is a feedback loop that results in the vast majority of developers using a > different product (non-Windows Firefox) from what 95% of our users are using (Firefox > on Windows). I don't think there's any disagreement between us that if we can make Windows builds fast, that's great, and if we can't without regressing other things, oh well. But I would /really/ like to avoid getting into the more general issue of how we incentivise developers to work on particular platforms here, if you don't mind. > In my experience, none of your examples are nearly as common as the case where I make a > change to a single .cpp file and I want to recompile just that file and relink libxul. > [...] > Consequently, I'd guess that it is extremely important to make incremental builds as > incremental as possible, as often as possible, and we should be extremely careful (at > least) about avoiding regressing this aspect. I don't disagree. What I object to is the notion that full rebuilds "are only really necessary because of the deficiencies of make," so therefore, given that our long-run plan is to eliminate make, we don't care about full-rebuild time. Note that we already have hacks to make recompiling one file and relinking libxul fast. I don't disagree that those hacks are harmful and shouldn't have to exist, but the fact remains that hacks exist for getting around this pain to some degree. In contrast, there are no hacks for speeding up the cases I listed, short of modifying your hardware in some way (*). Other than our bugs to entirely rewrite the build, this bug is the only one I'm aware of with a large potential speedup for full rebuilds. (*) Well, okay, ccache helps with some of them.
Comment 14•12 years ago
|
||
Once we have moz.build files, we should more easily be able to obtains the sets of all files sharing the same set of compiler flags across the tree. We should then be able to construct chained PCH files and share PCH files across directories. There is an initial seeding problem to identify the full set of .h files for each PCH "bundle." One way to solve that is with Clang's API. We could use the Clang Python bindings or write a small C/C++ program to traverse the set of #include'd files. Unfortunately, this only works where we use Clang. We should also minimize the variance in compiler arguments across the tree to minimize the number of PCH combinations. As Mike said, there are many directory-local defines. I'm willing to bet a lot of these are cruft, cargo culted, or could be pulled out into global defines. Minimizing the variance here could also allow us to reduce the number of compiler invocations if we ever start compiling multiple source files per invocation.
Comment 15•12 years ago
|
||
Here's evidence of someone going to great lengths to optimize their full-rebuild times on Linux and Mac: http://glandium.org/blog/?p=2908
Comment 16•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #11) > A lot of good things to reply to here. > > >Although we might create an xpcom.h, it would need to be compiled separately > >in at least 23 directories under xpcom. (Or, some poor soul could go through > >and try to consolidate all of the defines...). [...] > >This means each PCH is not compiled just once, but once per directory. > > I think figuring this out is pretty key to getting PCH to work. If we have > to > do a new PCH for each directory, we're likely missing key gains in top-level > directories where we spend a lot of time (dom, content come to mind). > > I wonder if things are not as bad as they seem, and if the vast majority of > source files share the same -Ds. (Or, if they're nearly the same, if they > could be made to be the same.) > > One way you could check this is by doing a clobber build, recording all of > the > command-line invocations, and counting how many sets of -Ds are used in how > many different invocations of CC. Ok, I did this on OSX. I used the following perl script to process the output: #! /usr/bin/perl while(<STDIN>) { if(m,^/usr/bin/clang,) { @args = split(/ /); $filename = $args[$#args]; chomp($filename); $filename =~ m,(.*/)[^/]*,; $dirs{$1}++; $argstring = ""; foreach $arg (sort @args) { if($arg =~ /^-D/) { $argstring .= "$arg "; } } $myargs{$argstring}++; $numfiles++; } } foreach $dir (keys %dirs) { print "In dir $dir built " . $dirs{$dir} . " files\n"; } foreach $arg (keys %myargs) { print "Args '$arg' used " . $myargs{$arg} . " times\n"; if($myargs{$arg} > $max) { $max = $myargs{$arg}; $maxarg = $arg; } } print "Max: $max, Arg = $maxarg\n"; print "Total number of directories : " . scalar(keys %dirs) . "\n"; print "Total number of argment strings: " . scalar(keys %myargs) . "\n"; print "Total number of files compiled: $numfiles\n"; Basically it just grabs all of the lines that call clang, picks out the -D options, sorts them into a string and counts the number of times that string of -D options appear. Here's the output summary: # snip: Lots of data about each -D set Max: 457, Arg = -DEXPORT_XPTC_API -DEXPORT_XPT_API -DIMPL_NS_NET -DIMPL_THEBES -DIMPL_XREAPI -DMOZILLA_CLIENT -DMOZILLA_INTERNAL_API -DNDEBUG -DNO_NSPR_10_SUPPORT -DNO_X11 -DSTATIC_EXPORTABLE_JS_API -DTRIMMED -D_IMPL_NS_COM -D_IMPL_NS_GFX -D_IMPL_NS_LAYOUT -D_IMPL_NS_WIDGET Total number of directories : 512 Total number of argment strings: 145 Total number of files compiled: 5359 So 457 files share that set of -D flags, which is the most widely used set. Some other sets of flags are only used by a handful of files - there are 145 different sets of flags used across the 512 directories that contain compiled files. Unless these are consolidated, that means we'd need 145 sets of PCHs (less than the one-per-directory I was suggesting, but figuring out where they can be shared by settings in the Makefiles may be more difficult than a simple one-per-directory model). > > >A) Come up with a way to figure out which files should go into the PCH list. > >This list needs to be pre-compiled for each directory that uses it, given the > >scope of the -D flag problem. So we get xpcom/base/xpcom.pch and > >xpcom/io/xpcom.pch which are both derived from xpcom/xpcom.h. Note that each > >.cpp file can only have one PCH, so there's no point in trying to share them > >across subsystems. > > Clang does support multiple PCH's per cpp file (the dependency graph can be > either a list or a DAG). Given that the vast majority of developers are on > Linux or Mac, I'd be quite happy with a solution which required clang. > > http://clang.llvm.org/docs/PCHInternals.html > > But I recognize this doesn't get us around the -D problem. Ahh, I wasn't aware of that! I naively assumed it was similar to gcc and MSVC. I believe that would address my concerns of adding unnecessary header dependencies, though as you point out we'd really need to slim down the -D mess to take advantage of it. Then it might be as simple as pre-processing .h files after they are installed in dist/include (I'm speculating here...) > Instead, we should think about the differences between builds which recompile > many files and builds which recompile few files. It's my contention that > builds which recompile many files will be common even if we have a build > system > which never requires us to rm -rf the objdir. > > To wit, you end up compiling many files when you > > * change compiler flags (e.g., I usually make only debug builds, but > sometimes I need a release build, a valgrind build, or a trace-malloc > build). > > * change a header which is included by many files (we have many such > headers, and some, such as nsContentUtils.h, are changed frequently). > > * pull a change from upstream which modifies such a header. In practice, I > end up rebuilding much of the tree every time I pull from upstream. > > * switch between branches which include a change to such a header. In > practice, nearly all of my substantial branches include such a change. > > PCH helps us in all of these cases, and no build system change is going to > substantially reduce the numer of files we have ot re-compile in these cases. But what I'm pointing out is that PCH (the gcc/MSVC way) can also hurt in many of these cases. Someone may put nsContentUtils.h in the PCH, for example, because it makes a clobber build some percentage faster. Currently without PCH, changing nsContentUtils.h will re-compile only the files that include it (eg: in content/base/src, there are 68 that include it according to my *.o.pp files). However if it's in the PCH, we'll re-compile everything in that directory (since every .cpp files includes the PCH), for a total of 86 files -- now you're ~26% slower.
Comment 17•12 years ago
|
||
> Unless these are consolidated, that means we'd need 145 sets of PCHs Not necessarily, because we don't need to apply PCH to every file in the tree. But it does sound like we'd want to consolidate the defines. It's hard to tell from your data whether that's feasible. > But what I'm pointing out is that PCH (the gcc/MSVC way) can also hurt in many of these > cases. Understood. I suggested one way we could try to work around this, at the end of comment 11, but it's not a great solution. If we can't figure it out, maybe we have to bail on PCH. :(
Comment 18•12 years ago
|
||
(In reply to Michael Shal [:mshal] from comment #16) > #! /usr/bin/perl use strict; use warnings FATAL => 'all'; ;) > # snip: Lots of data about each -D set > Max: 457, Arg = -DEXPORT_XPTC_API -DEXPORT_XPT_API -DIMPL_NS_NET > -DIMPL_THEBES -DIMPL_XREAPI -DMOZILLA_CLIENT -DMOZILLA_INTERNAL_API -DNDEBUG > -DNO_NSPR_10_SUPPORT -DNO_X11 -DSTATIC_EXPORTABLE_JS_API -DTRIMMED > -D_IMPL_NS_COM -D_IMPL_NS_GFX -D_IMPL_NS_LAYOUT -D_IMPL_NS_WIDGET > Total number of directories : 512 > Total number of argment strings: 145 > Total number of files compiled: 5359 We need to minimize the number of unique argument strings. We should be able to start this project today, even with Makefile.in's. The problem isn't going to go away with the mechanical transition to moz.build. As far as "to PCH or not to PCH," wouldn't it be awesome if the build backend could make intelligent choices depending on the number of invalidated targets and remembered resource usage from previous builds?
Comment 19•12 years ago
|
||
(In reply to Gregory Szorc [:gps] from comment #18) > (In reply to Michael Shal [:mshal] from comment #16) > > #! /usr/bin/perl > > use strict; > use warnings FATAL => 'all'; > > ;) It was a quick hack! Avert your eyes! :) > As far as "to PCH or not to PCH," wouldn't it be awesome if the build > backend could make intelligent choices depending on the number of > invalidated targets and remembered resource usage from previous builds? I'm not sure what exactly you mean here - can you give an example?
Comment 20•12 years ago
|
||
(In reply to Michael Shal [:mshal] from comment #19) > > As far as "to PCH or not to PCH," wouldn't it be awesome if the build > > backend could make intelligent choices depending on the number of > > invalidated targets and remembered resource usage from previous builds? > > I'm not sure what exactly you mean here - can you give an example? The build backend would be able to attribute a cost to: a) rebuild PCH file and rebuild everything that depends on it b) rebuild individual .cpp files without rebuilding the PCH c) some variant of above and then automatically select the option at build time with the lowest cost. I suspect we would use PCH for clobber builds and less so for incremental builds. But, that's just wild speculation which I haven't spent many brain cycles thinking through.
Comment 21•11 years ago
|
||
> Currently without PCH, changing nsContentUtils.h will re-compile only the files that
> include it (eg: in content/base/src, there are 68 that include it according to my *.o.pp
> files). However if it's in the PCH, we'll re-compile everything in that directory (since
> every .cpp files includes the PCH), for a total of 86 files -- now you're ~26% slower.
Thinking about this more, I don't think this is a problem: We can use PCH and also get exact dependencies without running the preprocessor on every file.
Consider: ccache does exactly this, by searching through files for "#include". If we cached the "#include"s inside each header file, this tool could be made very fast; in the common case you'd just have to stat the relevant headers to ensure that they haven't been modified. (Using mtime here is safe, since that's what make itself relies on for correctness. In contrast, ccache has to hash files to ensure that they haven't changed, which increases its overhead here.)
Comment 22•11 years ago
|
||
I just ran a test: - threw in some very common includes and with some GFX headers for fun. - Added CXXFLAGS += -include-pch moz-global-pch.h.pch in a few dirs I got a saving of 30 CPU seconds. It seams a bit less then what we're getting from unification but noteworthy never the less. I think we could precompile a generic PCH for mozilla. Then each dir by default would not use PCH, could request the global PCH or (for large modules) could specify their own PCH that the module owner would maintain. We can add something to the moz.build file to specify PCH for that directory.
Comment 23•11 years ago
|
||
(In reply to Benoit Girard (:BenWa) from comment #22) > I got a saving of 30 CPU seconds. Define 30 CPU seconds. If it's 30 seconds on the 70+ minutes of cpu time used during a build, it's worthless.
Comment 24•11 years ago
|
||
I did a targeted test for gfx/layers. With the PCH header posted above the time went from 17 to 13 CPU seconds. That's a 25% saving in CPU time which is worth doing. This is on a directory where we unify sources. This confirms that PCH header are still useful with unified sources.
Comment 25•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #23) > (In reply to Benoit Girard (:BenWa) from comment #22) > > I got a saving of 30 CPU seconds. > > Define 30 CPU seconds. If it's 30 seconds on the 70+ minutes of cpu time > used during a build, it's worthless. This statement is very premature. Benoit's results are quite worth pursuing, and they will only get better as we use PCH in more places in the tree.
Comment 26•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #25) > (In reply to Mike Hommey [:glandium] from comment #23) > > (In reply to Benoit Girard (:BenWa) from comment #22) > > > I got a saving of 30 CPU seconds. > > > > Define 30 CPU seconds. If it's 30 seconds on the 70+ minutes of cpu time > > used during a build, it's worthless. > > This statement is very premature. Benoit's results are quite worth > pursuing, and they will only get better as we use PCH in more places in the > tree. The comment didn't mention how many directories and how much time those directories were normally building in. Thus the question. Now, comment 24 is more useful to get an idea. With that being said, I'd rather not venture in precompile headers territory without a *very* careful plan. For example, we need to be very careful about defines (thankfully, they are now in moz.build). Maybe we should just go forward with the bug that i can't find that's essentially about moving defines to specialized headers (so instead of -D foo, we'd e.g. #include "foo.h").
Comment 27•11 years ago
|
||
Found it: bug 902825.
Comment 28•11 years ago
|
||
(In reply to comment #26) > With that being said, I'd rather not venture in precompile headers territory > without a *very* careful plan. For example, we need to be very careful about > defines (thankfully, they are now in moz.build). Maybe we should just go > forward with the bug that i can't find that's essentially about moving defines > to specialized headers (so instead of -D foo, we'd e.g. #include "foo.h"). The defines are not going to be a problem for now. I did some experiments today, and clang only cares about the defines which are actually used in the headers that you precompile. Other defines are ignored during AST validation. In addition to the defines, we need to be careful with things such as LOCAL_INCLUDES which change the include search order, flags affecting the codegen target such as -msse2 (because precompiled header files are not target agnostic) and perhaps other things which we are going to find out about along the way. Here is my proposed plan: 1. Start small, create a mozilla-precompiled.h which is included *everywhere* and add things such as libc headers to it. Work on that patch and get it landed. 2. Get a list of _highly_ popular headers everywhere in our tree, verify that they do not depend on defines etc, and add them to mozilla-precompiled.h. 3. Once we're there, and only then, start to worry about other headers. How does this sound?
Comment 29•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #26) > For example, we need to be very careful about > defines (thankfully, they are now in moz.build). Yes of course. > Maybe we should just go > forward with the bug that i can't find that's essentially about moving > defines to specialized headers (so instead of -D foo, we'd e.g. #include > "foo.h"). bug 902825 has much large scope then that which isn't explicable here. Moving defines to headers is an interesting idea but at this point we have no solid reason to block PCH on having that piece.
Comment 30•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #28) > How does this sound? This sounds reasonable, but we need to ensure that those headers do not ever depend on defines.
Comment 31•11 years ago
|
||
That is, not just when we add them to mozilla-precompiled.h.
Comment 32•11 years ago
|
||
(In reply to comment #30) > (In reply to :Ehsan Akhgari (needinfo? me!) from comment #28) > > How does this sound? > > This sounds reasonable, but we need to ensure that those headers do not ever > depend on defines. Yes. I still don't have a good solution for that...
Comment 33•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #30) > (In reply to :Ehsan Akhgari (needinfo? me!) from comment #28) > > How does this sound? > > This sounds reasonable, but we need to ensure that those headers do not ever > depend on defines. I guess these defines are not only from the command-line, but also added by moz.build files? Can't we do something like: if ! cmp -s <($CC -E $ALL_THE_OPTIONS -o - $header) <($CC -E -o - $header); then echo header depends on defines, fixme fi in any necessary directory?
Comment 34•11 years ago
|
||
(In reply to Nathan Froyd (:froydnj) (AFK 23-30 November) from comment #33) > (In reply to Mike Hommey [:glandium] from comment #30) > > (In reply to :Ehsan Akhgari (needinfo? me!) from comment #28) > > > How does this sound? > > > > This sounds reasonable, but we need to ensure that those headers do not ever > > depend on defines. > > I guess these defines are not only from the command-line, but also added by > moz.build files? Can't we do something like: > > if ! cmp -s <($CC -E $ALL_THE_OPTIONS -o - $header) <($CC -E -o - $header); > then > echo header depends on defines, fixme > fi > > in any necessary directory? It's not that easy. See prlog.h for example. You can have the following in a translation unit: #define FORCE_PR_LOG #include "prlog.h" Therefore we cannot precompile that header. It's not clear to me how your proposal would detect that situation.
Comment 35•11 years ago
|
||
Comment 36•11 years ago
|
||
This patch implements PCH support for the entire tree, and has only been tested on clang/OSX. Here I went through the most popular included headers across the code base, and precompiled most of them. Here's the size of the database with this patch: $ ls -lah obj-ff-dbg/mozilla-config.h.precompiledheader -rw-r--r-- 1 ehsan staff 3.4M 1 Dec 16:06 obj-ff-dbg/mozilla-config.h.precompiledheader And here is my timing of the affect of this before and after this patch: Before: real 7m24.288s user 37m20.551s sys 4m26.605s After: real 7m22.874s user 36m43.693s sys 4m24.324s The affect of this patch for me is between 40-60 seconds of CPU time, but always a negligible amount of wall time. I guess we might be very far from being fully parallel in a big enough of a part of my build that these kinds of CPU time improvements may start to get diminished in the wall time. (And indeed according to mach resource-usage, I'm spending around 2:30 with very poor parallelism in this build now.) I'd like to ask for feedback on the patch, and also that other people please grab the patch and use it on other machines and measure their build times before and after. Then we can decide whether we should pursue this path.
Assignee: nobody → ehsan
Attachment #8340794 -
Attachment is obsolete: true
Comment 37•11 years ago
|
||
I'd suggest trying again with some large but less common header like jsapi.h, layers.h and the likes. When I tried including simple but common headers like you have here I didn't see a win either when testing in 22. You need to be more aggressive with large headers.
Comment 38•11 years ago
|
||
Comment on attachment 8340800 [details] [diff] [review] Experimental support for precompiled headers with clang Hang on, it seems like a very weird behavior in the expansion of $< which seems to be context sensitive caused me to oversee a bug in my patch which caused all of the tree to be built *without* the precompiled header. :( So my timings above are basically useless. I need to fix up the patch to make the tree build again and then submit actual numbers.
Updated•11 years ago
|
Attachment #8340800 -
Attachment is obsolete: true
Comment 39•11 years ago
|
||
You need CXXPCHFLAGS =, not CXXPCHFLAGS :=
Comment 40•11 years ago
|
||
OK, here's an actual patch which builds, and here is the build time: real 7m2.328s user 34m51.262s sys 4m7.972s So this gives us around 5-6% improvement across the tree. I should also note that I stopped precompiling some of the headers because they were depending on MOZILLA_INTERNAL_API.
Comment 41•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #40) > So this gives us around 5-6% improvement across the tree. I should also > note that I stopped precompiling some of the headers because they were > depending on MOZILLA_INTERNAL_API. Very few c++ files are built without MOZILLA_INTERNAL_API. Maybe the right thing to do here is to use precompiled headers only for C++ files that are built with MOZILLA_INTERNAL_API (which essentially is whenever LIBXUL_LIBRARY is set).
Comment 42•11 years ago
|
||
Also, please measure the build time difference with gcc too, if you can, because i suspect the wins would be bigger there.
Comment 43•11 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #41) > (In reply to :Ehsan Akhgari (needinfo? me!) from comment #40) > > So this gives us around 5-6% improvement across the tree. I should also > > note that I stopped precompiling some of the headers because they were > > depending on MOZILLA_INTERNAL_API. > > Very few c++ files are built without MOZILLA_INTERNAL_API. Maybe the right > thing to do here is to use precompiled headers only for C++ files that are > built with MOZILLA_INTERNAL_API (which essentially is whenever > LIBXUL_LIBRARY is set). Is there an easy build system level trick which I can employ to differentiate those files? I agree that limiting PCH to those files makes a lot of sense. (In reply to Mike Hommey [:glandium] from comment #42) > Also, please measure the build time difference with gcc too, if you can, > because i suspect the wins would be bigger there. Will do. For now I've only focused on my local build environment, but extending this to GCC should be fairly simple.
Comment 44•11 years ago
|
||
(In reply to :Ehsan Akhgari (needinfo? me!) from comment #43) > Is there an easy build system level trick which I can employ to > differentiate those files? I agree that limiting PCH to those files makes a > lot of sense. ifdef LIBXUL_LIRBARY in rules.mk or Makefile.in (but only after config.mk is included)
Comment 45•11 years ago
|
||
This patch adds gcc support, but it actually doesn't work yet. The gcc PCH docs are a disaster, apparently there is no way to tell gcc to use a given PCH file, and you have to do that indirectly by placing a header.h.gch file in the same directory as the header, except that gcc doesn't pick it up during the build at all. I tested it by sticking a #error in mozilla-config.h after the PCH was generated. gcc has some warnings that it issues when it cannot use the PCH file that it found, and when I turn it on as errors it's not triggered even once. I suspect that gcc doesn't even try to pick up the .gch file at all. And to make things worse, all of this stuff works when I try out the commands on a small test.cpp/pch.h that I have around for testing purposes; it's not clear to me why this stops working in our build system.
Attachment #8340817 -
Attachment is obsolete: true
Comment 46•11 years ago
|
||
Note that I haven't tried comment 44 yet, I would like to figure out how we can convince gcc to use the PCH first.
Comment 47•11 years ago
|
||
This patch only uses PCH on internal API code, and precompiles a few more headers. With this patch, I get the following times: Before: real 7m1.261s user 35m28.100s sys 4m2.561s After: real 6m37.862s user 34m14.911s sys 3m53.180s The gcc support doesn't still work. glandium, what do you think about this patch? (Note that if we don't figure out the gcc problem, we can disable it and land the initial support for clang only.)
Attachment #8341100 -
Attachment is obsolete: true
Attachment #8344414 -
Flags: feedback?(mh+mozilla)
For windows: To create pch: cl -FIforced.h -Ycforced.h -Fphdrs.pch -c -TP -Foignored.obj forced.h (-TP: compile everything as C++, -Fo:use this output .obj file) To use pch: cl -FIforced.h -Yuforced.h -Fphdrs.pch hello.cpp
Comment 49•11 years ago
|
||
With comment 48, this should be very straightforward to port to MSVC, where this is presumbaly the most useful.
Comment 50•9 years ago
|
||
Comment on attachment 8344414 [details] [diff] [review] Experimental support for precompiled headers with clang and gcc Clearing the flag since even if I got feedback after 2 years, I wouldn't be motivated to address it.
Attachment #8344414 -
Flags: feedback?(mh+mozilla)
Updated•7 years ago
|
Assignee: ehsan → nobody
Comment hidden (spam) |
Updated•6 years ago
|
Product: Core → Firefox Build System
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•