Missing dependency information prevents parallelized building of NSS
Categories
(NSS :: Build, defect, P3)
Tracking
(Not tracked)
People
(Reporter: georg.schwarz, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: helpwanted)
Attachments
(12 files, 6 obsolete files)
484 bytes,
patch
|
Details | Diff | Splinter Review | |
588 bytes,
patch
|
Details | Diff | Splinter Review | |
196 bytes,
patch
|
Details | Diff | Splinter Review | |
5.02 KB,
patch
|
Details | Diff | Splinter Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review | |
47 bytes,
text/x-phabricator-request
|
Details | Review |
User-Agent: iCab/2.9.8 (Macintosh; U; PPC) Build Identifier: When building nss 3.9.2 on a Dual-Pentium system running Debian Linux using make -j2 (GNU make 3.79.1) I end up like this: schwarz@sendai:~/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/nss$ env BSD_LDOPTS="-L/usr/local/pkg/lib/nspr -Wl,-R/usr/local/pkg/lib/nspr -Wl,-R/usr/local/pkg/lib/nss" LIBRUNPATH=/usr/local/pkg/lib BUILD_OPT=1 make -j2 nss_build_all cd ../coreconf ; make cd ../dbm ; make export libs make[1]: Entering directory `/usr/src/pkgsrc/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/coreconf' make[1]: Entering directory `/usr/src/pkgsrc/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/dbm' cd nsinstall; make export cd include; make export make[2]: Entering directory `/usr/src/pkgsrc/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/coreconf/nsinstall' make[2]: Entering directory `/usr/src/pkgsrc/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/dbm/include' make[2]: Nothing to be done for `export'. make[2]: Leaving directory `/usr/src/pkgsrc/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/coreconf/nsinstall' cd nsinstall; make libs ../../coreconf/nsinstall/Linux2.4_x86_glibc_PTH_OPT.OBJ/nsinstall -R -m 444 ../../../dbm/include/nsres.h ../../../dbm/include/cdefs.h ../../../dbm/include/mcom_db.h ../../../dbm/include/ncompat.h ../../../dbm/include/winfile.h ../../../dist/public/dbm make[2]: ../../coreconf/nsinstall/Linux2.4_x86_glibc_PTH_OPT.OBJ/nsinstall: Command not found make[2]: *** [export] Error 127 make[2]: Leaving directory `/usr/src/pkgsrc/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/dbm/include' make[1]: *** [export] Error 2 make[1]: Leaving directory `/usr/src/pkgsrc/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/dbm' make: *** [build_dbm] Error 2 make: *** Waiting for unfinished jobs.... make[2]: Entering directory `/usr/src/pkgsrc/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/coreconf/nsinstall' gcc -o Linux2.4_x86_glibc_PTH_OPT.OBJ/nsinstall.o -c -O2 -fPIC -DLINUX1_2 -Di386 -D_XOPEN_SOURCE -DLINUX2_1 -ansi -Wall -pipe -DLINUX -Dlinux -D_POSIX_SOURCE -D_BSD_SOURCE -DHAVE_STRERROR -DXP_UNIX -UDEBUG -DNDEBUG -D_REENTRANT -I../../../dist/Linux2.4_x86_glibc_PTH_OPT.OBJ/include -I../../../dist/public/coreconf -I../../../dist/private/coreconf nsinstall.c mkdir: cannot create directory `Linux2.4_x86_glibc_PTH_OPT.OBJ': File exists make[2]: *** [Linux2.4_x86_glibc_PTH_OPT.OBJ/pathsub.o] Error 1 make[2]: *** Waiting for unfinished jobs.... make[2]: Leaving directory `/usr/src/pkgsrc/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/coreconf/nsinstall' make[1]: *** [libs] Error 2 make[1]: Leaving directory `/usr/src/pkgsrc/pkgsrc/devel/nss/work/nss-3.9.2/mozilla/security/coreconf' make: *** [build_coreconf] Error 2 omitting the -j option makes the build process work fine. Reproducible: Always
Comment 1•18 years ago
|
||
This is a known problem. We don't plan to fix this. Our makefiles were written a long time ago, and it's a lot of work to parallelize makefiles after they were written. We would appreciate any help though.
Updated•17 years ago
|
Updated•17 years ago
|
Updated•12 years ago
|
Comment 3•12 years ago
|
||
Except for one hunk (which creates a symlink from .so.1 to .so), the changes have to do with dependency information: `all' depends on `libs', which depends on `export'. If any PRIVATE_EXPORTS are defined, then `export' also depends on `private_export'. Various files (such as objects) explicitly depend on the OBJDIR -- which has a rule of its own. This not only helps build-parallelization, but is faster -- because make can remember, that it already created the OBJDIR and does not need to check, whether it exists (and is a directory) for every target.
Comment 4•12 years ago
|
||
This patch ensures, coreconf bits (nsinstall) are built before `export' is attempted.
Comment 5•12 years ago
|
||
Comment 6•12 years ago
|
||
Comment 7•12 years ago
|
||
In addition to the four submitted patches, the explicit export:: private_export needs to be removed from all Makefiles, where it is found, as it breaks things. Something like: find mozilla/security/nss -name Makefile | xargs /usr/bin/grep -l '^export:: private_export' | xargs /usr/bin/sed -i.bak -e '/^export:: private_export/d' After that I was able to build nss with -j2, -j4, and -j11 on two different FreeBSD machines.
Comment 8•12 years ago
|
||
Mikhail, thank you for your patches. I am surprised how simple they are. This weekend, I experimented on getting NSS to build in parallel and by using non-recursive make for NSS. My changes were much bigger than your changes (I completely replaced coreconf/rules.mk with my own). I found that if I add dependencies on $(OBJDIR) then, at least on Windows, the next rebuild would rebuild everything that depended on it. I guess this is because the directory's timestamp changes during the build? For that reason, I had to write dependencies as $(OBJDIR)/directory.exists, and write: $(OBJDIR)/directory.exists: mkdir $(OBJDIR) touch $@ Perhaps I was doing something wrong though. How much faster is the build with your patches, "make -j1" vs "make -j4" vs "make j11"? Can you post the timings from the "time" command?
Comment 9•12 years ago
|
||
(In reply to Brian Smith (:bsmith) from comment #8) > I found that if I add dependencies on $(OBJDIR) then, at least on Windows, > the next rebuild would rebuild everything that depended on it. You are right. I'm working with the FreeBSD port of nss -- where the code is only built once anyway so I did not notice this effect. But it, likely, is not acceptable for ongoing development. Your method (of depending on a file inside $(OBJDIR)) seems find, though. > How much faster is the build with your patches, "make -j1" vs > "make -j4" vs "make j11"? Here is from my quad Xeon FreeBSD/amd64 machine: mi@narawntapu:ports/security/nss (1067) rm -rf work/ ; time csh -c "make MAKE_JOBS_NUMBER=11 > & m11" 74.584u 17.719s 0:44.58 207.0% 6052+2127k 138+1195io 144pf+0w mi@narawntapu:ports/security/nss (1068) rm -rf work/ ; time csh -c "make MAKE_JOBS_NUMBER=4 > & m4" 72.791u 16.854s 0:44.33 202.2% 6025+2121k 2+1189io 144pf+0w mi@narawntapu:ports/security/nss (1069) rm -rf work/ ; time csh -c "make MAKE_JOBS_NUMBER=1 > & m1" 69.714u 14.179s 1:22.40 101.7% 5876+2084k 0+1191io 144pf+0w Not much difference between 11 and 4, but m1 is much slower...
Comment 10•12 years ago
|
||
Can anyone elaborate on the reason, `nsinstall -D' is used to create directories instead of mkdir (or md on Windows)?
Comment 11•12 years ago
|
||
(In reply to Brian Smith (:bsmith) from comment #8) > $(OBJDIR)/directory.exists: > mkdir $(OBJDIR) > touch $@ A better choice for a file inside $(OBJDIR) would've been $(OBJDIR)/depend.mk, which is supposed to be regenerated by mkdepend. Unfortunately, it seems, that content of coreconf/mkdepend has bit-rotted (does not build out-of-the-box, time-stamps are from Feb 2009)... Does it make sense to fix the tool and make the build use it every time, or is no one relying on it anyway -- and it should be ripped-out from rules.mk to make the file more readable? Please, advise. Thanks!
Comment 12•12 years ago
|
||
This incorporates Brian's suggestion for depending on a file inside $(OBJDIR). Timings are the same as before, but subsequent invocations of gmake do not rebuild everything over again, because "directory.exists" files are already there.
Comment 13•12 years ago
|
||
(In reply to Mikhail Teterin from comment #11) > Does it make sense to fix the tool and make the build use it every time, or > is no one relying on it anyway -- and it should be ripped-out from rules.mk > to make the file more readable? Please, advise. Thanks! I would say that we should not rely on the mkdepend tool for an initial patch, and that we should also avoid removing the rules for it in rules.mk. In my modified version of the build system, I simply make every *.obj and every *.res file depend on every header--i.e. it basically forces a full rebuild when any header changes. That isn't ideal, but it is basically what we have to do now when we touch coreconf/coreconf.dep after editing a header. I will post my patches for the de-recursified build system for NSS soon. Then we can decide which path to go down, regarding optimizing the build system. If I understand your numbers correctly, you ended up making the full build finish in about half the time of the current build system. In my de-recursified version, for far, the build finishes in about one fourth the time of the current build system. But, my patch is much larger than yours and it isn't finished yet.
Comment 14•3 years ago
|
||
If these files exist and aren't directories, there might be other
problems. Trying to fix them just will break the build.
Comment 15•3 years ago
|
||
Depends on D68692
Comment 16•3 years ago
|
||
This way we can separate shared and static build much easier.
And IMPORT_LIB_SUFFIX if now really a WIN32 platform specific.
Comment 17•3 years ago
|
||
I have no real clue, why PROGRAMS is actually working in the
sequence build. There is no special make code really handling it,
except for the install target.
This patches code is inspired by the $(eval ...) example in the
GNU make documentation. It generates a program specific make
target and maps the programs objects based on the defined
variables.
Depends on D69015
Comment 18•3 years ago
|
||
Stuff actually depends on the generated target to be copied into
the dist/ directory. And order-depend on the SOURCE_* directories,
so they are created before any copy attempts.
Depends on D69016
Comment 19•3 years ago
|
||
If these files exist and aren't directories, there might be other
problems. Trying to "fix" them by removing will break the build.
Depends on D69017
Comment 20•3 years ago
|
||
This still serializes many targets, but at least these targets
themself run their build in parallel. The main serialization
happens in nss/Makefile and nss/coreconf/rules.mk's all target.
We can't add these as real dependencies, as all Makefile snippets
use the same variable names. I tried to always run sub-makes to
hack in the depndencies, but these don't know of each other, so
targets very often run twice, and this breaks the build.
Having a tests:: target and a tests directory leads to misery (and
doesn't work), so it's renamed to check.
Depends on D69018
Comment 21•3 years ago
|
||
Copying private headers is now simply included in the exports
target, as these headers use an extra directory anyway.
Depends on D69019
Comment 22•3 years ago
|
||
Gets rid of a lot of duplicate Makefile code.
Depends on D69021
Comment 23•3 years ago
|
||
Double-colon rule behaviour isn't really compatible with parallel
build. This gets rid of all of them, so we can codify the
directory dependencies.
This leaves just three problems, which aren't really fixable with
the current build system without completely replacing it:
- everything depends on nsinstall
- everything depends on installed headers
- ckfw child directories depend on the build parent libs
This is handled by the prepare_build target.
Overall this allows most if the build to run in parallel.
Depends on D69022
Updated•3 years ago
|
Comment 24•3 years ago
|
||
In my build none of the dlls have the "32" included in their name.
Comment 25•3 years ago
|
||
Bob,
I've done a first pass through these patches but still need to rekindle my Make-fu. In the meantime, could you suggest someone at RH to add as a reviewer for these Makefile changes, too? They look to be quite valuable! Thanks!
Comment 26•3 years ago
|
||
either Daiki or I would be fine. You can add me if you want. (serves me right for passing on a gmake patch review to you yesterday).
bob
Comment 27•3 years ago
|
||
- Drop "32" default DLL suffix
- Generalize resource file handling
- Generate IMPORT_LIBRARY based on IMPORT_LIB_SUFFIX
Comment 28•3 years ago
|
||
I just realized, that all the patches from Phabricator are mirrored in Bugzilla and there are some additional comments. Maybe I should have added the following info here, before I starting submitting my patches.
My main motivation for these patches is making the LibreOffice Windows CI builds a few minutes faster, because they have to run the NSS build with -j1, even if the CI VM has a multitude of CPUs / cores. For Linux and Mac it's a smaller problem, as these platforms use ccache normally, but the first build is still slow.
FWIW: the original work was done in https://gerrit.libreoffice.org/c/core/+/74751 and there even was my - also reverted - predecessor in https://gerrit.libreoffice.org/c/core/+/62011
LO tested gyp + ninja last month (some info in https://lists.freedesktop.org/archives/libreoffice/2020-February/084461.html), but had too many different failures on Windows, so that change was reverted in general after a week of fixes in https://gerrit.libreoffice.org/c/core/+/91199.
I didn't look into this in detail, but AFAIK we had some problems with crashing python binaries for Win32 builds, rpath problems, utf-8 encoding, etc.
And there are a few more NSS build patches in https://git.libreoffice.org/core/+/refs/heads/master/external/nss/, but I guess most are a result of the way LO builds NSS on all the different platforms.
Updated•3 years ago
|
Comment 29•3 years ago
|
||
The google_test gtest build doesn't provide any exports for the
shared library on Windows and the gyp build also builds just a
static library. So build gtest and gtestutil libraries as static.
For whatever reason, the Windows linker doesn't find the main
function inside the gtestutil library, if we don't tell it to
build a console executable. But linking works fine, if the object
file is used directly. But since we can have different main()
objects based on build flags, we enforce building console
applications binaries.
Depends on D69021
Comment 30•3 years ago
|
||
I just pushed a stripped down patchset. I removed all the cleanup patches, except for the private_exports patch, which saves one complete walk of the whole NSS source tree. It now passes the LibreOffice CI build on Linux, Windows and MacOS with debug and release config, including the previously failing gtests build. I'll open a new bug for all the patches I've currently removed / couldn't push, because of the missing bug number.
My main "problem" is still the freebl Makefile. While the current solution in D69023 seem to work, I don't like it. OTOH it's the already existing solution without the double-colon libs target. I kind of understand why it works, but I don't like overwriting the libs: rule. What I don't understand is the fact, that using TARGETS += and .PHONY += instead didn't work for me in my LibreOffice environment, but it worked in my separate NSS only environment :-(
The missing patches remove a few thousand lines of Makefile comments, duplicate empty lines, a further reworked LIBRARY_NAME rule and remove a few now basically empty config.mk snippets. There is also a minor VERSION define change in the pk11wrap code, to get rid of the naming "clash" with the LIBRARY_VERSION in the Makefile (not really a clash, as the library is actually static, but it's confusing to read). Current diffstat of the remaining patches is "239 files changed, 213 insertions(+), 4757 deletions(-)".
Thanks for the review and comments so far.
Comment 31•3 years ago
|
||
Comment on attachment 9137161 [details]
Bug 290526 Use eval templates for export and freebl
Revision D69022 was moved to bug 1629553. Setting attachment 9137161 [details] to obsolete.
Comment 32•3 years ago
|
||
Comment on attachment 9139466 [details]
Bug 290526 Cleanup WIN shared library rules
Revision D70369 was moved to bug 1629553. Setting attachment 9139466 [details] to obsolete.
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Updated•3 years ago
|
Comment 33•3 years ago
|
||
https://hg.mozilla.org/projects/nss/rev/d30a6953b897a8c8beff5ac5e29c7d75d71530ff
https://hg.mozilla.org/projects/nss/rev/585942b1d556a689b72e6a9f84c6ee23413f07a4
https://hg.mozilla.org/projects/nss/rev/fb377d36262de7bd204187491dc2e3455b97fee0
https://hg.mozilla.org/projects/nss/rev/5d0bfa092e0fc37be1791132fa00f7f2e74f5a9b
https://hg.mozilla.org/projects/nss/rev/989ecbd870f3c2f81f0e2c559e277b11aa0c577b
https://hg.mozilla.org/projects/nss/rev/a82a55886c1d84ac17c9fd6c31ce03f682e93cfe
https://hg.mozilla.org/projects/nss/rev/f3a0ef69c0569c8512845075a48e796bc08bf636
Comment 34•3 years ago
|
||
While the dependency information is now fixed, the parallel debug build will still fail on Windows, because the NSPR patch isn't merged yet.
And will someone adapt the CI, so it'll test the parallel build? Should I file an additional bug for that?
Comment 35•3 years ago
|
||
and NSPR .. oops, thanks Jan-Marek!
https://hg.mozilla.org/projects/nspr/rev/23940b78e965762287b772a8b6e7acf314d1aa2f
For parallel building, I think we should file a companion bug to https://bugzilla.mozilla.org/show_bug.cgi?id=1637102. IMO we should just make all Make builds parallel for CI.
If you want to file the bug and take a crack at editing the build config - it should just be this line and the equiv ones to set MAKEOPTS
to -j16
or similar: https://searchfox.org/nss/rev/81ca42b7138b63ec48eec02fea6f76d87a099c60/automation/taskcluster/graph/src/extend.js#196
Comment 36•3 years ago
|
||
How soon do you need to pick up the NSPR change? FF 78 or 79?
Updated•3 years ago
|
Comment 37•3 years ago
|
||
AFAIK Firefox uses gyp to build NSS (from bug 1638289), so probably also NSPR. So while the NSS patches are in 3.53, the included NSPR version is still 4.25, which doesn't include the patch. In the end that NSPR change just matters for NSPR, configure + make, parallel, Windows, MS VS, debug
builds. But since the FF builds aren't constantly failing and AFAIK nobody ever complained about this failure, I seem to be the only one running this kind of builds. Mainly - I guess - because everyone had to build NSS sequential since "ever". And I'll just patch LibreOffice's NSS build with this minimal change, until it lands upstream.
So from my POV the NSPR patch doesn't matter for FF at all.
Comment 38•3 years ago
|
||
The NSPR patch should be in this next release set. Clearing my needinfo -- and Ben made some progress in Bug 1640328, just have to figure out some gcc-4 issues.
Description
•