Investigate clang LTO parallelism
Categories
(Firefox Build System :: General, enhancement)
Tracking
(Not tracked)
People
(Reporter: glandium, Unassigned)
Details
Reporter | ||
Comment 1•6 years ago
|
||
Comment 2•5 years ago
|
||
I am wondering whether the jobs defaults for all of our linkers get set properly with LTO. I think the right thing magically happens for ld64
on our Mac cross-compiles. I am less certain whether it happens correctly for binutils ld
on our Linux jobs, and judging by resource usage graphs on our shippable Windows jobs, it doesn't happen at all there (COFF lld doesn't seem to expose --thinlto-jobs
). Maybe we should investigate a little harder? (A separate "link" tier to clearly delineate what CPU usage looks like during linking?)
judging by resource usage graphs on our shippable Windows jobs, it doesn't happen at all there (COFF lld doesn't seem to expose
--thinlto-jobs
).
COFF spells it as /opt:lldltojobs=N
.
Comment 4•5 years ago
|
||
(In reply to :dmajor from comment #3)
judging by resource usage graphs on our shippable Windows jobs, it doesn't happen at all there (COFF lld doesn't seem to expose
--thinlto-jobs
).COFF spells it as
/opt:lldltojobs=N
.
Ah, indeed. But looking through lib/LTO/LTO.cpp and lib/Support/ThreadPool.cpp, I think the default of 0 gives completely bogus results.
Comment 5•5 years ago
|
||
To see whether it makes any difference, the trivial patch for Windows:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1014bed3cb2562b771c479c1146462c51681f110
Hmm, apparently there's also an lldltopartitions
? It's not entirely clear to me what the difference is, although this may be a starting point: https://reviews.llvm.org/D29059#665077.
Comment 7•5 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #5)
To see whether it makes any difference, the trivial patch for Windows:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=1014bed3cb2562b771c479c1146462c51681f110
OK, I'm not completely sure if this made a difference. The 64-bit shippable build completed in 70 minutes, with a resource utilization graph that looks like:
https://taskcluster-artifacts.net/F7DtTqx4QeyJyzoBV65uDA/0/public/build/build_resources.html
Three out of the last four builds on central, as of this writing, look like:
https://taskcluster-artifacts.net/MBnV5UPoSzqFPELxRAfbsA/0/public/build/build_resources.html
https://taskcluster-artifacts.net/AjmRGX-7QQqDrONLcolAPw/0/public/build/build_resources.html
https://taskcluster-artifacts.net/LtE55tpVRnu_P9qD0_qnSA/0/public/build/build_resources.html
and the times are somewhere in the 80+ minute range. So, similar graphs, with slightly more CPU usage before we drop off 100% utilization...I think we are winning?
There are other jobs on central that look like:
https://taskcluster-artifacts.net/QjfviU_0QTuoLHWYKS7kyQ/0/public/build/build_resources.html
and have build times similar to what the try push did. I don't know how the build times can vary so much; these builds shouldn't be sccached or anything like that, so we're just doing "how fast does a clean build go", and 20%ish variation on that seems...not great.
So I think something improved? I might be pushing from an old tree without some of glandium's recent build improvements, though.
(In reply to :dmajor from comment #6)
Hmm, apparently there's also an
lldltopartitions
? It's not entirely clear to me what the difference is, although this may be a starting point: https://reviews.llvm.org/D29059#665077.
I would like to understand what this option does too.
Comment 8•5 years ago
|
||
New try push adding in lldltopartitions
:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=8d2897ebc2e7166d1ab1b77157ebddb67a4850e8
Updated•2 years ago
|
Description
•