Open
Bug 521435
Opened 15 years ago
Updated 2 years ago
teach gcc builds to use LTO
Categories
(Firefox Build System :: General, defect)
Tracking
(Not tracked)
NEW
People
(Reporter: graydon, Unassigned)
References
Details
Attachments
(4 files)
Last Saturday, gcc acquired the ability to do link-time optimization (LTO), the moral equivalent of msvc's /LTCG option. We should support this -- or at least give it a try -- on our gcc platforms (mac, linux, etc.) as it's likely to give a substantial cross-the-board speedup. http://gcc.gnu.org/wiki/LinkTimeOptimization
Reporter | ||
Comment 1•15 years ago
|
||
(Er, note, this would be a speedup to execution-time. Build time will probably slow down, perhaps significantly.)
Comment 2•15 years ago
|
||
I assume this would require a GCC upgrade?
Comment 3•15 years ago
|
||
It's only been merged into the GCC trunk so far. AIUI it will be in GCC 4.5.0. Judging from prior releases 4.5.0 will come out some time in Q4. I've seen a few bugs cropping up on the GCC mailing list so it might be worth holding off until 4.5.0 is out, unless you're feeling optimistic.
Comment 4•15 years ago
|
||
Moving to future until a stable release of GCC happens.
Component: Release Engineering → Release Engineering: Future
Comment 5•15 years ago
|
||
This should go to Core:Build Config to get support in our build system. If we get to the point of wanting to switch to a new stable version of GCC for nightlies/releases then please file a bug against RelEng for that.
Component: Release Engineering: Future → Build Config
Product: mozilla.org → Core
QA Contact: release → build-config
Version: other → Trunk
Comment 6•14 years ago
|
||
Graydon, you were unusually optimistic in filing this bug. I've been idle-time working with Jan Hubica on this for the past few weeks, he got gcc trunk to link and startup :)
Assignee: nobody → tglek
Updated•14 years ago
|
Comment 7•14 years ago
|
||
Finally got some talos numbers of gcc trunk with/without lto. Lto is a: 1% win on sunspider, dromeo_css 1% regession on tp_dist. Note these are very preliminary, compiling with -O1 to start with(-O2 is broken on x86, -Os seems broken in general). C lto busts on nspr so not using LTO on C code. I didn't run the full talos yet, just what I felt was most interesting. lto libxul is 33mb, nonlto is 31mb. This is on 64bit.
Comment 8•14 years ago
|
||
Firefox should now mostly work with GCC LTO. GCC tracking bug is http://gcc.gnu.org/pr45375 paper http://arxiv.org/abs/1010.2196 Most promising seems to be build with LTO -O3 --param inline-unit-growth 5 that is 28.2MB, non-LTO -Os build is 28.2MB, too. performance of -O3 build is about 1% better with LTO according to Taras benchmark.
Comment 9•14 years ago
|
||
OOPS, tracking bug URL is http://gcc.gnu.org/PR45375
It seems there have been many LTO improvements to GCC in recent years. We should look into this again.
Comment 11•9 years ago
|
||
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #10) > It seems there have been many LTO improvements to GCC in recent years. We > should look into this again. When I tried with 5.1, enabling LTO regressed talos.
(In reply to Mike Hommey [:glandium] from comment #11) > (In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #10) > > It seems there have been many LTO improvements to GCC in recent years. We > > should look into this again. > > When I tried with 5.1, enabling LTO regressed talos. Bummer. Was libxul any smaller?
Comment 13•9 years ago
|
||
I didn't look. Note it was desktop, not mobile. And it was PGO+LTO that was slower than PGO alone.
Comment 14•9 years ago
|
||
I'd like to share my results with GCC 4.9.3 and Firefox 39.0, benchmarked with Peacekeeper. No PGO, no LTO: ~4400 points LTO only: ~4600 points PGO only: ~5000 points, xul 68MB LTO + PGO: ~5500 points, xul 64MB LTO caused a few crashes, backtraces showed that they all had common cause. Compiler options include 64-bit, O2 and native march.
Comment 15•9 years ago
|
||
-march=native can't be used on Mozilla's builds.
Comment 16•9 years ago
|
||
IMHO, native march causes constant speedup, so the results should be the same minus some constant value.
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 20•6 years ago
|
||
I got a successful opt build of LTO, with Talos runs, here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=12ce14a5bcac9975b41a1f901bfc3a8dcb2d791b&selectedJob=165424387 I attached the three patches I used to make that happen. I'm trying to get a PGO run of it for performance comparisons.
Updated•6 years ago
|
Product: Core → Firefox Build System
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 25•6 years ago
|
||
I got a successful LTO build; only needed a gcc patch to succeed. Everything needed is the attached four patches, although these are illustrative patches; not the actual changes we would apply if we wanted to pursue this. Perherder shows a near universal 3-9% performance win; except for ARES6. I wonder if that test is correctly configured for up/down gain/loss. (Adding Joel just in case.) https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=7e5bd52e36fcc1703ced01fe87e831a716677295&framework=2&showOnlyImportant=1&selectedTimeRange=172800
Flags: needinfo?(jmaher)
Comment 26•6 years ago
|
||
Wrong link? This only shows 3 results. One of which is a > 1000% increase in warnings.
Comment 27•6 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #26) > Wrong link? This only shows 3 results. One of which is a > 1000% increase in > warnings. Right link, bad options. That's the build metrics showing only important results. (Build metrics are not the normal view, but accessible from the dropdown) Here's the non-filtered performance metrics: https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=7e5bd52e36fcc1703ced01fe87e831a716677295&framework=1&selectedTimeRange=172800 The warning increase come from two things: - LTO outputs One Definition Rule and and LTO Type Mismatch which we didn't have before and are numerous. They may indicate issues, not sure. - I turned on final suggestions (for Bug 1332680) - those are suggestions-in-the-form-of-warnings
Comment 28•6 years ago
|
||
I am happy we make progress on this! The benchmark results looks quite good. There are some incremental things we could do on the top of that. For example for PGO builds it would be nice to drop the difference beween -Os and -Ofast/3. This prevents cross-module inlning of comdats and compiler optimize for size anyway all parts that are not executed in the train run. I am trying to benchmark with talos locally and have issues with the runs sometimes producing results and sometimes not. Anything I could look into? Concerning ODR warnings. I looked into them briefly and those I analyzed are real issues (gcc might report some false positives and I would like to know about them). The warnings are not easiest to analyze even though I tried to make them informative. The ODR mismatches often happens because named class uses some ifdef or type that is different in different units.
Comment 29•6 years ago
|
||
Also note that adding -flto=9 will make LTO linktime to always use 9 processes to do the final compilation stage. It would be better to use -flto=jobserver and then add "+" to each Makefile rule that executes linking. This will allow GCC sub-processes to be controlled by the toplevel make depending on its -j command. "+" is necessary to tell GNU make to pass down the pipe needed to contact jobserver.
Comment 30•6 years ago
|
||
thanks for the mention about ares6- that is in fact an improvement and I have a patch already filed to handle the reverse direction: https://bugzilla.mozilla.org/show_bug.cgi?id=1443239
Flags: needinfo?(jmaher)
Comment 31•6 years ago
|
||
Thanks for the patches, I'm going to enable PGO+LTO for Fedora Firefox builds if it's feasible.
Comment 32•6 years ago
|
||
Unfortunately it fails soon at cargo-linker: "/home/komat/tmp676-trunk-gtk3/src2/build/cargo-linker" "-Wl,--as-needed" "-Wl,-z,noexecstack" "-m64" "-L" ... = note: /home/komat/tmp676-trunk-gtk3/src2/objdir-optimized/toolkit/library/release/deps/liblibloading-48cd981c731eb1bf.rlib(libloading-48cd981c731eb1bf.libloading0.rcgu.o): In function `core::ptr::drop_in_place': libloading0-dae1b3bfe92793ed548dd7814337f5a0.rs:(.text._ZN4core3ptr13drop_in_place17h3cf1af1fed787f33E+0x1): undefined reference to `rust_libloading_dlerror_mutex_unlock' /home/komat/tmp676-trunk-gtk3/src2/objdir-optimized/toolkit/library/release/deps/liblibloading-48cd981c731eb1bf.rlib(libloading-48cd981c731eb1bf.libloading0.rcgu.o): In function `libloading::os::unix::DlerrorMutexGuard::new': libloading0-dae1b3bfe92793ed548dd7814337f5a0.rs:(.text._ZN10libloading2os4unix17DlerrorMutexGuard3new17h25f0dea1ba8750daE+0x1): undefined reference to `rust_libloading_dlerror_mutex_lock' /home/komat/tmp676-trunk-gtk3/src2/objdir-optimized/toolkit/library/release/deps/liblibloading-48cd981c731eb1bf.rlib(libloading-48cd981c731eb1bf.libloading0.rcgu.o): In function `<libloading::os::unix::DlerrorMutexGuard as core::ops::drop::Drop>::drop': libloading0-dae1b3bfe92793ed548dd7814337f5a0.rs:(.text._ZN81_$LT$libloading..os..unix..DlerrorMutexGuard$u20$as$u20$core..ops..drop..Drop$GT$4drop17h34dc679f28968fa3E+0x1): undefined reference to `rust_libloading_dlerror_mutex_unlock' /home/komat/tmp676-trunk-gtk3/src2/objdir-optimized/toolkit/library/release/deps/liblibloading-48cd981c731eb1bf.rlib(libloading-48cd981c731eb1bf.libloading0.rcgu.o): In function `<libloading::os::unix::Library as core::ops::drop::Drop>::drop': libloading0-dae1b3bfe92793ed548dd7814337f5a0.rs:(.text._ZN71_$LT$libloading..os..unix..Library$u20$as$u20$core..ops..drop..Drop$GT$4drop17h4d62911e45826704E+0x9): undefined reference to `rust_libloading_dlerror_mutex_lock' libloading0-dae1b3bfe92793ed548dd7814337f5a0.rs:(.text._ZN71_$LT$libloading..os..unix..Library$u20$as$u20$core..ops..drop..Drop$GT$4drop17h4d62911e45826704E+0xcb): undefined reference to `rust_libloading_dlerror_mutex_unlock' collect2: error: ld returned 1 exit status
Comment 33•6 years ago
|
||
Hello, the problem here is that cargo linker is not enabling gcc LTO plugin. Either it needs to be called through gcc wrapper, add proper plugin parameter to the linker or ./toolkit/library/release/build/libloading-d78baa5b18daaadf/out/src/os/unix/global_static.o needs to be built with no LTO. I think the last is easiest to arrange, but I got bit lost in the built machinery. Honza
Comment 34•6 years ago
|
||
Hi, as Martin Liska pointed out, one should no longer add -flto=64 to cflags but should use ac_add_options --enable-lto I just built yesterday checkout of firefox git using gcc 8.2 with no problems. Honza
Updated•6 years ago
|
Assignee: taras.mozilla → nobody
Updated•2 years ago
|
Severity: normal → S3
You need to log in
before you can comment on or make changes to this bug.
Description
•