Consider doing PGO on rust code
Categories
(Firefox Build System :: General, enhancement)
Tracking
(firefox70 fixed)
| Tracking | Status | |
|---|---|---|
| firefox70 | --- | fixed |
People
(Reporter: emilio, Assigned: mwoerister)
References
(Blocks 4 open bugs)
Details
(Whiteboard: [stylo:p3])
Attachments
(1 file)
Comment 1•7 years ago
|
||
| Reporter | ||
Comment 2•7 years ago
|
||
Comment 3•7 years ago
|
||
| Reporter | ||
Comment 4•7 years ago
|
||
| Reporter | ||
Comment 5•7 years ago
|
||
Comment 6•7 years ago
|
||
Comment 7•7 years ago
|
||
| Reporter | ||
Comment 8•7 years ago
|
||
| Reporter | ||
Comment 9•7 years ago
|
||
Comment 10•7 years ago
|
||
| Reporter | ||
Comment 11•7 years ago
|
||
Updated•7 years ago
|
| Reporter | ||
Comment 12•7 years ago
|
||
Updated•7 years ago
|
| Reporter | ||
Comment 13•7 years ago
|
||
Comment 14•7 years ago
|
||
| Reporter | ||
Comment 15•7 years ago
|
||
Comment 16•7 years ago
|
||
Comment 17•7 years ago
|
||
| Reporter | ||
Comment 18•7 years ago
|
||
Comment 19•7 years ago
|
||
Comment 20•7 years ago
|
||
| Assignee | ||
Comment 22•7 years ago
|
||
Comment 23•7 years ago
|
||
Comment 24•7 years ago
|
||
| Reporter | ||
Comment 25•7 years ago
|
||
| Reporter | ||
Comment 26•7 years ago
|
||
| Reporter | ||
Comment 27•7 years ago
|
||
Comment 28•7 years ago
|
||
Comment 29•7 years ago
|
||
Comment 30•7 years ago
|
||
| Assignee | ||
Comment 31•6 years ago
|
||
From what I've been able to gather over the past few days it looks like the PGO method that FF currently uses (front-end based PGO) and the method that rustc supports (IR-level PGO) might not be compatible. They use the same runtime infrastructure from clang_rt.profile.lib but there are subtle differences and Clang's documentation seems to imply that either one or the other should be used.
I think it would be a good idea to test whether we can switch to IR-level PGO for C++ code too, that is, use -fprofile-generate instead of -fprofile-instr-generate. Supporting front-end based instrumentation in rustc would potentially be a lot of work and in theory IR-level instrumentation should be more flexible anyway.
(I'm also a bit surprised that the current setup in FF works at all because, from what I've seen so far, it seems like profiling data in default.profraw might be overwritten multiple times instead of being appended to).
So it would be good to benchmark the following combinations against each other:
- The current setup with
-fprofile-instr-generate. - An adapted version with
-fprofile-instr-generate="default-%m.profraw"with data merging explicitly enabled. - A version with
-fprofile-generate(i.e. IR-level profiling) - A version with
-fprofile-generateand-mllvm -disable-preinline(because Emilio found the pre-inlining pass to be detrimental sometimes)
Some background:
- LLVM's profiling instrumentation works via adding certain instrinsics to LLVM IR
- Both front-end and IR-level instrumentation use the same instrinsics and runtime but for the former Clang emits the instrinsics while for the latter they are emitted by an internal LLVM pass during optimization
- IR-level instrumentation can lead to instrumented binaries that run with less overhead because it can run an inlining pass before adding the intrinsics (see http://lists.llvm.org/pipermail/llvm-dev/2015-August/089044.html)
Comment 32•6 years ago
|
||
(In reply to Michael Woerister from comment #31)
I think it would be a good idea to test whether we can switch to IR-level
PGO for C++ code too, that is, use-fprofile-generateinstead of
-fprofile-instr-generate.
Last I tried, Windows clang-cl only supports the "-instr" variety. I haven't looked closely enough to see whether this is just a lack of flag plumbing, or something deeper.
Perhaps you could start out trying your experiments on Linux, to see whether switching to -fprofile-gen is an improvement there. If so, we can cross the Windows bridge at that point.
Another thing to consider is that the recently-landed context-sensitive PGO (aka "5 tier") only supports -fprofile-gen currently. Again I haven't looked closely enough to see whether it's just flag plumbing or something more architectural. But if you can switch us over to its profiling style, all the better. :-)
| Assignee | ||
Comment 33•6 years ago
|
||
Last I tried, Windows clang-cl only supports the "-instr" variety.
Looks like your right :)
https://treeherder.mozilla.org/#/jobs?repo=try&revision=471e6757d94adebc9817b101a04682f4c368d1ae&selectedJob=234137475
I haven't looked closely enough to see whether this is just a lack of flag plumbing
I suspect it is. But I've been wrong before.
Another thing to consider is that the recently-landed context-sensitive PGO (aka "5 tier")
Oh that looks interesting. I noticed the -fcs-profile-generate flag showing up in the documentation recently.
Comment 34•6 years ago
|
||
I just remembered that https://reviews.llvm.org/D53457 exists, and clang-cl seems to work with -clang:-fprofile-generate!
I tried it on a very simple program and I got a file named "default_[huge-number]_0.profraw" which suggests that we are indeed taking the IR-level profiling paths.
Comment 35•6 years ago
|
||
Well that didn't last long. clang-cl crashed on process_thread_interception.cc during the instrumentation phase, likely due to that file's use of SEH. I'll attempt to reduce it...
| Assignee | ||
Comment 36•6 years ago
|
||
I just remembered that https://reviews.llvm.org/D53457 exists, and clang-cl seems to work with -clang:-fprofile-generate!
Ah, that's good so we are not blocked an that.
Well that didn't last long. clang-cl crashed on process_thread_interception.cc during the instrumentation phase, likely due to that file's use of SEH. I'll attempt to reduce it...
It looks like FF is being built with -C panic=abort, so this shouldn't be a problem for the Rust part of the code at least.
Comment 37•6 years ago
|
||
Except libstd is not built with -C panic=abort...
Comment 38•6 years ago
|
||
For code coverage, we worked around the problem by disabling instrumentation in functions using SEH.
| Assignee | ||
Comment 39•6 years ago
|
||
Except libstd is not built with -C panic=abort...
I wonder if the fat-LTO step we are already doing helps here: https://github.com/rust-lang/rust/blob/f47ec2ad5b6887b3d400aee49e2294bd27733d18/src/rustllvm/PassWrapper.cpp#L709
This pass is run after libstd has been merged in.
| Assignee | ||
Comment 40•6 years ago
|
||
Hm, it seems that the maybe_clobber_profiledbuild step does not delete Rust artefacts. And surprisingly cargo doesn't pick up on the changed RUSTFLAGS either.
Comment 41•6 years ago
|
||
I filed https://bugs.llvm.org/show_bug.cgi?id=41279 for the SEH crash, but it turns out that it doesn't really matter to us. We need to disable PGO for the sandbox interceptors anyway, because they can be called before the child process has initialized its import table, so the instrumentation code crashes.
Comment 42•6 years ago
|
||
I tried reproducing Michael's results (with an older base tree, so maybe an older Rust?) and got:
error: /builds/worker/workspace/build/src/obj-firefox/dom/serviceworkers/test/gtest/Unified_cpp_test_gtest0.cpp: Function control flow change detected (hash mismatch) main [-Werror,-Wbackend-plugin]
on Linux builds. I mean, adding PGO instrumentation to our gtest code seems pretty pointless, but I don't know why the error appears on my builds and not Michael's.
| Assignee | ||
Comment 43•6 years ago
|
||
Here's a successful Linux64 build for reference: https://treeherder.mozilla.org/#/jobs?repo=try&revision=9071c3379f95e2087ae500b1f964cfff96239460
Comment 44•6 years ago
|
||
The good news is that disabling pgo for some sandbox files and using the /clang trick dmajor mentioned in comment 34 works to get us through the profile-generate phase. The bad news is that Windows builds now fall over with a similar error as comment 42:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=a546ec8b70d13331c4720dbfa0a83d4cf95e6c3e
error: z:/task_1554908694/build/src/obj-firefox/xpcom/base/Unified_cpp_xpcom_base1.cpp: Function control flow change detected (hash mismatch) printf_stderr [-Werror,-Wbackend-plugin]
That was based off of m-c as of an hour or so ago, so we should be using the latest clang 8. No results yet from the Linux builds.
I am not sure what is going on. Maybe we need to disable -Wbackend-plugin on pgo builds, just like we already disable profile-instr-*?
Comment 45•6 years ago
|
||
OK, so adding -Wno-error=backend-plugin to PGO use flags seems to have silenced the errors. It's still not clear to me why I was seeing these and Michael was not; maybe will have to dig into compiler versions and/or exact compiler patches.
Removing bits about setting MOZ_PROFILE_ORDER_FILE gives us a successful Windows build, too! So that takes care of knowing that we can switch to IR-level instrumentation; now to integrating all the necessary Rust bits on all platforms.
| Assignee | ||
Comment 46•6 years ago
|
||
This might also be interesting: The following build was a test for Windows (which failed as expected) and macOS (which succeeded):
https://treeherder.mozilla.org/#/jobs?repo=try&revision=086c9004d2d86a549579c10c3b4978f84d5511bf
The interesting point about it is that it used Rust 1.34-beta instead of nightly, so that already is compatible with Clang 8.
| Assignee | ||
Comment 47•6 years ago
|
||
Huh, I just realized that the macOS build should have failed because Rust beta doesn't accept unstable flags like -Zpgo-gen. But it looks like there is no PGO done at all for shippable macOS builds. Is that correct?
| Reporter | ||
Updated•6 years ago
|
Comment 48•6 years ago
|
||
(In reply to Michael Woerister from comment #47)
Huh, I just realized that the macOS build should have failed because Rust beta doesn't accept unstable flags like
-Zpgo-gen. But it looks like there is no PGO done at all for shippable macOS builds. Is that correct?
That's correct, we're working on macOS PGO support in bug 1528374 -- it needs to be 3-stage due to cross-compiling on Linux, so it's a bit more involved.
Do you know if there are plans to stabilize -Zpgo-gen? I found a rust-lang RFC but it hasn't been updated in a while.
| Assignee | ||
Comment 49•6 years ago
|
||
I just realized that I never posted the link to the PGO stabilization tracking issue, so here goes: https://github.com/rust-lang/rust/issues/59913
Comment 50•6 years ago
|
||
Current Status
All of our identified blockers are now fixed. I've done try runs (linux, win) using nightly rust with somewhat unimpressive results (ignore the mac numbers, we don't pgo mac), which leads me to believe we're not actually consuming the rust profile data properly.
Next Steps
Firefox
- Verify that rust profiling data is generated
- Verify that rust profiling data is consumed
- Put together proper patches to enable Rust PGO (example of what I tried)
Rust
- PGO needs to be stabilized so that we can use the PGO flag (our policy doesn't allow us to use nightly rust builds right now). The compiler team has approved the change and it's just pending landing.
- An overly aggressive compiler error needs to be fixed
- We need to patch various libraries that now
Werroron Rust 1.37 due to new warnings.
Release Planning
We currently have a policy that requires that end-user builds of Firefox use release builds of Rust -- we're allowed to use beta rust on one platform. Additionally Rust has a requirement that unstable flags (pgo in this case) can only be used in Nightly Rust. This means that we have to wait for a stabilized PGO flag to hit Rust Beta before we can test this on our Nightly users.
Rust 1.37 will hit beta on July 4th, 2019, release on August 15th. Our next Firefox 69 will move to beta on July 8th, release on September 2nd. Firefox 70 hits beta on September 2nd, release on October 21st. My proposal is:
- July 8th - Update our Windows nighly builds to use Rust 1.37 beta. If we haven't had a chance to fix all new Rust warnings we can disable warnings as errors for shippable windows builds as a temporary workaround.
- August 15th - Update all builds to use Rust 1.37 release.
We can either let this ride the trains on 70 or push for an uplift to 69 beta.
Comment 51•6 years ago
|
||
which leads me to believe we're not actually consuming the rust profile data properly.
FWIW, the profile data does contain information for rust code, and the profile-use build does pass the right flag with the right path to the merged profile.
Comment 52•6 years ago
|
||
which leads me to believe we're not actually consuming the rust profile data properly.
A couple ideas to check that - Disassemble some panic codepaths before and after, since PGO ought to have high confidence about them being cold; or do a SymbolSort diff to look for dramatic changes in function size due to different inlining decisions.
Comment 53•6 years ago
|
||
A couple ideas to check that - Disassemble some panic codepaths before and
after, since PGO ought to have high confidence about them being cold; or do
a SymbolSort diff to look for dramatic changes in function size due to
different inlining decisions.
I didn't do this fully rigorously, but informally: in a before-and-after view of your try pushes, there is a lot of shuffling-about of Rust code, and servo glue functions did generally get bigger, for example by inlining xul!alloc::string::{{impl}}::to_string<nsstring::nsAString> in several places.
It seems safe to say that your flags weren't just ignored (which agrees with glandium's Comment 51), though I couldn't say whether or not PGO is really achieving its full potential.
Come to think of it, is any Rust code besides stylo even exercised in our PGO training?
Comment 54•6 years ago
|
||
(In reply to David Major [:dmajor] from comment #53)
A couple ideas to check that - Disassemble some panic codepaths before and
after, since PGO ought to have high confidence about them being cold; or do
a SymbolSort diff to look for dramatic changes in function size due to
different inlining decisions.I didn't do this fully rigorously, but informally: in a before-and-after view of your try pushes, there is a lot of shuffling-about of Rust code, and servo glue functions did generally get bigger, for example by inlining
xul!alloc::string::{{impl}}::to_string<nsstring::nsAString>in several places.
Thanks for checking, this is really helpful!
It seems safe to say that your flags weren't just ignored (which agrees with glandium's Comment 51), though I couldn't say whether or not PGO is really achieving its full potential.
I feels like we need to get someone from the rust compiler team involved unless you have other thoughts on how to verify rust PGO is behaving as expected.
Come to think of it, is any Rust code besides stylo even exercised in our PGO training?
I filed bug 1543852, bug 1543853, and bug 1543854 to increase our training set. We could possibly do a local profiling run that ad hoc loads some of those to see if we get better numbers if we just want to see if they help before doing the full integration with automation.
Comment 55•6 years ago
|
||
This change is similar to what we do for Rust LTO, except that we
don't have a MOZ_PGO=cross or similar setting. We assume that if you
want to do PGO, you also want to PGO the Rust code.
Comment 56•6 years ago
|
||
Nathan, is it possible to do a linux-only version of your patch that we could land alongside updating our linux builds to Rust 1.37 beta?
Comment 57•6 years ago
|
||
(In reply to Eric Rahm [:erahm] from comment #56)
Nathan, is it possible to do a linux-only version of your patch that we could land alongside updating our linux builds to Rust 1.37 beta?
Sure, we can limit the addition of the flags to Linux platforms only. Would that be more desirable for the time being?
Comment 58•6 years ago
|
||
(In reply to Nathan Froyd [:froydnj] from comment #57)
(In reply to Eric Rahm [:erahm] from comment #56)
Nathan, is it possible to do a linux-only version of your patch that we could land alongside updating our linux builds to Rust 1.37 beta?
Sure, we can limit the addition of the flags to Linux platforms only. Would that be more desirable for the time being?
Yeah, perhaps as another blocker to this bug dependent on bug 1564638. I'd like to get rust PGO enabled and tested on at least one platform, once we get Linux builds on the 1.37 beta we'll be able to use the stabilized PGO flag, but passing that flag to other platforms would break.
Updated•6 years ago
|
Comment 59•6 years ago
|
||
Comment 60•6 years ago
|
||
| bugherder | ||
Description
•