Closed Bug 361343 Opened 13 years ago Closed 12 years ago

build config for profile-guided optimization on Windows

Categories

(Firefox Build System :: General, enhancement)

x86
Windows XP
enhancement
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jruderman, Assigned: ted)

References

Details

(Keywords: perf)

Attachments

(2 files, 2 obsolete files)

If the IE team got an 8% perf improvement from this, it's worth trying, IMO :)

<Jesse_> preed: does the firefox build process use profile-guided optimization
<Jesse_> preed: according to http://weblogs.mozillazine.org/doron/archives/2006/10/ie_7_benchmarking.html the IE team got an 8% perf improvement from using it
<biesi> didn't it use to use it at one point?
<biesi> see the profiledbuild target
<biesi> though, that wouldn't be on windows
<preed> Jesse_: I do not believe so, no
<Jesse_> know if there's a bug on doing it for windows?
<Jesse_> or if anyone has tried?
<preed> nope
<preed> I'm assuming that profile-driven optimization requires running the build at some point
<preed> and then tweaking things?
<bz> sorta
<bz> At least for gcc, it involves compiling with some extra options, running the resulting builds on your test set and saving the data generated by that, and then feeding the data set to the compiler with some more options
<bz> Then the compiler more aggressively optimizes whatever things came up a lot in the test set
<bz> So you need a good test set.  ;)
<biesi> Tp?
<shebs_> profile-driven opts is potentially dangerous for an app that does lots of different things and/or is unpredictable
<dbaron> it's mainly just better guesses for branch prediction, IIRC
<shebs_> do we really know which side of a conditional reflow is used most often?
<dbaron> what we really care about are the cases where we take one branch 99%+ of the time, and most simple tests that hit a given branch at all will probably get that right
<dbaron> hopefully the compiler is smart about loops, though
<shebs_> the profile-guided opts don't really let you choose which to do and which to ignore
<shebs_> so unrepresentative test set could be misleading
<shebs_> for instance, does Tp exercise window resizing? Maybe different reflow paths than for first layout
<shebs_> compiler is already smart about loops actually - probably the safest use of a prediction bit
<shebs_> you could validate a test set by instrumenting a browser, using it in daily life for a week, then compare accumulated results to test set results
<shebs_> dunno if anybody goes to that much trouble :)
I've been building PGO builds for quite some time. Two of the unofficial builders
in Japan have been building them for a few months too. I'd say that 8% is about
right.

A few points:

- You need VS2005. VS2003 is the platform for 1.5 and 2.0.
- You need to build instrumented images, run the instrumented images against a
bunch of webpages, and then build optimized images from the statistics gathered in the instrumentation. I typically go to about 200 web pages doing various things on various sites. Play around with the menus, sidebars, etc. This part is fairly time-consuming as I do it manually.
- I run a normal build and then run a script which relinks about five to seven images for instrumentation. I then run the resulting browser and support images. And then run another script which creates the final images. And then delete the instrumentation files as they can get quite large.
- The manifest files have to be played around with or else the images won't run on certain Windows operating systems.
- If I instrument mail.google.com, a cursor bug arises. I think that there's a bug which causes the cursor problem but that you don't see it in builds without
the level of optimization that PGO uses. So I have to take some care in the instrumentation phase. I think that it would be nice to fix the bug but my guess is that debugging it would be fairly hard as it only shows up with PGO and my guess is that you wouldn't see it in a debug image. Unless the debug approach was printf statements.

------------------------------

One issue with going with VS2005 is that it breaks some extensions that have
DLLs (or maybe EXEs) built with VS2003. I haven't looked into this very much
but there are users of VS2005 builds that complain of this. Contact with the
extension writers usually amounts to a polite request to pound sand. I guess
the extension writers have enough to do until Mozilla standardizes on VS2005.
Depends on: vs2005
Trunk is already built with VC2005, so that's certainly not an issue.  Extension authors will just have to get with the program when Firefox 3 is released.
Flags: blocking1.9?
The only thing that worries me about this bug is the actual profile collection part.  Actually building with the profiling flags and applying the profile afterwards doesn't sound really hard, but collecting the profile data is a question mark to me.

(In reply to comment #4)
> The only thing that worries me about this bug is the actual profile collection
> part.  Actually building with the profiling flags and applying the profile
> afterwards doesn't sound really hard, but collecting the profile data is a
> question mark to me.
> 

In terms of the work involved or that the profile is not representative of actual usage?
Both really.  Ideally you'd just record someone interacting with the browser in common ways and visiting top sites, I suppose.  I don't really know if it's possible to prove that we made the experience better without very specific benchmarks.  If we had a way to automate that kind of interaction and gather perf benchmarks at the same time, that might be feasible, but I haven't heard of anybody doing anything like that, so that worries me.
(In reply to comment #6)
> Both really.  Ideally you'd just record someone interacting with the browser in
> common ways and visiting top sites, I suppose.  I don't really know if it's
> possible to prove that we made the experience better without very specific
> benchmarks.  If we had a way to automate that kind of interaction and gather
> perf benchmarks at the same time, that might be feasible, but I haven't heard
> of anybody doing anything like that, so that worries me.
> 

Why don't we try something like comment #1, check the changes in, see how they affect talos # and nightly testing experience?
Ah, ok, I misread some of the PGO docs.  Just to note:
"Updating source code. It's important to note that if the source code of the compiled application changed after the .PGD files were generated, then /LTCG:PGO will revert to simply doing an /LTCG build, and not use any of the profile information. So what do you do if you've spent considerable generating profile from your instrumented code, and then realize that you need to make a small change to the code, but would like to reuse the profiles that you've generated? In this case you can specify /LTCG:PGUPDATE (or /LTCG:PGU). PGUPDATE allows the linker to compile modified source code, while using the original .PGD file"
From http://msdn2.microsoft.com/en-us/library/aa289170(VS.71).aspx

So it looks like we could:
1) Do a build with PGO on
2) Have one or more people browse with it to get a representative sample
3) Take their pgc outputs and merge into a PGD
4) Check that in somewhere, and tell the nightly builds to use it.

That is a tractable set of steps.  :)

Flags: blocking1.9? → wanted1.9+
(In reply to comment #8)
> So it looks like we could:
> 1) Do a build with PGO on
> 2) Have one or more people browse with it to get a representative sample
> 3) Take their pgc outputs and merge into a PGD
> 4) Check that in somewhere, and tell the nightly builds to use it.
> That is a tractable set of steps.  :)

I've been doing PGO builds for two years and it typically takes me 30 to 90 minutes to do the instrumentation by hand. The level of instrumentation depends on how much time I have available at the moment. If nothing else, just running it once should improve Firefox startup time.

It would be nice if mozilla.org had a testing tool that simulates clicks on a screen for Windows. I was trained on such a package back in the 1990s but I don't remember the name. It might be useful to run Firefox from a command line with the output going to a window for a URL followed by a terminate message to the application.

One other note about PGO. PGO tends to exacerbate existing bugs.

If I run google mail in the instrumentation phase, I get a build with crashes now and then on random websites. So of course the solution is to not instrument with google mail.

A few other things:

- PGO should be used on the Javascript DLL and several other DLLs.
- The process should be that the nightly instrumentation build runs along with a regular build, people grab the build, run it on their machines, and then send back the pgc files and then an optimization build is done.
- Someone should check to see whethere any PGO DLLs need to be redistributed for the instrumentation.
- I thought that mozilla had a bunch of automated regression tests. Couldn't those be used for instrumentation?
One additional comment: I've seen comments that Mozilla doesn't use -O2 and -GL because of code size issues. PGO generally makes the image larger as one of its means of optimization includes inlining functions for better locality of code. You need to compile modules -GL for consideration for PGO.
Attachment #295995 - Attachment is patch: true
Attachment #295995 - Attachment mime type: application/octet-stream → text/plain
Attachment #295995 - Flags: review?(ted.mielczarek)
Comment on attachment 295995 [details] [diff] [review]
Try WPO on windows

This is a prerequisite for PGO, so let's try building with it.
Comment on attachment 295995 [details] [diff] [review]
Try WPO on windows

+export LD_FLAGS="-LTCG"

I think you mean LDFLAGS.

r=me
Attachment #295995 - Flags: review?(ted.mielczarek) → review+
Attachment #295995 - Flags: approval1.9?
Comment on attachment 295995 [details] [diff] [review]
Try WPO on windows

a+ schrep - yea want this.
Attachment #295995 - Flags: approval1.9? → approval1.9+
Depends on: 411369
Depends on: 412888
Could you move this into the default CFLAGS in configure now that we know we're keeping it?  I'd like to keep the default config as close to our official config as possible.  Also we just screwed build because they didn't update the release mozconfig to match.  :-(
Assignee: nobody → sayrer
Build config patch coming up to allow profiledbuild to work.
Assignee: sayrer → ted.mielczarek
Getting screwed by bug 416571 right now.
Depends on: 416571
Attached patch win32 profiledbuild fixes (obsolete) — Splinter Review
This makes profiledbuild almost work on Win32/MSVC.  Still getting screwed by the previously mentioned bug, but I think aside from that it's good to go.
With the patch from bug 416571, I'm a little closer, but I still fail linking xul.dll:

fatal error C1307: program has been edited since profile data was collected
LINK : fatal error LNK1257: code generation failed
I manually ran the link and substituted -LCTG:PGUPDATE instead of PGOPTIMIZE, and after chewing through over 30 minutes of CPU time and a gig of ram, the linker tells me:
3117 of 79313 (  3.93%) profiled functions will be compiled for speed
I think it's just bad build mojo, probably whatever is causing the build bustage from my previous comment is making the linker ignore a good portion of our code in its PGO pass.  It then helpfully warns about every single function that got linked without profiling data, and finally tells me:
79313 of 81683 functions (97.1%) were optimized using profile data
678345931 of 700552252 instructions (96.8%) were optimized using profile data

(FWIW, from previously in the build log, re-linking js3250.dll:
1831 of 1831 (100.00%) profiled functions will be compiled for speed)
The PGO sqlite3.dll makes me crash on startup, rebuilding it without PGO makes it ok.
(In reply to comment #21)
> The PGO sqlite3.dll makes me crash on startup, rebuilding it without PGO makes
> it ok.
> 

http://crash-stats.mozilla.com/?do_query=1&query_search=signature&query_type=contains&query=sqlite3

seeing a recent spike in sqlite crashes on trunk.
Attached patch better (obsolete) — Splinter Review
I'm able to build with PGO completely automated (except for closing the browser in the profiling run) with this patch (+ the patch from bug 416571).  Only downside is that it aggravates a sqlite shutdown crash, so we consistently crash on shutdown.  Also, I updated to today's source and built with --enable-jemalloc, so this build should be comparable to today's nightlies:
http://people.mozilla.com/~tmielczarek/firefox-3.0b4pre.en-US.win32.zip
Attachment #304093 - Attachment is obsolete: true
Depends on: 418502
perhaps someone should ping the sqlite folks about this?  note - a lot of those crashes aren't recent in the link posted in this bug.
(In reply to comment #24)
> perhaps someone should ping the sqlite folks about this?  note - a lot of those
> crashes aren't recent in the link posted in this bug.
> 

Yes - Shawn can you?
(In reply to comment #25)
> Yes - Shawn can you?
It'll be at least four or five days before I'll have the cycles to get the relevant background information, so I was hoping someone with more background information would be willing.
I've had a number of problems with PGO in the past including incomplete images and crashes on startup. What I usually did with these was to do two builds: one with -GL (whole program optimization or WPO) and one without -GL. If the image has multiple libraries, then I'd do builds (instrument + optimize) using combinations of -GL and without -GL until I found the offending libraries and then just build using the libraries without -GL to avoid the problem.

The same process could be used at the module level and I do this for my x64 builds. It's a rather painstaking process to go through. But it might be a way to isolate code bugs as I think that PGO tends to exacerbate bugs.

The 1 GB RAM usage on the optimize/update links are normal. They go up to about
1.6 GB for x64 builds.
If we can't get a handle on the sqlite crash, I can easily disable PGO when building sqlite.

mmoy: yeah, we've gone through some of that pain already since we turned on -GL last month.  We wound up disabling it in a few places (bug 411369).  That reminds me to make sure I don't regress that with this patch!
Ok, I made it possible to to depends builds with this (by forcing relink of DLL/EXE files on both passes) and added a makefile variable that can be used to switch off PGO per-module in case we need that.  My latest build doesn't crash on shutdown, so maybe I got lucky!  (It's at the same URL as above, if you're interested.)
Attachment #304317 - Attachment is obsolete: true
Attachment #304508 - Flags: review?(benjamin)
Attachment #304508 - Flags: review?(benjamin) → review+
Attachment #304508 - Flags: approval1.9?
Comment on attachment 304508 [details] [diff] [review]
awesomeness [checked in]

a=beltzner for 1.9
Attachment #304508 - Flags: approval1.9? → approval1.9+
Comment on attachment 304508 [details] [diff] [review]
awesomeness [checked in]

Still needs the fix from bug 416571 to land (and NSPR tag to get bumped to pick it up) to be usable.
Attachment #304508 - Attachment description: awesomeness → awesomeness [checked in]
Depends on: 418772
Blocks: 418865
Filed bug 418865 on actually turning this on on the tinderbox.  This isn't completely fixed (need bug 416571 and bug 418772 landed), but I think this bug has served its purpose.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Summary: Use profile-guided optimization on Windows → build config for profile-guided optimization on Windows
No longer blocks: 418866
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.