Closed Bug 1030706 Opened 5 years ago Closed 5 years ago

asm.js compilation 35% slower with Windows PGO builds

Categories

(Core :: JavaScript Engine: JIT, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: jandem, Assigned: jandem)

References

(Blocks 2 open bugs)

Details

Attachments

(2 files, 1 obsolete file)

I was testing Windows PGO builds today for some PGO bustage, and I decided to compare asm.js compilation time with PGO vs non-PGO.

As bug 977658 explains, PGO builds are usually slower on code that's not exercised during the profile run. Asm.js compilation isn't part of the profile, so I expected it to be slower, and it is.

For www.unrealengine.com/html5 I get the following asm.js compilation times (in ms):

m-c PGO build    : 9928, 10072 
m-c non-PGO build: 7311,  7308

That's at least a 35% difference. A patch that'd magically make compilation 35% faster would be considered amazing, but we can get that on Windows just by disabling PGO.

We could try adding an asm.js demo (or Octane, as it includes zlib) to the PGO profile run or we could disable PGO for jit/AsmJS*.

I *really* wish we'd (re)consider disabling PGO though. Sure, it wins on Sunspider, but is it still worth the perf cliffs (see also bug 977658), weird regressions/crashes, ultra-slow compilation times?
(In reply to Jan de Mooij [:jandem] from comment #0)
> is it still worth the perf cliffs (see also bug 977658),
> weird regressions/crashes, ultra-slow compilation times?

To be clear, I'm talking about Firefox compilation time here. PGO builds take 4-5 hours, non-PGO builds less than 1.5... I'm all for making JS faster, but there's no way we'd r+ a patch with those trade-offs in 2014.

Anyway, we have to fix this one way or another. It'd be interesting to add asm.js compilation to the profile run and see how much that buys us, but it may be simpler to disable PGO for some files.
To wit, only a small fraction (maybe 1/16) of asm.js compilation time is in AsmJS.cpp itself.  Most of the time is in the parser, MIR optimization and, most of all, register allocation.  Thus, disabling PGO in AsmJS* would, I expect, not have an effect.  Adding an asm.js workload to our PGO set, otoh, makes tons of sense.  In particular, I expect it would help general parsing and JIT compilation.

Also, what you're saying about PGO also makes sense, but I assume you're talking about turning off PGO just for SM, right?  (I think it helps general browser workloads a great deal.)  This is a pretty stark example of how dangerous it is to have perf cliffs like these.  I bet there are 10 other hot paths (for certain workloads) that are similarly effected.  Are any other benchmarks affected by turning of PGO?  What is the SS drop?
(In reply to Luke Wagner [:luke] from comment #2)
> To wit, only a small fraction (maybe 1/16) of asm.js compilation time is in
> AsmJS.cpp itself.  Most of the time is in the parser, MIR optimization and,
> most of all, register allocation.  Thus, disabling PGO in AsmJS* would, I
> expect, not have an effect.

Ah, that's true. But pretty weird as well, because browser startup and Sunspider should definitely exercise the more interesting paths in the parser, tokenizer and register allocator... Maybe that's why it's "only" 35%.

> Also, what you're saying about PGO also makes sense, but I assume you're
> talking about turning off PGO just for SM, right?  (I think it helps general
> browser workloads a great deal.) 

AFAIK we mainly started using PGO for SM (Sunspider). Also, I wouldn't be surprised if there are a ton of browser APIs that we don't run during the profile run and hence are considered "cold". Just like the TypedArray.set function becomes 2x slower with PGO (bug 977658). Element.getAttribute, say, should be part of the profile, but there are a lot of other functions...

> This is a pretty stark example of how
> dangerous it is to have perf cliffs like these.  I bet there are 10 other
> hot paths (for certain workloads) that are similarly effected.  Are any
> other benchmarks affected by turning of PGO?  What is the SS drop?

Last time I measured this, SS was about 8-10% and Kraken/Octane 3-5% (IIRC). We should probably measure this again and see on which tests we lose the most; maybe there are other things we can do to make non-PGO faster.
(In reply to Jan de Mooij [:jandem] from comment #3)
> > Also, what you're saying about PGO also makes sense, but I assume you're
> > talking about turning off PGO just for SM, right?  (I think it helps general
> > browser workloads a great deal.) 
> 
> AFAIK we mainly started using PGO for SM (Sunspider).

I think this has been measured a few times (during the whole pgo-fails-during-linking days) and PGO is a pretty big win on general browser things.  I attribute that to bad old crufty COM-taminated code.

> Last time I measured this, SS was about 8-10% and Kraken/Octane 3-5% (IIRC).
> We should probably measure this again and see on which tests we lose the
> most; maybe there are other things we can do to make non-PGO faster.

Yeah, that's a good idea; it's probably just a few natives that SS/Kraken are pounding on (Kraken has a JSON component, iirc), so perhaps just a bit of micro-optimization will fix 'em up.  Also, that will help non-Windows.  If we can turn off PGO in JS w/o any benchmark regressions, it seems like there would be no reason not to, given the high cost.
Hm wow, for some reason Sunspider is also about 20% slower with a PGO build on my machine.

It looks like PGO builds regressed a lot last week or so. Bisecting now...
Regression range for the SS PGO regression:

http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=f78e532e8a10&tochange=bdac18bd6c74

Will see if I can narrow it down with inbound builds...
Here's the inbound regression range for the SS slowdown. Who landed those patches? :(

http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=3087b33b8037&tochange=e589c195f61d

One of my patches made Windows PGO angry. Bug 1029422 likely has the same cause.

Yesterday I was backed out for mysterious PGO orange, today we have this performance regression. Seriously, it's hard to get any work done this way :(
I compared the SS numbers @ the last good inbound revision (3087b33b8037), PGO vs non-PGO:

PGO:        148.3, 150.1, 149.7, 152.4
non-PGO:    152.6, 151.8, 150.4, 152.3

I also downloaded an Aurora PGO build:

Aurora PGO: 149.1, 146.4, 150.5

Kraken:

PGO:     1039.0, 1042.8
non-PGO: 1045.0, 1033.9

I'll also look at Octane, but it tends to be more noisy so we'd likely need more runs.

asm.js compilation time on www.unrealengine.com/html5 (these numbers are lower than the ones in comment 0, it's a different/faster machine):

PGO:     7195, 7202
non-PGO: 5850, 5820 

So even if we fix the (Sunspider) PGO regression introduced last week:

(1) It won't fix the asm.js issue I filed this bug for.
(2) The difference between PGO and non-PGO on the JS benchmarks is super small. I'm pretty sure PGO was measurably faster than non-PGO not that long ago, but something must have changed that...

It'd be great if somebody could double-check these numbers! Unless somebody objects, tomorrow I'll post a patch to disable PGO in js/src.
Started a PGO build with PGO disabled in js/src:

https://tbpl.mozilla.org/?tree=Try&rev=a13f6beefd57

More benchmarking tomorrow.
Pretty unbelievable: when we disable PGO in js/src, we get the same bc1 perma-orange I was backed out for two days ago (bug 1028872, bug 1028866)...

The build is a lot faster though: from 233 minutes for another PGO Try push to 188 minutes.

The Numbers (I changed the Windows 8 power settings from "Balanced" to "Best performance", this improves everything a bit compared to the numbers in comment 8).

Sunspider 1.0.2 (ms):

Aurora PGO       : 143.2, 143.4, 144.7, 141.7
PGO last week    : 147.9, 144.0, 147.5, 146.0
PGO this week    : 181.7, 182.6, 182.4, 182.9
PGO, not js/src  : 150.9, 147.7, 151.7, 149.9

Kraken 1.1 (ms):

Aurora PGO:      : 1036.0, 1046.4, 1036.8
PGO last week    : 1061.5, 1043.2, 1042.0
PGO this week    : 1176.5, 1189.2, 1195.4
PGO, not js/src  : 1051.0, 1042.3, 1033.0

Octane 2.0 (points):

Aurora PGO:      : 24050, 27686, 27438
PGO last week    : 28064, 27574, 27971
PGO this week    : 23801, 23384, 24160
PGO, not js/src  : 26908, 27245, 26661

unrealengine.com/html5 compilation time (ms):

Aurora PGO:      : 6831, 6873
PGO last week    : 7145, 7116
PGO this week    : 8729, 8773
PGO, not js/src  : 5948, 5971

Note:

* The difference between Aurora and Nightly on SS could be from --enable-profiling on Nightly builds.
* If we do nothing, Nightly will have a terrible regression on SS and other benchmarks.
* "PGO last week" is inbound rev 3087b33b8037, see comment 7 and 8.

Pros/cons for turning off PGO in js/src:

+ asm.js compilation time will improve. Other parts of the codebase that are not in the profile will also get faster (bug 977658).
+ The performance regression (on pretty much all benchmarks) from last week will be gone.
+ SS and Kraken won't really be affected. SS may lose a few ms, Kraken may win some.
+ We'll avoid security bugs, crash bugs and correctness bugs introduced by PGO. I've fixed Windows PGO bugs in all of these categories; there are likely others that we're unaware of.
+ PGO builds will be significantly faster. Can't say much based on one build, but I think 188 minutes for the Try build is an improvement of at least 40 minutes.

- Octane may get a bit slower. I'll investigate this more.
- A bc1 perma-orange; the same one we got with bug 1028866 part 5. This will have to be fixed of course.
(In reply to Jan de Mooij [:jandem] from comment #10)
> Octane 2.0 (points):
> 
> Aurora PGO:      : 24050, 27686, 27438
> PGO last week    : 28064, 27574, 27971
> PGO this week    : 23801, 23384, 24160
> PGO, not js/src  : 26908, 27245, 26661

> - Octane may get a bit slower. I'll investigate this more.

I reran the octane scores. For me they look ok. I actually don't expect a regression by disabling pgo. But octane is quite noisy, so hard to make definite predictions about.

1 week old PGO build: 18327, 19162, 19367, 19540, 18743
try non-PGO build: 18475, 18535, 19078, 18782, 19499
Thanks for checking that Hannes!

(In reply to Jan de Mooij [:jandem] from comment #10)
> Pretty unbelievable: when we disable PGO in js/src, we get the same bc1
> perma-orange I was backed out for two days ago (bug 1028872, bug 1028866)...
...
> - A bc1 perma-orange; the same one we got with bug 1028866 part 5. This will
> have to be fixed of course.

This is bug 1028872 and a regression from bug 867317. Apparently when I landed bug 1028866 it somehow made this fail for the first time, but we're still seeing the same thing on other trees and it's a problem with the test.
Attached patch Patch (obsolete) — Splinter Review
See comment 10.

Flagging :glandium for the build system changes, Luke for the let's-disable-PGO decision.
Assignee: nobody → jdemooij
Status: NEW → ASSIGNED
Attachment #8447142 - Flags: superreview?(luke)
Attachment #8447142 - Flags: review?(mh+mozilla)
Comment on attachment 8447142 [details] [diff] [review]
Patch

I think your analysis of the pros/cons above make it pretty clear that turning off PGO is a win.  More generally, it seems crazy to me for a module massively focused on performance to have a compilation mode that can radically and unexpectedly degrade performance on our biggest platform, in a compilation mode that no-one runs locally.

Even if there are some real regressions, it seems like we should be able to apply good old fashioned profiling and optimization and the result will be faster on all platforms.

I'd sr+, but as not-module-owner, I probably shouldn't.  Forwarding to jorendorff.
Attachment #8447142 - Flags: superreview?(luke) → superreview?(jorendorff)
One thing that might be worth keeping in mind: We compile with -O1 on Windows, which optimizes for code size rather than performance. Imposing a different optimization level on parts of the codebase probably isn't a good idea, but it might be interesting to look into what -O2 would do for performance if we're turning off PGO. In bug 1004519, Benjamin Smedberg mentioned that -O1 produced better results, but I don't know if those results still apply today (with MSVC 2010 and PGO applied selectively).
(In reply to Emanuel Hoogeveen [:ehoogeveen] from comment #15)
> One thing that might be worth keeping in mind: We compile with -O1 on
> Windows, which optimizes for code size rather than performance. Imposing a
> different optimization level on parts of the codebase probably isn't a good
> idea, but it might be interesting to look into what -O2 would do for
> performance if we're turning off PGO. In bug 1004519, Benjamin Smedberg
> mentioned that -O1 produced better results, but I don't know if those
> results still apply today (with MSVC 2010 and PGO applied selectively).

js/src/ compiles its files with -O2: http://mxr.mozilla.org/mozilla-central/source/js/src/configure.in#1675
Attachment #8447142 - Flags: superreview?(jorendorff) → superreview+
What's missing from comment 10 is analysis is what would happen if we improved the PGO profile data.  The last time that we did this was in bug 472706 more than 5 years ago, so these unfortunate results are expected more or less.  :/  But aren't we approaching the problem backwards here?  There are well known advantages to PGOing code, we just need to examine the JS engine properly during the profile phase, I think.
That doesn't address the ongoing maintenance problems with PGO or the weak speedups we see, even on workloads PGO is optimized for.  However, it does seem useful to measure the effect of adding octane/kraken to the PGO workload now, before we disable PGO and (since PGO builds will surely break within a week) lose the ability to easily measure in the future.  But given the high cost, it seems like PGO has to provide significant measurable wins to justify itself in JS.
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #17)
> What's missing from comment 10 is analysis is what would happen if we
> improved the PGO profile data.  The last time that we did this was in bug
> 472706 more than 5 years ago, so these unfortunate results are expected more
> or less.  :/

As Luke said, the PGO profile run *does* include Sunspider and even there it wins only 1-3 ms if we're lucky *before it regressed last week*. In fact, some pretty harmless patches I landed last week to prepare some string natives for Latin1 chars regressed JS performance across the board (even in unrelated code like asm.js compilation), see comments 7-10. We can't have that kind of sudden performance regression and unpredictability. How am I going to fix this? Backout? What if we uplift a last-minute fix to beta that has the same disastrous effect on PGO builds?

> But aren't we approaching the problem backwards here?  There
> are well known advantages to PGOing code, we just need to examine the JS
> engine properly during the profile phase, I think.

Another problem with MSVC PGO is that it doesn't use the profile data simply to reorder branches or do more heavy inlining in hot code. It also does the opposite: your code is not in the profile? Then let's deoptimize it completely for you. As this bug and bug 977658 show, that can be pretty bad.

Also, we've had multiple security/crash/correctness bugs caused by PGO. I've spent days tracking down these issues; I could have spent that time improving JS performance on all platforms.
Sorry, I hit submit too soon, but I'll add some Octane/Kraken tests to the profile run and see what happens. It can't hurt to measure :)
Adds Kraken to the PGO profile:

https://tbpl.mozilla.org/?tree=Try&rev=123ee263f7be

I'll add Octane as well, but those tests take a lot more time and are more complicated to convert, so there's a bigger risk the profile run will timeout or fail somehow. Will do a separate Try push for that.
(In reply to Jan de Mooij [:jandem] from comment #19)
> (In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment
> #17)
> > What's missing from comment 10 is analysis is what would happen if we
> > improved the PGO profile data.  The last time that we did this was in bug
> > 472706 more than 5 years ago, so these unfortunate results are expected more
> > or less.  :/
> 
> As Luke said, the PGO profile run *does* include Sunspider and even there it
> wins only 1-3 ms if we're lucky *before it regressed last week*.

Right...  Could that be because we spend the majority of our time in JIT generated code there (which would obviously be unaffected by PGO)?

> In fact,
> some pretty harmless patches I landed last week to prepare some string
> natives for Latin1 chars regressed JS performance across the board (even in
> unrelated code like asm.js compilation), see comments 7-10. We can't have
> that kind of sudden performance regression and unpredictability. How am I
> going to fix this? Backout? What if we uplift a last-minute fix to beta that
> has the same disastrous effect on PGO builds?

I understand this pain and agree that this is not great.  But this is a trade-off, right?  So far with other parts of the code that does get coverage during the PGO profile phase, we have evidence that PGO improves the performance of the generated code by 10-20% (depending on the workload), and that is basically why we've dealt with all of the pain it causes us for years.  All I was asking here was for more measurements and thank you for doing that!  It might very well turn out to be the case that because the JS workload is mostly running JIT code, PGO even with good profile data is not very effective for it.  And I ultimately do trust your judgement here.

> > But aren't we approaching the problem backwards here?  There
> > are well known advantages to PGOing code, we just need to examine the JS
> > engine properly during the profile phase, I think.
> 
> Another problem with MSVC PGO is that it doesn't use the profile data simply
> to reorder branches or do more heavy inlining in hot code. It also does the
> opposite: your code is not in the profile? Then let's deoptimize it
> completely for you. As this bug and bug 977658 show, that can be pretty bad.

Yeah, this is a pretty bad side-effect of the PGO implementation in MSVC IMO.  There is no easy way to say exactly what happens, but in my experience it seems like it computes a relative hotness level for function (or some other kind of unit) and then it uses different optimization flags when codegening after the profile gathering phase depending on the hotness factor, which is really surprising to me.  In the past I have searched for ways to opt out of this behavior to no avail. :(

> Also, we've had multiple security/crash/correctness bugs caused by PGO. I've
> spent days tracking down these issues; I could have spent that time
> improving JS performance on all platforms.

Yes, I do share your pain here, I have also dealt with miscompilation MSVC PGO bugs.
BTW I wonder if bug 1029968 is related to PGO as well...
bug 1029968 deals with all PGO builds/perf tests on the 3 windows platforms, so I am pretty certain this is related.
Another argument in favor of turning off PGO in JS is that it help us fold libmozjs back into libxul without a massive PGO memory spike and that, in turn, would allow us to make JSAPI symbols private.
Also add Octane to the profile:

https://tbpl.mozilla.org/?tree=Try&rev=32189370b526

(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #22)
> Right...  Could that be because we spend the majority of our time in JIT
> generated code there (which would obviously be unaffected by PGO)?

Good point, we do spend a lot of time in JIT code these days, on all these benchmarks (especially on asm.js code of course). There's still quite a lot of time in the VM, GC, frontend and compiler though, but for instance Generational GC helps reduce the GC time, lazy parsing and stuff makes the frontend faster and new JIT optimizations often reduce time spent in the VM :)

> Yeah, this is a pretty bad side-effect of the PGO implementation in MSVC
> IMO.  There is no easy way to say exactly what happens, but in my experience
> it seems like it computes a relative hotness level for function (or some
> other kind of unit) and then it uses different optimization flags when
> codegening after the profile gathering phase depending on the hotness
> factor, which is really surprising to me.  In the past I have searched for
> ways to opt out of this behavior to no avail. :(

Ah, I was wondering whether there was a compiler flag/pragma for this, now I don't have to search for it.

Thanks for the info/advice!
(In reply to Luke Wagner [:luke] from comment #25)
> Another argument in favor of turning off PGO in JS is that it help us fold
> libmozjs back into libxul without a massive PGO memory spike and that, in
> turn, would allow us to make JSAPI symbols private.

The last time I tried doing that the issue was that it increased the linking time for libxul by so much that the buildbot by timeout for our jobs would kick in and kill the build job.  :(
See Also: → 1029968
(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #27)
Are you sure it wasn't the PGO timing out?
Both builds finished, so I downloaded the Octane + Kraken build, created a new profile and compared it to the build that turns off PGO in JS (Try link in comment 9). PGO++ is the new PGO build with Octane and Kraken added to the profile:

Sunspider 1.0.2 (ms):

PGO, not js/src: 151.0, 152.7, 151.0, 150.5
PGO++          : 149.6, 148.5, 150.6, 149.5

Kraken 1.1 (ms):

PGO, not js/src: 1052.1, 1057.5, 1054.8, 1052.7, 1053.0
PGO++          : 1059.1, 1041.1, 1057.5, 1037.8, 1050.1

Octane 2 (points):

PGO, not js/src: 26566, 27688, 26069, 26830
PGO++          : 27487, 27624, 27404, 28061

unrealengine.com/html5 compilation time (ms):

PGO, not js/src: 5896, 6001
PGO++          : 7181, 7162

Notes:

* The PGO++ build is on top of inbound revision (3087b33b8037), the one before the PGO regression last week. It wouldn't be fair to use newer revisions as we know they have a PGO perf regression.

* Octane-zlib is an asm.js benchmark. I'm surprised it doesn't affect our Unreal compilation time.

* I don't see much of a difference, so to be really sure the profile includes Kraken and Octane, I'll kick off a PGO build locally and keep an eye on the Firefox window it opens to make sure the numbers it reports make sense. I ran pgo/index.html on Mac before I pushed to Try, but the profile environment is different (it uses a web server etc).
(In reply to Jan de Mooij [:jandem] from comment #29)
> * I don't see much of a difference, so to be really sure the profile
> includes Kraken and Octane, I'll kick off a PGO build locally and keep an
> eye on the Firefox window it opens to make sure the numbers it reports make
> sense.

I just did this and the benchmarks run without any problems. It took about 10 minutes to run Sunspider/Kraken/Octane because the profile build is at least 10x slower on some tests, but it does run all tests.
For posterity, here are the patches to add Kraken 1.1 and Octane 2 to the profile run (compressed because they are ~20 MB).
That is surprising that Octane-zlib didn't fix the asm.js compile-time regression, zlib definitely hits the full compilation path.
Two things I'd suggest trying:

- Create a PGO workload with *only* an asm.js compilation and execution.  See if it makes a difference.

- Redo a number of the PGO tests with Visual Studio 2013.  We're using a 4 year old version of Visual Studio, with compilers that they're actively working on; would be good to know what changes may have happened in the meantime.  What we're seeing might just be a bug that was fixed.
For an asm.js-only workload, I'd recommend headless bananabench.  However, if asm.js-only fixes the perf problem (and not the newer msvc version) then that's a really strong point against PGO.
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #33)
> - Redo a number of the PGO tests with Visual Studio 2013.  We're using a 4
> year old version of Visual Studio, with compilers that they're actively
> working on; would be good to know what changes may have happened in the
> meantime.  What we're seeing might just be a bug that was fixed.

Does anybody here have VS 2013 and can test this for me (PGO enabled vs. PGO disabled in js/src vs. PGO enabled + Kraken/Octane added)? Do we have a timeframe for a VS update?
(In reply to Jan de Mooij [:jandem] from comment #35)
> Does anybody here have VS 2013 and can test this for me (PGO enabled vs. PGO
> disabled in js/src vs. PGO enabled + Kraken/Octane added)?

Nevermind, I'm downloading VS 2013 now so hopefully I can measure this myself.
(In reply to Luke Wagner [:luke] from comment #28)
> (In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment
> #27)
> Are you sure it wasn't the PGO timing out?

Yes!
I tested with VS 2013 and the results (see below) are promising. The perf cliffs on "cold" code seem to be gone for the most part (see the Unreal numbers). There's little difference on asm.js compilation with/without PGO, but on the JS benchmarks PGO is about a 4-7% win.

The PGO implementation in VS 2013 seems to be way more mature (and give more predictable performance) than in VS 2010, see the numbers below. However, I think using PGO with VS 2010 is a mistake; it doesn't win much and it can cause serious perf problems.

I think the right thing to do is disable PGO (in JS) for VS 2010 (or VS < 2013). What do people think?

Sunspider 1.0.2 (ms):

PGO       : 137.6, 136.1, 135.6
PGO++     : 143.7, 137.0, 136.9
PGO not JS: 146.8, 149.1, 145.8

Kraken 1.1 (ms):

PGO       : 1039.2, 1021.9, 1032.1
PGO++     :  991.8,  993.7,  995.4
PGO not JS: 1042.2, 1054.5, 1038.7

Octane 2 (points):

PGO       : 29504, 29287, 28662
PGO++     : 29213, 30482, 30653
PGO not JS: 27476, 27796, 27414

unrealengine.com/html5 compilation time (ms):

PGO       : 5836, 5773, 5765
PGO++     : 5535, 5449, 5443
PGO not JS: 5652, 5616, 5657
What are the different compilers used in the charts above?
"PGO" is VS2010, "PGO++" is VS2013, and "PGO not JS" is VS2013 without PGO?
(In reply to Sean Stangl [:sstangl] from comment #39)
> What are the different compilers used in the charts above?
> "PGO" is VS2010, "PGO++" is VS2013, and "PGO not JS" is VS2013 without PGO?

Sorry should have mentioned that, but all numbers are with VS 2013:

PGO      : no changes to the source code
PGO++    : Kraken/Octane added to the profile
PGO no JS: PGO disabled in js/src
I like tables. So here's all the numbers above:

https://docs.google.com/spreadsheets/d/1WMbOIqT2cxvOPi-2AD6UqvZWfZ89CKbfrevSzYNYBG4/edit?usp=sharing

Light green is 2013, light yellow is 2010.  There's a few 2010 regular PGO runs missing, but I think there are enough numbers there.  The percentages are based on speedup from a VS2010 build with PGO disabled in js/src.

From this, it looks like 2013 is a big win across the board -- we could switch to 2013 and disable PGO and still be at similar speeds to 2010 with PGO.  We could also enable PGO + Octane/Kraken and get >5% in every benchmark (compared to 2010 with PGO+Octane/Kraken).
When is everything moving to vs2013?
Comment on attachment 8447142 [details] [diff] [review]
Patch

Review of attachment 8447142 [details] [diff] [review]:
-----------------------------------------------------------------

r+ if and only if the decision is to stop PGOing JS.
Attachment #8447142 - Flags: review?(mh+mozilla) → review+
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #41)
> I like tables. So here's all the numbers above:
> 
> https://docs.google.com/spreadsheets/d/1WMbOIqT2cxvOPi-
> 2AD6UqvZWfZ89CKbfrevSzYNYBG4/edit?usp=sharing

Thanks for doing that!

> From this, it looks like 2013 is a big win across the board -- we could
> switch to 2013 and disable PGO and still be at similar speeds to 2010 with
> PGO.  We could also enable PGO + Octane/Kraken and get >5% in every
> benchmark (compared to 2010 with PGO+Octane/Kraken).

Yes definitely. From what I saw, 2013 really looks like a much better compiler (in terms of performance, especially PGO) than 2010. (I'm also pretty sure Firefox PGO compilation time improved, but I didn't measure this explicitly.)

But we have to decide what we're going to do until we switch to 2013. Disabling PGO for 2010 would:

(1) Fix the PGO perf regression (introduced last week) on pretty much everything.
(2) Give us better performance on "cold" code (see the -17.05% in your table)

If we *don't* disable PGO for 2010, we have to spend time tracking down and fixing (1). We also have to take the perf hit from (2) until we switch to 2013.
Flags: needinfo?(vladimir)
Ehsan is working on getting us on 2013; it's currently blocked by bugs linked from bug 914596, with the biggest issue being bug 1023941.  If we need to accelerate that we should make that case and plan for it -- assuming there aren't any issues along the way, it seems like a reasonable thing to do in Q3 (i.e. over the next three months).

But yes, we do have to decide what to do in the meantime.  All of the options suck, but disabling PGO sounds like the least bad one -- gives us predictable performance and reduces bugs.
Flags: needinfo?(vladimir) → needinfo?(ehsan)
OK, it looks like the PGO regression from last week (bug 1029968) got magically fixed by another, unrelated string patch. Still scary though because I've more string patches in flight and I don't want it to break again...
(In reply to Vladimir Vukicevic [:vlad] [:vladv] from comment #45)
> Ehsan is working on getting us on 2013; it's currently blocked by bugs
> linked from bug 914596, with the biggest issue being bug 1023941.  If we
> need to accelerate that we should make that case and plan for it -- assuming
> there aren't any issues along the way, it seems like a reasonable thing to
> do in Q3 (i.e. over the next three months).

David Major has been the person who is actually doing the work, but AFAIK he has other stuff on his plate too, so it's hard to say when this will happen.  That being said, we have removed all of the technical roadblocks in our way now so the rest is just going to be work.  Also, please note that traditionally with compiler switches we've seen things such as new crashes and whatnot show up so it will be something that needs an active owner for us to be able to pull it off.  Not sure if David is looking to own this as a Q3 goal or not.

> But yes, we do have to decide what to do in the meantime.  All of the
> options suck, but disabling PGO sounds like the least bad one -- gives us
> predictable performance and reduces bugs.

If you decide to disable PGO in js/src, can you please only do it for 2010?  The current patch completely disables it on Windows without regard to the compiler version.
Flags: needinfo?(ehsan)
According to bug 1029422, the Windows PGO regression appears to be back, again after a pretty simple string-related change..

(In reply to :Ehsan Akhgari (lagging on bugmail, needinfo? me!) from comment #47)
> If you decide to disable PGO in js/src, can you please only do it for 2010? 
> The current patch completely disables it on Windows without regard to the
> compiler version.

Yes, I'll write a new patch right now.
Attached patch Patch v2Splinter Review
Only disable PGO for MSVC 2010.
Attachment #8447142 - Attachment is obsolete: true
Attachment #8456045 - Flags: review?(mh+mozilla)
(In reply to Jan de Mooij [:jandem] from comment #49)
> Created attachment 8456045 [details] [diff] [review]
> Patch v2
> 
> Only disable PGO for MSVC 2010.

Would this description be a good summary if this patch lands? :

"We won't have Windows PGO builds on any branch or release channel (where the patch would get applied - I'm guessing uplifts might follow?) until we switch to VS 2013".
(In reply to Avi Halachmi (:avih) from comment #50)
> Would this description be a good summary if this patch lands? :
> 
> "We won't have Windows PGO builds on any branch or release channel (where
> the patch would get applied - I'm guessing uplifts might follow?) until we
> switch to VS 2013".

We are not turning off PGO completely, we will just disable it in 1 directory (js/src) until we switch to VS 2013.
Attachment #8456045 - Flags: review?(mh+mozilla) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/f93457a01cf7

Windows PGO Try push: https://tbpl.mozilla.org/?tree=Try&rev=8e89e911813c

I also verified this still improves performance a lot on SS and the Unreal Engine demo.
https://hg.mozilla.org/mozilla-central/rev/f93457a01cf7
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
You need to log in before you can comment on or make changes to this bug.