Closed Bug 494095 Opened 11 years ago Closed 11 years ago

Use -O3 for Mac Builds

Categories

(Firefox Build System :: General, defect)

x86
macOS
defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED
mozilla1.9.2a1

People

(Reporter: damons, Assigned: vlad)

Details

Attachments

(3 files)

From a quick analysis by Josh Aas:

I finished the first round of tests, Firefox 3.5 Mac OS X release builds, the only difference being -O2 vs. -O3. I'll find a better way to publish this data soon. 


The basic rundown is this though:

Mac OS X 10.5.2 (10.4u SDK), gcc 4.01
Firefox 3.5 code from May 14, 2009

Universal binary dmg size: -O2 is 16.9 MB, -O3 is 17.8 MB

Ts: About the same.

Tp: -O3 is 3-4% faster on average, 22% faster in the worst case.

Tjss: -O3 is about 29% faster.

Tsspider: -O3 is about 2% faster.
Josh, I'm not sure what this means:

Tp: -O3 is 3-4% faster on average, 22% faster in the worst case.

???
Juggling components, giving to vlad for quick patch tonight.
Assignee: nobody → vladimir
Component: MozillaBuild → Build Config
Product: mozilla.org → Core
QA Contact: mozillabuild → build-config
Target Milestone: --- → mozilla1.9.1
Version: other → unspecified
"The worst case" means worst case for each, so the cache-cold case, sorry, the
wording is confusing. In the cold case -O3 did 22% better than -O2, but that
advantage dropped to 3-4% for subsequent "hot" runs.
this is gcc 4.01 in both cases, right?
This is a zip file containing the full talos results for both builds, the builds themselves, a summary of how the test was done (yes, gcc-4.01 for both builds), and an OO.org spreadsheet with some of the numbers punched in.

http://joshaas.net/mozilla/talos-O2-vs-O3.zip
Here is the Linux x86_64 (Ubuntu 9.04, gcc-4.3.3) talos output for -Os (current default) and -O3 optimized builds and a test summary text file containing information about the host, mozconfigs, hardware.

I don't have time to analyze this tonight, here it is if anyone else cares to compare the results.
Use -O3 for MOZ_OPTIMIZE_FLAGS for both mozilla and js.  Note that there's -O2 in nspr as well, but I don't know what the rules etc. are for modifying NSPR's configure.in.
Attachment #378794 - Flags: review?(shaver)
Attachment #378794 - Flags: review+
Comment on attachment 378794 [details] [diff] [review]
use -O3 for darwin
[Checkin: Comment 24]

shaver is gone for now, I'll r+ to try this out asap.
Pushed.  Maybe a separate bug on nspr?
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Talos Improvement: DHTML decrease 0.65% on Tiger Firefox
Improvement: SunSpider decrease 1.34% on Leopard Firefox
Improvement: DHTML decrease 1.94% on Leopard Firefox
Improvement: Tp3 decrease 1.28% on Leopard Firefox

Looks like you moved the needle a little bit. I guess we need -fomit-frame-pointer to really win (bug 492688).
Has anyone researched how invasive the /O3 optimizations are or how the changes might tickle our particular codebase?   To paraphase from http://compilers.iecc.com/comparch/article/09-05-073  "...compiler optimizations are *designed* to to make buggy code crash, in the hopes that it will turn up crash bugs that can get fixed in a 'compiler evaluation phase'" 

How confident are we that a talos and a few automated test runs are going to provide the full breadth of test coverage needed to turn up race condition, timing, and crash bugs across the entire app, and allow them to get fixed in an RC cycle?

Its great to be doing this work, but should it really get a beta cycle?
I want to be clear that I'm not necessarily endorsing this change, I just provided data. I don't know enough about the consequences of -O3 to be confident that this change is safe so late in the game. We should get opinions from people who are more knowledgeable about compilers before making this change.
Yeah, let's see how it goes on m-c for a bit.
Whiteboard: [DO NOT COMMIT TO 1.9.1 BRANCH YET]
are there specs for what changes between O2 and O3 that can be added to this
bug?

would it be worthwhile for super-revier-ish and compiler-developer-ish people
to glance over the specs to see if they notice optimizations that might tickle
areas of our code base?

If we take the change the goal is to preempt a situation and timeline like in
https://bugzilla.mozilla.org/show_bug.cgi?id=475178

2009-01-13 optimizations added
2009-01-24 bug found
2009-02-16 Test case minimized after lots of hard work.
2009-02-27 patch

Ted thinks running unit tests would have caught this bug.  dbaron has tracked down similar bugs in the past and I think there are others and is also generally confident about taking the change, so that's good.

I guess I'm just more concerned about the amount of code coverage we have for UI interaction and 'the web'.  

It sounds like we should also examine all the test logs pretty closely for any suspicious changes that do turn up after optimizations are added.  That might be another mitigation strategy.

We have some new tools and process to catch and reproduce 1-click drive-by web page crashes out of the socorro data and bclary/tomcat automation, but that requires getting the changes out to 100k users.

Some analysis on the soccoro & hendrix data also shows lots of undiagnosed crashes on complicated web apps like live mail, gmail and a bunch of facebok apps, and problems are made worse with introduction of addons.   pounding on these with adhoc testing needs to be done anyway to get more data on what's going on but might be an espcially good thing to do now if we take this change.

anyone see changes in the memory use profile over pageloading cycles or other tests?
-O3 turns on the following optimizations on top of -O2

-finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload and -ftree-vectorize . 

Ref: http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

This 'may' impact the branch coverage [-funswitch-loops option ] and function coverage [ -finline-functions] results.
(In reply to comment #14)
> Ted thinks running unit tests would have caught this bug.  dbaron has tracked
> down similar bugs in the past and I think there are others and is also
> generally confident about taking the change, so that's good.

I'm certainly don't know much about compiler optimisations, and this may not be relevant here, however I think it may be worth pointing out:

Thunderbird's unit tests on trunk have already picked up two regressions from tracemonkey landings that Firefox's unit tests didn't. Some of this is probably because we're pushing the limits of js a bit more, however it does indicate to me that Firefox has less coverage in some areas (and even combined we may be missing quite a bit still).

Whether or not this applies in this situation, I don't know. So please use or throw away as appropriate.
-O2 doesn't have -finline-functions? aroo?  I think Apple's GCC is a special snowflake, though, so we should compare the specs entries or use -verbose to make sure we're talking about the right things anyway.

Looking at the reports of the "optimized builds" in the wild that also have -O3, there are indeed additional anecdotes of things not working, or being more crashy.  Not enough to investigate, but enough to put me over the edge on this issue for 3.5.  Let's not block or take for 3.5.0, but re-evaluate for a future 3.5.x.
Flags: blocking1.9.1+ → wanted1.9.1.x?
Whiteboard: [DO NOT COMMIT TO 1.9.1 BRANCH YET]
(In reply to comment #16)
 
> Thunderbird's unit tests on trunk have already picked up two regressions from
> tracemonkey landings that Firefox's unit tests didn't. 

Were there bugs filed for those?
(In reply to comment #18)
> (In reply to comment #16)
> 
> > Thunderbird's unit tests on trunk have already picked up two regressions from
> > tracemonkey landings that Firefox's unit tests didn't. 
> 
> Were there bugs filed for those?

Yes, both were fixed and now have tests - Bug 494045 and bug 485889.
(In reply to comment #19)

> 
> Yes, both were fixed and now have tests - Bug 494045 and bug 485889.

Perfect.  Thanks!
(In reply to comment #17)
> Looking at the reports of the "optimized builds" in the wild that also have
> -O3, there are indeed additional anecdotes of things not working, or being more
> crashy. 

Yes, I found out the same with my own experiences. In the past I've done a lot of testings with different flags, but only for Thunderbird. My knowledge is only based on trial and error. In my experiences -O3 have brought me a build that was a bit crashy, especially if I try to quit it. Therefore I've tried out -O2 and options that are additionally enabled with -O3. You can use e.g.
$ echo 'int main(){return 0;}' > test.c && gcc-4.2 -v -Q -O3 test.c -o test && rm test.c test
to see which options are enabled with -O3 (or -O2). Than you will see that -O3 also enables -fgcse-after-reload,  -finline-functions-called-once and -funswitch-loops (in Apples GCC 4.2).
At the moment I use "-O2 -fgcse-after-reload", this gives me a build that is faster than -O2, but not crashy like -O3. In my experiences it is the -finline-functions that makes the build crashy. -funswitch-loops is also stable, but I don't use it, because it makes the builds size bigger.
Tested on Mac OS X 10.5 with Apples GCC 4.2 and only with Thunderbird 3.

(My complete flag I use at the moment is "-O2 -fstack-protector -param=ssp-buffer-size=4 -march=core2 -msse4.1 -fpeel-loops -fgcse-after-reload -ftree-vectorize", but this is architecture dependent and contains options that are only beneficial on my machine, for whatever reason).
For OSX intel, we should absolutely be using the architecture/etc. flags, though perhaps not tree-vectorize -- we know the minimum architecture that will be supported there.  Trying gcc 4.4 with -O3 may also be interesting.
(In reply to comment #22)
> For OSX intel, we should absolutely be using the architecture/etc. flags,
> though perhaps not tree-vectorize -- we know the minimum architecture that will
> be supported there.  Trying gcc 4.4 with -O3 may also be interesting.

I thought mac gcc already used some decent minimum arch flag? Also, josh has noted that gcc 4.2+ builds will only work on 10.5+, so we need to cross that bridge first.
(In reply to comment #9)
> Pushed.

http://hg.mozilla.org/mozilla-central/rev/26748fc758c8
Flags: in-testsuite-
Target Milestone: mozilla1.9.1 → mozilla1.9.2a1
Version: unspecified → Trunk
Comment on attachment 427643 [details] [diff] [review]
(Bv1) Copy it to comm-central
[Checkin: Comment 27]

What's wrong with sticking this in a mailnews core bug so that we can track what we're fixing in mailnews?
Attachment #427643 - Flags: review?(bugzilla)
Attachment #427643 - Flags: review?(bugzilla)
Attachment #427643 - Flags: review?(bugzilla) → review+
Comment on attachment 427643 [details] [diff] [review]
(Bv1) Copy it to comm-central
[Checkin: Comment 27]


http://hg.mozilla.org/comm-central/rev/a697b40649a3
Attachment #427643 - Attachment description: (Bv1) Copy it to comm-central → (Bv1) Copy it to comm-central [Checkin: Comment 27]
Attachment #378794 - Attachment description: use -O3 for darwin → use -O3 for darwin [Checkin: Comment 24]
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.