Closed Bug 494095 Opened 16 years ago Closed 16 years ago

Use -O3 for Mac Builds

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

mozilla1.9.2a1

People

(Reporter: damons, Assigned: vlad)

Details

Attachments

(3 files)

linux x86_64 -Os vs. -O3 16 years ago Josh Aas 13.55 KB, application/octet-stream		Details
use -O3 for darwin [Checkin: Comment 24] 16 years ago Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond) 938 bytes, patch	sayrer : review+	Details \| Diff \| Splinter Review
(Bv1) Copy it to comm-central [Checkin: Comment 27] 15 years ago Serge Gautherie (:sgautherie) 790 bytes, patch	standard8 : review+	Details \| Diff \| Splinter Review

Damon Sicore (:damons)

Reporter

Description

•

16 years ago

From a quick analysis by Josh Aas: I finished the first round of tests, Firefox 3.5 Mac OS X release builds, the only difference being -O2 vs. -O3. I'll find a better way to publish this data soon. The basic rundown is this though: Mac OS X 10.5.2 (10.4u SDK), gcc 4.01 Firefox 3.5 code from May 14, 2009 Universal binary dmg size: -O2 is 16.9 MB, -O3 is 17.8 MB Ts: About the same. Tp: -O3 is 3-4% faster on average, 22% faster in the worst case. Tjss: -O3 is about 29% faster. Tsspider: -O3 is about 2% faster.

Damon Sicore (:damons)

Reporter

Comment 1

•

16 years ago

Josh, I'm not sure what this means: Tp: -O3 is 3-4% faster on average, 22% faster in the worst case. ???

Mike Shaver (:shaver emeritus)

Comment 2

•

16 years ago

Juggling components, giving to vlad for quick patch tonight.

Assignee: nobody → vladimir

Component: MozillaBuild → Build Config

Product: mozilla.org → Core

QA Contact: mozillabuild → build-config

Target Milestone: --- → mozilla1.9.1

Version: other → unspecified

Josh Aas

Comment 3

•

16 years ago

"The worst case" means worst case for each, so the cache-cold case, sorry, the wording is confusing. In the cold case -O3 did 22% better than -O2, but that advantage dropped to 3-4% for subsequent "hot" runs.

Robert Sayre

Comment 4

•

16 years ago

this is gcc 4.01 in both cases, right?

Josh Aas

Comment 5

•

16 years ago

This is a zip file containing the full talos results for both builds, the builds themselves, a summary of how the test was done (yes, gcc-4.01 for both builds), and an OO.org spreadsheet with some of the numbers punched in. http://joshaas.net/mozilla/talos-O2-vs-O3.zip

Mike Shaver (:shaver emeritus)

Updated

•

16 years ago

Flags: blocking1.9.1+

Josh Aas

Comment 6

•

16 years ago

Attached file linux x86_64 -Os vs. -O3 — Details

Here is the Linux x86_64 (Ubuntu 9.04, gcc-4.3.3) talos output for -Os (current default) and -O3 optimized builds and a test summary text file containing information about the host, mozconfigs, hardware. I don't have time to analyze this tonight, here it is if anyone else cares to compare the results.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 7

•

16 years ago

Attached patch use -O3 for darwin [Checkin: Comment 24] — Details — Splinter Review

Use -O3 for MOZ_OPTIMIZE_FLAGS for both mozilla and js. Note that there's -O2 in nspr as well, but I don't know what the rules etc. are for modifying NSPR's configure.in.

Attachment #378794 - Flags: review?(shaver)

Robert Sayre

Updated

•

16 years ago

Attachment #378794 - Flags: review+

Robert Sayre

Comment 8

•

16 years ago

Comment on attachment 378794 [details] [diff] [review] use -O3 for darwin [Checkin: Comment 24] shaver is gone for now, I'll r+ to try this out asap.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 9

•

16 years ago

Pushed. Maybe a separate bug on nspr?

Status: NEW → RESOLVED

Closed: 16 years ago

Resolution: --- → FIXED

(not currently active) Ted Mielczarek

Comment 10

•

16 years ago

Talos Improvement: DHTML decrease 0.65% on Tiger Firefox Improvement: SunSpider decrease 1.34% on Leopard Firefox Improvement: DHTML decrease 1.94% on Leopard Firefox Improvement: Tp3 decrease 1.28% on Leopard Firefox Looks like you moved the needle a little bit. I guess we need -fomit-frame-pointer to really win (bug 492688).

chris hofmann

Comment 11

•

16 years ago

Has anyone researched how invasive the /O3 optimizations are or how the changes might tickle our particular codebase? To paraphase from http://compilers.iecc.com/comparch/article/09-05-073 "...compiler optimizations are *designed* to to make buggy code crash, in the hopes that it will turn up crash bugs that can get fixed in a 'compiler evaluation phase'" How confident are we that a talos and a few automated test runs are going to provide the full breadth of test coverage needed to turn up race condition, timing, and crash bugs across the entire app, and allow them to get fixed in an RC cycle? Its great to be doing this work, but should it really get a beta cycle?

Josh Aas

Comment 12

•

16 years ago

I want to be clear that I'm not necessarily endorsing this change, I just provided data. I don't know enough about the consequences of -O3 to be confident that this change is safe so late in the game. We should get opinions from people who are more knowledgeable about compilers before making this change.

Mike Shaver (:shaver emeritus)

Comment 13

•

16 years ago

Yeah, let's see how it goes on m-c for a bit.

Mike Shaver (:shaver emeritus)

Updated

•

16 years ago

Whiteboard: [DO NOT COMMIT TO 1.9.1 BRANCH YET]

chris hofmann

Comment 14

•

16 years ago

are there specs for what changes between O2 and O3 that can be added to this bug? would it be worthwhile for super-revier-ish and compiler-developer-ish people to glance over the specs to see if they notice optimizations that might tickle areas of our code base? If we take the change the goal is to preempt a situation and timeline like in https://bugzilla.mozilla.org/show_bug.cgi?id=475178 2009-01-13 optimizations added 2009-01-24 bug found 2009-02-16 Test case minimized after lots of hard work. 2009-02-27 patch Ted thinks running unit tests would have caught this bug. dbaron has tracked down similar bugs in the past and I think there are others and is also generally confident about taking the change, so that's good. I guess I'm just more concerned about the amount of code coverage we have for UI interaction and 'the web'. It sounds like we should also examine all the test logs pretty closely for any suspicious changes that do turn up after optimizations are added. That might be another mitigation strategy. We have some new tools and process to catch and reproduce 1-click drive-by web page crashes out of the socorro data and bclary/tomcat automation, but that requires getting the changes out to 100k users. Some analysis on the soccoro & hendrix data also shows lots of undiagnosed crashes on complicated web apps like live mail, gmail and a bunch of facebok apps, and problems are made worse with introduction of addons. pounding on these with adhoc testing needs to be done anyway to get more data on what's going on but might be an espcially good thing to do now if we take this change. anyone see changes in the memory use profile over pageloading cycles or other tests?

Murali Nandigama [:murali]

Comment 15

•

16 years ago

-O3 turns on the following optimizations on top of -O2 -finline-functions, -funswitch-loops, -fpredictive-commoning, -fgcse-after-reload and -ftree-vectorize . Ref: http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html This 'may' impact the branch coverage [-funswitch-loops option ] and function coverage [ -finline-functions] results.

Mark Banner (:standard8)

Comment 16

•

16 years ago

(In reply to comment #14) > Ted thinks running unit tests would have caught this bug. dbaron has tracked > down similar bugs in the past and I think there are others and is also > generally confident about taking the change, so that's good. I'm certainly don't know much about compiler optimisations, and this may not be relevant here, however I think it may be worth pointing out: Thunderbird's unit tests on trunk have already picked up two regressions from tracemonkey landings that Firefox's unit tests didn't. Some of this is probably because we're pushing the limits of js a bit more, however it does indicate to me that Firefox has less coverage in some areas (and even combined we may be missing quite a bit still). Whether or not this applies in this situation, I don't know. So please use or throw away as appropriate.

Mike Shaver (:shaver emeritus)

Comment 17

•

16 years ago

-O2 doesn't have -finline-functions? aroo? I think Apple's GCC is a special snowflake, though, so we should compare the specs entries or use -verbose to make sure we're talking about the right things anyway. Looking at the reports of the "optimized builds" in the wild that also have -O3, there are indeed additional anecdotes of things not working, or being more crashy. Not enough to investigate, but enough to put me over the edge on this issue for 3.5. Let's not block or take for 3.5.0, but re-evaluate for a future 3.5.x.

Flags: blocking1.9.1+ → wanted1.9.1.x?

Mike Shaver (:shaver emeritus)

Updated

•

16 years ago

Whiteboard: [DO NOT COMMIT TO 1.9.1 BRANCH YET]

Damon Sicore (:damons)

Reporter

Comment 18

•

16 years ago

(In reply to comment #16) > Thunderbird's unit tests on trunk have already picked up two regressions from > tracemonkey landings that Firefox's unit tests didn't. Were there bugs filed for those?

Mark Banner (:standard8)

Comment 19

•

16 years ago

(In reply to comment #18) > (In reply to comment #16) > > > Thunderbird's unit tests on trunk have already picked up two regressions from > > tracemonkey landings that Firefox's unit tests didn't. > > Were there bugs filed for those? Yes, both were fixed and now have tests - Bug 494045 and bug 485889.

Damon Sicore (:damons)

Reporter

Comment 20

•

16 years ago

(In reply to comment #19) > > Yes, both were fixed and now have tests - Bug 494045 and bug 485889. Perfect. Thanks!

Nomis101

Comment 21

•

16 years ago

(In reply to comment #17) > Looking at the reports of the "optimized builds" in the wild that also have > -O3, there are indeed additional anecdotes of things not working, or being more > crashy. Yes, I found out the same with my own experiences. In the past I've done a lot of testings with different flags, but only for Thunderbird. My knowledge is only based on trial and error. In my experiences -O3 have brought me a build that was a bit crashy, especially if I try to quit it. Therefore I've tried out -O2 and options that are additionally enabled with -O3. You can use e.g. $ echo 'int main(){return 0;}' > test.c && gcc-4.2 -v -Q -O3 test.c -o test && rm test.c test to see which options are enabled with -O3 (or -O2). Than you will see that -O3 also enables -fgcse-after-reload, -finline-functions-called-once and -funswitch-loops (in Apples GCC 4.2). At the moment I use "-O2 -fgcse-after-reload", this gives me a build that is faster than -O2, but not crashy like -O3. In my experiences it is the -finline-functions that makes the build crashy. -funswitch-loops is also stable, but I don't use it, because it makes the builds size bigger. Tested on Mac OS X 10.5 with Apples GCC 4.2 and only with Thunderbird 3. (My complete flag I use at the moment is "-O2 -fstack-protector -param=ssp-buffer-size=4 -march=core2 -msse4.1 -fpeel-loops -fgcse-after-reload -ftree-vectorize", but this is architecture dependent and contains options that are only beneficial on my machine, for whatever reason).

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Comment 22

•

16 years ago

For OSX intel, we should absolutely be using the architecture/etc. flags, though perhaps not tree-vectorize -- we know the minimum architecture that will be supported there. Trying gcc 4.4 with -O3 may also be interesting.

(not currently active) Ted Mielczarek

Comment 23

•

16 years ago

(In reply to comment #22) > For OSX intel, we should absolutely be using the architecture/etc. flags, > though perhaps not tree-vectorize -- we know the minimum architecture that will > be supported there. Trying gcc 4.4 with -O3 may also be interesting. I thought mac gcc already used some decent minimum arch flag? Also, josh has noted that gcc 4.2+ builds will only work on 10.5+, so we need to cross that bridge first.

Vladimir Vukicevic [:vlad] [:vladv] (needinfo me, slow to respond)

Assignee

Updated

•

16 years ago

Attachment #378794 - Flags: review?(shaver)

Serge Gautherie (:sgautherie)

Comment 24

•

15 years ago

(In reply to comment #9) > Pushed. http://hg.mozilla.org/mozilla-central/rev/26748fc758c8

Flags: in-testsuite-

Target Milestone: mozilla1.9.1 → mozilla1.9.2a1

Version: unspecified → Trunk

Serge Gautherie (:sgautherie)

Comment 25

•

15 years ago

Attached patch (Bv1) Copy it to comm-central [Checkin: Comment 27] — Details — Splinter Review

Attachment #427643 - Flags: review?(bugzilla)

Mark Banner (:standard8)

Comment 26

•

15 years ago

Comment on attachment 427643 [details] [diff] [review] (Bv1) Copy it to comm-central [Checkin: Comment 27] What's wrong with sticking this in a mailnews core bug so that we can track what we're fixing in mailnews?

Attachment #427643 - Flags: review?(bugzilla)

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

Attachment #427643 - Flags: review?(bugzilla)

Mark Banner (:standard8)

Updated

•

15 years ago

Attachment #427643 - Flags: review?(bugzilla) → review+

Serge Gautherie (:sgautherie)

Comment 27

•

15 years ago

Comment on attachment 427643 [details] [diff] [review] (Bv1) Copy it to comm-central [Checkin: Comment 27] http://hg.mozilla.org/comm-central/rev/a697b40649a3

Attachment #427643 - Attachment description: (Bv1) Copy it to comm-central → (Bv1) Copy it to comm-central [Checkin: Comment 27]

Serge Gautherie (:sgautherie)

Updated

•

15 years ago

Attachment #378794 - Attachment description: use -O3 for darwin → use -O3 for darwin [Checkin: Comment 24]

Mike Shaver (:shaver emeritus)

Updated

•

15 years ago

Flags: wanted1.9.1.x?

BMO Automation

Updated

•

7 years ago

Product: Core → Firefox Build System

You need to log in before you can comment on or make changes to this bug.