Use -O3 -fomit-frame-pointer instead of -Os on Mac OS X

RESOLVED FIXED

Status

()

Core
JavaScript Engine
P1
normal
RESOLVED FIXED
9 years ago
7 years ago

People

(Reporter: gal, Assigned: Paul Biggar)

Tracking

Trunk
x86
Mac OS X
Points:
---
Dependency tree / graph
Bug Flags:
blocking1.9.2 -

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: fixed-in-tracemonkey)

Attachments

(1 attachment)

(Reporter)

Description

9 years ago
We get a 5.5% (50ms) speedup with gcc on MacOSX when using -O3 and -fomit-frame-pointer instead of -Os. Peak speedup is 16% for md5. Everything seems to get faster. spectral-norm is probably a measurement outlier.
(Reporter)

Comment 1

9 years ago
TEST                   COMPARISON            FROM                 TO             DETAILS

=============================================================================

** TOTAL **:           1.055x as fast    985.8ms +/- 0.7%   934.0ms +/- 0.2%     significant

=============================================================================

  3d:                  1.067x as fast    149.1ms +/- 0.5%   139.7ms +/- 0.5%     significant
    cube:              1.047x as fast     44.1ms +/- 0.8%    42.1ms +/- 1.6%     significant
    morph:             1.040x as fast     30.2ms +/- 0.5%    29.1ms +/- 0.5%     significant
    raytrace:          1.092x as fast     74.8ms +/- 0.7%    68.5ms +/- 0.4%     significant

  access:              1.054x as fast    136.4ms +/- 1.0%   129.4ms +/- 0.3%     significant
    binary-trees:      1.055x as fast     41.1ms +/- 0.3%    38.9ms +/- 0.4%     significant
    fannkuch:          1.053x as fast     56.4ms +/- 1.3%    53.5ms +/- 0.3%     significant
    nbody:             1.050x as fast     25.8ms +/- 1.3%    24.6ms +/- 0.7%     significant
    nsieve:            1.061x as fast     13.2ms +/- 2.2%    12.4ms +/- 1.1%     significant

  bitops:              1.094x as fast     37.6ms +/- 7.9%    34.4ms +/- 0.7%     significant
    3bit-bits-in-byte: -                   1.7ms +/- 7.7%     1.6ms +/- 9.1% 
    bits-in-byte:      -                   9.4ms +/- 28.6%     7.8ms +/- 1.6% 
    bitwise-and:       1.084x as fast      2.8ms +/- 4.2%     2.6ms +/- 5.3%     significant
    nsieve-bits:       1.054x as fast     23.6ms +/- 1.2%    22.4ms +/- 0.6%     significant

  controlflow:         1.010x as fast     33.1ms +/- 0.8%    32.7ms +/- 0.4%     significant
    recursive:         1.010x as fast     33.1ms +/- 0.8%    32.7ms +/- 0.4%     significant

  crypto:              1.118x as fast     61.2ms +/- 0.5%    54.7ms +/- 0.5%     significant
    aes:               1.108x as fast     36.7ms +/- 0.8%    33.1ms +/- 0.7%     significant
    md5:               1.164x as fast     16.1ms +/- 0.5%    13.8ms +/- 0.8%     significant
    sha1:              1.082x as fast      8.4ms +/- 1.7%     7.8ms +/- 1.7%     significant

  date:                1.063x as fast    138.9ms +/- 0.3%   130.7ms +/- 0.2%     significant
    format-tofte:      1.059x as fast     68.0ms +/- 0.3%    64.2ms +/- 0.2%     significant
    format-xparb:      1.066x as fast     70.9ms +/- 0.3%    66.5ms +/- 0.2%     significant

  math:                -                  39.5ms +/- 5.8%    38.6ms +/- 0.5% 
    cordic:            -                  20.0ms +/- 11.3%    19.1ms +/- 0.6% 
    partial-sums:      1.030x as fast     13.6ms +/- 1.0%    13.2ms +/- 0.9%     significant
    spectral-norm:     *1.061x as slow*    5.9ms +/- 1.1%     6.3ms +/- 2.1%     significant

  regexp:              -                  45.6ms +/- 0.4%    45.4ms +/- 0.4% 
    dna:               -                  45.6ms +/- 0.4%    45.4ms +/- 0.4% 

  string:              1.049x as fast    344.4ms +/- 0.2%   328.4ms +/- 0.1%     significant
    base64:            1.042x as fast     17.0ms +/- 0.7%    16.3ms +/- 0.8%     significant
    fasta:             1.082x as fast     76.1ms +/- 0.2%    70.3ms +/- 0.3%     significant
    tagcloud:          1.055x as fast    102.1ms +/- 0.2%    96.8ms +/- 0.2%     significant
    unpack-code:       1.021x as fast    118.0ms +/- 0.3%   115.6ms +/- 0.2%     significant
    validate-input:    1.064x as fast     31.3ms +/- 0.8%    29.4ms +/- 0.5%     significant
Summary: TM: Use -O3 -fomit-frame-pointer instead of -Os → Use -O3 -fomit-frame-pointer instead of -Os

Updated

9 years ago
Flags: blocking1.9.2+
(Reporter)

Comment 2

9 years ago
Ted, how are we doing with breakpad and omit-frame-pointer?
Holy bl33p!

Ted, I'll help hack breakpad if it comes to that (gdb/gcc hacker alum, retired undefeated, 18 tko 2 ko 1992-1995 MicroUnity ;-).

/be
(In reply to comment #2)
> Ted, how are we doing with breakpad and omit-frame-pointer?

We're not doing anything, I'm not working on it. See bug 464750. Happy to help someone with it, but I don't know enough about DWARF to do the work myself.
Depends on: 464750
Is Google doing anything upstream?  I'd be surprised if the Chrome-on-Mac team was willing to eat the frame-pointer cost, at least in the long term.
Who knows? They don't talk about what they're working on, they just show up with patches.
Do you know anyone there who we could ask?  Are they building Chrome-on-Mac with -fomit-frame-pointer?
More or less dup of bug 484275?
(In reply to comment #7)
> Do you know anyone there who we could ask?  Are they building Chrome-on-Mac
> with -fomit-frame-pointer?

I asked mento today. He's no longer actively working on breakpad but has his ear to the ground. He is not aware of any plans to support -fomit-frame-pointer. I think we're going to have to do that work if we want this win.

Comment 10

9 years ago
Mozilla has two folks (Graydon Hoare and myself) who are interested in doing it.  It's on the schedule.
We switched to -O3 in bug 494095.

Comment 12

9 years ago
Is -fomit-frame-pointer really work for you? If I build a Mozilla application with CFLAGS and CXXFLAGS=-fomit-frame-pointer (in my case Thunderbird), I get everytime an application that won't start (crash at startup). On PPC and Intel. Mac OS X 10.5. With Apples gcc 4.0 and 4.2. I've tried this several times now.
(Reporter)

Comment 13

9 years ago
I only built the JS VM with -fomit-frame-pointer and that seems to work pretty well.

Comment 14

9 years ago
iirc, -fomit-frame-pointer will break xptcall which is why it'd be fine for a spidermonkey standalone but on average a disaster for a gecko.
If we are going to do this, it must go in the beta.  Also, assigning to Sayre.
Assignee: general → sayrer
Priority: -- → P1
Seems like it'd be tough to get this into the beta at this point, unless Jim is basically done with bug 464750.

Comment 17

8 years ago
I'm not basically done with 464750; been giving priority to ES5 strict mode.  There's still substantial work needed there.

Updated

8 years ago
Depends on: 517832
Doesn't look like this is going to make 1.9.2.
Flags: blocking1.9.2+ → blocking1.9.2-

Comment 19

8 years ago
For what it's worth: I've broken down 464750 into tasks, and tried to get the dependencies right.  So if we felt this sufficiently high-priority to warrant assigning more than one person to the task, there are opportunities for parallelism there.  In addition to myself, Graydon Hoare is familiar with the appropriate parts of DWARF, as is (I think) Julian Seward.

Comment 20

8 years ago
This can be enabled on Linux now. I'm still working on the Mac dumper.
(Reporter)

Comment 21

8 years ago
++jimb

Comment 22

8 years ago
(In reply to comment #20)
> This can be enabled on Linux now. I'm still working on the Mac dumper.

Um, to be clear: we need to import the new upstream sources, and then it can be enabled on Linux.
Depends on: 554024

Comment 23

8 years ago
The upstream sources have been imported (bug 548113).  We need to get the Soccoro processor code updated (bug
Depends on: 554019

Comment 24

8 years ago
Um, I was saying... We need to get the Soccoro processor code updated (bug 554019) and switch to DWARF on Linux (bug 554024), before we can do this.
Status: NEW → ASSIGNED

Comment 25

8 years ago
One consequence of this change will be that crashes like this one:

http://crash-stats.mozilla.com/report/index/d751e554-9356-4b76-bd2e-7e6de2100322

for which we are unable to find symbol files, will probably become less reliable. At the moment, we can trace through those frames using the saved frame pointers, but those will be gone.  We will probably fall back on scanning the stack for words that look like return addresses; Ted says this works well enough on Windows, with caveats.

Updated

8 years ago
Summary: Use -O3 -fomit-frame-pointer instead of -Os → Use -O3 -fomit-frame-pointer instead of -Os on Mac OS X

Updated

8 years ago
Blocks: 554364

Updated

8 years ago
No longer blocks: 554364
(Assignee)

Comment 26

7 years ago
Created attachment 465234 [details] [diff] [review]
Enable -fomit-frame-pointer on OS X

According to the comment in Makefile.in, it looks like we're waiting on bug 517832, now fixed, to enable -fomit-frame-pointer on mac.

Here's an untested patch for it.
(Assignee)

Comment 27

7 years ago
Comment on attachment 465234 [details] [diff] [review]
Enable -fomit-frame-pointer on OS X

I've tested and timed this on buildmonkey-left. No errors with ref- or trace-tests in optimized build.

The timing results are pretty good. On SS, a 3% improvement overall, with most improvements in the 3-4% range, a few no changes, no losses, and an 8% increase on fannkuch. On v8, not much change, a 6% increase on crypto, 2% slowdown on earley-boyer, and 3% increase on regexp.
Attachment #465234 - Flags: review?(sayrer)
Comment on attachment 465234 [details] [diff] [review]
Enable -fomit-frame-pointer on OS X

Please also put this into the default optimization flags in configure:
http://mxr.mozilla.org/mozilla-central/source/js/src/configure.in#1888
we also enabled it for the entire browser build on Linux, we should do the same on OS X:
http://mxr.mozilla.org/mozilla-central/source/configure.in#2196
(I'd like to clear out all these MODULE_OPTIMIZE_FLAGS out of js/src/Makefile since JS has its own configure nowadays, that's bug 464328.)
Attachment #465234 - Flags: review?(sayrer) → review-
Assignee: sayrer → pbiggar
(Assignee)

Comment 29

7 years ago
(In reply to comment #28)
> Comment on attachment 465234 [details] [diff] [review]
> Enable -fomit-frame-pointer on OS X
> 
> Please also put this into the default optimization flags in configure:
> http://mxr.mozilla.org/mozilla-central/source/js/src/configure.in#1888

What's the interaction here? This stuff is a bit of a mess, and I don't understand which takes precedence and why.


> we also enabled it for the entire browser build on Linux, we should do the same
> on OS X
> http://mxr.mozilla.org/mozilla-central/source/configure.in#2196
> (I'd like to clear out all these MODULE_OPTIMIZE_FLAGS out of js/src/Makefile
> since JS has its own configure nowadays, that's bug 464328.)

This assumes it will work and is a net win, which I can't guarantee. I've only tested the JS module.
(In reply to comment #29)
> What's the interaction here? This stuff is a bit of a mess, and I don't
> understand which takes precedence and why.

MODULE_OPTIMIZE_FLAGS override the global settings on a per-Makefile basis. The JS makefile is littered with them because it used to not have its own configure file where it could set the defaults it wanted where they differed from the top-level browser Makefile. Now that it has one, defaults should live there instead.

> This assumes it will work and is a net win, which I can't guarantee. I've only
> tested the JS module.

What does it matter? Just fix it globally and if it sucks we can back it out like we would with your patch. We've enabled it globally on Linux, I'm fairly certain we'll see a win on Mac as well.

Comment 31

7 years ago
We should file a separate bug for enabling -O3 on Linux.

Comment 32

7 years ago
Comment on attachment 465234 [details] [diff] [review]
Enable -fomit-frame-pointer on OS X

We're going to take this on TM. Global build changes should not come in TM merges.

I filed bug 590179 to enable -fomit-frame-pointer globally on Mac.

I filed bug 590181 to switch Linux to -O3.
Attachment #465234 - Flags: review+

Comment 33

7 years ago
http://hg.mozilla.org/tracemonkey/rev/e2c21045e316
Whiteboard: fixed-in-tracemonkey
(Assignee)

Comment 34

7 years ago
(In reply to comment #30)
> > What's the interaction here? This stuff is a bit of a mess, and I don't
> > understand which takes precedence and why.
> 
> MODULE_OPTIMIZE_FLAGS override the global settings on a per-Makefile basis. The
> JS makefile is littered with them because it used to not have its own configure
> file where it could set the defaults it wanted where they differed from the
> top-level browser Makefile. Now that it has one, defaults should live there
> instead.

I wouldn't be happy to do this now. js/src/configure.in is a mess, and I think the simplest thing for it is to keep it as close to the root configure.in as we can. We're already fighting a losing battle there, and MOZ_OPTIMIZE_FLAGS is already diverging, but I think adding more local changes to js/src/configure.in makes it worse, not better.

I think the best outcome is to commit this change now, close the bug, and leave fixing the mess in configure.in to another bug.
I'm assuming that bug 590179 should update js/src/configure.in too, right?  At that point, should I touch the MODULE_OPTIMIZE_FLAGS here in any way?  Or just leave it be?
Yeah, it should update both configures. Leave the Makefile be, we'll clean it up in another bug.

Comment 37

7 years ago
http://hg.mozilla.org/mozilla-central/rev/e2c21045e316
Status: ASSIGNED → RESOLVED
Last Resolved: 7 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.