Closed Bug 590181 Opened 14 years ago Closed 13 years ago

Use smarter optimization flags on Linux

Categories

(Firefox Build System :: General, defect)

x86
Linux
defect
Not set
normal

Tracking

(blocking2.0 -)

RESOLVED FIXED
mozilla6
Tracking Status
blocking2.0 --- -

People

(Reporter: sayrer, Assigned: glandium)

References

Details

Attachments

(2 files, 13 obsolete files)

8.56 KB, patch
glandium
: review+
Details | Diff | Splinter Review
2.73 KB, patch
glandium
: review+
Details | Diff | Splinter Review
As noted in bug 492688, Linux should build with -O3. We're switching gcc versions right now, so it's probably a bad time to experiment.
Do we have numbers?  The only numbers I see in that bug are for Mac.
(In reply to comment #1)
> Do we have numbers?  The only numbers I see in that bug are for Mac.

O2 has a huge win for me on Linux too. I'd be happy to switch to either O2/O3. The only hurt comes from startup(cos paging in huge binaries is inefficient as hell, but that just means I need to deploy icegrind sooner)

Bloat effects of high -O-levels are offset by PGO compiling cold code in -Os manner. Unfortunately 4.5 is still a bit too fail-prone for our purposes, switching to 4.4 might be safer(to get PGO).
Summary: Switch Linux to -O3 → Use smarter compile flags on Linux
Taras, do you have numbers offhand for how much of a startup hit -O3 is on Linux?

Perhaps there's a middle ground in terms of flags if we can't use -O3.
Our previous -Os -freorder... flags were a compromise between -Os and -O2 for older GCC versions (I think 4.2 at the time).
(In reply to comment #4)
> Taras, do you have numbers offhand for how much of a startup hit -O3 is on
> Linux?
> 
> Perhaps there's a middle ground in terms of flags if we can't use -O3.

I think we should just take the startup hit on Linux. The hit is proportional to size difference in libxul. The Chrome binary is already a few times bigger than ours, we are in good shape in that regard.
The hit is due to lack optimizations in kernel + linker. We can address those later.
Wow, -O3 is  a lot faster than -Os -falign-the-world.  I'm seeing a 2x improvement in some parts of my benchmark.
Attached patch Patch v1 (obsolete) — Splinter Review
Patch.  Will push to try.

This will only apply cleanly on top of my --enable-profiling patch (bug 592923), but you get the idea.
-O3 will pretty much win by definition on microbenchmarks, as long as they're micro enough.

The question is what whole-system perf looks like.
(In reply to comment #9)
> -O3 will pretty much win by definition on microbenchmarks, as long as they're
> micro enough.

Sure.  But my microbenchmark is the hot path for an awesomebar search.  :)

I'll post talos results once they're up.
blocking2.0: --- → ?
Talos results from Linux-64 are up.  Linux-32 should be up in an hour or so; there was an ifra failure on my last push.

http://bit.ly/b1fqlT

This is good for a 9% improvement on Drameo DOM, but is an 8% TS regression.
It's weird to see a regression on Ts. Try with -O2. I was only expecting Ts_cold to regress.
Sorry, yes.  ts_cold regressed by 8%, but ts improved by 1.6%.
Linux 32 results are up.  Same link as above.

11% ts_cold regression, 4% ts improvement, 13.5% Dromaeo DOM improvement.
That sounds like the Ts_cold effects are as expected, as are the microbenchmark results.  The Ts win is nice.
Attachment #472808 - Flags: review?(ted.mielczarek)
So do we want to take this?
(In reply to comment #16)
> So do we want to take this?

yes
Assignee: nobody → justin.lebar+bug
Comment on attachment 472808 [details] [diff] [review]
Patch v1

># HG changeset patch
># Parent 49001b53c9888896888c2b6357a610a441e257fe
>
>diff --git a/configure.in b/configure.in
>--- a/configure.in
>+++ b/configure.in
>@@ -2210,23 +2210,25 @@ ia64*-hpux*)
>         # while; Intel recommends against using it.
>         MOZ_OPTIMIZE_FLAGS="-O2"
>         MOZ_DEBUG_FLAGS="-g"
>     elif test "$GNU_CC" -o "$GNU_CXX"; then
>         GCC_VERSION=`$CC -v 2>&1 | awk '/^gcc version/ { print $3 }'`
>         case $GCC_VERSION in
>         4.1.*|4.2.*|4.5.*)
>             # -Os is broken on gcc 4.1.x 4.2.x, 4.5.x we need to tweak it to get good results.
>+            # XXX is this still needed?
>             MOZ_OPTIMIZE_SIZE_TWEAK="-finline-limit=50"

Without evidence to the contrary I guess this stays. Feel free to disprove! (But the comment is unnecessary.)
Attachment #472808 - Flags: review?(ted.mielczarek) → review+
Comment on attachment 472808 [details] [diff] [review]
Patch v1

>         case $GCC_VERSION in
>         4.1.*|4.2.*|4.5.*)
>             # -Os is broken on gcc 4.1.x 4.2.x, 4.5.x we need to tweak it to get good results.
>+            # XXX is this still needed?

I don't think so. Old compilers can barely compile moz as is. 4.5 doesn't seem to be helped enough. Since we are switching away from -Os anyway, lets drop this.

>             MOZ_OPTIMIZE_SIZE_TWEAK="-finline-limit=50"
>         esac
>         # If we're building with --enable-profiling, we need a frame pointer.
>+        MOZ_OPTIMIZE_FLAGS="-O3 $MOZ_OPTIMIZE_SIZE_TWEAK"
>         if test -z "$MOZ_PROFILING"; then
>-            MOZ_OPTIMIZE_FLAGS="-Os -freorder-blocks -fomit-frame-pointer $MOZ_OPTIMIZE_SIZE_TWEAK"
>+            MOZ_OPTIMIZE_FLAGS+="-fomit-frame-pointer"
>         else
>-            MOZ_OPTIMIZE_FLAGS="-Os -freorder-blocks -fno-omit-frame-pointer $MOZ_OPTIMIZE_SIZE_TWEAK"
>+            MOZ_OPTIMIZE_FLAGS+="-fno-omit-frame-pointer"

Was -Os intentionally not replaced with -O3 here? -freorder-blocks is useful for pgo, don't drop that.
Also, I think -Os might still be a win on arm for us. We'll have to see if regresses mobile :)
I'm confused.  Are you sure you're reading this diff correctly?  The patched version doesn't have -Os:

>             MOZ_OPTIMIZE_SIZE_TWEAK="-finline-limit=50"
>         esac
>         # If we're building with --enable-profiling, we need a frame pointer.
>         MOZ_OPTIMIZE_FLAGS="-O3 $MOZ_OPTIMIZE_SIZE_TWEAK"
>         if test -z "$MOZ_PROFILING"; then
>             MOZ_OPTIMIZE_FLAGS+="-fomit-frame-pointer"
>         else
>             MOZ_OPTIMIZE_FLAGS+="-fno-omit-frame-pointer"

> Also, I think -Os might still be a win on arm for us. We'll have to see if
> regresses mobile :)

I left the Android flags as-is -- I think these are just for desktop Linux.  That said, I'd be Very Happy to switch all our -Os flags to -O3.
We should loop the android devs in before we do that, the cost of increasing code size might be higher there.
When you say "Desktop Linux" you probably mean "Desktop Linux and Maemo".
(In reply to comment #23)
> When you say "Desktop Linux" you probably mean "Desktop Linux and Maemo".

On IRC, stuart and stechz say they think this is worth a try on Maemo.  Should be easy enough to disable if it breaks anything.
Attached patch Part 2 - Enable -O3 (obsolete) — Splinter Review
Rebasing onto top (as opposed to my --enable-profiling patch), and removing MOZ_OPTIMIZE_SIZE_TWEAK.

Also making the same change to js/src/configure.in.  It's not so important there since they override these flags in js/src/Makefile.in.

Ted, since I changed just about every line in this patch, would you mind having another look?
Attachment #472808 - Attachment is obsolete: true
Attachment #474135 - Flags: review?(ted.mielczarek)
Comment on attachment 474135 [details] [diff] [review]
Part 2 - Enable -O3

>-        MOZ_OPTIMIZE_FLAGS="-Os -freorder-blocks -fomit-frame-pointer $MOZ_OPTIMIZE_SIZE_TWEAK"
>+        MOZ_OPTIMIZE_FLAGS="-O3 -fomit-frame-pointer"

Keep -freorder-blocks, it's helpful for pgo.
(In reply to comment #26)
> Comment on attachment 474135 [details] [diff] [review]
> Patch v2
> 
> >-        MOZ_OPTIMIZE_FLAGS="-Os -freorder-blocks -fomit-frame-pointer $MOZ_OPTIMIZE_SIZE_TWEAK"
> >+        MOZ_OPTIMIZE_FLAGS="-O3 -fomit-frame-pointer"
> 
> Keep -freorder-blocks, it's helpful for pgo.

-O2 and -O3 imply -freorder-blocks.

http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Comment on attachment 474135 [details] [diff] [review]
Part 2 - Enable -O3

r+ if you make taras' requested change.
Attachment #474135 - Flags: review?(ted.mielczarek) → review+
> -O2 and -O3 imply -freorder-blocks.

great.
(In reply to comment #27)
> > Keep -freorder-blocks, it's helpful for pgo.
> 
> -O2 and -O3 imply -freorder-blocks.
> 
> http://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html

Wouldn't it be worth adding a comment about -freorder-blocks for the case -Os gets better and reinstated in the future?
Landed, turned two Linux-32 boxes orange, bug 578880, backed out.

Mobile people: You'll get your Talos runs for free.  :)
This also caused some apparently-legitimate oranges (that either didn't happen on try or went unnoticed);

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1284151393.1284152081.21293.gz

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1284151393.1284152080.21226.gz

...guess there's more work to be done.
57622 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_transitions_per_property.html | interpolation of transitions: none to rotate(90deg) translate(20%, 20%) rotate(-90deg) - got "matrix(1, 0, 1.28002e-8, 1, 12.9px, 8.05px)", expected "matrix(1, 0, 1.05316e-8, 1, 12.9px, 8.05px)"
57623 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_transitions_per_property.html | interpolation of transitions: none to rotate(-90deg) translate(20%, 20%) rotate(90deg) - got "matrix(1, 0, -1.28002e-8, 1, 14.8167px, -3.43333px)", expected "matrix(1, 0, -1.05316e-8, 1, 14.8167px, -3.43333px)"

REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/mozilla-central_fedora_test-reftest/build/reftest/tests/layout/reftests/css-calc/border-radius-1.html |
dbaron, this looks like rounding error to me.  From the test's source, it looks like these tests are set up to require exact matches.  Do you think that's right?
It is interesting that -O3 affects how we do math, though.  Maybe something is getting auto-vectorized.
All the differences in the rounded corners test appear to be off-by-one.  Looks like rounding, at least in this case.
I just pushed to try with -O2.  We'll see how that goes.
Interesting.  I get the same interpolation error (comment 33) with -O2.  No word yet on the border-radius reftest.
Linux x64 builds still fail make-check (bug 578880) with -O2.  I thought this was intermittent, but it's apparently not.
Comment on attachment 474135 [details] [diff] [review]
Part 2 - Enable -O3

>diff --git a/js/src/Makefile.in b/js/src/Makefile.in

>-MODULE_OPTIMIZE_FLAGS = -O3 -fstrict-aliasing -fomit-frame-pointer $(MOZ_OPTIMIZE_SIZE_TWEAK)
>+MODULE_OPTIMIZE_FLAGS = -O3 -fstrict-aliasing -fomit-frame-pointer
> # Special optimization flags for jsinterp.c
> INTERP_OPTIMIZER = -O3 -fstrict-aliasing

I think we want INTERP_OPTIMIZER to be the same as MODULE_OPTIMIZE_FLAGS now. Looking at blame, it was originally added to override -Os with -O3, which is now unnecessary. And -fomit-frame-pointer was never added to it, which is bad. I'll have numbers in a bit.

Regardless, we definitely want to remove $(MOZ_OPTIMIZE_SIZE_TWEAK) from the js/src flags, because that only made sense with -Os! If this patch doesn't land again soon, we should split out that change and land it separately. (Getting numbers here too.)

>diff --git a/xpcom/io/Makefile.in b/xpcom/io/Makefile.in

> # work around bug 408258
> ifdef GNU_CC 
> ifneq ($(OS_ARCH), Darwin)
>-MODULE_OPTIMIZE_FLAGS = -Os -fno-strict-aliasing $(MOZ_OPTIMIZE_SIZE_TWEAK)
>+MODULE_OPTIMIZE_FLAGS = -Os -fno-strict-aliasing

I think you want -O3 and -fomit-frame-pointer here too... based on blame, the relevant bit here is -fno-strict-aliasing.
My patch in bug 593116 fixes the js/src/Makefile.in issues.

I was going to switch xpcom/io in a followup bug, but I'm happy to do it here.

We have -fno-strict-aliasing on globally (bug 414641) so we could just kill xpcom/io's special flags.
Since this bug is currently blocked (test failures aside, we're still waiting on bug 578880), I'll see if I can make some progress on bug 593116.
(In reply to comment #41)
> My patch in bug 593116 fixes the js/src/Makefile.in issues.

Cool.

> I was going to switch xpcom/io in a followup bug, but I'm happy to do it here.

Yeah, might as well do it here. Less work. :)

> We have -fno-strict-aliasing on globally (bug 414641) so we could just kill
> xpcom/io's special flags.

Hmm. Your patch there doesn't touch xpcom/io, so if we kill the override, we'll need a fix there. Anyway, since your patch there makes aliasing warnings errors, we'll be forced to fix it, so killing the special flag seems fine to me.

dvander kindly ran perf tests for removing $(MOZ_OPTIMIZE_SIZE_TWEAK) from the js/src flags (with the new -O3). It was neutral, so no harm removing it. Removing INTERP_OPTIMIZER (i.e. adding -fomit-frame-pointer) *might* have won 1-2% on V8. So we should definitely do that.
If http://bit.ly/aXat4C is to be believed, -O2 is significantly slower than -O3, pretty much across the board.  Looks like -O2 isn't even a clear ts_cold win.
Hmmm...I can't reproduce the border-radius failure using -O3 on Linux x64 with GCC 4.4.  I'll try with gcc 4.3 now.
If you confirm you see it on 4.3 but not 4.5, we can just block this on 4.5 deployment. :)
I can't reproduce with gcc 4.3 either.  Maybe this is a 32- vs 64-bit issue.  I'll try that now.
My own 32-bit builds with gcc 4.3 succeed, but I've verified that tryserver's build fails when I run it myself.
I'm using gcc 4.3.4 locally, and the build machines are using 4.3.3.  I'm going to see if I can dig up a 4.3.3 binary to see if that's the problem.
By the way. You can also try setting CXX to CXX="/tools/gcc-4.5/bin/g++ -static-libstdc++" and CC to /tools/gcc-4.5/bin/gcc to test 4.5.1
Anyone have gcc 4.3.3 on Linux x86-32 handy?  I'm trying to spin a VM, but Ubuntu is being difficult.
Just finished my build with 4.3.3; I don't get an error on the border-radius reftest.

jlebar@jaunty-vm:~/code/moz/ff-1$ c++ --version
c++ (Ubuntu 4.3.3-5ubuntu4) 4.3.3

I can try following Ted's steps and get the exact GCC they have on the build machines.  It might actually be easier (and more informative) to update the GCC on one of the builds machines, though.
Seeing the same issue with GCC 4.5.

Something very strange is going on here.  By far the simplest explanation is that I'm doing something wrong, and that things are not behaving as I think they are.  I'm going to double-check everything in the morning.
Attached patch Part 1a - Test fixes (obsolete) — Splinter Review
IRL dbaron and I decided to just accept the floating point differences we were seeing and modify the tests that broke.

This patch fixes the build I got from try when I run it locally.  I'll push to try.

We're still blocked on bug 578880.
Attachment #476113 - Flags: review?(dbaron)
Comment on attachment 476113 [details] [diff] [review]
Part 1a - Test fixes

r=dbaron
Attachment #476113 - Flags: review?(dbaron) → review+
This looks good on try, modulo bug 578880.
Depends on: gcc4.5
blocking2.0: ? → -
I get more test failures with gcc 4.5 plus -O3.  On Linux-32 (not -64):

61343 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_transitions_per_property.html | radius-valued property border-bottom-left-radius: interpolation of radius with mixed units - got "21.9833px 22.9833px", expected "22px 23px"
61361 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_transitions_per_property.html | radius-valued property border-bottom-right-radius: interpolation of radius with mixed units - got "21.9833px 22.9833px", expected "22px 23px"
61379 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_transitions_per_property.html | radius-valued property border-top-left-radius: interpolation of radius with mixed units - got "21.9833px 22.9833px", expected "22px 23px"
61397 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_transitions_per_property.html | radius-valued property border-top-right-radius: interpolation of radius with mixed units - got "21.9833px 22.9833px", expected "22px 23px"
61456 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_transitions_per_property.html | radius-valued property -moz-outline-radius-bottomleft: interpolation of radius with mixed units - got "21.9833px 22.9833px", expected "22px 23px"
61474 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_transitions_per_property.html | radius-valued property -moz-outline-radius-bottomright: interpolation of radius with mixed units - got "21.9833px 22.9833px", expected "22px 23px"
61492 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_transitions_per_property.html | radius-valued property -moz-outline-radius-topleft: interpolation of radius with mixed units - got "21.9833px 22.9833px", expected "22px 23px"
61510 ERROR TEST-UNEXPECTED-FAIL | /tests/layout/style/test/test_transitions_per_property.html | radius-valued property -moz-outline-radius-topright: interpolation of radius with mixed units - got "21.9833px 22.9833px", expected "22px 23px"

http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1285714922.1285715549.1120.gz

I wonder why specifically transitions and border radius are sensitive to -O3.
this sucks, are fails still there in -O2?

Would be nice to figure out what's going on this week. How is talos data?
> got "21.9833px 22.9833px", expected "22px 23px"

sounds like accumulated rounding errors.
Right.  The unanswered question is, why does -O2/-O3 cause this, while -Os doesn't?   It would be good to understand the code that's responsible for computing these numbers and to understand why whatever gcc is doing is a legal optimization.
Interestingly, it doesn't fail on my machine (and yes, I tried on x86 (not -64) with the patch applied)
(In reply to comment #62)
> Interestingly, it doesn't fail on my machine (and yes, I tried on x86 (not -64)
> with the patch applied)

I should mention I tried with debian's latest gcc-4.5 (svn r164618)
(In reply to comment #62)
> Interestingly, it doesn't fail on my machine (and yes, I tried on x86 (not -64)
> with the patch applied)

/me digs a hole and hides. I was not in the right chroot... unfortunately i get "too much recursion" when starting the httpd.js server.
(In reply to comment #63)
> > Interestingly, it doesn't fail on my machine (and yes, I tried on x86 (not -64)
> > with the patch applied)

This is (or, I guess, would be, since you didn't actually run it) consistent with what we were seeing with the failures caused by gcc-4.3: Locally-compiled builds worked fine, but the builds from try (when run locally) would fail.  Weird, I know.
Hm...dbaron
... dbaron, is the test in comment 58 one where roundoff error is acceptable?  There's no fudge factor built into that code, but I could add one (or try switching to numbers that are powers of two).
Talos results: http://bit.ly/92akzb

Still waiting for full 32-bit results.  On 64-bit, see the expected Dromaeo improvements (7% Dromaeo DOM, 4.5% Dromaeo CSS.  It's also good for a small sunspider win (1.5% on 32-bit).

Interestingly, I *don't* see a ts_cold regression on either 32- or 64-bit.
Whoa.  32-bit Dromaeo results:

17% DOM improvement,
9% CSS improvement,
7% jslib improvement.
(In reply to comment #69)
> Whoa.  32-bit Dromaeo results:
> 
> 17% DOM improvement,
> 9% CSS improvement,
> 7% jslib improvement.

I think these numbers meant to go into bug 559964 :)

Should be even better with pgo.
(In reply to comment #67)
> ... dbaron, is the test in comment 58 one where roundoff error is acceptable? 
> There's no fudge factor built into that code, but I could add one (or try
> switching to numbers that are powers of two).

Maybe try switching the width and height set in test_radius_transition from 200px to 256px?
(In reply to comment #71)
> (In reply to comment #67)
> > ... dbaron, is the test in comment 58 one where roundoff error is acceptable? 
> > There's no fudge factor built into that code, but I could add one (or try
> > switching to numbers that are powers of two).
> 
> Maybe try switching the width and height set in test_radius_transition from
> 200px to 256px?

Justin, do you have time to make a patch for that?
I pushed a patch this morning to try, and it looks good on Linux x86-32.  I'll attach here in a moment.
Attached patch Part 1b - More test fixes (obsolete) — Splinter Review
Attachment #480255 - Flags: review?(dbaron)
Attachment #476113 - Attachment description: Test fixes v1 → Part 1a - Test fixes
Attachment #474135 - Attachment description: Patch v2 → Part 2 - Enable -O3
Comment on attachment 480255 [details] [diff] [review]
Part 1b - More test fixes

The middle case (of the three that you modified), you changed such that there's no animation for the y component.  Could you fix that so there's still animation for both components?  Otherwise this looks fine.
Attached patch Part 1b - More test fixes (v2) (obsolete) — Splinter Review
Like this?  Interdiff (v1 -> v2):

diff --git a/layout/style/test/test_transitions_per_property.html b/layout/style/test/test_transitions_per_property.
--- a/layout/style/test/test_transitions_per_property.html
+++ b/layout/style/test/test_transitions_per_property.html
@@ -530,30 +530,30 @@ function test_radius_transition(prop) {
   div.style.setProperty(prop, "3px 8px", "");
   is(cs.getPropertyValue(prop), "3px 8px",
      "radius-valued property " + prop + ": computed value before transition");
   div.style.setProperty("-moz-transition-property", prop, "");
   div.style.setProperty(prop, "15px 12px", "");
   is(cs.getPropertyValue(prop), "6px 9px",
      "radius-valued property " + prop + ": interpolation of radius");
   div.style.setProperty("-moz-transition-property", "none", "");
-  div.style.setProperty(prop, "12.5% 25%", "");
-  is(cs.getPropertyValue(prop), "32px 64px",
+  div.style.setProperty(prop, "12.5% 6.25%", "");
+  is(cs.getPropertyValue(prop), "32px 16px",
      "radius-valued property " + prop + ": computed value before transition");
   div.style.setProperty("-moz-transition-property", prop, "");
   div.style.setProperty(prop, "25%", "");
-  is(cs.getPropertyValue(prop), "40px 64px",
+  is(cs.getPropertyValue(prop), "40px 28px",
      "radius-valued property " + prop + ": interpolation of radius");
   div.style.setProperty("-moz-transition-property", "none", "");
   div.style.setProperty(prop, "6.25% 12.5%", "");
   is(cs.getPropertyValue(prop), "16px 32px",
      "radius-valued property " + prop + ": computed value before transition");
   div.style.setProperty("-moz-transition-property", prop, "");
-  div.style.setProperty(prop, "64px 32px", "");
-  is(cs.getPropertyValue(prop), "28px 32px",
+  div.style.setProperty(prop, "64px 16px", "");
+  is(cs.getPropertyValue(prop), "28px 28px",
      "radius-valued property " + prop + ": interpolation of radius with mixed units");
 
   test_length_percent_calc_transition(prop);
 
   div.style.removeProperty("width");
   div.style.removeProperty("height");
   div.style.removeProperty("border");
   div.style.removeProperty("padding");
Attachment #480255 - Attachment is obsolete: true
Attachment #480270 - Flags: review?(dbaron)
I think you reattached the old patch, but the interdiff looks like it fixes the problem, except you might have introduced a new one in that the very last line should now expect "28px" instead of "28px 28px".
Ah, oops -- I qref'ed the wrong patch.

Interestingly, "28px 28px" seems to work.
Attached patch Part 1b - More test fixes (v3) (obsolete) — Splinter Review
Attachment #480270 - Attachment is obsolete: true
Attachment #480270 - Flags: review?(dbaron)
Comment on attachment 480297 [details] [diff] [review]
Part 1b - More test fixes (v3)

"28px 28px" shouldn't work in my tree (post bug 595650 and bug 595651); I guess we'll see who lands first.
Attachment #480297 - Flags: review+
The new test is green on try, so this should be good to land, unless dbaron gets there first.  :)
Keywords: checkin-needed
The test fixes fail to apply in test_transitions_per_property.html.
Keywords: checkin-needed
(In reply to comment #82)
> The test fixes fail to apply in test_transitions_per_property.html.

I'll rebase once we're ready to check in.
Attached patch Part 1 - Test fixes (v4) (obsolete) — Splinter Review
Test fixes, rolled up into one patch.

dbaron, can you please review the change I made to the one check_distance() call?  Just that one line is the only substantial difference between this patch and the ones it obsoletes.

I've pushed to try: http://hg.mozilla.org/try/rev/2384e33af4a0
Attachment #476113 - Attachment is obsolete: true
Attachment #480297 - Attachment is obsolete: true
Attachment #489301 - Flags: review?(dbaron)
Same as previous part 2 patch, just now with a checkin comment.
Attachment #474135 - Attachment is obsolete: true
Comment on attachment 489301 [details] [diff] [review]
Part 1 - Test fixes (v4)

I think you're probably better off keeping the check_distance calls in sync with the values just above that they're testing.
I admit to not fully understanding this test, but I don't see what the 10% in the original call corresponds to.  Is it (25-5)/4 + 5?  That's how I got 15.625%.
check_distance expects three arguments:
 * the start of an interval
 * the value 1/4 of the way from the start to the end.  (1/2 is bad since then the test wouldn't catch bugs where start and end got swapped)
 * the end of the interval
Oh, you mean I should change the other calls to check_interval?  That I can do.
No, just change the numbers in both places where you're changing them in one.
> No, just change the numbers in both places where you're changing them in one.

I think we may be talking past each other.  Do you mean that, because I made this change:

-  div.style.setProperty(prop, "5% 15%", "");
-  is(cs.getPropertyValue(prop), "10px 30px",
+  div.style.setProperty(prop, "12.5% 6.25%", "");
+  is(cs.getPropertyValue(prop), "32px 16px",
      "radius-valued property " + prop + ": computed value before transition");
   div.style.setProperty("-moz-transition-property", prop, "");
   div.style.setProperty(prop, "25%", "");
-  is(cs.getPropertyValue(prop), "20px 35px",
+  is(cs.getPropertyValue(prop), "40px 28px",
      "radius-valued property " + prop + ": interpolation of radius");
   check_distance(prop, "5% 15%", "10% 17.5%", "25%");

I should change the arguments to the check_distance call above to ("12.5% 6.25%", "...", "25%"), where "..." meets the criteria from comment 88?  That's what I meant in comment 89.
In an effort to get this landed on schedule, Taras and I think we should take these test fixes as a followup.
That wasn't particularly clear.  I meant that we're planning to land the test fixes in this bug as-is, and then clean them up per the above comments as a followup.
(In reply to comment #91)
> > No, just change the numbers in both places where you're changing them in one.
> 
> I think we may be talking past each other.  Do you mean that, because I made
> this change:
> 
> -  div.style.setProperty(prop, "5% 15%", "");
> -  is(cs.getPropertyValue(prop), "10px 30px",
> +  div.style.setProperty(prop, "12.5% 6.25%", "");
> +  is(cs.getPropertyValue(prop), "32px 16px",
>       "radius-valued property " + prop + ": computed value before transition");
>    div.style.setProperty("-moz-transition-property", prop, "");
>    div.style.setProperty(prop, "25%", "");
> -  is(cs.getPropertyValue(prop), "20px 35px",
> +  is(cs.getPropertyValue(prop), "40px 28px",
>       "radius-valued property " + prop + ": interpolation of radius");
>    check_distance(prop, "5% 15%", "10% 17.5%", "25%");
> 
> I should change the arguments to the check_distance call above to ("12.5%
> 6.25%", "...", "25%"), where "..." meets the criteria from comment 88?  That's
> what I meant in comment 89.

yes.
Pushed to m-c:
http://hg.mozilla.org/mozilla-central/rev/d2db7be6d3b4 (-O3 switch)
http://hg.mozilla.org/mozilla-central/rev/f634808dcae9 (test fixes)
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla2.0b8
Should this bug be reopened as the patch has been backed out in http://hg.mozilla.org/mozilla-central/rev/e9e55a9460bb ?
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
So I just tried reproducing the sunspider hang on try, all of the benchmarks complete successfully.
(In reply to comment #97)
> So I just tried reproducing the sunspider hang on try, all of the benchmarks
> complete successfully.

Let's get some additional testing on these builds just to be sure.

Can you file a bug (http://bit.ly/test_request), and include the <username>-<revision> for the builds (that helps whoever is submitting the extra builds), and ask for a few extra test and talos runs?
Attached patch Part 1 - Test fixes (v5) (obsolete) — Splinter Review
Interdiff in a moment
Attachment #489301 - Attachment is obsolete: true
Attachment #491126 - Flags: review?(dbaron)
Attachment #489301 - Flags: review?(dbaron)
Attachment #491128 - Attachment mime type: application/octet-stream → text/plain
This sucks, running talos multiple times results in a hang at -O3. -O2 causes sunspider to hang too. I'm really hoping that bug 609543 is the same issue.
Comment on attachment 491126 [details] [diff] [review]
Part 1 - Test fixes (v5)

>-  check_distance(prop, "8% 12%", "-moz-calc(6% + 10px) -moz-calc(5px + 9%)",
>-                       "40px 20px");
>+  check_distance(prop, "6.25% 12.5%", "16px 4px", "64px 16px");

I don't see where the middle numbers here came from or why they should even work.  You should use the correct -moz-calc() expression.

r=dbaron with that fixed
Attachment #491126 - Flags: review?(dbaron) → review+
And once you fix that, can you undo the accuracy reduction that you made as well?
Aha.  I *finally* understand this test.  Fix in a moment.

Sorry for the runaround, dbaron.
Attachment #491126 - Attachment is obsolete: true
Attachment #491128 - Attachment is obsolete: true
Two new test failures on try which appear to be from gcc 4.5 plus -O3:

I probably won't have time to take a closer look until the weekend.

Linux64: ERROR TEST-UNEXPECTED-PASS | chrome://mochitests/content/a11y/accessible/text/test_{singleline,whitespace}.html

http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1291181801.1291183038.24230.gz

Linux32: REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/tryserver_fedora_test-reftest/build/reftest/tests/layout/reftests/bugs/368020-5.html | image comparison (==)

http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1291180522.1291181422.17207.gz
Just ran through tryserver again.  Got the same two failures as in comment 107, plus a ts_cold crash on linux-32:

http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1292478670.1292480244.6506.gz
(In reply to comment #108)
> Just ran through tryserver again.  Got the same two failures as in comment 107,
> plus a ts_cold crash on linux-32:
> 
> http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1292478670.1292480244.6506.gz

Sounds like 4.5 is too buggy at O3. Is O2 ok?
With -O2, same reftest failure on linux-32 plus a js-ctypes crash:

http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1292527302.1292528084.32576.gz

Still awaiting linux-64 results.
The reftest failure is kind of interesting.  It looks like the gradients are being cut off a pixel or two too soon at the bottom.
Hi um could the default compile flags also include: -D_FORTIFY_SOURCE=2 -Wformat -Wformat-security  -fstack-protector

Also --> -pie -fPIE 

See http://wiki.debian.org/Hardening (among other sites) for as to why these should be added.
For the default firefox - from http://www.mozilla.com/en-US/firefox/ (32bit)

 Position Independent Executable: no, normal executable!
 Stack protected: no, not found!
 Fortify Source functions: no, not found!
 Read-only relocations: no, not found!
 Immediate binding: no, not found!

Here is firefox compiled on amd64 with hardening flags enabled:
 Position Independent Executable: yes
 Stack protected: yes
 Fortify Source functions: yes
 Read-only relocations: yes
 Immediate binding: yes
Summary: Use smarter compile flags on Linux → Use smarter optimization flags on Linux
(In reply to comment #112)
> Hi um could the default compile flags also include: -D_FORTIFY_SOURCE=2
> -Wformat -Wformat-security  -fstack-protector
> 
> Also --> -pie -fPIE 
> 
> See http://wiki.debian.org/Hardening (among other sites) for as to why these
> should be added.

Could you please file a separate bug on this?  This bug's summary was pretty vague, but I've updated it.  :)
I could... but this bug seemed to be about compile flags already and I didn't want to file a duplicate bug.
This bug is specifically about optimization flags; that's why I changed the summary.  Your bug is about security flags, so it's a separate issue.
(In reply to comment #115)
> I could... but this bug seemed to be about compile flags already and I didn't
> want to file a duplicate bug.

Yes, the summary was lying :-)
No longer depends on: 578880
Depends on: 578880
Comment on attachment 474135 [details] [diff] [review]
Part 2 - Enable -O3

(bookkeeping)
Attachment #474135 - Flags: approval2.0+
Depends on: 641842
With the two patches from here applied, and building with gcc 4.5 with -static-libstdc++ on try (off revision 4866be78732f), I only get one failure:

NEXT ERROR REFTEST TEST-UNEXPECTED-FAIL | file:///home/cltbld/talos-slave/test/build/reftest/tests/layout/reftests/bugs/368020-5.html | image comparison (==)
REFTEST   IMAGE 1 (TEST): 
REFTEST   IMAGE 2 (REFERENCE): 
REFTEST number of differing pixels: 126
(In reply to comment #120)
> With the two patches from here applied, and building with gcc 4.5 with
> -static-libstdc++ on try (off revision 4866be78732f), I only get one failure:

Can you please attach the log as a text file which can be used with reftest-analyzer.xhtml?
Attached file reftest failure log (obsolete) —
Note this only fails on x86, and doesn't fail on x86-64. (so that could very well be a double precision problem)
(In reply to comment #122)
> Created attachment 519800 [details]
> reftest failure log
> 
> Note this only fails on x86, and doesn't fail on x86-64. (so that could very
> well be a double precision problem)

Looks like the height of the boxes are incorrectly calculated.  CCing layout folks.
Attached patch Additional test fix (obsolete) — Splinter Review
Reading bug 368020, it doesn't look like the use of percentage in the test case is useful for what is being tested. Replacing them with absolute sizes fix the failure. I get an all green on try with this patch + the two others on this bug with gcc 4.5 -static-libstdc++.
Attachment #519873 - Flags: review?
Attachment #519873 - Flags: review? → review?(dbaron)
Comment on attachment 519873 [details] [diff] [review]
Additional test fix

If we can't accurately compute 2% or 3% of 500px, that's a problem.  It should come out to exactly 10px and 15px; floating point error shouldn't enter the picture, since even if there's slight error we should round right back to the correct value.
Attachment #519873 - Flags: review?(dbaron) → review-
(In reply to comment #124)
> Reading bug 368020, it doesn't look like the use of percentage in the test case
> is useful for what is being tested.

Tests aren't always strictly written to test only the bug being tested.  In this case I wrote tests for percentages to make sure we don't break percentages.
Comment on attachment 519873 [details] [diff] [review]
Additional test fix

Actually, I guess this is understandable if the normal line-height comes out to exactly a half-pixel value, and then the float multiplication comes out very slightly low, so r=dbaron.
Attachment #519873 - Flags: review- → review+
BTW, when applying part 1, I noticed this in js/src/configure.in (and changed it locally):
-          MOZ_OPTIMIZE_FLAGS="-Os -freorder-blocks -fno-reorder-functions -fno-omit-frame-pointer $MOZ_OPTIMIZE_SIZE_TWEAK"
+          MOZ_OPTIMIZE_FLAGS="-Os -fno-omit-frame-pointer"

I guess that's a typo, right, Justin ?
OS: Linux → Windows Server 2003
(In reply to comment #128)
> BTW, when applying part 1

part 2, that is.
OS: Windows Server 2003 → Linux
(In reply to comment #128)
> BTW, when applying part 1, I noticed this in js/src/configure.in (and changed
> it locally):
> -          MOZ_OPTIMIZE_FLAGS="-Os -freorder-blocks -fno-reorder-functions
> -fno-omit-frame-pointer $MOZ_OPTIMIZE_SIZE_TWEAK"
> +          MOZ_OPTIMIZE_FLAGS="-Os -fno-omit-frame-pointer"
> 
> I guess that's a typo, right, Justin ?

Yes, nice catch!  It should be -03 -fno-omit-frame-pointer.
Refreshed patch. I'll take this bug to completion (or not, it appears there might be problems with -O3 + PGO).
Assignee: justin.lebar+bug → mh+mozilla
Attachment #521150 - Flags: review+
Attachment #489302 - Attachment is obsolete: true
Attachment #494276 - Attachment is obsolete: true
Attachment #519800 - Attachment is obsolete: true
Attachment #519873 - Attachment is obsolete: true
Attachment #521151 - Flags: review+
No longer depends on: gcc4.5
No longer depends on: 578880
Blocks: gcc4.5
This can safely be done before actually switching to gcc 4.5, though we'll probably do both at (roughly) the same time.
http://hg.mozilla.org/mozilla-central/rev/e4ffaa64e1fb
http://hg.mozilla.org/mozilla-central/rev/bc80c46f185d
Status: REOPENED → RESOLVED
Closed: 14 years ago13 years ago
Resolution: --- → FIXED
Target Milestone: mozilla2.0b8 → mozilla6
That certainly wasn't intentional on my part!
Depends on: 654049
This might have broken Linux crash reports.  See bug 654595.
Turned out it was PGO that busted the crash reports.  However, when we disabled PGO, we got perma-orange in 4 reftests, due to this bug here: bug 654858

Backed out "part 2" of this bug to fix the perma-orange:
 http://hg.mozilla.org/mozilla-central/rev/e5c5ea4d03ed
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 654858
OK, since that backout Mochitest-other is orange on Linux opt, both 32-bit and 64-bit.  I've closed the tree until we sort that out, since at this point I don't know how to get it to a known-good state.
bz suggested that we might need a clobber, which seems reasonable, given that my backout constituted a build config change.  Requested clobbers, and hopefully that fixes the mochitest-oth issues..
Nope, the clobber didn't fix it. (As I later noticed, the builds for my backout were actually free-space clobber builds themselves)

glandium, ball's in your court... I'm not sure what we need to do to get the tree back in a green state. :)
And for what it's worth, if I push current m-c to try I get the failures there too.
And note that the failures are unexpected passes on the a11y tests...
Note also that the a11y tests haven't changed in 3 months (since Feb 3 2011).

The unexpected passes all seem to be of the form
> "" should equal ""
> 0 should equal 0
so I don't think they're *legitimate* passes - rather, it looks like we're messing up the expected value somehow.  (The test isn't super skimmable, so I'm not absolutely sure of that, but it seems likely.)

philor pointed out that it looks like we hit this exact issue last week in bug 652459 (the log in bug 652459 comment 0 exactly matches the unexpected passes we're seeing now).  That bug mysteriously stopped being a problem -- perhaps that was when this bug here's cset landed? (and now that this is backed out, it's a problem again)
(In reply to comment #144)
> philor pointed out that it looks like we hit this exact issue last week in bug
> 652459

See in particular bug 652459 comment 3, which basically says that the GCC 4.5 upgrade caused the a11y orange.

So, AFAICT, here's the situation:
(a) gcc-4.5 upgrade causes a11y orange (bug 652459 & comment 139 here)
(b) This bug (-O3) fixes the a11y orange, but breaks reftests (bug 654858)
(c) PGO fixes the reftests, but breaks crash reports (bug 654595)

We've reverted (c) and (b) already.

IIUC, it looks like we should revert (a) in order to get back to a known-good state.
Alternately, if reverting the GCC upgrade is a hassle & if we anticipate having bug 654595 resolved within a few days so that (a)+(b)+(c) can all be applied again, then we could potentially disable these a11y tests for a few days until we get to that point. (if possible, it'd be nice to have someone from the a11y team dig in deeper to figure out what's going on)
I think we have several problems in one, and they might grant separate bugs.

- This is something I wanted to file before all this mess appears, we can't set -O3 unconditionally on all gcc versions. People may be building with older versions and may have problems as a consequence (others than those we have). (and O3 without PGO makes libxul.so 7MB bigger). So all in all, maybe we should only enable O3 if we are building with PGO.

- We really need to figure out what this a11y orange is because if it happens for us with gcc 4.5 and -Os, it means it likely happens for distros. On Debian, we've been using gcc 4.5 for a while without regressions in reftests, xpcshell-tests, crashtests, jstestbrowser, and make check, but I don't run mochitest (because they take too much time). I will try running that particular test.

- Likewise for the reftests breakage with -O3, because while currently PGO apparently optimizes the involved code at -Os, it may decide to optimize at -O3 some day.

Note that I may have a workaround for (c), which is going through try right now.
I'll add that I didn't run in the reftest regression when I first tried gcc 4.5 + -O3 without PGO on try a few weeks ago.
(In reply to comment #147)
> - We really need to figure out what this a11y orange is because if it happens
> for us with gcc 4.5 and -Os, it means it likely happens for distros. On
> Debian, we've been using gcc 4.5 for a while without regressions in reftests,
> xpcshell-tests, crashtests, jstestbrowser, and make check, but I don't run
> mochitest (because they take too much time). I will try running that
> particular test.

Note that it was also happening on tm with PGO.
We definitely need separate bugs on the three items in comment 147.
(In reply to comment #150)
> We definitely need separate bugs on the three items in comment 147.

- I reopened bug 652459 for the second item.
- We could use bug 654858 for the third item.
- As for the first one, it could either be treated in this bug or a new separate one.
And I filed bug 654975 as a possible workaround for bug 654595
Depends on: 655003
Depends on: 652459
Depends on: 655020
No longer depends on: 655020
No longer depends on: 652459
Depends on: 655020
No longer depends on: 655020
http://hg.mozilla.org/mozilla-central/rev/3abe2f8c592e
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
Keywords: regression
Keywords: regression
Product: Core → Firefox Build System
You need to log in before you can comment on or make changes to this bug.