Closed Bug 53486 (O2) Opened 22 years ago Closed 17 years ago

Default linux MOZ_OPTIMIZE_FLAG is -0 !!


(SeaMonkey :: Build Config, defect, P2)



(Not tracked)



(Reporter: dougt, Assigned: leaf)



(Keywords: topperf)


(6 files)

in mozilla/configure, the default optimization flag passed to gcc is -0.  This
needs a value of at least -O1, probably -O2.

Leaf, can you check if the build machines have this set in the environment?
Severity: normal → critical
Keywords: nsbeta3
I think cls was looking at this. I just finished a build with -O2 that
worked...trying -O3 next...
verification builds, and I believe tinderboxen, use -O
From egcs docs...

Options That Control Optimization

   These options control various sorts of optimizations:

     Optimize.  Optimizing compilation takes somewhat more time, and a
     lot more memory for a large function.

     Without `-O', the compiler's goal is to reduce the cost of
     compilation and to make debugging produce the expected results.
     Statements are independent: if you stop the program with a
     breakpoint between statements, you can then assign a new value to
     any variable or change the program counter to any other statement
     in the function and get exactly the results you would expect from
     the source code.

     Without `-O', the compiler only allocates variables declared
     `register' in registers.  The resulting compiled code is a little
     worse than produced by PCC without `-O'.

     With `-O', the compiler tries to reduce code size and execution

     When you specify `-O', the compiler turns on `-fthread-jumps' and
     `-fdefer-pop' on all machines.  The compiler turns on
     `-fdelayed-branch' on machines that have delay slots, and
     `-fomit-frame-pointer' on machines that can support debugging even
     without a frame pointer.  On some machines the compiler also turns
     on other flags.

     Optimize even more.  GNU CC performs nearly all supported
     optimizations that do not involve a space-speed tradeoff.  The
     compiler does not perform loop unrolling or function inlining when
     you specify `-O2'.  As compared to `-O', this option increases
     both compilation time and the performance of the generated code.

     `-O2' turns on all optional optimizations except for loop unrolling
     and function inlining.  It also turns on the `-fforce-mem' option
     on all machines and frame pointer elimination on machines where
     doing so does not interfere with debugging.

     Optimize yet more.  `-O3' turns on all optimizations specified by
     `-O2' and also turns on the `inline-functions' option.

As an aside, I have a suspicion that specifying -O2 with -pedantic generates bad
code, mostly because I've never been able to get a build to work with both flags
turned on (it crashes in layout somewhere). Haven't taken the time to prove it,
My -O2 builds are also crashing.  The last known module to be touched was
editor.  Now, the weird thing is that if I rebuild just editor/base adding -g to
CFLAGS/CXXFLAGS, the build doesn't crash on start up.  I'm going to try to
narrow it to a particular file.

Here's the beginning of the optimized trace:
(gdb) bt
#0  0x8361411 in ?? ()
#1  0x2c0270e2 in nsEditor::GetPriorNode ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#2  0x2c06c23b in nsHTMLEditor::GetPriorHTMLNode ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#3  0x2c037d88 in nsTextEditRules::WillInsert ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#4  0x2c038841 in nsTextEditRules::WillInsertText ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#5  0x2c0379cb in nsTextEditRules::WillDoAction ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#6  0x2c05a202 in nsHTMLEditor::InsertText ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#7  0x2c094954 in nsHTMLEditorLog::InsertText ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#8  0x2b9f4d76 in nsGfxTextControlFrame2::SetTextControlFrameState ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#9  0x2b9f56a1 in nsGfxTextControlFrame2::SetProperty ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#10 0x2b955659 in nsHTMLInputElement::SetValue ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#11 0x2aed3589 in SetHTMLInputElementProperty ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/./

   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#24 0x2b1e6790 in nsDocLoaderImpl::FireOnLocationChange ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#25 0x2b59fc75 in nsDocShell::SetCurrentURI ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#26 0x2b59f55e in nsDocShell::OnNewURI ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#27 0x2b59fba3 in nsDocShell::OnLoadingSite ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#28 0x2b59c49b in nsDocShell::CreateContentViewer ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#29 0x2b5a6c59 in nsDSURIContentListener::DoContent ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#30 0x2b1e3906 in nsDocumentOpenInfo::DispatchContent ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#31 0x2b1e34c1 in nsDocumentOpenInfo::OnStartRequest ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#32 0x2b0fa84d in nsHTTPFinalListener::OnStartRequest ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#33 0x2b0d9d40 in InterceptStreamListener::OnStartRequest ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/
#34 0x2b0faa09 in nsHTTPServerListener::FinishedResponseHeaders ()
   from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/

Both my -O2 and -O3 builds worked fine. cls, are you using --enable-pedantic?
Yes as --enable-pedantic is the default. 

I also discovered that if I compile _just_ editor/base/nsEditor.cpp with the
additional -g, the crash goes away.

The attached patch makes --enable-optimize=-O9 work as expected.  It also
changes the optimize defaults to -O3 for mozilla if using gcc and to -O3 for
nspr if using linux & autoconf.  
I would recommend using -O2 as the default optimization level. The only
difference between O2 and O3 is that O3 inlines much more. This often bloats the
executables and even makes them slower due to cache issues.
I compiled the ns6 branch with -O2. It seems to work well.
I thought -O3 was supposed to make binaries faster with a side-effect of the
inlining being a bigger binaries.  Could you elaborate on the slowdown due to
cache issues?

Currently, I'm building using:
env CFLAGS='-pipe -O2' CXXFLAGS='-pipe -O2' ../mozilla/configure 
--enable-nspr-autoconf --enable-mathml --enable-svg --disable-debug
--disable-tests --disable-mailnews

This weekend, I was seeing intermittent crashes without recompiling but it was
crashing in gklayout.
Back trace from -O2 build with gklayout crash:

#0  0x83eda0b in ?? ()
#1  0x2bb19e6e in nsIBox::AddCSSPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#2  0x2bb1d371 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#3  0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#4  0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#5  0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#6  0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#7  0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#8  0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#9  0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#10 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#11 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#12 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#13 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#14 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#15 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#16 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#17 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#18 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#19 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#20 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#21 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#22 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#23 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#24 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#25 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#26 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#27 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#28 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#29 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#30 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#31 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#32 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#33 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#34 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#35 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#36 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#37 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#38 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#39 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#40 0x2bb1fbee in nsSprocketLayout::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#41 0x2bb1d3a6 in nsContainerBox::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#42 0x2bb27dfb in nsBoxFrame::GetPrefSize ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#43 0x2bb1f1bf in nsSprocketLayout::PopulateBoxSizes ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#44 0x2bb1e2e3 in nsSprocketLayout::Layout ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#45 0x2bb1d670 in nsContainerBox::DoLayout ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#46 0x2bb2813f in nsBoxFrame::DoLayout ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#47 0x2bb19a18 in nsBox::Layout ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#48 0x2bb20e67 in nsStackLayout::Layout ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#49 0x2bb1d670 in nsContainerBox::DoLayout ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#50 0x2bb2813f in nsBoxFrame::DoLayout ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#51 0x2bb19a18 in nsBox::Layout ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#52 0x2bb27c52 in nsBoxFrame::Reflow ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#53 0x2bb18240 in nsRootBoxFrame::Reflow ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#54 0x2b9687a6 in nsContainerFrame::ReflowChild ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#55 0x2b9a2799 in ViewportFrame::Reflow ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#56 0x2b977ef6 in nsHTMLReflowCommand::Dispatch ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#57 0x2b992de8 in PresShell::ProcessReflowCommands ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#58 0x2b991ab1 in PresShell::FlushPendingNotifications ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#59 0x2b991b0d in PresShell::EndReflowBatching ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#60 0x2bd4fd1a in nsEditor::EndUpdateViewBatch ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#61 0x2bd451de in nsEditor::EndPlaceHolderTransaction ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#62 0x2bd7d889 in nsHTMLEditor::InsertText ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#63 0x2bdb47c0 in nsHTMLEditorLog::InsertText ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#64 0x2ba7c17a in nsGfxTextControlFrame2::SetTextControlFrameState ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#65 0x2ba7a71d in nsGfxTextControlFrame2::SetProperty ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#66 0x2b9e3fe9 in nsHTMLInputElement::SetValue ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#67 0x2aea4aa9 in ?? ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#68 0x2abe07e6 in js_SetProperty ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#69 0x2abd5a4c in js_Interpret ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#70 0x2abcf23d in js_Invoke ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#71 0x2abcf42f in js_InternalInvoke ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#72 0x2abe03eb in js_SetProperty ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#73 0x2abd5a4c in js_Interpret ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#74 0x2abcf23d in js_Invoke ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#75 0x2b4a4202 in nsXPCWrappedJSClass::CallMethod ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#76 0x2b4a2bf1 in nsXPCWrappedJS::CallMethod ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#77 0x2ab80f2e in PrepareAndDispatch ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#78 0x2ab8107e in nsXPTCStubBase::Stub8 ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#79 0x2b5a3d52 in ?? ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#80 0x2b18fb90 in nsDocLoaderImpl::FireOnLocationChange ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#81 0x2b5424b5 in nsDocShell::SetCurrentURI ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#82 0x2b541d9b in nsDocShell::OnNewURI ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#83 0x2b5423e3 in nsDocShell::OnLoadingSite ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#84 0x2b53eb93 in nsDocShell::CreateContentViewer ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#85 0x2b5481a0 in nsDSURIContentListener::DoContent ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#86 0x2b18caee in nsDocumentOpenInfo::DispatchContent ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#87 0x2b18c5e7 in nsDocumentOpenInfo::OnStartRequest ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#88 0x2b0b6a59 in nsHTTPFinalListener::OnStartRequest ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#89 0x2b097570 in InterceptStreamListener::OnStartRequest ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#90 0x2b0b67b5 in nsHTTPServerListener::FinishedResponseHeaders ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#91 0x2b0b537d in nsHTTPServerListener::OnDataAvailable ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#92 0x2b073638 in nsOnDataAvailableEvent::HandleEvent ()
#93 0x2b072f10 in nsStreamListenerEvent::HandlePLEvent ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#94 0x2ab6f543 in PL_HandleEvent ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#95 0x2ab6f463 in PL_ProcessPendingEvents ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#96 0x2ab701ad in nsEventQueueImpl::ProcessPendingEvents ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/./
#97 0x2b1d32af in event_processor_callback ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#98 0x2b1d305d in our_gdk_io_invoke ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#99 0x2b36baca in g_io_unix_dispatch () from /usr/lib/
#100 0x2b36d186 in g_main_dispatch () from /usr/lib/
#101 0x2b36d751 in g_main_iterate () from /usr/lib/
#102 0x2b36d8f1 in g_main_run () from /usr/lib/
#103 0x2b2955b9 in gtk_main () from /usr/lib/
#104 0x2b1d377e in nsAppShell::Run ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#105 0x2af1b926 in nsAppShellService::Run ()
   from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/
#106 0x804de62 in main1 ()

The only difference between -O2 and -O3 is that -O3 turns on -finline-functions,
which inlines all possible "small" functions. This makes the code larger and
possibly faster. I say possibly, because on modern machines inlining isn't an
obvious win like in the bad old days. Duplicating code (which is what inlining
does) defeats the purpose of caches. This is a thing which will need to be

For my machine (a dual PIII) i hyphotize that "-O2 -fomit-frame-pointer
-march=i486 -mcpu=pentiumpro" will generate good code. One could possibly add
-fstrict-aliasing there too, but I don't know if it is safe.

I've personally added asm() constructs to the code that run on >=486 only, soo
-march=i486 lets the compiler use 486 instructions elsewhere. -march=pentiumpro
should shedule the code good for modern PII like processors, but still be
backwards compatible (in this case to 486).

I'll try to build such a beast and see if it works.
Ok. So -fomit-frame-pointer was really bad. Crashed  badly somewhere in
XPConnect i think. Lets scrap that, rebuilding without it.
The -O2 -march=i486 -mcpu=pentiumpro seems to hang on startup. Investigating.
reassigning to cls, our unix build guru.
Assignee: leaf → cls
Target Milestone: --- → mozilla0.9
Um, yeah.  If someone cares to review the patch, I can check in only the portion
that allows you to tweak the -O level via --enable-optimize .  Based upon my
experience, the egcs compiler has issues compiling certain parts of Mozilla with
anything above -O.  Once I upgraded back to gcc 2.95.2, I was able to run a
build that used -O2 & -O3 without a problem. So, since our official build
platform is stock RH 6.x, I guess we're going to leave the default optimization
level at -O.

so you aren't doing
@@ -2871,11 +2877,11 @@
@@ -576,11 +582,11 @@
assuming you don't check in those portions you can take a r=timeless, although 
I would think an r=leaf (cc) or r=granrose would be prefered.

Is this bug critical? [no?]
Is this bug platform PC? [no? there's alpha code in one of the patch blocks]
Keywords: approval, patch, review
to the footprint keyword with ya mate!

lets get this reviewed so cls can get us some 
experimental builds for folks to pound on, and
if all goes well start to get compiler and optimization
flags updated to be the best that they can be for
the next round of development
Keywords: footprint
The --enable-optimize=val patch has been checked in.

I've put up a couple of egcs 1.1.2 & gcc 2.95.2 builds that were made from the
same tree at .  They were built on a RH6.2 +
updates box.  I've also made available the gcc 2.95.2 rpms, which install
themselves with the /usr/gcc295 prefix.  In addition to the mozconfig options,
the builds were made with --disable-debug --enable-optimize=-O{2} --disable-tests . - if the patch has been checked in, can we close this bug?

Depends on: 61501
Target Milestone: mozilla0.9 → mozilla1.0
The original bug is on changing the default to use at least -O2.  The last time
I tried building with -O2 & gcc 2.95.2, it crashed closing a dialog window (bug
61501).  I don't think an egcs 1.1.2 -O2 build will get that far.  So unless
we're going to punt on changing the defaults, the bug should probably remain open.

-fomit-frame-pointer will crash because of the xpconnect asm stuff.

I've been running --enable-debug -O2 builds with gcc-2.96 for a while without
problems. I haven't quantfied the speed differences, but its noticable compared
to -O.
jrgm,  can you spin a build and see if it makes the
the page loader time drop...

We really ought to run all the perf tests that we have
against an -02 build.  How about we flip the switch for
a day in the release builds  or create some experimental
and get lots of people pounding on things all over the
product..  leafer,seawood, good idea?   

lets get this on the hot list for the performance
meeting cuz it relatively easy to turn on/off and
might be a good perf win.

dogut/valeski/waterson, are the gamera folks using -O2?
Worksforme.  I'd suggest throwing the switch for Thursday or Friday's builds so
that we avoid the couple of days of milestone breakage.  I'm also throwing in my
vote for upgrading the release compiler to gcc 2.95.2+ while we are changing things.

I made two builds, both pulled from the trunk at approx. 8pm this evening,
setting --disable-debug, --disable-tests and --enable-optimize=-O1 or 
--enable-optimize=-O2. I ran them both through a couple of page loading 
test series, getting similar results with each build for each run.

There appears to be about a ~9% gain from setting '-O2' for "already cached"
loads of a page. For the "initial visit", I don't see any significant change
in page load times.

I'll attach a graph of "already cached" load times, comparing -O1 and -O2.
very cool..  maybe we should flip the optimization level for 0.9...
chofmann, yes, let's do that. what's the safest way? carpool? I think we'd want
to be able to identify regressions caused by the optimizer, if any.
Don't forget that changes to don't cause a rebuild on unix (bug
72018, WONTFIX), so to see the improvement (and pick up any regressions) people
(and the tinderboxes) doing optimised builds will have to clobber.
Note that jrgm is using gcc 2.96 for his tests.  I'd be interested to see the
results on a RH6.2 + egcs 1.1.2 system since that's what the release boxes are.
put this dude on the landing plan

seawood,  should we try and get all the release build
systems on gcc2.96 rh7.0 in the same step..  we need
to get closer to the latest compilers as soon as possible
to start picking up the ron guilmette fixes..
I've been building with -02 on a RH6.2/egcs1.1.2 system for weeks with no trouble...
I'm made builds for weeks now, gcc 2.95.3 , -O2 and -O3 without problems. 

I hope this will be included in 0.9.
It would be awesome if we could get this into 0.9. What has to happen?
roc, time travel.   Given past, non-immediately obvious problems when compiling
with egcs -O2, I don't feel comfortable making the switch the day after we've
closed for the 0.9 stabilization period.  But if drivers feels otherwise, then
we'll throw the switch (and pray for stability).

Hofmann, I know we want to upgrade to a more recent compiler but I think jumping
to rh 2.96 would be a bit hasty.  Afaik, there is no other vendor supporting
that particular compiler/libstdc++ combination so we would bear the full brunt
of making the runtime libstdc++ library available to the end-users.  RH2.96's
libstdc++ is not binary compatible with any other released version of libstdc++.

Gcc 2.95.3 would be a better choice from a user standpoint imo.  We'd still have
to roll our own rpms for the build systems but since libstdc++ is still binary
compatible with the one from gcc 2.95.2, which multiple vendors distribute
and/or support, we would not have the additional library distribution burden.  

And of course, I'd suggest waiting until after the 0.9 release before upgrading
the boxes.
I agree with Chris Seawood 100% on both issues.
Makes sense to me.

I would also vote for gcc-2.95.3. I have been using it since it came out for
building Mozilla, and I haven't yet found a problem that I can relate to the
Agreement again; the 2.95.3 release has given me the
fewest (zero so far) funky compiler-attributable problems
of the 2.9x series to date.  Perhaps a bit too bleeding-edge
to declare it the new official Mozilla-happy compiler though?
> are the gamera folks using -O2?

I asked -- they say they are using -O2 
Has anyone tried playing with adding -funroll-loops to -O2?  Seems like that
could be an interesting experiment...
i made a build with -O3 -funroll-loops -march=pentium

It works and seems to be very speedy, but i didn't make any performance
measures. Can somebody do that please ?
I made another build with all these on:
-O3 -funroll-loops -fexpensive-optimizations -march=pentium -mcpu=pentiumpro
-fstrength-reduce -fschedule-insns2 -frerun-cse-after-loop -fthread-jumps
-fcse-follow-jumps -fcse-skip-blocks

Compiles well and runs REALLY fast.
WRT compiler runtimes, would it be good to build the nightlys with a staticly 
linked c++ library.  I dont know how much this would bloat the code, but it 
should make it possible for anyone to run mozilla w/o having to worry about 
which version of g++ we used.  I know you can do this by building the gcc with 
the --disable-shared flag.  I dont know how to do it if you did not build gcc 
with that flag, but I bet cls does.
Yesterday i clobbered, pulled, built with -O2 and suddenly trying to compose a
message always crashed. ("New Msg" and "Reply" or whatever.)

Pulled again, rebuilt to make sure it wasn't mid-checkin mess: same thing.
No bug resembling this was reported lately.
Clobbered yet again and built *without* -O2 and the crash vanished.
In between pulls i could see no checkin that would have affected this.
R.K.Aa: could you post disassebly of the destructor code in both -O2 and -O1
versions?  It'd be interesting to see if we can nail down the compiler error, or
see if it is, in fact, just sloppy Mozilla code that we've been lucky with so far.
clobbered again and building with -O1 now
no crash with -O1

Shaver: Mailed you a Q. Reply?
I'll clobber and rebuild again with -O2 to do this thorough, but need advice.
Does anyone know if the gcc295 nightlies are built with -O or -O2 ?
Would be nice if at least that ones would switch to -O2 soon - and they also
would be a nice thing to test how good that works...
(I know I would be one of the first ppl that would see them crashing as I'm
downloading those builds daily...)
Yeah, it would be great if one of the tinderboxen were configured to use -O2 (or
even higher, why not?) and the builds could be pushed out for people to try.
I talked to chofmann earlier this week and we're going to wait until after the
0.9 bins have been built to upgrade the release machines to 2.95.3.  All
optimized nightly builds use the default --enable-optimize.  It's not worth
hacking the nightly automation to switch to -O2 for specific builds only and
remove the potential for conflicting --enable-optimize options when we're going
to throw the switch anyways next week.

The tinderboxes builds are not delivered to anyone.  That's a topic for a
separate bug if people are interested in doing that. 
re gcc2.95.x:

Beonex Communicator 0.6-pre Linux release builds are compiled with gcc2.95.2
with --enable-optimize. I had reports that the binary crashed on some systems at
startup. Presumably, it were all RH systems. I have the following in the install
instructions <>:

"Redhat 6.1 or later: Install the std. C++ libraries for gcc 2.95 from Mandrake
(100 kB). On RedHat 7.0, you have to force installation (rpm -i --force filename)."

This seems to fix the problem.

The --force is because otherwise, rpm will complain about downgrading (libstdc++
2.96 is already installed).
Mozilla doesn't work on Redhat 6.0 and earlier at all, IIRC.
I couldn't even find a Redhat libstdc++ 2.95 package at that time.
> I had reports that the binary crashed on some systems

(before I put up the install instructions)

> otherwise, rpm will complain about downgrading

Note that rpm will not downgrade, if you force installation. Both libstdc++ 2.95
and libstdc++ 2.96 will be installed. If you don't force, installation will be
rejected. So, this precedure most likely will *not* hork the RH7.0 installation.
> The only difference between -O2 and -O3 is that -O3 turns on
-finline-functions, which inlines all possible "small" functions.

In gcc-3.0 prereleases (anybody has tested mozilla with them? The more we stress
test the new gcc before it's released, the better.), -O3 also turns on

From the info file:

     Attempt to avoid false dependancies in scheduled code by making use
     of registers left over after register allocation.  This
     optimization will most benefit processors with lots of registers.
     It can, however, make debugging impossible, since variables will
     no longer stay in a "home register".

I've been building with -O2 on x86 with the official
GCC 2.95.3 for about a month now -- no problems seen.
How about introducing some "gcc" keyword?

Ever more will be building with gcc 2.95 / 2.96. I use 2.96 and see a few
crashes others don't. One crash i did report but later resolved as invalid
because i didn't use the official egcs, later turned out to be a blocker.
The crashes may not always be that obvious but untill further I've stopped
reporting them.

Question is: Should I stick to the nightlies and shaddap about what i see in own
CVS builds - or not? If not: some policy and means of sorting these bugs might
be an idea.
the gcc2.96 is no official gcc compiler (there never was a real gcc2.96), just a
like a nightly build taken by RedHat, who also added some bug fixes.
Because of that IMHO gcc2.96 is by definition buggy, so we shouldn't care too
much about it.
On the other hand, we should try the gcc 3 builds, which should be pretty close
to what the real gcc3 will be. If a gcc bug is found, the bug could be postet to
the developers and we make sure, that Mozillas potential problems with gcc3 will
be solved much earlier. I hope that gcc3 gives quite an additional performance
boost on Linux, which mozilla always enjoys :-)
OK - then i won't file gcc2.96RH bugs.
This morning's builds had a bug ( bug 80746 ) which may have led to 
a Bugzilla user inadvertantly 
changing this bug from the Assignbed/Accepted status to the New status.  If you 
are the owner of this bug please check to see that it is in the correct Status.  
I know that there were problems when we tried to upgrade compilers, but I'm
nominating for 0.9.2 anyway. A 9% improvment is nothing to sneeze at...
Keywords: mozilla0.9.2
Mandrake is using gcc 2.96 ;)
9% is nothing to sneeze at on my p233 w/ 96M.
Any major changes in the optimization scheme from 2.95.x to 3?
Recent builds are showing problems (password dialogs return a blank password,
somehow) when compiled with -O2 on RedHat 7.1.  Goes away if I use -O instead.

I'm seeing no problems in my -O2 builds on Linux, compiling on Debian Unstable
with gcc-2.95.4.

I can say there's a definite performance increase - the UI speeds up
considerably; although it's not enough to match native GTK menus (with -O2 we're
still twice as slow as X-Chat).
Blocks: 71874
Ok, so can we get some feedback from people using builds made with egcs 1.1.2 &
-O2?  The release boxes are still stuck with that compiler and given the last
upgrade fiasco, it'll be some time before that changes (someone prove me wrong,
please :-P).    
cls: isn't bug 79681 (statically linking libstdc++) the only real obstacle to
upgrading the compilers for real?  And it looks like much of the work on that
bug has been done...
That and the legal issue that goes along with bug 79681 (just updated).

Depends on: 79681
With -O2 on rh7.1, I can confirm bryner's issue with the mailnews password
dialog. Sigh.
The password dialog problem thing is now bug 83388. Theres a workarround in that
Depends on: 83388
I see a crash when clicking on "View" in the History window's menu, if Mozilla
is compiled with gcc 2.95.4 and -O[23]. -O works.

I've been building with -O3 for some time now without experiencing other problems.
Reproducible crash with gcc2.95.2 built with -O2:

  Go to
  Switch stylesheet

Produces segfault.

I investigated a bit with different optimisation settings.  Here's the result:

From gcc toplev.c
  if (optimize >= 2)
      flag_cse_follow_jumps = 1;
      flag_cse_skip_blocks = 1;
      flag_gcse = 1;
      flag_expensive_optimizations = 1;
      flag_strength_reduce = 1;
      flag_rerun_cse_after_loop = 1;
      flag_rerun_loop_opt = 1;
      flag_caller_saves = 1;
      flag_force_mem = 1;
      flag_schedule_insns = 1;
      flag_schedule_insns_after_reload = 1;
      flag_regmove = 1;
although flag_schedule_insns is switched off in i386.c.  So I built with -O1 and
an increasing number of flags switched on.  This lead to a surprising result:
you can enable all of the -O2 flags and not see the crash.  Similarly, you can
enable -O2 and then disable all of the optimisations with -fno-* and the crash
will still occur.

Unfortunately I couldn't find anywhere where additional options are enabled. 
However, this may suggest that it may be worthwhile explicitly switching on
flags that are known to work rather than using -O2.

Running some test builds with gcc-3.0 next.
I'll look into the segfault.  However, note that just because a certain
optimization level causes a crash it _doesn't_ mean there's a bug in the
compiler.  Certain application bugs can be hidden or exposed by differing
optimization levels.  It is a useful data point, though.
From the GCC 2.95.2 documentation at

-mcpu=cpu type 
    Assume the defaults for the machine type cpu type when scheduling
instructions. The choices for cpu type are: 
     `i386' `i486' `i586' `i686' `pentium' `pentiumpro' `k6'

    While picking a specific cpu type will schedule things appropriately for
that particular chip, the compiler will not generate any code that does not run
on the i386 without the `-march=cpu type' option being used. `i586' is
equivalent to `pentium' and `i686' is equivalent to `pentiumpro'. `k6' is the
AMD chip as opposed to the Intel ones. 

-march=cpu type 
    Generate instructions for the machine type cpu type. The choices for cpu
type are the same as for `-mcpu'. Moreover, specifying `-march=cpu type'
implies  `-mcpu=cpu type'. 

We should be using (IMHO) -O2 -mcpu=pentiumpro -march=pentium

This will select instructions for a pentium (not a 486, and not the default
386!).  It will schedule them for best performance on a pentiumpro (which is
what the PII and PIII are based on).  Alternatively, we could use just -O2
-march=pentium (which implies -mcpu=pentium).

Note: using these options will make use on a 486 probably impossible.  If
someone wants a 486 build, they should make a separate build.  I'm not even
going to think about 386.
If you look at the output of egcs -fverbose-asm -S hello.c , you'll see that it
already defaults to using -march=pentium.  I get a similar result with gcc
2.95.3.  If you want to change the scheduling based upon ${target_cpu}, that's
fine but we shouldn't hardcode it across tbe board.

# GNU C version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release)
(i386-redhat-linux) compiled by GNU C version egcs-2.91.66 19990314/Linux
(egcs-1.1.2 release).
# options passed:  -fverbose-asm
# options enabled:  -fpeephole -ffunction-cse -fkeep-static-consts
# -fpcc-struct-return -fcommon -fverbose-asm -fgnu-linker -fargument-alias
# -m80387 -mhard-float -mno-soft-float -mieee-fp -mfp-ret-in-387
# -mschedule-prologue -mcpu=i386 -march=pentium
Well, that has -mcpu=i386 -march=pentium.  That generates instructions that
(may) only run on a pentium, but optimizes for running on a __386__.  Either the
documentation is majorly fubar (possible), or that's a silly set of options.
For x86 machines with GCC, I still think -O2 -mcpu=pentiumpro -march=pentium
makes lots of sense.

If you think it doesn't matter, here's the difference between -O2 and -O2
-mcpu=pentiumpro for a truely trivial function.  My point is that the
optimization IS different (vastly in some cases).

void hello() { }

        pushl %ebp
        movl %esp,%ebp

        pushl %ebp
        movl %esp,%ebp
        movl %ebp,%esp
        popl %ebp

FYI, 2.95.2 under freebsd 4.1 yields:
# GNU C version 2.95.2 19991024 (release) (i386-unknown-freebsd) compiled by GNU
version 2.95.2 19991024 (release).
# options passed:  -fverbose-asm
# options enabled:  -fpeephole -ffunction-cse -fkeep-static-consts
# -freg-struct-return -fsjlj-exceptions -fcommon -fverbose-asm -fgnu-linker
# -fargument-alias -fident -m80387 -mhard-float -mno-soft-float -mieee-fp
# -mfp-ret-in-387 -mno-fancy-math-387 -mschedule-prologue -mcpu=i386
# -march=pentium
cls: "If you look at the output of egcs -fverbose-asm -S hello.c , you'll see
that it already defaults to using -march=pentium."

On Red Hat (and therefore the Linux build boxen), yes. Not everywhere,

Regarding "-mcpu=pentiumpro", this will reportedly produce code running slower
on non-Intel processors than code compiled with "-mcpu=i386". YMMV.

Generally, the -mcpu savings in real world applications similar to mozilla are
slim, and the issues tangled. I'd expect someone proposing this change to give
hard numbers on the usual page load time benchmarks for different -mcpu settings
running on at least the 3 most popular subarches.

In the meantime, let's constrain ourselves to -O2, and ironing out the bugs
uncovered by it (in mozilla or gcc). This provides a much more definite win
across the board.
Very good point - using -O2 is far more important than lesser quibbles,
especially if the primary systems using gcc are defaulting to -march=pentium.
(I'd assumed it was defaulting to -march=i386).
General note as it came up: recent snapshots of GCC-3.0 won't compile mozilla
(amongst other things).  It's a general problem with c++ headers that's
discussed at
Contacted fcrozat to find out the flags used for mandrake builds which i believe
are optimized with at least -O3 ntm for i586 and using eh some recent form of
gcc (the infamous 2.96-[something indicating a recent cvs snapshot]) and glibc
223 and everything else under the sun which i dont care to ramble on about.  PS
I speak of cooker for those mdk users who are bewildered at why they see no 091.
 And, uh, do u want smoketest/other test results or a trace if it crashes?  Can
i even do an effective trace with such opt?
Anyway...i shutup now
Scratch that, im stupid.  Just found out NOT optimized beyond the -O default,
since standard mdk opt (now pretty sure it's -O3) compiled, but crashed on
startup :).  Sorry for confusion but still want to know what tests and data
would help.
cc'ing... hey cls, how'd you make that spiffy graph? I have a P2/266 & 256M, I
can do some other comparisons on a lower-end machine if that'd help at all.
Actually, that's jrgm's graph and I believe the tools are only available inside
the netscape firewall. 

If you want graphs like that, try out Grace (formerly xmgr or ACE/gr) from Its a bit quirky to get used to, but
extremely powerful once you get over the initial learning hump. Produces similar
graphs to the 04/15/01 01:46 image attachment. Oh - and you need Lesstif
is bug 82962, "Crash on switching character coding", related to this?


    ------- Additional Comments From  2001-06-14 14:42 ----
    I'm pretty sure that this is a -O2 problem.   My gcc 2.95.3 -g build can 
    switch encodings without a problem.  My gcc 2.95.3 -O2 build crashes.  I 
    can't see the stack because it locks my X session.  The gcc295 nightlies 
    are built using gcc 2.95.3 -O2 as well.  I'll fire off gcc 2.95.3 -O & 
    egcs 1.1.2 -O2 builds to verify.
Depends on: 82962
When I'm debugging a X program that breakpoints when a menu is posted I
run the debugger on one X server and send the app to a different X server
that is nearby. I use the gdb command: 

  set env DISPLAY othersys:0

before run.
Even better (if you have one machine):
Start a second X server (such as xinit -- :1 -bpp 24)
setenv DISPLAY :1.0
Depends on: 80988
> Regarding "-mcpu=pentiumpro", this will reportedly produce code running slower
> on non-Intel processors than code compiled with "-mcpu=i386". YMMV.

I'd like to test that. I use an AMD Duron.

> Generally, the -mcpu savings in real world applications similar to mozilla are
> slim, and the issues tangled. I'd expect someone proposing this change to give
> hard numbers on the usual page load time benchmarks for different -mcpu
> settings running on at least the 3 most popular subarches.

How can I benchmark Mozilla objectively?

General note: Thanks to all here for the comments, many of which were very
interesting for me, even if strictly offtopic (not about the -O2 flag).
WRT debuggin X programs.  Xnest and Xvnc are also helpful programs.  In
particular Xvnc is great if you have to debug on a remote machine and the
network link is slow.
Has anyone dared the -O9 with gcc-2.95.3 or gcc-3.0?
I tried once.  Barely got past the first file and that's with some recoding as i
FYI, AFAIK nothing over -O3 is actually any different from -O3. -O[4-9] appear
to be there for forward compatibility.
which reminds me, it was actually (probably) a compiler problem that i was having...
what is up with this bug?  anything new? gonna switch default flag to -O3 or
something? :P  mcafee has been fooling around with comet and now coffee in
relation to optimization but i dunno what it is he's been up to, ns and 'active'
mozilla partners probably have some idea...
Moving to -O2 (not -O3 or other various funky compiler options) is waiting on
verification of the libstdc+++ license for 2.95.x so that we can link statically
against libstdc++ (bug 79681).  Once we can statically link against libstdc++,
then we can upgrade the release compilers to gcc 2.95.x. 

And since I know someone is going to ask, moving to gcc3 creates another shared
library dependency problem; this time with .  I'm not sure if we
would be able to work around that with the same hack proposed in bug 79681 .

I'm posting from 0.9.1 compiled under gcc 3.0. I used -O3 -march=athlon
-mcpu=athlon. The only file to complain (segv'd gcc) was
layout/html/forms/src/nsFormControlHelper.cpp. It worked fine when I did this
file with -O2.
The gcc error was:
nsFormControlHelper.cpp: In static member function `static void 
   nsFormControlHelper::PaintFixedSizeCheckMark(nsIRenderingContext&, float)':
nsFormControlHelper.cpp:731: Internal error: Segmentation fault

I've raised a bug with the gcc peeps.

The performance of this thing seems pretty snappy - though its a bit unfair for
me to compare with my last build of 0.9.1, as I've recompiled my entire machine
from scratch with atlon optimized binaries :-)

I've done some testing on my gcc 3 -O3 build and it crashes on viewer demos 4
and 9 (simple tables and frames). 
The errors for #4 are:
Gdk-ERROR **: BadValue (integer parameter out of range for operation)
  serial 3279 error_code 2 request_code 12 minor_code 0
Gdk-ERROR **: BadValue (integer parameter out of range for operation)
  serial 3280 error_code 2 request_code 12 minor_code 0
Going to create the event queue
         gModeSwitchBit = 0x0

And for #9 (which crashes at the second attempt)Gdk-ERROR **: BadValue (integer
parameter out of range for operation)
  serial 17077 error_code 2 request_code 12 minor_code 0
Going to create the event queue
         gModeSwitchBit = 0x0

If anyone want more diagnostics and can tell me how to get them I'm happy to oblige


Loz benchmarked these two Mozillas/systems with the Viewer demos in the debug
demo. He found speed improvements betwen -4% and 190%, average 32%, plus 2
reproduceable crashes.
I felt a bit guilty abou the benchmarks comparing a -O3 build on a gcc3 system
with a -O2 build on a pgcc system. So i knocked up a -O2 build on my gcc3 system
to do a real comparison. (Both builds are now 0.9.2)
I tested by taking the times for each of the debug->viewer demos off the status bar.
Both builds also had -mcpu=athlon -march=athlon set.
Both builds crashed on test 4 and 9 (I'm building an i386 build to if that is
the problem, I'll try a -O build after - if that still crashes I'm guessing its
either gcc3 or something other external program on this new system)
Tests 14 and 15 were ignored, as they use a net connection.
The -O3 build sometimes came up with non numeric characters (;:<) in its time
(always for the first run of test 10).
I ran each test three times in a row, then moved on to the next. The only
exception to this was moving from test 10 to 11, where I cleared the screen
(800x600), as test 10 uses 100% cpu while in action.

The results were:
O2 tests are on the left, O3 on the right. The % was calculated as avg(O2) -
avg(O3) / avg(O2)

The conclusions?
The testing is too imprecise to really show anything..., but it looks like
swings and roundabouts.
Ack well, back to my compiling
> > Regarding "-mcpu=pentiumpro", this will reportedly produce code running slower
> > on non-Intel processors than code compiled with "-mcpu=i386". YMMV.
> I'd like to test that. I use an AMD Duron.

I did so.

1. Test setup
1.1. Procedure
I "reload"ed (via a simple click on the button)
<> 8 times. Each time, I
wrote down the time displayed in the status bar.
Before that, I sat cache to "Once per session" and reloaded the document 3 times
to eliminate network delay.
1.2. Config
Mozilla 0.9.2 as Beonex Communicator, OFFICIAL, gcc 2.95.4, -O2, disable-debug,
normal: CFLAGS=""
ppro: CFLAGS="-mcpu=pentiumpro -march=pentiumpro"
1.3. Machine
AMD Duron 650, Via KT133, 256MB PC100
Note: Not an Intel!

normal: 5.181 ms
ppro: 4.493 ms
Standard derivation is < 0.01 ms in both cases.
I tried -O3 (which "inlines functions") too. I also have some additional remarks
to the test.

1.1. Procedure
Mozilla seems to do DNS lookups nevertheless, so I reran some of the tests with
a local HTML file. Most of the results were similar.
I also measured the size of a distribution, uncompressed and bzip2 compressed.

1.2. Config
O3: CFLAGS="-mcpu=pentiumpro -march=pentiumpro" --enable-optimize=-O3

1.3 Machine
Debian unstable

2. Results

2.1. Performance
ppro is 13.27% faster than normal.O3: 4.478 ms average (0.83% faster than ppro)
With local file, O3 is 1.71% faster than ppro.

2.2. Distribution size
u = uncompressed, b = bzipped
normal: u: 23848960, b: 9004239
ppro: u: 24289280, b: 9121543 (+1.85%, +1.30%)
O3: u: 25077760, b: 9357161 (+3.25%, +2.58% vs. ppro; +5.15%, 3.92% vs. normal)

4. Additional notes
- It is to expect that a genuine Intel Pentium 2/3/4 sees a similar speed
- Somebody said that ppro (as defined above) runs on i586 (Intel Pentium, AMD
K6) machines, just maybe a tiny bit slowed. I couldn't test, because I don't
have a i586 system that can run Mozilla.

5. Summary
5.1. Longer
ppro brings considerable (13%) faster code, but only a small (~1.5%) increase in
distribution size.
An additional -O3 brings only a small perf improvement (additional 1-2%), but a
larger dist size increase (additional ~3%).

5.2. Executive
Use ppro optimization.
> CFLAGS="-mcpu=pentiumpro -march=pentiumpro" --enable-optimize=-O3
> - Somebody said that ppro (as defined above) runs on i586 (Intel Pentium, AMD
> K6) machines, just maybe a tiny bit slowed.
> Use ppro optimization.


-march=pentiumpro code does not run on i586 or K6/K6-2, at least as
generated by GCC 2.95.3.  -mcpu=pentiumpro is fine, but -march
must be i586 or lower.
march is the architecture to compile for; mcpu is to provide optimizations for
that architecture.

-march=486 -mcpu=pentiumpro would run a little slower on a 486 but would be
optimized for a pentiumpro class cpu

-march=pentiumpro -mcpu=pentiumpro more than likely will not run on a pentium or
lower CPU
Wait a even if you do -O3 it isn't going to fomit-frame-pointer if
your machine won't be able to debug?  Getting this impression from waterson's
post from the egcs docs about opt flags.  Obviously the flag would be left off
for debugging purposes on most machines then, but on releases (guess not on
talkbalk builds), would there be any significant gain by adding the flag?
and btw, someone wanna take off that moz092 keyword?  I wouldn't dare.
Does this require that that static gcc linking thing go through?
I've tried 0.9.3 against gcc-3.0 with -mcpu=athlon -march=athlon and with -O2
and -O3. I did similar timings to those I did previously against 0.9.2 (Based on
the viewer demos on the debug menu). I'm happy to say that things are faster.
Even better, tests 4 and 9 no longer crash the browser. I'm still getting an
issue with -O3 where I sometimes get a garbage character in the second decimal
point of the time displayed in the status bar. c++ is not my language, but this
sounds like a race condition (unless -O3 is broken). There is still apparently
little difference in performance between -O2 and -O3, but this problem makes it
hard to be sure.
I'll not post the times after what happened last time...
I tried to generate Mandrake mozilla rpm with 
CFLAGS="-O3 -fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586 -ffast-math
-fno-strength-reduce" BUILD_OFFICIAL=1 ./configure --enable-optimize=-O3 
with gcc 2.96-0.60mdk (based on gcc RH 2.96-95)
and I got very strange problems : preferences weren't loaded correctly, some
attributes of radiobuttons were not found anymore :((
My preferences problem was caused by '-ffast-math'..
I've opened up another bug that is tracking the problem I was having with the
status bar (94375). I can now get reliable timings with my O3 build (if I
compile js/src/jsstr.c and js/src/jsinterp.c as -O2). From the numbers below you
can see that there is not a lot of difference between the builds. The testing is
certainly not accurate enough to justifiably say that one is better than the other.

test    avg(O2) avg(O3) 100*(O2-O3)/O2
0       .1510   .1650   -9.2715
1       .1076   .1103   -2.5092
2       .1120   .1056   5.7142
3       .1180   .1106   6.2711
4       .2160   .2113   2.1759
5       .3556   .3403   4.3025
6       .1566   .1553   .8301
7       .1070   .1086   -1.4953
8       .5270   .5220   .9487
9       .9333   .9453   -1.2857
10      .2576   .2500   2.9503
11      .0966   .0990   -2.4844
12      .1120   .1120   0
13      .1813   .1796   .9376
16      .1586   .1593   -.4413

Out of interest the size of my -O3 build is much larger than my -O2 build. If I
run $tar czhf dist.jar dist then my -O2 jar is 26646709 bytes whereas my -O3
build is 36626264  bytes. I checked that there are no different file names
between the two. A lot of files get included twice using this method (links from
bin and lib). The main offending files are (1.6MB bigger in
-O3), libgklayout (1.5MB bigger in O3) libgkconhtmlstyle_s.a (670kB bigger in
O3). I've seen a message on the gcc lists that may back this behaviour of
gcc-3.0: <a href="">. Ah well
gcc-3.0.1 is slated for release Wednesday (things going well).
there seems to be a huge size difference between gcc3.0 and gcc3.0.1, so I am
interested in seeing the numbers of 3.0.1 when it is out :-)
The latest results from Loz's unreliable short tests are available:

build A: gcc-3.0   -O2 size=26646709
build B: gcc-3.0   -O3 size=36626264
build C: gcc-3.0.1 -O2 size=27025387
build D: gcc-3.0.1 -O3 size=29721253

As expected the build size has come down considerably during the point release
of gcc. Now the average times:

Test    A       B       C       D 
0     .1510   .1650   .1630   .1580
1     .1076   .1103   .1193   .1113
2     .1120   .1056   .1100   .1090
3     .1180   .1106   .1213   .1140
4     .2160   .2113   .2176   .2180
5     .3556   .3403   .3470   .3583
6     .1566   .1553   .1626   .1593
7     .1070   .1086   .1130   .1100
8     .5270   .5220   .5346   .5263
9     .9333   .9453   .9546   .9576
10    .2576   .2500   .3056   .2640
11    .0966   .0990   .1026   .1000
12    .1120   .1120   .1173   .1136
13    .1813   .1796   .2000   .1850
16    .1586   .1593   .1676   .1630

Hmm, the following appears to be true:
-O3 is now marginally quicker than -O2
gcc-3.0.1 is very slightly slower than gcc-3.0

Some quality patches to the gcc inliner are becoming available, so we should
hopefully see some marked improvements at the next point release (or earlier if
I get the inclination to test the patches).

I still get garbage characters on the status bar, now with both builds - see for details.
loz, are you sure that you didn't mix up the labels? It seems that
- 02 build size got larger
- O2 got slowlier
- gcc 3.0.0 O2 is as fast as 3.0.1 O3, with considerably smaller build size.

So, it looks like 3.0.1 is all around worse, apart from fixing the excessively
large O3.
3.0.1 turned down the inlining cutoff - see the gcc archive for more details
than you want to know :) That would explain why -O3 code is now smaller.
The labels are correct.
I was only referring to the -O3 builds with the size comment (after I'd said how
big they'd been in a previous post). I didn't make this in any way clear,
however :-).

Also the drift in performance could be due to my upgrade to the 2.4.9 kernel,
which has some issues with a new VM algorithm, if I get time I'll retest the
gcc-3.0 builds with the new kernel, and if I get even more inclined I may try
the kernel patches that fix this and maybe even the gcc inliner patch.
Depends on: 96911
See observations in bug 97649 regarding preferences horkage under gcc 3.0.1
snapshot, and also which conditions caused it. I resolved it as invalid, but it
has relevance to other comments in this bug. Vcls: please verify or reopen as
seems fit.
Does this _really_ depend on 79681?  I can't figure out why, and the various
responses i get seem to indicate that it really doesn't.  Just politics?

<Dauphin> Blue: some people prefer the static build change to the -O2 change
which involves changes compilers & license issues but the 2 changes are
I had to look up that word:
orthogonal adj. [from mathematics] Mutually independent; well separated;
sometimes, irrelevant to.
Who can guess which keyword is outdated again?
Actually, it does.  I thought that you were talking about the "static build"
which is bug 46775.  

Stepping through this mess one more time:

The desired goal is to move to using -O2 by default as it has a perceived
performance increase.
egcs 1.1.2 is the current default compiler used by the build automation
-O2 egcs 1.1.2 builds are reported to be flaky at best.
-O2 gcc 2.95.x builds are much more stable
gcc 2.95.2 uses a different libstdc++ causing builds to have a different shared
library dependency
Since this new libstdc++ isn't on most platforms (where RedHat has majority
share of the market), people would have to install this new libstdc++ and that
is perceived, by some, to be too much to ask.
So we're looking into statically linking libstdc++ into mozilla, but that brings
up questions about the exact license of 2.95.2's libstdc++ and whether the
linking will cause problems for Mozilla license-wise (this may be a moot point
after the tri-licensing stuff is finished).
gcc 3.0.x's libstdc++ has a clear & compatible license but gcc3 also introduces
an additional dependency (see ) .  
It may be possible to statically link against the libgcc but the webpage implies
that we should only do that for a completely static binary.

So, in summary, we're waiting until the 2.95.2's libstdc++ license issue is
resolved or the market changes so that newer libstdc++/libgcc are standard on
most machines.

FWIW, we have had test gcc295x -O2 builds on the ftp site for months.  

bug 104653 is about NaN on all downloads on a fresh gcc3 build.
Just for your information. I have been using the gcc nightlies for months
without problems and I just compiled milestone 0.9.5 with -O4 and -march=k6,
runs without a hitch so far. I did notice a nice speedup compared to the
standard builds.
No longer depends on: 79681
My belief is that, for the default builds, adding a dependency on a
new shared library package is worth it to get the performance gains that will
come with bumping up -O to a higher level.  We can document this and provide
prebuilt versions of the appropriate libstdc++ so that it's easy for users to
meet this dependency.  How do other folks feel about this?  Would Netscape also
be interested in doing this for their default builds?
As an interim measure, how about releasing special versions for milestones with
the new library or whatever is needed to do -O2, with special intructions about
what type of system you need to use it. Of course the current type of builds
would remain the default until everything is sorted out.
Didn't bundling system libraries go out of style with win95?  There's no need
for to become a distributor of nor as
we don't do anything special to those libraries to warrant the bundling.  Users
need to acquire,, etc before they can run mozilla. 
Just add these new libraries to the list.

The gcc30 builds have replaced the gcc295 builds on the ftpsite.  They are built
using -O2.  Also, if you're planning on building using gcc 3.0.x, you might want
to make sure that you have a newer binutils installed.  I kept seeing a crash
loading on RH6.2 until I upgraded binutils to

*** Bug 109934 has been marked as a duplicate of this bug. ***
WFM (linux, gcc-2.96-96, mozilla-cvs-20011121, --enable-optimize=-O3)

starts up easily 40% faster than my -O build.

tested on a 850MHz PIII w/ 256Mb RAM.
did you run it through chofmann's browser buster and smoketests and other 
debug tests?
0.9.6 will currently build a -O3 under gcc-3.0.2, but it segvs whenever it tries
to render. Backtrace (in case this isn't a dup - I couldn't find it - though I'd
be surprised if it wasn't)
#0  0x40b81da7 in _ZN12imgContainer11AppendFrameEP14gfxIImageFrame ()
#1  0x40b9112c in _Z14HaveDecodedRowPvPhiiiihi ()
#2  0x40b90258 in _Z10output_rowP10gif_struct ()
#3  0x40b8fefb in _Z6do_lzwP10gif_structPKh ()
#4  0x40b8ef2c in _Z9gif_writeP10gif_structPKhj ()
#5  0x40b90a88 in _ZN13nsGIFDecoder211ProcessDataEPhj ()
#6  0x40b91230 in _Z11ReadDataOutP14nsIInputStreamPvPKcjjPj ()
#7  0x4015714d in
() from /home/loz/work/mozilla/dist/bin/
#8  0x40b90aeb in _ZN13nsGIFDecoder29WriteFromEP14nsIInputStreamjPj ()
#9  0x40b858ab in
_ZN10imgRequest15OnDataAvailableEP10nsIRequestP11nsISupportsP14nsIInputStreamjj ()
#10 0x40b8446d in
#11 0x4096273f in
_ZN12nsJARChannel15OnDataAvailableEP10nsIRequestP11nsISupportsP14nsIInputStreamjj ()
   from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/
#12 0x40926c26 in _ZN22nsOnDataAvailableEvent11HandleEventEv ()
   from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/
#13 0x4091773a in _ZN23nsARequestObserverEvent13HandlePLEventEP7PLEvent ()
   from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/
#14 0x4016ff22 in PL_HandleEvent ()
   from /home/loz/work/mozilla/dist/bin/
#15 0x4016f2c7 in PL_ProcessPendingEvents ()
   from /home/loz/work/mozilla/dist/bin/
#16 0x401714a4 in _ZN16nsEventQueueImpl20ProcessPendingEventsEv ()
   from /home/loz/work/mozilla/dist/bin/
#17 0x40cfb666 in _Z24event_processor_callbackPvi17GdkInputCondition ()
#18 0x40cfb5ef in _Z17our_gdk_io_invokeP11_GIOChannel12GIOConditionPv ()
#19 0x403ae56a in g_io_unix_dispatch () from /opt/gnome/lib/
#20 0x403afcf0 in g_main_dispatch () from /opt/gnome/lib/
#21 0x403afff8 in g_main_iterate () from /opt/gnome/lib/
#22 0x403b04bc in g_main_run () from /opt/gnome/lib/
#23 0x402c838f in gtk_main () from /opt/gnome/lib/
#24 0x40cfb253 in _ZN10nsAppShell3RunEv ()
#25 0x41388c12 in _ZN17nsAppShellService3RunEv ()
#26 0x804c884 in _Z5main1iPPcP11nsISupports ()
#27 0x804bb6c in main ()
#28 0x404f1861 in __libc_start_main () at soinit.c:56

Target Milestone: mozilla1.0 → mozilla0.9.8
If somebody gets a chance to test gcc-3.0.3 with mozilla that would be interesting.
I built Mozilla using gcc 3.0.3 from a CVS build pulled around 2:30
PST, December 24.  I compiled using -O2, but it was still a debug
build, so that might have covered up some of the crashes that might
have happened with a non-debug build.  My mozconfig file is:

ac_add_options --enable-jsd
ac_add_options --with-java-supplement
ac_add_options --with-extensions=all
ac_add_options --enable-mathml
ac_add_options --enable-crypto
ac_add_options --enable-chrome-format=flat
ac_add_options --enable-meta-components
ac_add_options --disable-logrefcnt
ac_add_options --disable-detect-webshell-leaks
ac_add_options --disable-dtd-debug
ac_add_options --disable-tests
ac_add_options --disable-xpctools
ac_add_options --disable-reflow-perf
ac_add_options --disable-perf-metrics
ac_add_options --disable-jprof
ac_add_options --without-profile-modules
ac_add_options --enable-leaky
ac_add_options --enable-debug-modules
ac_add_options --with-system-jpeg
ac_add_options --with-system-zlib
ac_add_options --with-system-png
ac_add_options --with-system-mng
ac_add_options --without-system-nspr
ac_add_options --with-gtk
ac_add_options --disable-verbose-config-defs

My setup is a i686, Linux 2.4.17 (compiled with gcc 3.0.2), glibc
2.2.4-2, and XFree86 4.1.0

I did the following without getting any crashes:

- Normal browsing, with full use of cookies (,

- Browsing with HTTPS (Sourceforge).

- Sent and received mail.

- Started up the editor on the Slashdot homepage.

- Used Chatzilla.

- Used the address book.

- Edited my bookmarks.

- All of the non-site Debug:Verification entries, except for the
  Java applet and JavaScript tests.

- All of View Demo, XBL Test Suite and XUL Test suite tests.
I redid my previous build, but with all debugging turned off.  I then redid
all of the tests which were done on the previous test, plus I viewed all of
the panels in the (non-mail) preferences window, plus popped up most of the
"extra/additional/advanced" windows from the prefs panels.  Nothing crashed
(except for a few things that also crashed with the 2001122408 build).
I've just come from bug 118783.

Jan09 2002 cvs built and ran fine for me on linux this way:

./configure --disable-debug --enable-optimize="-O -mcpu=k6"

I realize -O is the default now, but thought I'd drop this in here. I had tons
of compile sucesses, but segfaults on run, with a mixed bag of -O3 and
-fomit-frame-pointer both of which are said to cause problems in bug 118783.
Drop kicking this one off of 0.9.8 radar.
Target Milestone: mozilla0.9.8 → Future
This page may be useful in determining extra optimization flags that could give
a speedboost: <a

I plan on running some tests with gcc 3.0.3 to see which ones help/work with
Keywords: mozilla1.0+
Reassigning to asa per my conversation with him. We need some decision to be
made regarding this for 1.0.
Assignee: cls → asa
Removing from netscape landing page due to high risk of introducing an optimizer
change without a couple of milestones of testing. I suggest the mozilla1.0+ be
re-evaluated for this reason.
people have been testing -O2 builds for about a year now, and all issues seen
there have been fixed until around August 2001, it seems.
We have several people (including me) who have been building and running gcc
2.95.2/.3 -O2 builds on a daily basis for more than half a year. We also have a
bunch of people doing the same -O2 builds with gcc 3.0.x and I don't remember
that I have heard any big problem reports of them (-O3 may have more problems

As Comment #107 From Loz Hygate 2001-08-05 12:59 states, 0.9.3 was building and
running on both (gcc 2.95.x and gcc 3.0.x) compilers without problems, and that
never changed for milestones since that date.

I dont understand your argument of "high risk of introducing an optimizer
change without a couple of milestones of testing" for this reason, as I believe
that 7 milestone releases building and running without problems should count as
"a couple of milestones of testing", right?
-O2 has been in heavy use on Linux (and FreeBSD) for a long time.  I've been
using it for all my browsing and testing since 0.9.1-ish days.
I agree, I have been building with -02 on Linux for ages and my builds have been
just as solid as nightlies - just much faster.

If we want this in 1.0 we should change this flag ASAP to prove to the sceptics
that this is really solid now. It should be trivial to switch off optimizations
in the unlikely event that it should cause problems for people. It does not seam
fair to all of us who have been trying these builds for so long time to claim
that it is untested.
It appears that the Debian packages have been using -O2 since at least 0.9.7-3,
prior to which they used -O3 since M18-1 and then -O2 for the single release of
M17-3. Here are the changelog entries that I'm drawing these conclusions from.
There is no mention of "-O" or "O1", "O2", or "O3" in any other changelog
entries, which is a very strong indication that the optimization level remained
the same in between.

mozilla (2:0.9.7-3) unstable; urgency=high

  * debian/rules:
    - Downgrade optimize level to 2 (-O3 to -O2)
      It will fix segfault with flashplugin, I've checked. (closes: #121404)
      Perhaps, also fix strange crashes. (closes: #126805, #126418, #127346)


mozilla (M18-1) unstable; urgency=low

  * ...
  * Changes CFLAGS again, to -O3 -pipe in hopes to make it faster


mozilla (M17-3) unstable; urgency=low

  * ...
  * Changed CFLAGS to -pipe -O2 to comply with policy 

It seems to me that "every user of the Debian mozilla package since M17" also
constitutes a fairly broad base of testers :) I've also used every Debian
mozilla package released since then (I stopped using the packages for a while,
but only because Debian wasn't releasing any at all) and I can personally
testify that I've never seen any apparently Debian-specific or
optimization-related issues. - where can we read these logs ?
Debian users can read them in /usr/share/doc/mozilla/changelog.Debian.gz (this
is a standard location for Debian changelogs).

If you aren't a Debian user, I don't think they're publically available in a
terribly easy to use form, but you can find it in the following (huge) patch
document, which mozilla views with no real problems:

(search for mozilla-0.9.9/debian/changelog to get the part of the patch that
inserts the debian changelog - the whole changelog is in there)

Hope this helps.
The keyword "topperf" is certainly justified for this bug.
Keywords: perftopperf
Adding syd to the cc:.
Christopher Seawood stopped putting up gcc2.95 -O2 builds on the ftp-server in
favor of gcc3.0 builds.  Wasn't this because 2.95 and -O2 was no longer
considered experimental?  FWIW I always build with the following without special

ac_add_options --enable-optimize="-O4 -finline -fno-omit-frame-pointer -march=k6
I've been using
ac_add_options --enable-optimize="-O3 -march=athlon -mcpu=athlon
-fno-omit-frame-pointer -maccumulate-outgoing-args"
with gcc 3.0 ever since 3.0.1 with no problems besides plugins dont work.
I think that a reasonable approach to this bug for 1.0 is to offer an
'experimental' build with -02. I certainly don't think that we'd hold 1.0 for
this change. 
Asa - that sounds like a reasonable compromise. Can you try to make sure such a
build is actually done? Also, will it be easy to check in a talkback report if
such a build was used? It would be nice if an official -O2 build was done daily
Just for the record, I've been using "-O3 -march=i686 -fno-omit-frame-pointer
-funroll-loops" for a couple of months without problems. Someone on the
newsgroups recommended it after having used those options for a while.
I wrote in to the gcc mailing list and asked if there was a way to avoid the gcc
3.0 requirement and got the following response:

>Yes, see the -static-libgcc option. Be aware that this is asking for trouble if 
>you ever want to throw exceptions across shared libraries, because there could be 
>different incompatible versions of the EH runtime linked in. But, IIRC, mozilla 
>does not use exceptions anyway so that may not be an issue.

Post 1.0/gcc 3.1 release we should try this flag and see how well it works.
Now using gcc 3.1 with -O3 -march=athlon -mcpu=athlon -fno-omit-frame-pointer
-maccumulate-outgoing-args -falign-functions=4 -fstrict-aliasing
-fbranch-probabilities and it works excellent!
> I think that a reasonable approach to this bug for 1.0 is to offer an
> 'experimental' build with -02.

I agree.

> I certainly don't think that we'd hold 1.0 for this change. 

In that case though, it's nonsense to have this mozilla1.0+.
I tried yesterday's nightly build with -O3 -march=k6 -fexpensive-optimizations
-ffast-math using GCC 2.95.3 and it was marginally faster rendering the index of
the JDK API but slower rendering Google Image Search pages, when compared to -O2
-march=k6. Observations are based only on Mozilla's "Document: Done(...secs)"
output. I couldn't feel the difference.
To "end" this endless discussion:

Can we now try (e.g. "carpool", change default to -O2, kick all tinderboxen and
check if the Zilla still passes the smoketests) to make the default optimisation
-O2 in "trunk" and see if it works properly, please ?

We can "undo" this change at any point later if it causes too much trouble...
Keywords: mozilla1.1
I agree with Roland, it's time to try to turn this on by default. A LOT of
people run these builds daily, and it would be nice if the rest of the crowd
also got to see the speed improvements it brings. I hope the Mozilla folks trust
all of us who have tested this for so long...
It's not an issue of testing or trust at this point, but of the fact that to
reliably turn this stuff on, the default builds need to happen on newer
compilers (the current default builds happen on egcs 1.1.2, which is ancient). 
And switching to newer compilers brings in library dependency issues.  So the
blocking thing now is that a document needs to be written up describing these
issues, and the major stakeholders need to buy off on the changes.  Most likely,
this will involve adding a requirement for specific mininimum library versions,
and (at the very least) supplying up-to-date links from which folks with various
common linux distributions can download them.
There used to be nightly builds produced on gcc2.95, so some build box should
have that compiler installed, and it should have made it clear what libraries
are required. And the newer libraries will probably make it easier for people to
install mozilla since the egcs-libraries are too old to be installed by default
on most up-to-date distributions.
It's simple.. Currently mozilla depends on, which
is an old library part of the old egcs 1.1.2.
A C++ program built with a current gcc (2.95.4) will depend on, which is part of that gcc.
As stated before, the old library is more problematic, because it's from an
obsolete compiler. As an example, the "new" libstdc++ has been in Debian since
As long as we're eventually going to be requiring a new set of libraries, are
there worthwhile performance gains to be gained by switching to gcc 3.x? I think
once Red Hat, Mandrake, and United ship a distrib with 3.x, it's fair game for
apps to require. Red Hat's been shipping gcc3 libs for quite some time now.
Alias: O2
this is blocked by bug 158385, yes?
Well, it depends on moving to a gcc3.2 build, and that depends on that other
Depends on: 158385
here my 2 cents, inspite my trouble is not about linux.

Problem is funny - in BeOS (gcc 2.9 - 2.95) - -O2 breaks URLbar drop-down
functionality. It don't raise automatically with autocomplete. Inspite it can
still bring it to front with explicit click on control
FWIW, I turned -O2 on on brad a few hours back. ZDiff was +429752/-481140, for a
net win of 51388 bytes. Which isn't really significant.

Brad (the person) changed the machine's name just afterwards, so you can't see
the log, but its at

the +/- is all over the place - minus where the compiiler can optimise some
stuff better, and + where the compiler inlined code to make stuff faster for a
modest code size increase. (and 50K for the 7-10% speed improvement mentioned
elsewhere is defintely modest) I did look into asm diffs for a couple of the
large wins, and I don't think that theres anything useful we can do with it from
a code pov.

That box is a trace-malloc box, so no perf numbers from the change over.

(sergei, 2.95 doesn't work with -O2 - theres a known compiler bug with that)
OK, we've finally got the compiler that we need (gcc 3.2.3) to throw the -O2
switch in the default build.  If we could get -O2 turned on by 1.5 alpha, that'd
totally rock.
Assignee: asa → leaf
nightly builds should start getting built with -O2 starting on july 4th. Let's
hope there aren't any fireworks.

Accepting bug, in case we want configure to automatically use -O2 optimization
for --enable-optimize for sufficiently high GCC
gcc 3.2 vs. gcc 3.3 shootout:
Average (10runs):
                      gcc 3.2 gcc 3.3
getElementById()        556ms   485ms
getElementsByTagName()  369ms   304ms
createElement()         988ms   866ms
getAttribute()          437ms   381ms
setAttribute()          408ms   342ms
Sum                    2758ms  2378ms
That's a whooping 14% faster.

Version: Mozilla 1.4
Build options: moz_official, -O2, -march=pentiumpro (won't run on pentium1/k6)
System: Debian unstable, Athlon XP 1600+ (1.4Ghz)
We also might want to consider -Os (optimize for size).  Here's the
results for three different --enable-optimize configurations

   -Os -mcpu=i686  (code scheduled for i686, will run on any x86)

Average (50 runs):
                        -O2     -Os     -Os/i686
getElementById()        431ms   426ms   430ms
getElementsByTagName()  390ms   390ms   394ms
createElement()         726ms   722ms   728ms
getAttribute()          452ms   453ms   455ms
setAttribute()          432ms   432ms   433ms
Sum                    2431ms  2423ms  2440ms

Size of all .so:       16979K  16127K  16092K

System: Red Hat 9, Pentium-III (1.133GHz)
Build config: MOZ_PHOENIX --disable-tests --disable-ldap
--enable-crypto --enable-plaintext-editor-only --disable-composer

Using -Os instead of -O2 give almost identical performance for these
tests and reduces the footprint by 5%.
Severity: critical → major
Priority: P3 → P2
Target Milestone: Future → mozilla1.5beta
Blocks: majorbugs
What is the current status of this?
Is -O2 enabled now for the nightly Linux builds?
about:buildconfig shows Firebird trunk nightly is using -Os (FB 20040101 Linux)
What about the 'Suite'? -Os or -O2 or -O1??
Compare bug 225433 (use -Os)
resolving in favor of 225433

*** This bug has been marked as a duplicate of 225433 ***
Closed: 17 years ago
Resolution: --- → DUPLICATE
Product: Browser → Seamonkey
No longer blocks: majorbugs
You need to log in before you can comment on or make changes to this bug.