Closed Bug 53486 (O2) Opened 24 years ago Closed 20 years ago

Default linux MOZ_OPTIMIZE_FLAG is -0 !!

Categories

(SeaMonkey :: Build Config, defect, P2)

x86
Linux

Tracking

(Not tracked)

RESOLVED DUPLICATE of bug 225433
mozilla1.5beta

People

(Reporter: dougt, Assigned: leaf)

References

Details

(Keywords: topperf)

Attachments

(6 files)

in mozilla/configure, the default optimization flag passed to gcc is -0. This needs a value of at least -O1, probably -O2. Leaf, can you check if the build machines have this set in the environment?
Severity: normal → critical
Keywords: nsbeta3
I think cls was looking at this. I just finished a build with -O2 that worked...trying -O3 next...
verification builds, and I believe tinderboxen, use -O
From egcs docs... Options That Control Optimization ================================= These options control various sorts of optimizations: `-O' `-O1' Optimize. Optimizing compilation takes somewhat more time, and a lot more memory for a large function. Without `-O', the compiler's goal is to reduce the cost of compilation and to make debugging produce the expected results. Statements are independent: if you stop the program with a breakpoint between statements, you can then assign a new value to any variable or change the program counter to any other statement in the function and get exactly the results you would expect from the source code. Without `-O', the compiler only allocates variables declared `register' in registers. The resulting compiled code is a little worse than produced by PCC without `-O'. With `-O', the compiler tries to reduce code size and execution time. When you specify `-O', the compiler turns on `-fthread-jumps' and `-fdefer-pop' on all machines. The compiler turns on `-fdelayed-branch' on machines that have delay slots, and `-fomit-frame-pointer' on machines that can support debugging even without a frame pointer. On some machines the compiler also turns on other flags. `-O2' Optimize even more. GNU CC performs nearly all supported optimizations that do not involve a space-speed tradeoff. The compiler does not perform loop unrolling or function inlining when you specify `-O2'. As compared to `-O', this option increases both compilation time and the performance of the generated code. `-O2' turns on all optional optimizations except for loop unrolling and function inlining. It also turns on the `-fforce-mem' option on all machines and frame pointer elimination on machines where doing so does not interfere with debugging. `-O3' Optimize yet more. `-O3' turns on all optimizations specified by `-O2' and also turns on the `inline-functions' option.
As an aside, I have a suspicion that specifying -O2 with -pedantic generates bad code, mostly because I've never been able to get a build to work with both flags turned on (it crashes in layout somewhere). Haven't taken the time to prove it, though...
My -O2 builds are also crashing. The last known module to be touched was editor. Now, the weird thing is that if I rebuild just editor/base adding -g to CFLAGS/CXXFLAGS, the build doesn't crash on start up. I'm going to try to narrow it to a particular file. Here's the beginning of the optimized trace: (gdb) bt #0 0x8361411 in ?? () #1 0x2c0270e2 in nsEditor::GetPriorNode () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libeditor.so #2 0x2c06c23b in nsHTMLEditor::GetPriorHTMLNode () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libeditor.so #3 0x2c037d88 in nsTextEditRules::WillInsert () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libeditor.so #4 0x2c038841 in nsTextEditRules::WillInsertText () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libeditor.so #5 0x2c0379cb in nsTextEditRules::WillDoAction () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libeditor.so #6 0x2c05a202 in nsHTMLEditor::InsertText () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libeditor.so #7 0x2c094954 in nsHTMLEditorLog::InsertText () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libeditor.so #8 0x2b9f4d76 in nsGfxTextControlFrame2::SetTextControlFrameState () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libgklayout.so #9 0x2b9f56a1 in nsGfxTextControlFrame2::SetProperty () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libgklayout.so #10 0x2b955659 in nsHTMLInputElement::SetValue () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libgklayout.so #11 0x2aed3589 in SetHTMLInputElementProperty () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/./libjsdom.so from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libmozbrwsr.so #24 0x2b1e6790 in nsDocLoaderImpl::FireOnLocationChange () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/liburiloader.so #25 0x2b59fc75 in nsDocShell::SetCurrentURI () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libdocshell.so #26 0x2b59f55e in nsDocShell::OnNewURI () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libdocshell.so #27 0x2b59fba3 in nsDocShell::OnLoadingSite () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libdocshell.so #28 0x2b59c49b in nsDocShell::CreateContentViewer () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libdocshell.so #29 0x2b5a6c59 in nsDSURIContentListener::DoContent () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libdocshell.so #30 0x2b1e3906 in nsDocumentOpenInfo::DispatchContent () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/liburiloader.so #31 0x2b1e34c1 in nsDocumentOpenInfo::OnStartRequest () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/liburiloader.so #32 0x2b0fa84d in nsHTTPFinalListener::OnStartRequest () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libnecko.so #33 0x2b0d9d40 in InterceptStreamListener::OnStartRequest () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libnecko.so #34 0x2b0faa09 in nsHTTPServerListener::FinishedResponseHeaders () from /usr/cls/moz/opt-test/obj-opt-O3/dist/bin/components/libnecko.so
Both my -O2 and -O3 builds worked fine. cls, are you using --enable-pedantic?
Yes as --enable-pedantic is the default. I also discovered that if I compile _just_ editor/base/nsEditor.cpp with the additional -g, the crash goes away.
The attached patch makes --enable-optimize=-O9 work as expected. It also changes the optimize defaults to -O3 for mozilla if using gcc and to -O3 for nspr if using linux & autoconf.
I would recommend using -O2 as the default optimization level. The only difference between O2 and O3 is that O3 inlines much more. This often bloats the executables and even makes them slower due to cache issues.
I compiled the ns6 branch with -O2. It seems to work well.
I thought -O3 was supposed to make binaries faster with a side-effect of the inlining being a bigger binaries. Could you elaborate on the slowdown due to cache issues? Currently, I'm building using: env CFLAGS='-pipe -O2' CXXFLAGS='-pipe -O2' ../mozilla/configure --enable-nspr-autoconf --enable-mathml --enable-svg --disable-debug --disable-tests --disable-mailnews This weekend, I was seeing intermittent crashes without recompiling but it was crashing in gklayout.
Back trace from -O2 build with gklayout crash: #0 0x83eda0b in ?? () #1 0x2bb19e6e in nsIBox::AddCSSPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #2 0x2bb1d371 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #3 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #4 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #5 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #6 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #7 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #8 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #9 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #10 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #11 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #12 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #13 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #14 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #15 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #16 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #17 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #18 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #19 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #20 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #21 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #22 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #23 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #24 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #25 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #26 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #27 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #28 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #29 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #30 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #31 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #32 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #33 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #34 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #35 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #36 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #37 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #38 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #39 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #40 0x2bb1fbee in nsSprocketLayout::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #41 0x2bb1d3a6 in nsContainerBox::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #42 0x2bb27dfb in nsBoxFrame::GetPrefSize () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #43 0x2bb1f1bf in nsSprocketLayout::PopulateBoxSizes () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #44 0x2bb1e2e3 in nsSprocketLayout::Layout () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #45 0x2bb1d670 in nsContainerBox::DoLayout () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #46 0x2bb2813f in nsBoxFrame::DoLayout () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #47 0x2bb19a18 in nsBox::Layout () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #48 0x2bb20e67 in nsStackLayout::Layout () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #49 0x2bb1d670 in nsContainerBox::DoLayout () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #50 0x2bb2813f in nsBoxFrame::DoLayout () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #51 0x2bb19a18 in nsBox::Layout () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #52 0x2bb27c52 in nsBoxFrame::Reflow () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #53 0x2bb18240 in nsRootBoxFrame::Reflow () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #54 0x2b9687a6 in nsContainerFrame::ReflowChild () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #55 0x2b9a2799 in ViewportFrame::Reflow () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #56 0x2b977ef6 in nsHTMLReflowCommand::Dispatch () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #57 0x2b992de8 in PresShell::ProcessReflowCommands () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #58 0x2b991ab1 in PresShell::FlushPendingNotifications () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #59 0x2b991b0d in PresShell::EndReflowBatching () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #60 0x2bd4fd1a in nsEditor::EndUpdateViewBatch () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libeditor.so #61 0x2bd451de in nsEditor::EndPlaceHolderTransaction () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libeditor.so #62 0x2bd7d889 in nsHTMLEditor::InsertText () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libeditor.so #63 0x2bdb47c0 in nsHTMLEditorLog::InsertText () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libeditor.so #64 0x2ba7c17a in nsGfxTextControlFrame2::SetTextControlFrameState () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #65 0x2ba7a71d in nsGfxTextControlFrame2::SetProperty () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #66 0x2b9e3fe9 in nsHTMLInputElement::SetValue () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libgklayout.so #67 0x2aea4aa9 in ?? () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libjsdom.so #68 0x2abe07e6 in js_SetProperty () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libmozjs.so #69 0x2abd5a4c in js_Interpret () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libmozjs.so #70 0x2abcf23d in js_Invoke () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libmozjs.so #71 0x2abcf42f in js_InternalInvoke () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libmozjs.so #72 0x2abe03eb in js_SetProperty () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libmozjs.so #73 0x2abd5a4c in js_Interpret () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libmozjs.so #74 0x2abcf23d in js_Invoke () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libmozjs.so #75 0x2b4a4202 in nsXPCWrappedJSClass::CallMethod () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libxpconnect.so #76 0x2b4a2bf1 in nsXPCWrappedJS::CallMethod () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libxpconnect.so #77 0x2ab80f2e in PrepareAndDispatch () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libxpcom.so #78 0x2ab8107e in nsXPTCStubBase::Stub8 () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libxpcom.so #79 0x2b5a3d52 in ?? () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libmozbrwsr.so #80 0x2b18fb90 in nsDocLoaderImpl::FireOnLocationChange () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/liburiloader.so #81 0x2b5424b5 in nsDocShell::SetCurrentURI () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libdocshell.so #82 0x2b541d9b in nsDocShell::OnNewURI () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libdocshell.so #83 0x2b5423e3 in nsDocShell::OnLoadingSite () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libdocshell.so #84 0x2b53eb93 in nsDocShell::CreateContentViewer () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libdocshell.so #85 0x2b5481a0 in nsDSURIContentListener::DoContent () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libdocshell.so #86 0x2b18caee in nsDocumentOpenInfo::DispatchContent () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/liburiloader.so #87 0x2b18c5e7 in nsDocumentOpenInfo::OnStartRequest () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/liburiloader.so #88 0x2b0b6a59 in nsHTTPFinalListener::OnStartRequest () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libnecko.so #89 0x2b097570 in InterceptStreamListener::OnStartRequest () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libnecko.so #90 0x2b0b67b5 in nsHTTPServerListener::FinishedResponseHeaders () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libnecko.so #91 0x2b0b537d in nsHTTPServerListener::OnDataAvailable () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libnecko.so #92 0x2b073638 in nsOnDataAvailableEvent::HandleEvent () #93 0x2b072f10 in nsStreamListenerEvent::HandlePLEvent () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libnecko.so #94 0x2ab6f543 in PL_HandleEvent () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libxpcom.so #95 0x2ab6f463 in PL_ProcessPendingEvents () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libxpcom.so #96 0x2ab701ad in nsEventQueueImpl::ProcessPendingEvents () from /usr/cls/moz/main/obj-opt-O2/dist/bin/./libxpcom.so #97 0x2b1d32af in event_processor_callback () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libwidget_gtk.so #98 0x2b1d305d in our_gdk_io_invoke () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libwidget_gtk.so #99 0x2b36baca in g_io_unix_dispatch () from /usr/lib/libglib-1.2.so.0 #100 0x2b36d186 in g_main_dispatch () from /usr/lib/libglib-1.2.so.0 #101 0x2b36d751 in g_main_iterate () from /usr/lib/libglib-1.2.so.0 #102 0x2b36d8f1 in g_main_run () from /usr/lib/libglib-1.2.so.0 #103 0x2b2955b9 in gtk_main () from /usr/lib/libgtk-1.2.so.0 #104 0x2b1d377e in nsAppShell::Run () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libwidget_gtk.so #105 0x2af1b926 in nsAppShellService::Run () from /usr/cls/moz/main/obj-opt-O2/dist/bin/components/libnsappshell.so #106 0x804de62 in main1 ()
The only difference between -O2 and -O3 is that -O3 turns on -finline-functions, which inlines all possible "small" functions. This makes the code larger and possibly faster. I say possibly, because on modern machines inlining isn't an obvious win like in the bad old days. Duplicating code (which is what inlining does) defeats the purpose of caches. This is a thing which will need to be benchmarked. For my machine (a dual PIII) i hyphotize that "-O2 -fomit-frame-pointer -march=i486 -mcpu=pentiumpro" will generate good code. One could possibly add -fstrict-aliasing there too, but I don't know if it is safe. I've personally added asm() constructs to the code that run on >=486 only, soo -march=i486 lets the compiler use 486 instructions elsewhere. -march=pentiumpro should shedule the code good for modern PII like processors, but still be backwards compatible (in this case to 486). I'll try to build such a beast and see if it works.
Ok. So -fomit-frame-pointer was really bad. Crashed badly somewhere in XPConnect i think. Lets scrap that, rebuilding without it.
The -O2 -march=i486 -mcpu=pentiumpro seems to hang on startup. Investigating.
reassigning to cls, our unix build guru.
Assignee: leaf → cls
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9
Um, yeah. If someone cares to review the patch, I can check in only the portion that allows you to tweak the -O level via --enable-optimize . Based upon my experience, the egcs compiler has issues compiling certain parts of Mozilla with anything above -O. Once I upgraded back to gcc 2.95.2, I was able to run a build that used -O2 & -O3 without a problem. So, since our official build platform is stock RH 6.x, I guess we're going to leave the default optimization level at -O.
so you aren't doing @@ -2871,11 +2877,11 @@ @@ -576,11 +582,11 @@ assuming you don't check in those portions you can take a r=timeless, although I would think an r=leaf (cc) or r=granrose would be prefered. Is this bug critical? [no?] Is this bug platform PC? [no? there's alpha code in one of the patch blocks]
Keywords: approval, patch, review
to the footprint keyword with ya mate! lets get this reviewed so cls can get us some experimental builds for folks to pound on, and if all goes well start to get compiler and optimization flags updated to be the best that they can be for the next round of development
Keywords: footprint
The --enable-optimize=val patch has been checked in. I've put up a couple of egcs 1.1.2 & gcc 2.95.2 builds that were made from the same tree at http://www.seawood.org/mozilla/ . They were built on a RH6.2 + updates box. I've also made available the gcc 2.95.2 rpms, which install themselves with the /usr/gcc295 prefix. In addition to the mozconfig options, the builds were made with --disable-debug --enable-optimize=-O{2} --disable-tests .
cls@seawood.org - if the patch has been checked in, can we close this bug? Gerv
Depends on: 61501
Target Milestone: mozilla0.9 → mozilla1.0
The original bug is on changing the default to use at least -O2. The last time I tried building with -O2 & gcc 2.95.2, it crashed closing a dialog window (bug 61501). I don't think an egcs 1.1.2 -O2 build will get that far. So unless we're going to punt on changing the defaults, the bug should probably remain open.
-fomit-frame-pointer will crash because of the xpconnect asm stuff. I've been running --enable-debug -O2 builds with gcc-2.96 for a while without problems. I haven't quantfied the speed differences, but its noticable compared to -O.
jrgm, can you spin a build and see if it makes the the page loader time drop... We really ought to run all the perf tests that we have against an -02 build. How about we flip the switch for a day in the release builds or create some experimental and get lots of people pounding on things all over the product.. leafer,seawood, good idea? cathleen, lets get this on the hot list for the performance meeting cuz it relatively easy to turn on/off and might be a good perf win. dogut/valeski/waterson, are the gamera folks using -O2?
Worksforme. I'd suggest throwing the switch for Thursday or Friday's builds so that we avoid the couple of days of milestone breakage. I'm also throwing in my vote for upgrading the release compiler to gcc 2.95.2+ while we are changing things.
temp/MyHTML1.gif I made two builds, both pulled from the trunk at approx. 8pm this evening, setting --disable-debug, --disable-tests and --enable-optimize=-O1 or --enable-optimize=-O2. I ran them both through a couple of page loading test series, getting similar results with each build for each run. There appears to be about a ~9% gain from setting '-O2' for "already cached" loads of a page. For the "initial visit", I don't see any significant change in page load times. I'll attach a graph of "already cached" load times, comparing -O1 and -O2.
very cool.. maybe we should flip the optimization level for 0.9...
chofmann, yes, let's do that. what's the safest way? carpool? I think we'd want to be able to identify regressions caused by the optimizer, if any.
Don't forget that changes to autoconf.mk don't cause a rebuild on unix (bug 72018, WONTFIX), so to see the improvement (and pick up any regressions) people (and the tinderboxes) doing optimised builds will have to clobber.
Note that jrgm is using gcc 2.96 for his tests. I'd be interested to see the results on a RH6.2 + egcs 1.1.2 system since that's what the release boxes are.
put this dude on the landing plan http://komodo.mozilla.org/planning/branches.cgi seawood, should we try and get all the release build systems on gcc2.96 rh7.0 in the same step.. we need to get closer to the latest compilers as soon as possible to start picking up the ron guilmette fixes..
I've been building with -02 on a RH6.2/egcs1.1.2 system for weeks with no trouble...
I'm made builds for weeks now, gcc 2.95.3 , -O2 and -O3 without problems. I hope this will be included in 0.9.
It would be awesome if we could get this into 0.9. What has to happen?
roc, time travel. Given past, non-immediately obvious problems when compiling with egcs -O2, I don't feel comfortable making the switch the day after we've closed for the 0.9 stabilization period. But if drivers feels otherwise, then we'll throw the switch (and pray for stability). Hofmann, I know we want to upgrade to a more recent compiler but I think jumping to rh 2.96 would be a bit hasty. Afaik, there is no other vendor supporting that particular compiler/libstdc++ combination so we would bear the full brunt of making the runtime libstdc++ library available to the end-users. RH2.96's libstdc++ is not binary compatible with any other released version of libstdc++. Gcc 2.95.3 would be a better choice from a user standpoint imo. We'd still have to roll our own rpms for the build systems but since libstdc++ is still binary compatible with the one from gcc 2.95.2, which multiple vendors distribute and/or support, we would not have the additional library distribution burden. And of course, I'd suggest waiting until after the 0.9 release before upgrading the boxes.
I agree with Chris Seawood 100% on both issues.
Makes sense to me.
I would also vote for gcc-2.95.3. I have been using it since it came out for building Mozilla, and I haven't yet found a problem that I can relate to the compiler.
Agreement again; the 2.95.3 release has given me the fewest (zero so far) funky compiler-attributable problems of the 2.9x series to date. Perhaps a bit too bleeding-edge to declare it the new official Mozilla-happy compiler though?
> are the gamera folks using -O2? I asked -- they say they are using -O2
Has anyone tried playing with adding -funroll-loops to -O2? Seems like that could be an interesting experiment...
i made a build with -O3 -funroll-loops -march=pentium It works and seems to be very speedy, but i didn't make any performance measures. Can somebody do that please ?
I made another build with all these on: -O3 -funroll-loops -fexpensive-optimizations -march=pentium -mcpu=pentiumpro -fstrength-reduce -fschedule-insns2 -frerun-cse-after-loop -fthread-jumps -fcse-follow-jumps -fcse-skip-blocks Compiles well and runs REALLY fast.
WRT compiler runtimes, would it be good to build the nightlys with a staticly linked c++ library. I dont know how much this would bloat the code, but it should make it possible for anyone to run mozilla w/o having to worry about which version of g++ we used. I know you can do this by building the gcc with the --disable-shared flag. I dont know how to do it if you did not build gcc with that flag, but I bet cls does.
Yesterday i clobbered, pulled, built with -O2 and suddenly trying to compose a message always crashed. ("New Msg" and "Reply" or whatever.) Pulled again, rebuilt to make sure it wasn't mid-checkin mess: same thing. No bug resembling this was reported lately. Clobbered yet again and built *without* -O2 and the crash vanished. In between pulls i could see no checkin that would have affected this.
R.K.Aa: could you post disassebly of the destructor code in both -O2 and -O1 versions? It'd be interesting to see if we can nail down the compiler error, or see if it is, in fact, just sloppy Mozilla code that we've been lucky with so far.
clobbered again and building with -O1 now
no crash with -O1 Shaver: Mailed you a Q. Reply? I'll clobber and rebuild again with -O2 to do this thorough, but need advice.
Does anyone know if the gcc295 nightlies are built with -O or -O2 ? Would be nice if at least that ones would switch to -O2 soon - and they also would be a nice thing to test how good that works... (I know I would be one of the first ppl that would see them crashing as I'm downloading those builds daily...)
Yeah, it would be great if one of the tinderboxen were configured to use -O2 (or even higher, why not?) and the builds could be pushed out for people to try.
I talked to chofmann earlier this week and we're going to wait until after the 0.9 bins have been built to upgrade the release machines to 2.95.3. All optimized nightly builds use the default --enable-optimize. It's not worth hacking the nightly automation to switch to -O2 for specific builds only and remove the potential for conflicting --enable-optimize options when we're going to throw the switch anyways next week. The tinderboxes builds are not delivered to anyone. That's a topic for a separate bug if people are interested in doing that.
re gcc2.95.x: Beonex Communicator 0.6-pre Linux release builds are compiled with gcc2.95.2 with --enable-optimize. I had reports that the binary crashed on some systems at startup. Presumably, it were all RH systems. I have the following in the install instructions <http://www.beonex.com/communicator/version/0.6/install/unix>: "Redhat 6.1 or later: Install the std. C++ libraries for gcc 2.95 from Mandrake <http://www.beonex.de/download/communicator/0.6/req/libstdc++-2.95.2-7mdk.i586.rpm> (100 kB). On RedHat 7.0, you have to force installation (rpm -i --force filename)." This seems to fix the problem. The --force is because otherwise, rpm will complain about downgrading (libstdc++ 2.96 is already installed). Mozilla doesn't work on Redhat 6.0 and earlier at all, IIRC. I couldn't even find a Redhat libstdc++ 2.95 package at that time.
> I had reports that the binary crashed on some systems (before I put up the install instructions) > otherwise, rpm will complain about downgrading Note that rpm will not downgrade, if you force installation. Both libstdc++ 2.95 and libstdc++ 2.96 will be installed. If you don't force, installation will be rejected. So, this precedure most likely will *not* hork the RH7.0 installation.
> The only difference between -O2 and -O3 is that -O3 turns on -finline-functions, which inlines all possible "small" functions. In gcc-3.0 prereleases (anybody has tested mozilla with them? The more we stress test the new gcc before it's released, the better.), -O3 also turns on -frename-registers. From the info file: `-frename-registers' Attempt to avoid false dependancies in scheduled code by making use of registers left over after register allocation. This optimization will most benefit processors with lots of registers. It can, however, make debugging impossible, since variables will no longer stay in a "home register".
I've been building with -O2 on x86 with the official GCC 2.95.3 for about a month now -- no problems seen.
How about introducing some "gcc" keyword? Ever more will be building with gcc 2.95 / 2.96. I use 2.96 and see a few crashes others don't. One crash i did report but later resolved as invalid because i didn't use the official egcs, later turned out to be a blocker. The crashes may not always be that obvious but untill further I've stopped reporting them. Question is: Should I stick to the nightlies and shaddap about what i see in own CVS builds - or not? If not: some policy and means of sorting these bugs might be an idea.
the gcc2.96 is no official gcc compiler (there never was a real gcc2.96), just a like a nightly build taken by RedHat, who also added some bug fixes. Because of that IMHO gcc2.96 is by definition buggy, so we shouldn't care too much about it. On the other hand, we should try the gcc 3 builds, which should be pretty close to what the real gcc3 will be. If a gcc bug is found, the bug could be postet to the developers and we make sure, that Mozillas potential problems with gcc3 will be solved much earlier. I hope that gcc3 gives quite an additional performance boost on Linux, which mozilla always enjoys :-)
OK - then i won't file gcc2.96RH bugs.
Status: ASSIGNED → NEW
This morning's builds had a bug ( bug 80746 ) which may have led to a Bugzilla user inadvertantly changing this bug from the Assignbed/Accepted status to the New status. If you are the owner of this bug please check to see that it is in the correct Status. Thanks.
I know that there were problems when we tried to upgrade compilers, but I'm nominating for 0.9.2 anyway. A 9% improvment is nothing to sneeze at...
Keywords: mozilla0.9.2
Mandrake is using gcc 2.96 ;) 9% is nothing to sneeze at on my p233 w/ 96M. Any major changes in the optimization scheme from 2.95.x to 3?
Recent builds are showing problems (password dialogs return a blank password, somehow) when compiled with -O2 on RedHat 7.1. Goes away if I use -O instead.
I'm seeing no problems in my -O2 builds on Linux, compiling on Debian Unstable with gcc-2.95.4. I can say there's a definite performance increase - the UI speeds up considerably; although it's not enough to match native GTK menus (with -O2 we're still twice as slow as X-Chat).
Blocks: 71874
Ok, so can we get some feedback from people using builds made with egcs 1.1.2 & -O2? The release boxes are still stuck with that compiler and given the last upgrade fiasco, it'll be some time before that changes (someone prove me wrong, please :-P).
cls: isn't bug 79681 (statically linking libstdc++) the only real obstacle to upgrading the compilers for real? And it looks like much of the work on that bug has been done...
That and the legal issue that goes along with bug 79681 (just updated).
Depends on: 79681
With -O2 on rh7.1, I can confirm bryner's issue with the mailnews password dialog. Sigh.
The password dialog problem thing is now bug 83388. Theres a workarround in that bug.
Depends on: 83388
I see a crash when clicking on "View" in the History window's menu, if Mozilla is compiled with gcc 2.95.4 and -O[23]. -O works. I've been building with -O3 for some time now without experiencing other problems.
Reproducible crash with gcc2.95.2 built with -O2: Go to www.w3c.org/Style/CSS Switch stylesheet Produces segfault. I investigated a bit with different optimisation settings. Here's the result: From gcc toplev.c if (optimize >= 2) { flag_cse_follow_jumps = 1; flag_cse_skip_blocks = 1; flag_gcse = 1; flag_expensive_optimizations = 1; flag_strength_reduce = 1; flag_rerun_cse_after_loop = 1; flag_rerun_loop_opt = 1; flag_caller_saves = 1; flag_force_mem = 1; #ifdef INSN_SCHEDULING flag_schedule_insns = 1; flag_schedule_insns_after_reload = 1; #endif flag_regmove = 1; } although flag_schedule_insns is switched off in i386.c. So I built with -O1 and an increasing number of flags switched on. This lead to a surprising result: you can enable all of the -O2 flags and not see the crash. Similarly, you can enable -O2 and then disable all of the optimisations with -fno-* and the crash will still occur. Unfortunately I couldn't find anywhere where additional options are enabled. However, this may suggest that it may be worthwhile explicitly switching on flags that are known to work rather than using -O2. Running some test builds with gcc-3.0 next.
I'll look into the segfault. However, note that just because a certain optimization level causes a crash it _doesn't_ mean there's a bug in the compiler. Certain application bugs can be hidden or exposed by differing optimization levels. It is a useful data point, though.
From the GCC 2.95.2 documentation at http://gcc.gnu.org/onlinedocs/gcc-2.95.2/gcc_2.html#SEC31 ------------- -mcpu=cpu type Assume the defaults for the machine type cpu type when scheduling instructions. The choices for cpu type are: `i386' `i486' `i586' `i686' `pentium' `pentiumpro' `k6' While picking a specific cpu type will schedule things appropriately for that particular chip, the compiler will not generate any code that does not run on the i386 without the `-march=cpu type' option being used. `i586' is equivalent to `pentium' and `i686' is equivalent to `pentiumpro'. `k6' is the AMD chip as opposed to the Intel ones. -march=cpu type Generate instructions for the machine type cpu type. The choices for cpu type are the same as for `-mcpu'. Moreover, specifying `-march=cpu type' implies `-mcpu=cpu type'. --------------- We should be using (IMHO) -O2 -mcpu=pentiumpro -march=pentium This will select instructions for a pentium (not a 486, and not the default 386!). It will schedule them for best performance on a pentiumpro (which is what the PII and PIII are based on). Alternatively, we could use just -O2 -march=pentium (which implies -mcpu=pentium). Note: using these options will make use on a 486 probably impossible. If someone wants a 486 build, they should make a separate build. I'm not even going to think about 386.
If you look at the output of egcs -fverbose-asm -S hello.c , you'll see that it already defaults to using -march=pentium. I get a similar result with gcc 2.95.3. If you want to change the scheduling based upon ${target_cpu}, that's fine but we shouldn't hardcode it across tbe board. # GNU C version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release) (i386-redhat-linux) compiled by GNU C version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release). # options passed: -fverbose-asm # options enabled: -fpeephole -ffunction-cse -fkeep-static-consts # -fpcc-struct-return -fcommon -fverbose-asm -fgnu-linker -fargument-alias # -m80387 -mhard-float -mno-soft-float -mieee-fp -mfp-ret-in-387 # -mschedule-prologue -mcpu=i386 -march=pentium
Well, that has -mcpu=i386 -march=pentium. That generates instructions that (may) only run on a pentium, but optimizes for running on a __386__. Either the documentation is majorly fubar (possible), or that's a silly set of options. For x86 machines with GCC, I still think -O2 -mcpu=pentiumpro -march=pentium makes lots of sense. If you think it doesn't matter, here's the difference between -O2 and -O2 -mcpu=pentiumpro for a truely trivial function. My point is that the optimization IS different (vastly in some cases). void hello() { } -mcpu=i386: hello: pushl %ebp movl %esp,%ebp leave ret -mcpu=pentiumpro: hello: pushl %ebp movl %esp,%ebp movl %ebp,%esp popl %ebp ret FYI, 2.95.2 under freebsd 4.1 yields: # GNU C version 2.95.2 19991024 (release) (i386-unknown-freebsd) compiled by GNU C version 2.95.2 19991024 (release). # options passed: -fverbose-asm # options enabled: -fpeephole -ffunction-cse -fkeep-static-consts # -freg-struct-return -fsjlj-exceptions -fcommon -fverbose-asm -fgnu-linker # -fargument-alias -fident -m80387 -mhard-float -mno-soft-float -mieee-fp # -mfp-ret-in-387 -mno-fancy-math-387 -mschedule-prologue -mcpu=i386 # -march=pentium
cls: "If you look at the output of egcs -fverbose-asm -S hello.c , you'll see that it already defaults to using -march=pentium." On Red Hat (and therefore the Linux build boxen), yes. Not everywhere, though. Regarding "-mcpu=pentiumpro", this will reportedly produce code running slower on non-Intel processors than code compiled with "-mcpu=i386". YMMV. Generally, the -mcpu savings in real world applications similar to mozilla are slim, and the issues tangled. I'd expect someone proposing this change to give hard numbers on the usual page load time benchmarks for different -mcpu settings running on at least the 3 most popular subarches. In the meantime, let's constrain ourselves to -O2, and ironing out the bugs uncovered by it (in mozilla or gcc). This provides a much more definite win across the board.
Very good point - using -O2 is far more important than lesser quibbles, especially if the primary systems using gcc are defaulting to -march=pentium. (I'd assumed it was defaulting to -march=i386).
General note as it came up: recent snapshots of GCC-3.0 won't compile mozilla (amongst other things). It's a general problem with c++ headers that's discussed at http://gcc.gnu.org/ml/gcc/2001-06/threads.html#00197
Contacted fcrozat to find out the flags used for mandrake builds which i believe are optimized with at least -O3 ntm for i586 and using eh some recent form of gcc (the infamous 2.96-[something indicating a recent cvs snapshot]) and glibc 223 and everything else under the sun which i dont care to ramble on about. PS I speak of cooker for those mdk users who are bewildered at why they see no 091. And, uh, do u want smoketest/other test results or a trace if it crashes? Can i even do an effective trace with such opt? Anyway...i shutup now -Blue
Scratch that, im stupid. Just found out NOT optimized beyond the -O default, since standard mdk opt (now pretty sure it's -O3) compiled, but crashed on startup :). Sorry for confusion but still want to know what tests and data would help.
cc'ing... hey cls, how'd you make that spiffy graph? I have a P2/266 & 256M, I can do some other comparisons on a lower-end machine if that'd help at all.
Actually, that's jrgm's graph and I believe the tools are only available inside the netscape firewall.
Offtopic: If you want graphs like that, try out Grace (formerly xmgr or ACE/gr) from http://plasma-gate.weizmann.ac.il/Grace/ Its a bit quirky to get used to, but extremely powerful once you get over the initial learning hump. Produces similar graphs to the 04/15/01 01:46 image attachment. Oh - and you need Lesstif
is bug 82962, "Crash on switching character coding", related to this? From http://bugzilla.mozilla.org/show_bug.cgi?id=82962 ------- Additional Comments From cls@seawood.org 2001-06-14 14:42 ---- I'm pretty sure that this is a -O2 problem. My gcc 2.95.3 -g build can switch encodings without a problem. My gcc 2.95.3 -O2 build crashes. I can't see the stack because it locks my X session. The gcc295 nightlies are built using gcc 2.95.3 -O2 as well. I'll fire off gcc 2.95.3 -O & egcs 1.1.2 -O2 builds to verify.
Depends on: 82962
When I'm debugging a X program that breakpoints when a menu is posted I run the debugger on one X server and send the app to a different X server that is nearby. I use the gdb command: set env DISPLAY othersys:0 before run.
Even better (if you have one machine): Start a second X server (such as xinit -- :1 -bpp 24) setenv DISPLAY :1.0
Depends on: 80988
> Regarding "-mcpu=pentiumpro", this will reportedly produce code running slower > on non-Intel processors than code compiled with "-mcpu=i386". YMMV. I'd like to test that. I use an AMD Duron. > Generally, the -mcpu savings in real world applications similar to mozilla are > slim, and the issues tangled. I'd expect someone proposing this change to give > hard numbers on the usual page load time benchmarks for different -mcpu > settings running on at least the 3 most popular subarches. How can I benchmark Mozilla objectively? General note: Thanks to all here for the comments, many of which were very interesting for me, even if strictly offtopic (not about the -O2 flag).
WRT debuggin X programs. Xnest and Xvnc are also helpful programs. In particular Xvnc is great if you have to debug on a remote machine and the network link is slow.
Has anyone dared the -O9 with gcc-2.95.3 or gcc-3.0?
hehehe I tried once. Barely got past the first file and that's with some recoding as i went.
FYI, AFAIK nothing over -O3 is actually any different from -O3. -O[4-9] appear to be there for forward compatibility.
which reminds me, it was actually (probably) a compiler problem that i was having... what is up with this bug? anything new? gonna switch default flag to -O3 or something? :P mcafee has been fooling around with comet and now coffee in relation to optimization but i dunno what it is he's been up to, ns and 'active' mozilla partners probably have some idea...
Moving to -O2 (not -O3 or other various funky compiler options) is waiting on verification of the libstdc+++ license for 2.95.x so that we can link statically against libstdc++ (bug 79681). Once we can statically link against libstdc++, then we can upgrade the release compilers to gcc 2.95.x. And since I know someone is going to ask, moving to gcc3 creates another shared library dependency problem; this time with libgcc_s.so.1 . I'm not sure if we would be able to work around that with the same hack proposed in bug 79681 .
I'm posting from 0.9.1 compiled under gcc 3.0. I used -O3 -march=athlon -mcpu=athlon. The only file to complain (segv'd gcc) was layout/html/forms/src/nsFormControlHelper.cpp. It worked fine when I did this file with -O2. The gcc error was: nsFormControlHelper.cpp: In static member function `static void nsFormControlHelper::PaintFixedSizeCheckMark(nsIRenderingContext&, float)': nsFormControlHelper.cpp:731: Internal error: Segmentation fault I've raised a bug with the gcc peeps. The performance of this thing seems pretty snappy - though its a bit unfair for me to compare with my last build of 0.9.1, as I've recompiled my entire machine from scratch with atlon optimized binaries :-) Loz
I've done some testing on my gcc 3 -O3 build and it crashes on viewer demos 4 and 9 (simple tables and frames). The errors for #4 are: Gdk-ERROR **: BadValue (integer parameter out of range for operation) serial 3279 error_code 2 request_code 12 minor_code 0 Gdk-ERROR **: BadValue (integer parameter out of range for operation) serial 3280 error_code 2 request_code 12 minor_code 0 Going to create the event queue gModeSwitchBit = 0x0 And for #9 (which crashes at the second attempt)Gdk-ERROR **: BadValue (integer parameter out of range for operation) serial 17077 error_code 2 request_code 12 minor_code 0 Going to create the event queue gModeSwitchBit = 0x0 If anyone want more diagnostics and can tell me how to get them I'm happy to oblige cheers Loz
Loz benchmarked these two Mozillas/systems with the Viewer demos in the debug demo. He found speed improvements betwen -4% and 190%, average 32%, plus 2 reproduceable crashes.
I felt a bit guilty abou the benchmarks comparing a -O3 build on a gcc3 system with a -O2 build on a pgcc system. So i knocked up a -O2 build on my gcc3 system to do a real comparison. (Both builds are now 0.9.2) I tested by taking the times for each of the debug->viewer demos off the status bar. Both builds also had -mcpu=athlon -march=athlon set. Both builds crashed on test 4 and 9 (I'm building an i386 build to if that is the problem, I'll try a -O build after - if that still crashes I'm guessing its either gcc3 or something other external program on this new system) Tests 14 and 15 were ignored, as they use a net connection. The -O3 build sometimes came up with non numeric characters (;:<) in its time (always for the first run of test 10). I ran each test three times in a row, then moved on to the next. The only exception to this was moving from test 10 to 11, where I cleared the screen (800x600), as test 10 uses 100% cpu while in action. The results were: O2 tests are on the left, O3 on the right. The % was calculated as avg(O2) - avg(O3) / avg(O2) test time1 time2 time3 avg time1 time2 time3 avg % 0 0.229 0.138 0.122 .1630 0.131 0.127 0.125 .1276 21.00 1 0.177 0.089 0.089 .1183 0.196 0.094 0.094 .1280 -8.00 2 0.144 0.108 0.084 .1120 0.179 0.086 0.085 .1166 -4.00 3 0.125 0.106 0.104 .1116 0.149 0.126 0.126 .1336 -19.00 5 0.418 0.352 0.361 .3770 0.492 0.36 0.358 .4033 -6.00 6 0.226 0.149 0.172 .1823 0.205 0.153 0.152 .1700 6.00 7 0.103 0.113 0.082 .0993 0.109 0.083 0.081 .0910 8.00 8 0.536 0.575 0.557 .5560 0.557 0.574 0.566 .5656 -1.00 10 0.163 0.104 0.103 .1233 0.14 0.097 0.097 .1113 9.00 11 0.116 0.089 0.088 .0976 0.092 0.09 0.09 .0906 7.00 12 0.123 0.104 0.127 .1180 0.132 0.126 0.128 .1286 -8.00 13 0.334 0.097 0.09 .1736 0.344 0.095 0.096 .1783 -2.00 16 0.168 0.153 0.169 .1633 0.159 0.162 0.155 .1586 2.00 The conclusions? The testing is too imprecise to really show anything..., but it looks like swings and roundabouts. Ack well, back to my compiling
> > Regarding "-mcpu=pentiumpro", this will reportedly produce code running slower > > on non-Intel processors than code compiled with "-mcpu=i386". YMMV. > > I'd like to test that. I use an AMD Duron. I did so. 1. Test setup 1.1. Procedure I "reload"ed (via a simple click on the button) <http://gcc.gnu.org/onlinedocs/gcc-2.95.3/gcc_2.html#SEC2> 8 times. Each time, I wrote down the time displayed in the status bar. Before that, I sat cache to "Once per session" and reloaded the document 3 times to eliminate network delay. 1.2. Config Mozilla 0.9.2 as Beonex Communicator, OFFICIAL, gcc 2.95.4, -O2, disable-debug, enable-strip-libs normal: CFLAGS="" ppro: CFLAGS="-mcpu=pentiumpro -march=pentiumpro" 1.3. Machine AMD Duron 650, Via KT133, 256MB PC100 Note: Not an Intel! Results: normal: 5.181 ms ppro: 4.493 ms Standard derivation is < 0.01 ms in both cases.
I tried -O3 (which "inlines functions") too. I also have some additional remarks to the test. 1.1. Procedure Mozilla seems to do DNS lookups nevertheless, so I reran some of the tests with a local HTML file. Most of the results were similar. I also measured the size of a distribution, uncompressed and bzip2 compressed. 1.2. Config O3: CFLAGS="-mcpu=pentiumpro -march=pentiumpro" --enable-optimize=-O3 1.3 Machine Debian unstable 2. Results 2.1. Performance ppro is 13.27% faster than normal.O3: 4.478 ms average (0.83% faster than ppro) With local file, O3 is 1.71% faster than ppro. 2.2. Distribution size u = uncompressed, b = bzipped normal: u: 23848960, b: 9004239 ppro: u: 24289280, b: 9121543 (+1.85%, +1.30%) O3: u: 25077760, b: 9357161 (+3.25%, +2.58% vs. ppro; +5.15%, 3.92% vs. normal) 4. Additional notes - It is to expect that a genuine Intel Pentium 2/3/4 sees a similar speed improvement. - Somebody said that ppro (as defined above) runs on i586 (Intel Pentium, AMD K6) machines, just maybe a tiny bit slowed. I couldn't test, because I don't have a i586 system that can run Mozilla. 5. Summary 5.1. Longer ppro brings considerable (13%) faster code, but only a small (~1.5%) increase in distribution size. An additional -O3 brings only a small perf improvement (additional 1-2%), but a larger dist size increase (additional ~3%). 5.2. Executive Use ppro optimization.
> CFLAGS="-mcpu=pentiumpro -march=pentiumpro" --enable-optimize=-O3 ... > - Somebody said that ppro (as defined above) runs on i586 (Intel Pentium, AMD > K6) machines, just maybe a tiny bit slowed. ... > Use ppro optimization. Nope... -march=pentiumpro code does not run on i586 or K6/K6-2, at least as generated by GCC 2.95.3. -mcpu=pentiumpro is fine, but -march must be i586 or lower.
march is the architecture to compile for; mcpu is to provide optimizations for that architecture. -march=486 -mcpu=pentiumpro would run a little slower on a 486 but would be optimized for a pentiumpro class cpu -march=pentiumpro -mcpu=pentiumpro more than likely will not run on a pentium or lower CPU
Wait a minute...so even if you do -O3 it isn't going to fomit-frame-pointer if your machine won't be able to debug? Getting this impression from waterson's post from the egcs docs about opt flags. Obviously the flag would be left off for debugging purposes on most machines then, but on releases (guess not on talkbalk builds), would there be any significant gain by adding the flag? and btw, someone wanna take off that moz092 keyword? I wouldn't dare. Does this require that that static gcc linking thing go through?
I've tried 0.9.3 against gcc-3.0 with -mcpu=athlon -march=athlon and with -O2 and -O3. I did similar timings to those I did previously against 0.9.2 (Based on the viewer demos on the debug menu). I'm happy to say that things are faster. Even better, tests 4 and 9 no longer crash the browser. I'm still getting an issue with -O3 where I sometimes get a garbage character in the second decimal point of the time displayed in the status bar. c++ is not my language, but this sounds like a race condition (unless -O3 is broken). There is still apparently little difference in performance between -O2 and -O3, but this problem makes it hard to be sure. I'll not post the times after what happened last time...
I tried to generate Mandrake mozilla rpm with CFLAGS="-O3 -fomit-frame-pointer -pipe -mcpu=pentiumpro -march=i586 -ffast-math -fno-strength-reduce" BUILD_OFFICIAL=1 ./configure --enable-optimize=-O3 with gcc 2.96-0.60mdk (based on gcc RH 2.96-95) and I got very strange problems : preferences weren't loaded correctly, some attributes of radiobuttons were not found anymore :((
My preferences problem was caused by '-ffast-math'..
I've opened up another bug that is tracking the problem I was having with the status bar (94375). I can now get reliable timings with my O3 build (if I compile js/src/jsstr.c and js/src/jsinterp.c as -O2). From the numbers below you can see that there is not a lot of difference between the builds. The testing is certainly not accurate enough to justifiably say that one is better than the other. test avg(O2) avg(O3) 100*(O2-O3)/O2 0 .1510 .1650 -9.2715 1 .1076 .1103 -2.5092 2 .1120 .1056 5.7142 3 .1180 .1106 6.2711 4 .2160 .2113 2.1759 5 .3556 .3403 4.3025 6 .1566 .1553 .8301 7 .1070 .1086 -1.4953 8 .5270 .5220 .9487 9 .9333 .9453 -1.2857 10 .2576 .2500 2.9503 11 .0966 .0990 -2.4844 12 .1120 .1120 0 13 .1813 .1796 .9376 16 .1586 .1593 -.4413 Out of interest the size of my -O3 build is much larger than my -O2 build. If I run $tar czhf dist.jar dist then my -O2 jar is 26646709 bytes whereas my -O3 build is 36626264 bytes. I checked that there are no different file names between the two. A lot of files get included twice using this method (links from bin and lib). The main offending files are libgkcontent.so (1.6MB bigger in -O3), libgklayout (1.5MB bigger in O3) libgkconhtmlstyle_s.a (670kB bigger in O3). I've seen a message on the gcc lists that may back this behaviour of gcc-3.0: <a href="http://gcc.gnu.org/ml/gcc/2001-07/msg01523.html">. Ah well gcc-3.0.1 is slated for release Wednesday (things going well).
there seems to be a huge size difference between gcc3.0 and gcc3.0.1, so I am interested in seeing the numbers of 3.0.1 when it is out :-)
The latest results from Loz's unreliable short tests are available: build A: gcc-3.0 -O2 size=26646709 build B: gcc-3.0 -O3 size=36626264 build C: gcc-3.0.1 -O2 size=27025387 build D: gcc-3.0.1 -O3 size=29721253 As expected the build size has come down considerably during the point release of gcc. Now the average times: Test A B C D 0 .1510 .1650 .1630 .1580 1 .1076 .1103 .1193 .1113 2 .1120 .1056 .1100 .1090 3 .1180 .1106 .1213 .1140 4 .2160 .2113 .2176 .2180 5 .3556 .3403 .3470 .3583 6 .1566 .1553 .1626 .1593 7 .1070 .1086 .1130 .1100 8 .5270 .5220 .5346 .5263 9 .9333 .9453 .9546 .9576 10 .2576 .2500 .3056 .2640 11 .0966 .0990 .1026 .1000 12 .1120 .1120 .1173 .1136 13 .1813 .1796 .2000 .1850 16 .1586 .1593 .1676 .1630 Hmm, the following appears to be true: -O3 is now marginally quicker than -O2 gcc-3.0.1 is very slightly slower than gcc-3.0 Some quality patches to the gcc inliner are becoming available, so we should hopefully see some marked improvements at the next point release (or earlier if I get the inclination to test the patches). I still get garbage characters on the status bar, now with both builds - see http://bugzilla.mozilla.org/show_bug.cgi?id=94375 for details.
loz, are you sure that you didn't mix up the labels? It seems that - 02 build size got larger - O2 got slowlier - gcc 3.0.0 O2 is as fast as 3.0.1 O3, with considerably smaller build size. So, it looks like 3.0.1 is all around worse, apart from fixing the excessively large O3.
3.0.1 turned down the inlining cutoff - see the gcc archive for more details than you want to know :) That would explain why -O3 code is now smaller.
The labels are correct. I was only referring to the -O3 builds with the size comment (after I'd said how big they'd been in a previous post). I didn't make this in any way clear, however :-). Also the drift in performance could be due to my upgrade to the 2.4.9 kernel, which has some issues with a new VM algorithm, if I get time I'll retest the gcc-3.0 builds with the new kernel, and if I get even more inclined I may try the kernel patches that fix this and maybe even the gcc inliner patch.
Depends on: 96911
See observations in bug 97649 regarding preferences horkage under gcc 3.0.1 snapshot, and also which conditions caused it. I resolved it as invalid, but it has relevance to other comments in this bug. Vcls: please verify or reopen as seems fit.
Does this _really_ depend on 79681? I can't figure out why, and the various responses i get seem to indicate that it really doesn't. Just politics? <Dauphin> Blue: some people prefer the static build change to the -O2 change which involves changes compilers & license issues but the 2 changes are orthoganal. I had to look up that word: orthogonal adj. [from mathematics] Mutually independent; well separated; sometimes, irrelevant to. Who can guess which keyword is outdated again?
Actually, it does. I thought that you were talking about the "static build" which is bug 46775. Stepping through this mess one more time: The desired goal is to move to using -O2 by default as it has a perceived performance increase. egcs 1.1.2 is the current default compiler used by the build automation -O2 egcs 1.1.2 builds are reported to be flaky at best. -O2 gcc 2.95.x builds are much more stable gcc 2.95.2 uses a different libstdc++ causing builds to have a different shared library dependency Since this new libstdc++ isn't on most platforms (where RedHat has majority share of the market), people would have to install this new libstdc++ and that is perceived, by some, to be too much to ask. So we're looking into statically linking libstdc++ into mozilla, but that brings up questions about the exact license of 2.95.2's libstdc++ and whether the linking will cause problems for Mozilla license-wise (this may be a moot point after the tri-licensing stuff is finished). gcc 3.0.x's libstdc++ has a clear & compatible license but gcc3 also introduces an additional libgcc_s.so.1 dependency (see http://gcc.gnu.org/gcc-3.0/libgcc.html ) . It may be possible to statically link against the libgcc but the webpage implies that we should only do that for a completely static binary. So, in summary, we're waiting until the 2.95.2's libstdc++ license issue is resolved or the market changes so that newer libstdc++/libgcc are standard on most machines. FWIW, we have had test gcc295x -O2 builds on the ftp site for months.
bug 104653 is about NaN on all downloads on a fresh gcc3 build.
Just for your information. I have been using the gcc nightlies for months without problems and I just compiled milestone 0.9.5 with -O4 and -march=k6, runs without a hitch so far. I did notice a nice speedup compared to the standard builds.
No longer depends on: 79681
My belief is that, for the mozilla.org default builds, adding a dependency on a new shared library package is worth it to get the performance gains that will come with bumping up -O to a higher level. We can document this and provide prebuilt versions of the appropriate libstdc++ so that it's easy for users to meet this dependency. How do other folks feel about this? Would Netscape also be interested in doing this for their default builds?
As an interim measure, how about releasing special versions for milestones with the new library or whatever is needed to do -O2, with special intructions about what type of system you need to use it. Of course the current type of builds would remain the default until everything is sorted out.
Didn't bundling system libraries go out of style with win95? There's no need for mozilla.org to become a distributor of libstdc++.so.3 nor libgcc_s.so.1 as we don't do anything special to those libraries to warrant the bundling. Users need to acquire libX11.so.6, libglib-1.2.so, etc before they can run mozilla. Just add these new libraries to the list. The gcc30 builds have replaced the gcc295 builds on the ftpsite. They are built using -O2. Also, if you're planning on building using gcc 3.0.x, you might want to make sure that you have a newer binutils installed. I kept seeing a crash loading libtransformiix.so on RH6.2 until I upgraded binutils to 2.11.90.0.8.
*** Bug 109934 has been marked as a duplicate of this bug. ***
WFM (linux, gcc-2.96-96, mozilla-cvs-20011121, --enable-optimize=-O3) starts up easily 40% faster than my -O build. tested on a 850MHz PIII w/ 256Mb RAM.
did you run it through chofmann's browser buster and smoketests and other debug tests?
0.9.6 will currently build a -O3 under gcc-3.0.2, but it segvs whenever it tries to render. Backtrace (in case this isn't a dup - I couldn't find it - though I'd be surprised if it wasn't) #0 0x40b81da7 in _ZN12imgContainer11AppendFrameEP14gfxIImageFrame () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libimglib2.so #1 0x40b9112c in _Z14HaveDecodedRowPvPhiiiihi () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libimggif.so #2 0x40b90258 in _Z10output_rowP10gif_struct () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libimggif.so #3 0x40b8fefb in _Z6do_lzwP10gif_structPKh () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libimggif.so #4 0x40b8ef2c in _Z9gif_writeP10gif_structPKhj () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libimggif.so #5 0x40b90a88 in _ZN13nsGIFDecoder211ProcessDataEPhj () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libimggif.so #6 0x40b91230 in _Z11ReadDataOutP14nsIInputStreamPvPKcjjPj () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libimggif.so #7 0x4015714d in _ZN6nsPipe17nsPipeInputStream12ReadSegmentsEPFjP14nsIInputStreamPvPKcjjPjES3_jS6_ () from /home/loz/work/mozilla/dist/bin/libxpcom.so #8 0x40b90aeb in _ZN13nsGIFDecoder29WriteFromEP14nsIInputStreamjPj () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libimggif.so #9 0x40b858ab in _ZN10imgRequest15OnDataAvailableEP10nsIRequestP11nsISupportsP14nsIInputStreamjj () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libimglib2.so #10 0x40b8446d in _ZN13ProxyListener15OnDataAvailableEP10nsIRequestP11nsISupportsP14nsIInputStreamjj () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libimglib2.so #11 0x4096273f in _ZN12nsJARChannel15OnDataAvailableEP10nsIRequestP11nsISupportsP14nsIInputStreamjj () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libnecko.so #12 0x40926c26 in _ZN22nsOnDataAvailableEvent11HandleEventEv () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libnecko.so #13 0x4091773a in _ZN23nsARequestObserverEvent13HandlePLEventEP7PLEvent () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libnecko.so #14 0x4016ff22 in PL_HandleEvent () from /home/loz/work/mozilla/dist/bin/libxpcom.so #15 0x4016f2c7 in PL_ProcessPendingEvents () from /home/loz/work/mozilla/dist/bin/libxpcom.so #16 0x401714a4 in _ZN16nsEventQueueImpl20ProcessPendingEventsEv () from /home/loz/work/mozilla/dist/bin/libxpcom.so #17 0x40cfb666 in _Z24event_processor_callbackPvi17GdkInputCondition () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libwidget_gtk.so #18 0x40cfb5ef in _Z17our_gdk_io_invokeP11_GIOChannel12GIOConditionPv () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libwidget_gtk.so #19 0x403ae56a in g_io_unix_dispatch () from /opt/gnome/lib/libglib-1.2.so.0 #20 0x403afcf0 in g_main_dispatch () from /opt/gnome/lib/libglib-1.2.so.0 #21 0x403afff8 in g_main_iterate () from /opt/gnome/lib/libglib-1.2.so.0 #22 0x403b04bc in g_main_run () from /opt/gnome/lib/libglib-1.2.so.0 #23 0x402c838f in gtk_main () from /opt/gnome/lib/libgtk-1.2.so.0 #24 0x40cfb253 in _ZN10nsAppShell3RunEv () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libwidget_gtk.so #25 0x41388c12 in _ZN17nsAppShellService3RunEv () from /home/loz/work/mozilla-0.9.6-3.0.2-athlon-O3/dist/bin/components/libnsappshell.so #26 0x804c884 in _Z5main1iPPcP11nsISupports () #27 0x804bb6c in main () #28 0x404f1861 in __libc_start_main () at soinit.c:56 Loz
Target Milestone: mozilla1.0 → mozilla0.9.8
If somebody gets a chance to test gcc-3.0.3 with mozilla that would be interesting.
I built Mozilla using gcc 3.0.3 from a CVS build pulled around 2:30 PST, December 24. I compiled using -O2, but it was still a debug build, so that might have covered up some of the crashes that might have happened with a non-debug build. My mozconfig file is: ac_add_options --enable-jsd ac_add_options --with-java-supplement ac_add_options --with-extensions=all ac_add_options --enable-mathml ac_add_options --enable-crypto ac_add_options --enable-chrome-format=flat ac_add_options --enable-meta-components ac_add_options --disable-logrefcnt ac_add_options --disable-detect-webshell-leaks ac_add_options --disable-dtd-debug ac_add_options --disable-tests ac_add_options --disable-xpctools ac_add_options --disable-reflow-perf ac_add_options --disable-perf-metrics ac_add_options --disable-jprof ac_add_options --without-profile-modules ac_add_options --enable-leaky ac_add_options --enable-debug-modules ac_add_options --with-system-jpeg ac_add_options --with-system-zlib ac_add_options --with-system-png ac_add_options --with-system-mng ac_add_options --without-system-nspr ac_add_options --with-gtk ac_add_options --disable-verbose-config-defs My setup is a i686, Linux 2.4.17 (compiled with gcc 3.0.2), glibc 2.2.4-2, and XFree86 4.1.0 I did the following without getting any crashes: - Normal browsing, with full use of cookies (Slashdot.org, Kuro5hin.org). - Browsing with HTTPS (Sourceforge). - Sent and received mail. - Started up the editor on the Slashdot homepage. - Used Chatzilla. - Used the address book. - Edited my bookmarks. - All of the non-site Debug:Verification entries, except for the Java applet and JavaScript tests. - All of View Demo, XBL Test Suite and XUL Test suite tests.
I redid my previous build, but with all debugging turned off. I then redid all of the tests which were done on the previous test, plus I viewed all of the panels in the (non-mail) preferences window, plus popped up most of the "extra/additional/advanced" windows from the prefs panels. Nothing crashed (except for a few things that also crashed with the 2001122408 build).
I've just come from bug 118783. Jan09 2002 cvs built and ran fine for me on linux this way: ./configure --disable-debug --enable-optimize="-O -mcpu=k6" I realize -O is the default now, but thought I'd drop this in here. I had tons of compile sucesses, but segfaults on run, with a mixed bag of -O3 and -fomit-frame-pointer both of which are said to cause problems in bug 118783.
Drop kicking this one off of 0.9.8 radar.
Target Milestone: mozilla0.9.8 → Future
This page may be useful in determining extra optimization flags that could give a speedboost: <a href="http://www.suse.de/~aj/SPEC/compare-flags.html">http://www.suse.de/~aj/SPEC/compare-flags.html</a> . I plan on running some tests with gcc 3.0.3 to see which ones help/work with Mozilla.
Keywords: mozilla1.0+
Reassigning to asa per my conversation with him. We need some decision to be made regarding this for 1.0.
Assignee: cls → asa
Removing from netscape landing page due to high risk of introducing an optimizer change without a couple of milestones of testing. I suggest the mozilla1.0+ be re-evaluated for this reason.
syd: people have been testing -O2 builds for about a year now, and all issues seen there have been fixed until around August 2001, it seems. We have several people (including me) who have been building and running gcc 2.95.2/.3 -O2 builds on a daily basis for more than half a year. We also have a bunch of people doing the same -O2 builds with gcc 3.0.x and I don't remember that I have heard any big problem reports of them (-O3 may have more problems though). As Comment #107 From Loz Hygate 2001-08-05 12:59 states, 0.9.3 was building and running on both (gcc 2.95.x and gcc 3.0.x) compilers without problems, and that never changed for milestones since that date. I dont understand your argument of "high risk of introducing an optimizer change without a couple of milestones of testing" for this reason, as I believe that 7 milestone releases building and running without problems should count as "a couple of milestones of testing", right?
-O2 has been in heavy use on Linux (and FreeBSD) for a long time. I've been using it for all my browsing and testing since 0.9.1-ish days.
I agree, I have been building with -02 on Linux for ages and my builds have been just as solid as nightlies - just much faster. If we want this in 1.0 we should change this flag ASAP to prove to the sceptics that this is really solid now. It should be trivial to switch off optimizations in the unlikely event that it should cause problems for people. It does not seam fair to all of us who have been trying these builds for so long time to claim that it is untested.
It appears that the Debian packages have been using -O2 since at least 0.9.7-3, prior to which they used -O3 since M18-1 and then -O2 for the single release of M17-3. Here are the changelog entries that I'm drawing these conclusions from. There is no mention of "-O" or "O1", "O2", or "O3" in any other changelog entries, which is a very strong indication that the optimization level remained the same in between. mozilla (2:0.9.7-3) unstable; urgency=high * debian/rules: - Downgrade optimize level to 2 (-O3 to -O2) It will fix segfault with flashplugin, I've checked. (closes: #121404) Perhaps, also fix strange crashes. (closes: #126805, #126418, #127346) ... mozilla (M18-1) unstable; urgency=low * ... * Changes CFLAGS again, to -O3 -pipe in hopes to make it faster ... mozilla (M17-3) unstable; urgency=low * ... * Changed CFLAGS to -pipe -O2 to comply with policy It seems to me that "every user of the Debian mozilla package since M17" also constitutes a fairly broad base of testers :) I've also used every Debian mozilla package released since then (I stopped using the packages for a while, but only because Debian wasn't releasing any at all) and I can personally testify that I've never seen any apparently Debian-specific or optimization-related issues.
sballard@netreach.net - where can we read these logs ?
Debian users can read them in /usr/share/doc/mozilla/changelog.Debian.gz (this is a standard location for Debian changelogs). If you aren't a Debian user, I don't think they're publically available in a terribly easy to use form, but you can find it in the following (huge) patch document, which mozilla views with no real problems: http://non-us.debian.org/debian-non-US/pool/non-US/main/m/mozilla/mozilla_0.9.9-1.diff.gz (search for mozilla-0.9.9/debian/changelog to get the part of the patch that inserts the debian changelog - the whole changelog is in there) Hope this helps.
The keyword "topperf" is certainly justified for this bug.
Keywords: perftopperf
Adding syd to the cc:.
Christopher Seawood stopped putting up gcc2.95 -O2 builds on the ftp-server in favor of gcc3.0 builds. Wasn't this because 2.95 and -O2 was no longer considered experimental? FWIW I always build with the following without special problems: ac_add_options --enable-optimize="-O4 -finline -fno-omit-frame-pointer -march=k6 -mcpu=k6"
I've been using ac_add_options --enable-optimize="-O3 -march=athlon -mcpu=athlon -fno-omit-frame-pointer -maccumulate-outgoing-args" with gcc 3.0 ever since 3.0.1 with no problems besides plugins dont work.
I think that a reasonable approach to this bug for 1.0 is to offer an 'experimental' build with -02. I certainly don't think that we'd hold 1.0 for this change.
Asa - that sounds like a reasonable compromise. Can you try to make sure such a build is actually done? Also, will it be easy to check in a talkback report if such a build was used? It would be nice if an official -O2 build was done daily already.
Just for the record, I've been using "-O3 -march=i686 -fno-omit-frame-pointer -funroll-loops" for a couple of months without problems. Someone on the newsgroups recommended it after having used those options for a while.
I wrote in to the gcc mailing list and asked if there was a way to avoid the gcc 3.0 libgcc_s.so requirement and got the following response: >Yes, see the -static-libgcc option. Be aware that this is asking for trouble if >you ever want to throw exceptions across shared libraries, because there could be >different incompatible versions of the EH runtime linked in. But, IIRC, mozilla >does not use exceptions anyway so that may not be an issue. Post 1.0/gcc 3.1 release we should try this flag and see how well it works.
Now using gcc 3.1 with -O3 -march=athlon -mcpu=athlon -fno-omit-frame-pointer -maccumulate-outgoing-args -falign-functions=4 -fstrict-aliasing -fbranch-probabilities and it works excellent!
#147 > I think that a reasonable approach to this bug for 1.0 is to offer an > 'experimental' build with -02. I agree. > I certainly don't think that we'd hold 1.0 for this change. In that case though, it's nonsense to have this mozilla1.0+.
I tried yesterday's nightly build with -O3 -march=k6 -fexpensive-optimizations -ffast-math using GCC 2.95.3 and it was marginally faster rendering the index of the JDK API but slower rendering Google Image Search pages, when compared to -O2 -march=k6. Observations are based only on Mozilla's "Document: Done(...secs)" output. I couldn't feel the difference.
To "end" this endless discussion: Can we now try (e.g. "carpool", change default to -O2, kick all tinderboxen and check if the Zilla still passes the smoketests) to make the default optimisation -O2 in "trunk" and see if it works properly, please ? We can "undo" this change at any point later if it causes too much trouble...
Keywords: mozilla1.1
I agree with Roland, it's time to try to turn this on by default. A LOT of people run these builds daily, and it would be nice if the rest of the crowd also got to see the speed improvements it brings. I hope the Mozilla folks trust all of us who have tested this for so long...
It's not an issue of testing or trust at this point, but of the fact that to reliably turn this stuff on, the default builds need to happen on newer compilers (the current default builds happen on egcs 1.1.2, which is ancient). And switching to newer compilers brings in library dependency issues. So the blocking thing now is that a document needs to be written up describing these issues, and the major stakeholders need to buy off on the changes. Most likely, this will involve adding a requirement for specific mininimum library versions, and (at the very least) supplying up-to-date links from which folks with various common linux distributions can download them.
There used to be nightly builds produced on gcc2.95, so some build box should have that compiler installed, and it should have made it clear what libraries are required. And the newer libraries will probably make it easier for people to install mozilla since the egcs-libraries are too old to be installed by default on most up-to-date distributions.
It's simple.. Currently mozilla depends on libstdc++-libc6.1-1.so.2, which is an old library part of the old egcs 1.1.2. A C++ program built with a current gcc (2.95.4) will depend on libstdc++-libc6.2-2.so.3, which is part of that gcc. As stated before, the old library is more problematic, because it's from an obsolete compiler. As an example, the "new" libstdc++ has been in Debian since 1999-07-14.
As long as we're eventually going to be requiring a new set of libraries, are there worthwhile performance gains to be gained by switching to gcc 3.x? I think once Red Hat, Mandrake, and United ship a distrib with 3.x, it's fair game for apps to require. Red Hat's been shipping gcc3 libs for quite some time now.
Alias: O2
this is blocked by bug 158385, yes?
Well, it depends on moving to a gcc3.2 build, and that depends on that other bug....
Depends on: 158385
here my 2 cents, inspite my trouble is not about linux. Problem is funny - in BeOS (gcc 2.9 - 2.95) - -O2 breaks URLbar drop-down functionality. It don't raise automatically with autocomplete. Inspite it can still bring it to front with explicit click on control
FWIW, I turned -O2 on on brad a few hours back. ZDiff was +429752/-481140, for a net win of 51388 bytes. Which isn't really significant. Brad (the person) changed the machine's name just afterwards, so you can't see the log, but its at http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1050125340.9102.gz&fulltext=1 the +/- is all over the place - minus where the compiiler can optimise some stuff better, and + where the compiler inlined code to make stuff faster for a modest code size increase. (and 50K for the 7-10% speed improvement mentioned elsewhere is defintely modest) I did look into asm diffs for a couple of the large wins, and I don't think that theres anything useful we can do with it from a code pov. That box is a trace-malloc box, so no perf numbers from the change over. (sergei, 2.95 doesn't work with -O2 - theres a known compiler bug with that)
OK, we've finally got the compiler that we need (gcc 3.2.3) to throw the -O2 switch in the default build. If we could get -O2 turned on by 1.5 alpha, that'd totally rock.
Assignee: asa → leaf
nightly builds should start getting built with -O2 starting on july 4th. Let's hope there aren't any fireworks. Accepting bug, in case we want configure to automatically use -O2 optimization for --enable-optimize for sufficiently high GCC
Status: NEW → ASSIGNED
gcc 3.2 vs. gcc 3.3 shootout: <http://www.world-direct.com/mozilla/dhtml/funo/domtestcases/> Average (10runs): gcc 3.2 gcc 3.3 getElementById() 556ms 485ms getElementsByTagName() 369ms 304ms createElement() 988ms 866ms getAttribute() 437ms 381ms setAttribute() 408ms 342ms Sum 2758ms 2378ms That's a whooping 14% faster. Version: Mozilla 1.4 Build options: moz_official, -O2, -march=pentiumpro (won't run on pentium1/k6) System: Debian unstable, Athlon XP 1600+ (1.4Ghz)
We also might want to consider -Os (optimize for size). Here's the results for three different --enable-optimize configurations (gcc-3.2.2-5): -O2 -Os -Os -mcpu=i686 (code scheduled for i686, will run on any x86) <http://www.world-direct.com/mozilla/dhtml/funo/domtestcases/> Average (50 runs): -O2 -Os -Os/i686 getElementById() 431ms 426ms 430ms getElementsByTagName() 390ms 390ms 394ms createElement() 726ms 722ms 728ms getAttribute() 452ms 453ms 455ms setAttribute() 432ms 432ms 433ms Sum 2431ms 2423ms 2440ms Size of all .so: 16979K 16127K 16092K System: Red Hat 9, Pentium-III (1.133GHz) Build config: MOZ_PHOENIX --disable-tests --disable-ldap --disable-mailnews --enable-extensions=default,-inspector,-irc,-venkman,-content-packs,-help --enable-crypto --enable-plaintext-editor-only --disable-composer Using -Os instead of -O2 give almost identical performance for these tests and reduces the footprint by 5%.
Severity: critical → major
Priority: P3 → P2
Target Milestone: Future → mozilla1.5beta
Blocks: majorbugs
What is the current status of this? Is -O2 enabled now for the nightly Linux builds?
about:buildconfig shows Firebird trunk nightly is using -Os (FB 20040101 Linux)
What about the 'Suite'? -Os or -O2 or -O1??
Compare bug 225433 (use -Os)
resolving in favor of 225433 *** This bug has been marked as a duplicate of 225433 ***
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → DUPLICATE
Product: Browser → Seamonkey
No longer blocks: majorbugs
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: