Closed Bug 122974 Opened 24 years ago Closed 23 years ago

coreconf should use the -xO3 optimization flag on Solaris SPARC

Categories

(NSS :: Build, defect, P2)

3.3.1
Sun
Solaris
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: wtc, Assigned: kirk.erickson)

Details

Attachments

(3 files, 2 obsolete files)

Right now coreconf is using the -O (equivalent to -xO2) optimization flag on Solaris. We received several suggestions yesterday that coreconf should use an optimization level higher than -xO2. (-xO5 is the highest level.) Most of the opinions recommend that we use -xO3 by default because it strikes the best balance of performance gain and binary size for most applications (with a bias towards performance), and use -xO5 on select performance-critical files. One opinion suggests that we try -xO5 unless the binary size increase is unacceptable or we run into optimization bugs. Since coreconf is shared by multiple products, I suggest that we use -xO3 in coreconf. However, I'm also interested in finding out the performance and binary size of NSS compiled with -xO5.
Per our discussion in the conference call this afternoon, we must proceed carefully when experimenting with more aggressive compiler optimization. Set priority to P2, with no target milestone.
Priority: -- → P2
Saw modest gain with this patch on a 4 (400MHz) cpu box soupnazi 020415 full 3.4 Beta Apr 15 2002 11:07:36 100 enabled 171.55 0% soupnazi 020415 full-O3 3.4 Beta Apr 15 2002 12:16:56 100 enabled 172.54 0% soupnazi 020415 restart 3.4 Beta Apr 15 2002 11:07:36 100 enabled 331.85 1% soupnazi 020415 restart-O3 3.4 Beta Apr 15 2002 12:16:56 100 enabled 334.59 1% This patch addresses 5.8 only.
Target NSS 3.6.
Target Milestone: --- → 3.6
Let's resolve this in 3.7. It's fine to resolve it WONTFIX if the performance gain is not worth the risk.
Target Milestone: 3.6 → 3.7
Moved to target milestone 3.8 because the original NSS 3.7 release has been renamed 3.8.
Target Milestone: 3.7 → 3.8
I spent several days benchmarking -xO3. I stressed iws-perf: System Configuration: Sun Microsystems sun4u Sun Enterprise 450 (4 X UltraSPARC-II 400MHz) System clock frequency: 100 MHz Memory size: 512 Megabytes from 9 client machines, all running strsclnt from NSS_3_7_BRANCH. On the selfserv side, I did two baseline runs. First NSS_3_7_BRANCH stock (importing NSPR 4.2.2), then with a noimport build using NSPR_4_2_2_RC1 All runs are with NSPR_USE_ZONE_ALLOCATOR set to 1. full NSS with no changes (imported NSPR baseline) full-noimport noimport build no changes (baseline) full-xO3 only NSS compiled with -xO3 full-xO3-nspr only NSPR compiled with -xO3 full-xO3-both both NSPR and NSS compiled with -xO3 All runs were over 30 minutes in length. Filesizes are about the same: 741132 libnss3.so 741136 libnss3.so noimport 742748 libnss3.so compiled with -xO3 934572 libnss3.so compiled with -fast ------------------------------------------------ 392596 libnspr4.so 388204 libnspr4.so noimport 387500 libnspr4.so compiled with -xO3 420960 libnspr4.so compiled with -fast Compiling with -xO3 netted no gain on full handshake runs. Compiling NSS with -xO3 benefited restart (reuse) runs. 612.79 - 602.84 = 9.95 (1.65%) The "both" runs won earlier trials. I recommend that we change the TIP of both NSS and NSPR. There's as much as possibly a 2% gain on restarts. That said, I saw -fast beat -xO3 consistently, and performed a pair of runs with both NSS and NSPR compiled with -fast for grins. This is not considered safe though. These options are turned on for -fast: -dalign (SPARC) -fns -fsimple=2 (SPARC) -fsingle -ftrap=%none -nofstore (x86) -xbuiltin=%all -xlibmil -xtarget=native -xO5 Wan-Teh pointed out that -xtarget=native ties the architecture we build on to the executables linked with the library. I guess -fast is not germaine to this bug, but it consistently yielded the biggest gain in all my runs. I saw a 4% gain, again only on restarts.
Benchmark results.
Comment on attachment 109986 [details] [diff] [review] Proposed patch for the TIP of NSS and NSPR Hi Kirk, Thanks for the patch and the performance results. The NSPR change is incorrect. Your change will affect all platforms. You should find the "solaris" section and change _OPTIMIZE_FLAGS there. Also, you should change mozilla/nsprpub/configure.in and then run "autoconf" in the mozilla/nsprpub directory to regenerate mozilla/nsprpub/configure. The NSS coreconf change is correct, but I think it is better to use = instead of += to change OPTIMIZER as we discussed last Wednesday.
Attachment #109986 - Flags: review-
Benchmarked the TIP of NSS with TIP of NSPR. Saw again, that compiling NSS with -xO3 (rather than -O, which is equivalent to -xO2) speeds restarts some. Like my previous -xO3 results (with 3.7/4.2.2 - see id=109987), the most gain on restarts was compiling NSS with -xO3 only, and leaving NSPR as is. Full handshakes are a wash. I'll attach results next. the "-nss" runs correspond to when the proposed patch was engaged. The "-both" runs are with both NSS and NSPR compiled with -xO3. I suggest we leave NSPR as is, and checkin this change at the TIP of NSS.
The results for full-nss and restart-nss correpspond to when the suggested patch was applied. Its targeted for the TIP of NSS. The '-both runs are where both NSS and NSPR were compiled with -XO3. The 'full' and 'restart' runs were in a noimport build of the TIP of NSS and TIP of NSPR with no changes.
Comment on attachment 110601 [details] [diff] [review] Proposed patch for coreconf Wan_Teh - could you review this patch for checkin at the TIP of NSS?
Attachment #110601 - Flags: review?(kirk.erickson)
Comment on attachment 110601 [details] [diff] [review] Proposed patch for coreconf r=wtc. This patch is good.
Attachment #110601 - Flags: review?(kirk.erickson) → review+
Attachment #79454 - Attachment is obsolete: true
Attachment #109986 - Attachment is obsolete: true
Checked in approved patch (id=110602) at the TIP of NSS.
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: