Closed Bug 77788 Opened 23 years ago Closed 23 years ago

PSM2: non-debug freebl build fails on Solaris

Categories

(NSS :: Build, defect, P2)

3.2.1
Sun
Solaris
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rich.burridge, Assigned: wtc)

References

Details

Attachments

(5 files)

Chris, I've put this one under "Build config" this time, rather than 
NSS or PSM. Please adjust if I'm wrong.

We are trying to build PSM2 with Sun native compilers (Forte 6 Update 1).

Ie:

...
% gmake -f client.mk checkout BUILD_MODULES=psm2
...
% gmake -f client.mk build_all BUILD_MODULES=psm2
...

In .../mozilla/security/nss/lib/freebl/SunOSpure32
when it tries to build libfreebl_pure32_3.so, the following occurs:

...
ld -G -h libfreebl_pure32_3.so -B symbolic -z defs -z now -z text -M
mapfile.Solaris -o SunOS5.8_OPT.OBJ/libfreebl_pure32_3.so
SunOS5.8_OPT.OBJ/ldvector.o SunOS5.8_OPT.OBJ/prng_fips1861.o
SunOS5.8_OPT.OBJ/sha_fast.o SunOS5.8_OPT.OBJ/md2.o SunOS5.8_OPT.OBJ/md5.o
SunOS5.8_OPT.OBJ/alg2268.o SunOS5.8_OPT.OBJ/arcfour.o SunOS5.8_OPT.OBJ/arcfive.o
SunOS5.8_OPT.OBJ/desblapi.o SunOS5.8_OPT.OBJ/des.o SunOS5.8_OPT.OBJ/rijndael.o
SunOS5.8_OPT.OBJ/dh.o SunOS5.8_OPT.OBJ/pqg.o SunOS5.8_OPT.OBJ/dsa.o
SunOS5.8_OPT.OBJ/rsa.o SunOS5.8_OPT.OBJ/mpprime.o SunOS5.8_OPT.OBJ/mpmontg.o
SunOS5.8_OPT.OBJ/mplogic.o SunOS5.8_OPT.OBJ/mpi.o  
../../../../../dist/SunOS5.8_OPT.OBJ/lib/libsecutil.a 
-L../../../../../dist/SunOS5.8_OPT.OBJ/lib/ -lplc4 -lplds4 -lnspr4 -lc
Undefined           first referenced
 symbol                 in file
s_mpv_mul_d                         SunOS5.8_OPT.OBJ/mpmontg.o
conv_i32_to_d32_and_d16             SunOS5.8_OPT.OBJ/mpmontg.o
s_mpv_mul_d_add                     SunOS5.8_OPT.OBJ/mpi.o
mont_mulf_noconv                    SunOS5.8_OPT.OBJ/mpmontg.o
s_mpv_mul_d_add_prop                SunOS5.8_OPT.OBJ/mpmontg.o
conv_i32_to_d32                     SunOS5.8_OPT.OBJ/mpmontg.o
conv_i32_to_d16                     SunOS5.8_OPT.OBJ/mpmontg.o
ld: fatal: Symbol referencing errors. No output written to
SunOS5.8_OPT.OBJ/libfreebl_pure32_3.so
gmake[5]: *** [SunOS5.8_OPT.OBJ/libfreebl_pure32_3.so] Error 1
gmake[5]: Leaving directory
`/spare/ws/mozilla/mozilla-sparc-nightly/mozilla/security/nss/lib/freebl/SunOSpure32'

According to Margaret Chan, if f you are building with debug enabled
then this problem doesn't occur. 

I've done a bit of research on this (and will continue to look at it).

Now if we remove the following from the
...mozilla/security/nss/lib/freebl/Makefile (at about line 164):
 
-ifdef USE_64
-# this builds for Sparc v9a pure 64-bit architecture
-    MPI_SRCS += mpi_sparc.c
-    ASFILES   = mpv_sparcv9.s montmulfv9.s
-    DEFINES  += -DMP_ASSEMBLY_MULTIPLY -DMP_USING_MONT_MULF
-    DEFINES  += -DMP_USE_UINT_DIGIT
-#   MPI_SRCS += mpv_sparc.c
-# removed -xdepend from the following line
-    SOLARIS_FLAGS = -fast -xO5 -xrestrict=%all -xchip=ultra -xarch=v9a -KPIC
-mt
-    SOLARIS_AS_FLAGS = -xarch=v9a -K PIC
-else
-# this builds for Sparc v8+a hybrid architecture, 64-bit registers, 32-bit ABI
-    MPI_SRCS += mpi_sparc.c
-    ASFILES  = mpv_sparcv8.s montmulfv8.s
-    DEFINES  += -DMP_NO_MP_WORD -DMP_ASSEMBLY_MULTIPLY -DMP_USING_MONT_MULF
-    DEFINES  += -DMP_USE_UINT_DIGIT
-    SOLARIS_AS_FLAGS = -xarch=v8plusa -K PIC
-#   ASM_SUFFIX = .S
-endif

and do:

% gmake clean
% gmake

those symbols are picked up from ...mozilla/security/nss/lib/freebl/mpi/mpi.c

In mpi.c, each of the routines for the unresolved symbols is surronded by:

#if !defined(MP_ASSEMBLY_MULTIPLY) 
... 
#endif
 
So if we don't remove those lines from the .../freebl/Makefile, then 
MP_ASSEMBLY_MULTIPLY is apparently being defined, but the assembly file
isn't. Still investigating this.

Now obviously we would prefer to use the specially crafted assembler 
routine rather than the generic .c routines for performance reasons.
I've added a transcript of the output of doing a "gmake -p -n" on my build
machine. Hopefully this should tell us why it's failed to build the assembler
files.
Rich, if you're not building the NSS_CLIENT_BRANCH (autoconf branch) of NSS,
then the bugs need to be filed against the NSS product. Sorry.

Assignee: cls → wtc
Component: Build Config → Libraries
Product: Browser → NSS
QA Contact: granrose → sonmi
Version: other → 3.2.1
Chris, fair enough. Perhaps you can still help though.

I think I have an inkling of what's going on here. 

It seems that this Makefile is used twice. First it tries
to build libfreebl_hybrid_3.so. For this, run USE_PURE_32
is not defined, so it follows the #else clause in the
Makefile (about line 168) and it sets ASFILES to the two
assembly files, and DEFINES includes -DMP_ASSEMBLY_MULTIPLY
The two assembler files are compiled, and the library is
built successfully.

Now the second time it uses the Makefile is when it's trying
to build libfreebl_pure32_3.so. It creates a SunOSpure32 
directory, and a load of symbolic links and sets USE_PURE_32. 

This means that the Makefile takes the #ifdef USE_PURE_32 clause 
at about line 165, and adds three definitions to DEFINES. 
Unfortunately, the old definitions (including -DMP_ASSEMBLY_MULTIPLY) 
appear to still be defined, therefore when it compiles mpi.c, the 
routines that are surronded by #if !defined(MP_ASSEMBLY_MULTIPLY) 
are not compiled, therefore generating unresolved symbols at ld 
build time.

Can you think of a simple fix for this?
Rich, are you able to explain why the debug build does not
have this problem?
In a word, no. Margaret, can you chip in here, as you were the person who
successfully built this?

I've just thought of a potential solution though (haven't tried it yet, 
so I might be talking crap).

What if right at the very top of the .../freebl/Makefile we had:

DEFINES =

This would (hopefully) reset the DEFINES definition. I'm going into
a meeting now, but I'll try this in an hour or so and see if it works.
The "DEFINES=" didn't work. Back to the drawing board.
As I understand it, the working hypothesis here is that makefile 
variables from PSM/mozilla makefiles are leaking through into,
and interfereing with, the NSS makes.   Recall that the NSS makes
work OK on Solaris, both OPT and DBG, when not invoked via PSM's
make system.

One way to isolate the two sets of makefile variables is to use a 
shell script.  That is, have psm makefiles invoke a shell script 
that invokes the NSS make, rather than invoking make on the NSS 
makefiles directly.

Variables that need to be passed can be passed explicitly, e.g.
on the command line.
I was building using NSS_CLIENT_BRANCH.  Checking out psm2 by

gmake -f client.mk checkout BUILD_MODULES=psm2 

checks out codes from the above branch for me:

% cvs status mpi.c
===================================================================
File: mpi.c             Status: Up-to-date

   Working revision:    1.32
   Repository revision: 1.32   
/cvsroot/mozilla/security/nss/lib/freebl/mpi/mpi.c,v
   Sticky Tag:          NSS_CLIENT_TAG (revision: 1.32)
   Sticky Date:         (none)
   Sticky Options:      (none)

My build with --enable-debug --disable-optimize --enable-modules=psm2 does not
have this problem, but I can't say for sure if this problem is simply triggered
by enabling optimize.  

Checking on Rich's build problem, I kind of feel that the build should not be
going into this section:  ifdef USE_64 as we're not enabling 64 bit, but I have
not figured out where it is being set/defined.
Margaret, were you using NSS_CLIENT_BRANCH or NSS_CLIENT_TAG?
I've added two more attachments to the bug. There is a transcript of a
build done by our RE team, with debug turned on (a Solaris 7 SPARC machine).
There are no build errors there, and you will see that -DMP_ASSEMBLY_MULTIPLY
is not defined when compiling the second library. The other attachment is
for the build on my machine, with no debug and no optimization (ie. 
--disable-optimize in .mozconfig which apparantly is ignored). This one
has -DMP_ASSEMBLY_MULTIPLY defined for the builing of the second library.
Where is that coming from? Find, that and I think we can solve this.
Rich, every compile in your second transcript ("no debug and
no optimize", what does that mean?) has this compiler warning:
cc: Warning: multiple use of -K option, previous one discarded.

Many compiler options are repeated on the command line.  Something
is wrong here.
Sorry, clarification:  it was NSS_CLIENT_TAG that psm2 used,  NOT
NSS_CLIENT_BRANCH.
> ("no debug and no optimize", what does that mean?)

It means I had --disable-debug and --disable-optimize in my .mozconfig.
Having said that, it looks like the libraries are being compiled with -O
so I'm not sure where it's picking this up from.

> has this compiler warning:
> cc: Warning: multiple use of -K option, previous one discarded.
> 
> Many compiler options are repeated on the command line.  Something
> is wrong here.

Yes. This could be related to the problem we are seeing. 
I think that nss & psm can only take one option or the other.  i.e.,
--enable-debug --disable-optimize or --disable-debug --enable-optimize. 
Disabling/Enabling both may confuse the build.  This should not be the problem
though as Shirley & Kevin (our RE)'s builds with --disable-debug
--enable-optimize with optimization level O3 fails the same way.
Hopefully the attached log from a succesful nightly optimized NSS build
will show you what the command lines ought to look like.  HTH.
Summary: PSM2: freebl does not build correctly on Solaris when not debug. → PSM2: non-debug freebl build fails on Solaris
Thanks Nelson. Interesting. This is on a 5.6 machine. We should compare
the SunOS5.[6,7,8].mk files to see if there is anything funkily
different there. 
Nelson, could you attach a "gmake -p -n" for a build in the freebl directory
from your Solaris 5.6 machine please? I'd like to compare that with the 
gmake -p -n that I attached for my failed build. Thanks.
Rich, I emailed you the 1/2 megabyte output of "gmake -p -n libs"
Please don't attach it here.

BTW, I build NSS on solaris 6, 7, and 8 all the time.
There shouldn't be any differences between those builds, other than
in the object directory names.  If you find there are other 
differences, please let me know.
OK, I think I have tracked down this bug.

Rich: I bet that you have the environment variable CFLAGS set.  You can
try unsetting CFLAGS and rebuilding.

Nelson: You can try setting the environment variable CFLAGS (for example,
to -O) and you should get the same build failure.

If CFLAGS (actually any make variable) is set in the environment, GNU make
passes the *current value* (which may have been modified by the makefile)
of CFLAGS to sub-makes.  In mozilla/security/coreconf/command.mk, we
define CFLAGS with "+=".  So if CFLAGS is set in the environment, sub-makes
will append to the value of CFLAGS in the parent makefile.

If CFLAGS is not set in the environment, GNU make does not pass it to
sub-makes.

I am going to attach a fix that defines CFLAGS with "=".
Status: NEW → ASSIGNED
wtc: Yes, I did have CFLAGS environment variable set 
(also CXXFLAGS). Am trying a rebuild now (without those
two set, and without your proposed patch), just to confirm 
that this will build. Thanks.
wtc: yes, this was indeed the problem. I unsetenv'ed my
CFLAGS, and the build built just fine. On Monday, I'll
apply your patch, set a CFLAGS, and see if that works
too. Thanks.
Hi Wan-Teh:

In other words, does it mean that by changing it to CFLAGS= as opposed to
CFLAGS+=, it will no longer append the CFLAGS that's set in the environment to
CFLAGS?  Just try to understand it better.  Thanks, Margaret 
Margaret Chan wrote:
> In other words, does it mean that by changing it to CFLAGS= as opposed to
> CFLAGS+=, it will no longer append the CFLAGS that's set in the environment to
> CFLAGS?  Just try to understand it better.

Well, it's actually more complicated than that.  If you want to fully
understand how GNU make behaves in this regard, send me email or give
me a call.
wtc: I've asked our RE team to try out your fix with their builds tonight,
and to update this bug with the results. Hopefully that should be sufficient
to indicate whether the problem has been fixed on not. Thanks.
From Bug #77324:

Given when the bugs were opened I'd say bug #77788 is a duplicate if this one
;-)

Anyway - removing CFLAGS and downloading 20010430 has changed the failure mode
for me to:

make[2]: Leaving directory
`/work/gyles/apps/mozilla/DISTRIBUTION/20010430/mozilla/security/nss/lib'
../../config/nsinstall -R -m 755
../../dist/SunOS5.6_sparc_OPT.OBJ/lib/libnssckbi.so ../../dist/bin
../../config/nsinstall: cannot access
../../dist/SunOS5.6_sparc_OPT.OBJ/lib/libnssckbi.so: No such file or directory
make[1]: *** [install] Error 1
make[1]: Leaving directory
`/work/gyles/apps/mozilla/DISTRIBUTION/20010430/mozilla/security/manager'
make: *** [install] Error 2

This looks like some sort of path problem because the library has built:

whars07h{gyles}24: cd SunOS5.6_OPT.OBJ/
/work/gyles/apps/mozilla/DISTRIBUTION/20010430/mozilla/security/nss/lib/ckfw/builtins/SunOS5.6_OPT.OBJ
whars07h{gyles}25: l
total 300
-rw-rw-r--   1 gyles    passport    15392 May  1 11:03 anchor.o
-rw-rw-r--   1 gyles    passport    77556 May  1 11:03 certdata.o
-rw-rw-r--   1 gyles    passport     1344 May  1 11:03 constants.o
-rw-rw-r--   1 gyles    passport     2180 May  1 11:03 find.o
-rw-rw-r--   1 gyles    passport     1892 May  1 11:03 instance.o
-rwxrwxr-x   1 gyles    passport   182440 May  1 11:03 libnssckbi.so*
-rw-rw-r--   1 gyles    passport     1924 May  1 11:03 object.o
-rw-rw-r--   1 gyles    passport     1036 May  1 11:03 session.o
-rw-rw-r--   1 gyles    passport     1792 May  1 11:03 slot.o
-rw-rw-r--   1 gyles    passport     2276 May  1 11:03 token.o
whars07h{gyles}26: file libnssckbi.so 
libnssckbi.so:  ELF 32-bit MSB dynamic lib SPARC Version 1, dynamically linked,
not stripped
I've tried the CFLAGS= instead for CFLAGS+= patch in our solaris 7 builds with
Forte6 Update1 compilers and this solved the undefined symbols build problem.
Thank you for verifying my fix.  I checked it in on the trunk
and the NSS_3_2_BRANCH.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Priority: -- → P3
Resolution: --- → FIXED
Target Milestone: --- → 3.2.2
FYI. I can't get your putback yet since PSM 2.0 is still using NSS_CLIENT_TAG.
So for now I will have to continue to apply the patch until pull tags for PSM
2.0 are changed.
By the way, this is just a note of the behavior.  I've checked out Shirley's
build log.  We have set the CFLAGS and CXXFLAGS to -xO3 in the build
environment, but it appears that the nss build invoked by PSM is ignoring it. 
See example below:

cd nss; gmake libs
gmake[4]: Entering directory `/net/crumple.eng/export/nc-re/release/build/SUN-MO
Z/5.7/sparc-opt/mozilla/security/nss/lib/nss'
cc -o SunOS5.7_OPT.OBJ/nssinit.o -c -O -KPIC -DSVR4 -DSYSV -D__svr4 -D__svr4__ -
DSOLARIS -DSOLARIS2_7 -D_SVID_GETTOD -xarch=v8 -DXP_UNIX -UDEBUG -DNDEBUG -I/usr
/dt/include -I/usr/openwin/include -I/net/crumple.eng/export/nc-re/release/build
/SUN-MOZ/5.7/sparc-opt/mozilla/dist/include/nspr  -I../../../../dist/public/secu
rity -I../../../../dist/private/security -I../../../../dist/include -I../../../.
./dist/public/security -I../../../../dist/public/dbm  nssinit.c

It is using -O instead.  However, for now, it is sufficient enough for us to be
using -O for those files; so we don't worry much about that.  Just update it as
a note.
My fix (changing CFLAGS += to CFLAGS =) breaks the NSS build on
Windows NT.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I backed out my fix.  I'm going to mark this bug
WONTFIX.

You can work around this problem by unsetting CFLAGS
in your environment.  If you need to specify a C
compiler flag for a configure script, do that on the
command line.  For example, with sh:
    CC=cc CFLAGS=-xO3 ./configure
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Resolution: --- → WONTFIX
The reason that the fix for this bug breaks our NT build
is that mozilla/security/nss/lib/fortcrypt/swfort/pkcs11/manifest.mn
defines -DSWFORT in CFLAGS:
    CFLAGS += -DSWFORT
With our fix, this macro define would be lost.  One way to avoid
this problem is to define -DSWFORT in DEFINES, which is more
appropriate anyway:
    DEFINES = -DSWFORT
I checked in this change.  However, I still haven't checked in the
fix for this bug.
*** Bug 98578 has been marked as a duplicate of this bug. ***
*** Bug 100556 has been marked as a duplicate of this bug. ***
why can't we make this change inside an if !defined(XP_WIN) or something?
Now that the manifest.mn file that the fix broke has been
modified to avoid the problem, the fix may be checked in
again.  To be safe, I verified that no manifest.mn under
mozilla/security sets CFLAGS.

So I checked in the fix on the tip of NSS.  By checking in
this fix, we are requiring that clients of coreconf not set
CFLAGS in manifest.mn.  This is not a serious limitation
because they can still achieve the same goal by setting the
components of CFLAGS, such as DEFINES and INCLUDES.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Target Milestone: 3.2.2 → 3.4
This fix will go into NSS 3.4.

Note that the Mozilla client is using a stable
snapshot of the NSS 3.3 branch now, so it won't
pick up this fix until it upgrades to NSS 3.4
(mid-November the earliest).
Status: REOPENED → RESOLVED
Closed: 23 years ago23 years ago
Component: Libraries → Build
Priority: P3 → P2
Resolution: --- → FIXED
*** Bug 107644 has been marked as a duplicate of this bug. ***
Today, the third duplicate of this bug was filed,
so I checked in the fix on the NSS_3_3_BRANCH
and will move the NSS_CLIENT_TAG.  The Mozilla
client will pick up this fix later today when
I move the NSS_CLIENT_TAG on security/coreconf/command.mk.
Target Milestone: 3.4 → 3.3.2
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: