Closed Bug 331110 Opened 18 years ago Closed 15 years ago

NSS testing must detect a wrong version of glibc on Linux

Categories

(NSS :: Test, defect, P2)

3.11
x86
Linux
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: nelson, Assigned: slavomir.katuscak+mozilla)

Details

Attachments

(1 file)

NSS Tinderbox is now reporting more-or-less continuous failures on Linux.
The errors generally occur in the same place.  Log file output excerpt below.

The error message that comes out is bogus.  It says:
ssl.sh: Stress TLS  RC4 128 with MD5 produced a returncode of 143
That doesn't tell us what program produced that return code.
Was it selfserv?  Was it strsclnt?  

What does 143 mean, and what causes it?  
Looking at strsclnt.c (assuming the return value comes from strsclnt),
it appears that all calls to exit pass either 0, 1, or 2, but not 143.

Alexei, you're our Linux guru.  Please have a look.  
This bug has the highest severity, because it makes Tinderbox useless.

Note, If there's a problem with the actual crypto, that needs to be fixed,
too.  But this bug is about the bizarre return code 143, and the bogus
error message that doesn't identify the offending program, not about
the underlying NSS library problem (if any).


ssl.sh: Stress TLS  RC4 128 with MD5 ----
selfserv -D -p 8443 -d ../ext_server -n nsssvr.red.iplanet.com -B -s \
          -w nss   -i ../tests_pid.11135  &
selfserv started at Sun Mar 19 22:32:45 PST 2006
tstclnt -p 8443 -h nsssvr.red.iplanet.com  -q \
        -d ../ext_client < /export/tinderbox/Linux_2.4.20-8_Depend/mozilla/security/nss/tests/ssl/sslreq.dat
strsclnt -q -p 8443 -d ../ext_client  -w nss -c 1000 -C c \
          nsssvr.red.iplanet.com
strsclnt started at Sun Mar 19 22:32:45 PST 2006
strsclnt: -- SSL: Server Certificate Validated.
strsclnt: 0 cache hits; 1 cache misses, 0 cache not reusable
./all.sh: line 730: 29821 Terminated              strsclnt -q -p ${PORT} -d ${P_R_CLIENTDIR} ${CLIENT_OPTIONS} -w nss $cparam $verbose ${HOSTADDR}
strsclnt completed at Sun Mar 19 22:41:04 PST 2006
ssl.sh: Stress TLS  RC4 128 with MD5 produced a returncode of 143, expected is 0.  FAILED
./all.sh: line 730: 29801 Terminated              selfserv -D -p ${PORT} -d ${P_R_SERVERDIR} -n ${HOSTADDR} ${SERVER_OPTIONS} ${ECC_OPTIONS} -w nss ${sparam} -i ${R_SERVERPID} $verbose
ssl.sh: skipping  Stress SSL3 ECDHE-ECDSA AES 128 CBC with SHA (no reuse) (ECC only)
P1 for 3.11.1
Priority: -- → P1
Nelson,

I'm sure it wasn't selfserv that returned 143, because the script currently never checks return code from selfserv which is started in the background.
So, it's got to be from strsclnt .
the reason of failures was installation of old glibc on the system. 

Recommended version of glibc > 2.3.2-27.

Related bug that was fixed in this version:
  unable to waik up threads holding on cv with pthread_cond_signal call.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → INVALID
Alexei,  I interpret your comment above as saying that NSS wsa built with
a broken version of glibc.  Assuming that is so, then I agree that this is 
a bug in glibc libraries, not in NSS libraries.  But it can also be 
perceived as a problem with NSS builds.  Apparently NSS builds need to 
ensure that they are being linked with a certain version of glibc.

We cannot consider this bug fixed until our own nightly builds are being
done with a fixed version of glibc.  

I think our build scripts should test the version of glibc, and the entire
build should fail if a bad glibc is being used.  
Status: RESOLVED → REOPENED
Component: Test → Build
QA Contact: jason.m.reid → build
Resolution: INVALID → ---
(In reply to comment #4)

> We cannot consider this bug fixed until our own nightly builds are being
> done with a fixed version of glibc.  

And tinderbox, too.

Alexei + Christophe, please make the build script detect the wrong version
of glibc and fail.

Sandeep, ensure that tinderbox systems build with the right version of glibc.

Summary: ssl.sh: Stress TLS RC4 128 with MD5 produced a returncode of 143 → NSS being built with wrong version of glibc on Linux
Do we have to check the glibc on the build machine or on the test machine.

I believe that if the fixed glibc does not change the API, the version of the glibc on the build machine does not matter (not statically linked).
What makes the test fail is the glibc used to run the test on the test machine.

Am I correct ?
I need to clarify couple thing: then I mention glibc, I meant the rpm package, not a particular library. Most of the libs in the package are dynamically linked and since it is glibc, they resides in /usr/lib. BTW, lib nss is a part of it.

Now, the problem with glibc package that we hit here is the run time problem. The problem in particular is dynamic linking with libpthread.so that has this bug. Building on the machine with the old package is not a problem as long as bits are not run on this machine. 

glibc 2.3.2-11(the one that we had) is ancient. I believe the bug that was filed against glibc was dated at 2003 and having it on the machine.

So, instead of doing modification of Makefile in NSS, I feel that changes should be make to configure part of NSPR if they should be made at all. But it will not provide the solution for running bits on a machine with old libraries set. So we still have an issue.

The better but more difficult solution is to check what dynamic library(s) we are linking with then we start nss.

JDK has same dependencies with os libraries on all platforms. Whey track some key links dynamically, but they also publish the minimum requirement for the systems and it is user responsibility to verify it.
OK, so maybe this is a libraries or test bug.
Our QA must ensure that it is running on a machine with good libs.
Perhaps not every lib on every machine, but when we know there is a 
known problem with a known version of a known lib on a known OS,
then I think we must check for that, at least in the test scripts.
I understand your point.
Now we can take care of our test machine and check if the right version of glibc is present on this machine to avoid having this failure. If this test is included in all.sh, it will probably save future dev/QA a lot of troubleshooting time. I'll work on this.

What about our customers ? Do we have a way to pass them this information about the minimum version of glibc that works with NSS ?
Assignee: alexei.volkov.bugs → christophe.ravel.bugs
Status: REOPENED → NEW
Component: Build → Test
QA Contact: build → test
Summary: NSS being built with wrong version of glibc on Linux → NSS testing must detect a wrong version of glibc on Linux
I don't think it should be a P1 or a blocker bug at this point.
Hmm, I thought I had changed this bug during our test bug meeting wednesday. But apparently not. This is not a P1 blocker. We know about the issue for our own QA machines, and corrective action is being taken or has already been taken. 

Christophe, we do need to document the minimum version of glibc / Linux that are needed for the bits that we produce. But I thought you were already doing that.

We should also check in the test script for an unsupported version if we have such a test available. That's a P3 test issue. The tests already fail because the OS is unsupported. The lack of this test could only be a P1 blocker if the tests were giving us false positives, but that's not the case.
Severity: blocker → normal
Priority: P1 → P3
Assignee: christophe.ravel.bugs → slavomir.katuscak
Target Milestone: 3.11.1 → Future
Target Milestone: Future → ---
Increasing priority to P2 (based on priorities set on meeting with Nelson in
September).
Priority: P3 → P2
Alexei, In comment #3 you wrote: Recommended version of glibc > 2.3.2-27.

The question is, how can I detect this minor number (27) ?? 

I checked FAQ on libc page:

--
4.9. How can I find out which version of glibc I am using in the moment?

{UD} If you want to find out about the version from the command line simply run the libc binary. This is probably not possible on all platforms but where it is simply locate the libc DSO and start it as an application. On Linux like

	/lib/libc.so.6
--

With this I got:
GNU C Library stable release version 2.3.2, by Roland McGrath et al.
libc does not carry a minor version in the version strings. There are tree more or less reliable ways to detect the minor version: 1. query package db using rpm; 2.  check compilation date(minor releases goes in chronological order); 3. write a code to test feature of a minor release. 
Attached patch Patch.Splinter Review
Attachment #305716 - Flags: review?(alexei.volkov.bugs)
Comment on attachment 305716 [details] [diff] [review]
Patch.

During the review, it comes clear to me, that making nss test scripts dependable on a particular package system is a mistake. Unfortunately, glibc does not carry information about it release build. It has only release information and possibly date. The last one is optional and not many systems have it.

I my opinion, we are out of options and I'm voting to drop this bug and close it as "will not fixed".
Attachment #305716 - Flags: review?(alexei.volkov.bugs) → review-
Is there NO version information in the glibc shared object in a 
"release build"?  
Does the ident command find nothing?
How about a command such as
    strings glibc.so | grep -c "expected version" 
No, there is not information about last number, see comment #13, you will not get more information using strings command. 
(In reply to comment #17)
> Is there NO version information in the glibc shared object in a 
> "release build"?
I think you mean this:
goa1:/home/volkov # /lib/libc-2.4.so 
GNU C Library development release version 2.4 (20060505), by Roland McGrath et al.
Copyright (C) 2006 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Configured for i686-suse-linux.
Compiled by GNU CC version 4.1.0 (SUSE Linux).
Compiled on a Linux 2.6.16 system on 2006-05-05.

As you see, it is not a detailed as we need it to be.

> Does the ident command find nothing?
ident works with RCS files. There are no libc sources on most of the system. The library is shipped in binary version.
> How about a command such as
>     strings glibc.so | grep -c "expected version"
The command does not aggregate too much more useful info than I've printed in the answer to the first question.

Slavo, Alexei,

Even if it was a manual process, how did we determine what version of glibc we had on our Linux systems when we ran into the problem ?
(In reply to comment #20)
> Slavo, Alexei,
> 
> Even if it was a manual process, how did we determine what version of glibc we
> had on our Linux systems when we ran into the problem ?

For RH we can use rpm -q glibc. 
This bug is open for a long time without any progress.

Summary:
1. There is not easy way how to detect version of glibc on a system - recommended is to use >= 2.3.2-27, but last number (27) is not detectable - at least we don't know how to detect it.
2. We can look to system package version, this method is not reliable (we can't be sure that we use package version of glibc).

At last staff meeting when this bug was discussed (about 11 months ago), it was suggested to close it with some arguments against it (don't remember all details). I would like to get rid of this bug, so I'm closing it as WONTFIX. If anybody wants to get it fixed and knows reliable way how to get glibc version then feel free to reopen it.
Status: NEW → RESOLVED
Closed: 18 years ago15 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: