Closed
Bug 100447
Opened 24 years ago
Closed 24 years ago
safe_popen fails under BSD/OS 4.2, BSD/OS 4.3
Categories
(NSS :: Libraries, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
3.4
People
(Reporter: lidl, Assigned: wtc)
Details
Attachments
(3 files, 3 obsolete files)
|
2.97 KB,
patch
|
Details | Diff | Splinter Review | |
|
6.81 KB,
patch
|
Details | Diff | Splinter Review | |
|
2.32 KB,
patch
|
Details | Diff | Splinter Review |
It was pointed out to me that the builds of mozilla that I have made for
BSD/OS4.2 core dump on exit. This was rather curious, I thought, but after
some investigation, I was able to trace back the failure to the safe_popen()
function in mozilla/security/nss/lib/util/unix_rand.c
After looking at this problem a little closer, it finally became clear to me
that this was also the problem that prevented mozilla from running on BSD/OS 4.3
if I compiled with --enable-crypto!
The safe_popen() function takes a program to run, makes a pipe, forks and
then closes down the majority of the fds for the child process. It execs
the command and sends back the results to the parent -- results which are
used to try to garner some randomness from the machine.
The command that is attemtped to run is "netstat -in", which produces a
more or less static output for a given machine under BSD/OS. If you
wanted something that counts packets, so it changes often, it would
have to be "netstat -sin". Here's what a "netstat -in" looks like on
BSD/OS 4.2:
lidl@sloar-40: netstat -in
Name Index MTU Speed Mtrc Address Network
exp0 1 1500 100M 0 00:90:27:c1:bb:3c
exp0 1 0 192.111.45.25 192.111.45.0/26
exp0 1 0 fe80::290:27ff:fec1:bb3c%exp0 fe80::/64
lo0 2 4352 0
lo0 2 0 fe80::1%lo0 fe80::/64
lo0 2 0 ::1 ::1/128
lo0 2 0 127.0.0.1 127
ppp0* 3 1500 0
tun0* 4 1500 0
vlan0* 5 0 0 00:00:00:00:00:00
vlan1* 6 0 0 00:00:00:00:00:00
gif0* 7 1280 0
gif1* 8 1280 0
gif2* 9 1280 0
gif3* 10 1280 0
Not very random.
Under BSD/OS, the safe_popen() routine causes the child process to coredump.
Under 4.2, it coredumps after running the exec'd program. This shows itself
as a completely "regular" run of mozilla, but after it exits normally,
a core file is written from the child process.
Under 4.3, the child exits before any data is written to the parent, so
the parent process hangs forever in the fread() of the expected data from
the (no longer running) child process.
Removing the attempt to call safe_popen() (heck, I removed the entire
function from my local copy) makes the program:
1) Not dump core under BSD/OS 4.2.
2) Initialize successfully under BSD/OS 4.3.
To compenstate for this lack of randomness, I added "/dev/urandom" to the
list of files that gets read, more or less applying the patch that
is attached to this bug:
http://bugzilla.mozilla.org/show_bug.cgi?id=96626
I marked this bug as "critical" in that it core-dumps on 4.2 (although
only when exiting the application) and prevents it from running on 4.3
entirely.
I believe you should be able to replicate this problem by examing the
directory where the tests on host 'otaku' (from the tinderbox-ports page)
are run -- look for the mozilla-bin.core file.
I'll attach my diffs for this problem, which disable the safe_popen()
code entirely for BSD/OS.
| Assignee | ||
Comment 2•24 years ago
|
||
Nelson, could you please take a look at this bug? Thanks.
Assignee: wtc → nelsonb
Comment 3•24 years ago
|
||
Marking NEW adding keywords.
| Assignee | ||
Comment 4•24 years ago
|
||
Instead of removing the call to safe_popen on BSD/OS,
we should find out why safe_popen causes the child
process to coredump.
Could you post the stack trace in the core file?
Priority: -- → P1
Target Milestone: --- → 3.4
Adding a traceback will be difficult for me -- I've upgraded most my
machines to BSD/OS 4.3, which as I explained in the original bug
report fails differently than under 4.2. Because the browser won't
even start under 4.3 without this patch, I've already had to modify
all my local copies of the CVS tree to have this patch included.
However, since there is a BSD/OS 4.2 tinderbox running (otaku), and
the bug manifests itself as a core dump after the main browser exits,
I expect that you will find a core dump on otaku, in whatever the
working directory of the browser is when the browser tests are run.
If not, cd to a writable directory, run the browser, exit it (you don't
even have to visit any sites) and examine the core file.
If you don't have the core file on the tinderbox machine, I can, if I
must, reload one of my workstations with 4.2 and get that trace. However,
as I pointed out, the netstat that is called doesn't give a particularly
random number under BSD/OS. You would literally be better off calling
random() and using that the netstat that is run.
Comment 6•24 years ago
|
||
On most (some? many?) Unix platforms, netstat -in gives interface statistics
for all interfaces. On BSD, it gives only static configuration info, and
therefore is useless as a source of entropy.
I suspect that there is some alternate set of flags that produces interface
statistics for all interfaces on BSD. Can you tell us what those flags are
and show some sample output of such a command?
Well, if you run 'netstat -in', you get the useless (for randomness purposes)
list of interfaces on the machine, which is pretty static across reboots.
To get the statistics, you would need to run 'netstat -sin', which looks
like the historical 'netstat -in' output:
lidl@nohost-77: uname -a
BSD/OS nohost.pix.net 4.3 BSDI BSD/OS 4.3 Kernel #2: Tue Oct 23 13:59:52 EDT
2001 root@nohost.pix.net:/usr/src/sys/compile/MEKETREX i386
(mac address sanitized to protect the innocent)
lidl@nohost-78: netstat -sin
Name Index Address Ipkts Ierrs Opkts Oerrs Coll Drop
exp0 1 xx:xx:xx:xx:xx:xx 364945 0 340750 0 0 0
exp0 1 192.168.168.1 365571 336982
exp0 1 fe80::2xx:xxff:fexx:xxxx%exp0 0 1
exp1 2 xx:xx:xx:xx:xx:xx 6630526 0 5851870 1 0 0
exp1 2 192.168.169.1 6509562 5731715
exp1 2 fe80::2xx:xxff:fexx:xxxx%exp1 0 1
exp1 2 192.168.169.2 112487 141258
wx0* 3 xx:xx:xx:xx:xx:xx 0 0 0 0 0 0
lo0 4 293072 0 293072 0 0 0
lo0 4 ::1 0 0
lo0 4 fe80::1%lo0 0 0
lo0 4 127.0.0.1 267669 267669
ppp0* 5 0 0 0 0 0 0
tun0* 6 0 0 0 0 0 0
vlan0* 7 00:00:00:00:00:00 0 0 0 0 0 0
gif0* 8 0 0 0 0 0 0
Comment 8•24 years ago
|
||
The patch give above disables the use of the netstat command on ALL
unix platforms, and adds the use of /dev/urandom only on BSDI.
I don't object to replacing netstat with /dev/urandom on platforms that
support /dev/urandom, but I do object to eliminating the use of netstat
without replacing it with something that provides at least as much
entropy as is being lost.
Comment 9•24 years ago
|
||
In http://lists.insecure.org/linux-kernel/2000/Mar/1739.html the author
explains that some systems may run out of swap space on one of the first
Copy-On-Write (COW) events after a fork, causing a SIGSEGV to the child.
This seems like a very likely cause for bug 100447, since mozilla is
renowned for using lots of memory.
So, submittor, please try substituting vfork in place of fork in
safe_popen on your BSD box and see if that solves the problem.
While you're at it, please make the command use the -s flag on BSD.
If this works, please attach the patch to this bug report.
While it is likely that mozilla will begin to use /dev/urandom on some
platforms in the future, that is not the solution to this problem.
Please follow bug 96626 on the introduction of /dev/urandom to mozilla.
Whiteboard: awaiting trial of vfork by submittor
| Reporter | ||
Comment 10•24 years ago
|
||
I'll address two things in this email, then I'll go build the test
cases that I was asked to test. (Even though I'm 99% sure that
it isn't going to help.)
1) I didn't mean to disable the netstat code on all machines, just
BSD/OS. There is a missing chunk of code in the attachment that
addresses this problem.
#ifndef BSDI
#define DO_NETSTAT
#endif
2) I'm quite certain that I'm not running out of swap. The 4.2 machine
that I was testing on had in excess of 600MB of swap. Mozilla may
be a memory pig, but even it is not that big.
The two different 4.3 machines that I've been testing on have,
respectively, 512MB RAM + 2GB swap, and 1.25GB RAM + 2.25GB of swap.
I'm going to checkout a fresh tree and try the changes suggested.
I'll report back later on the results.
| Reporter | ||
Comment 11•24 years ago
|
||
Well, as I suspected, changing the call from fork() to vfork()
didn't make it work. So that's not the problem. And, as a point
of reference, even when running the debugging build under
gdb, I still had in excess of 1.5GB of available swap space,
so the running out swap problem that was referenced isn't
the problem either.
I've cleaned up my earlier patch a little so that it doesn't
disable the netstat code on most machines, and more or less
predicates whether or not safe_popen() gets used on whether or
not /dev/urandom is in play.
While this doesn't address all of the commentary in bug 96626,
I don't feel it needs to either. Some of that commentary is
quite flawed, as it assumes deep knowledge about the internals of the
/dev/urandom implementation. Those assumptions may hold true
for Linux, but they don't for many flavors of the various BSD
systems available. All one should assume about /dev/urandom
is that you can open it and read a random or pretty pseudo-random
stream of bytes for it. One should NOT assume that is does
SHA-1 internally and therefore anything more than 160 bits pointless.
I'll attach my current patch next.
| Reporter | ||
Comment 12•24 years ago
|
||
Comment 13•24 years ago
|
||
Thank you for testing vfork.
This bug reports a problem with safe_popen on BSDI.
Another bug, bug 108708, reports a similar problem on Solaris,
(although it's not clear that these problems have the same cause).
Rather than fixing the root cause of the problem, the patches in this bug
report would simply avoid using safe_popen and substitute the use of
/dev/urandom instead.
This report also points out that /dev/urandom is of varying quality
on various platforms, which is precisely why I think the use of
/dev/urandom should be in addition to, rather than instead of,
other sources of entropy.
There are two separate issues:
1) fix safe_popen on BSDI and Solaris, (bugs 100447 and 108708)
2) add /dev/urandom as another source. (bug 96626)
I have a patch for 96626 that attempts to use /dev/urandom on ANY
unix/Linux that supports it (not a compile-time decision). It addresses
the other concerns expressed in 96626 by trying /dev/urandom first,
and only trying 1KB of it. I'll add that patch to 96626 when it's ready
(probably today).
In the meantime, let's get to the bottom of what's wrong with
safe_popen on BSDI (and Solaris). I have no BSDI system on which to
investigate, so further progress on this issue depends on BSDI developers.
| Reporter | ||
Comment 14•24 years ago
|
||
Getting to the bottom of what's wrong with the code in safe_popen() is
going to be a pretty messy problem, for a variety of reasons:
1) That code is written without any thought to the reality of running
on systems that use pthreads as the threading model. Signal handling
under pthreads doesn't guarantee that any particular thread will
get a particular signal delivered to it. Remember that signals
are still process-scope, not thread scope. So, diddling the
signals with sigaction() before and after forking is at best a
potentially haphazard way of doing things. At worst, doomed to
failure.
2) The use of strtok() is a no-no in a treaded program, at least without
a mutex to protect the state of it. Given that safe_popen() makes
no attempt to protect mutexes that might be held across the fork()
(one of the few things that are carried over from parent->child
when a pthread'd program forks), I wouldn't care to guess what odd
failures might exhibit themselves due to some residual mutex being
held in libc or another library that the child process encounters
already locked by the parent.
I would guess that the safe_popen() routines should be using
pthread_atfork() to handle a reasonable cleanup of the child
space process context. Without that rewrite, I think this code
is doomed to be buggy.
All that being said, I'm building a debug version of the suspect code
under 4.2 (where the child actually drops a core file on application
exit), and I'll post a backtrace when it finishes.
Comment 15•24 years ago
|
||
safe_popen is not a public function.
It is intended for use only with the software crypto token.
Prevention of simultaneous use of safe_popen is handled at higher
levels within the software crypto token (or are supposed to be).
Additional redundant locks at this level should not be added.
Having said that, contributions for the improvement of safe_popen are
welcome.
| Reporter | ||
Comment 16•24 years ago
|
||
Your comments about safe_open() being a private function -- sure, but
that doesn't address any of the points I just raised.
Given that this function runs in the context of a threaded application,
and on many of unix-like flavors, it uses pthreads, the points that
I raised are entirely valid and haven't been addressed.
Strtok is unsafe in a multi-threaded application. Irregardless of whether
or not the safe_popen() function is only running once, any other
thread that is operating could smash the static strtok buffer, corrupting
the buffer before safe_popen is done with it. Since you cannot force
other threads to not operate while this is running, you have an
insurmountable problem with the code as it exists currently.
The signal handling code is likewise bound to give problems without
additional protection. While there may be protection with regard to
only one instance of safe_open() operating due to the higher levels
of the code, there is no protection (nor can there be) as the effects
of other threads that may be interleaved or running simultaenously
on another processor in the system.
As for contributions to the code, I've contributed twice. In both
cases the problems are avoided by effectively removing this code
from the program. Spending a huge amount of time in order to
rewrite this code to make it proof against pthreads effects is
not something that I am willing to do.
The root issue here is how to get randomness into the program, not on
how to get fork(), pipe() and a host of string functions
to operate perfectly in a pthreads environment. I believe that
my patch (removal of safe_popen and all the dangerous things it
does) and the reading of a system source of randomness is an
effective solution to the problem.
As a final concern, since part of the goals of the mozilla project is
to make it run faster (and given that it is already a very large application),
using fork() is no way to make it run faster. Much less running fork()
twice.
| Assignee | ||
Comment 17•24 years ago
|
||
I noticed the sigaction(SIGCHLD) calls in safe_popen() and
safe_pclose() and I agree with Mr. Lidl that they have many
potential problems in a pthreads program. I just opened
bug 109971 for this issue. (I don't have a solution though.)
However, I don't think those two sigaction() calls caused
this bug. Their purpose is to make waitpid() work. To
verify that they are not the cause of this bug, we can do
this experiment:
- safe_popen: comment out the sigaction stuff.
- safe_pclose: comment out the sigaction call and the waitpid
while loop.
On the issue of strtok(), yes, it's not thread-safe. Looking
at the code, I think it can be rewritten to avoid the strtok()
call. But there are other problems with our child process.
In the child process, we can only call async-signal-safe functions
between fork() and execvp(). We definitely violated this restriction.
I don't know if this is the cause of the problem though. I will
attach a patch to address this issue.
Comment 18•24 years ago
|
||
NSS uses NSPR. NSPR uses pthreads on all **ix platforms. Ergo,
I believe NSS threads are always pthreads. When a pthread thread
calls fork, the only thread that is replicated in the newly forked
process is the thread that called fork. That is, the new process
initially has only one thread. strtok is used in the only thread
in the new process. It is safe in that context. One could replace
strtok with something like strtok_r, but IMO that's pointless in
that single-threaded process environment. If BSDI's pthread
library is reproducing more than one thread in the newly forked
process, I believe that is a flaw in that pthread library.
A greater concern is that processes that use NSS MUST be able to
succesfully fork other processes. If there is a flaw in NSS/NSPR
that makes NSS-based processes from being able to fork children,
such a flaw MUST be fixed. To simply eliminate the calls to fork
rather than fixing the real flaw is to stick one's head in the sand.
I do not object, in principle, to replacing the popen with data
from /dev/urandom on those platforms where /dev/urandom is known
to be more than a trivial LFSR PRNG. ONCE the flaw(s) that affect fork
on BSDI and Solaris are resolved, then I have no problem with eliminating
the use of safe_popen on BSDI.
| Assignee | ||
Comment 19•24 years ago
|
||
| Assignee | ||
Comment 20•24 years ago
|
||
lidl@pix.net wrote:
> Under 4.3, the child exits before any data is written to the parent, so
> the parent process hangs forever in the fread() of the expected data from
> the (no longer running) child process.
Could you attach a debugger to the child process and post its backtrace?
I am wondering if its backtrace would look similar to the one in bug
108708, seen on Solaris.
Comment 21•24 years ago
|
||
Why does the parent hang, in this case, instead of getting zero from fread,
due to reading EOF?
Is it possibly the case that if the FD is already at EOF before the
parent calls fdopen, then fread doesn't detect EOF?
| Reporter | ||
Comment 22•24 years ago
|
||
Here's the stack backtrace for the failure on BSD/OS 4.2.
lidl@sloar-39: gdb ./mozilla-bin mozilla-bin.core
GNU gdb
Copyright 1998 Free Software Foundation, Inc.
[...]
Reading symbols from
/data/obj/mozilla-096/dist/bin/components/libimpText.so...done.
Reading symbols from
/data/obj/mozilla-096/dist/bin/components/libabsyncsvc.so...done.
#0 0x486bcb6d in kill () from /shlib/libc.so.2
(gdb) where
#0 0x486bcb6d in kill () from /shlib/libc.so.2
#1 0x487436c3 in abort () from /shlib/libc.so.2
#2 0x48753624 in _thread_aio_entry_destroy () from /shlib/libc.so.2
#3 0x48753e8c in _thread_sys_close () from /shlib/libc.so.2
#4 0x4a833d8c in safe_popen (cmd=0x4a885a28 "netstat -nis") at unix_rand.c:666
#5 0x4a834038 in RNG_SystemInfoForRNG () at unix_rand.c:790
#6 0x4a7896c5 in nss_Init (configdir=0x8450f40
"/homes/elves/lidl/.mozilla/lidl/vipfc2ii.slt",
certPrefix=0x4a86b47b "", keyPrefix=0x4a86b47b "", secmodName=0x4a86b49d
"secmod.db",
readOnly=0, noCertDB=0, noModDB=0, forceOpen=0) at nssinit.c:250
#7 0x4a789867 in NSS_InitReadWrite (
configdir=0x8450f40 "/homes/elves/lidl/.mozilla/lidl/vipfc2ii.slt") at
nssinit.c:307
#8 0x4a7155fd in nsNSSComponent::InitializeNSS (this=0x824e700)
at
/scratch/proj/src/mozilla-096/mozilla/security/manager/ssl/src/nsNSSComponent.cpp:547
#9 0x4a715a35 in nsNSSComponent::Init (this=0x824e700)
at
/scratch/proj/src/mozilla-096/mozilla/security/manager/ssl/src/nsNSSComponent.cpp:609
#10 0x4a71d859 in nsNSSComponentConstructor (aOuter=0x0, aIID=@0x8060c08,
aResult=0x8047510)
at
/scratch/proj/src/mozilla-096/mozilla/security/manager/ssl/src/nsNSSModule.cpp:51
#11 0x48235064 in nsGenericFactory::CreateInstance (this=0x8255ce0, aOuter=0x0,
aIID=@0x8060c08,
aResult=0x8047510)
at
/scratch/proj/src/mozilla-096/mozilla/xpcom/components/nsGenericFactory.cpp:74
#12 0x4822f686 in nsComponentManagerImpl::CreateInstanceByContractID
(this=0x806a200,
aContractID=0x8061072 "@mozilla.org/security/entropy;1", aDelegate=0x0,
aIID=@0x8060c08,
aResult=0x8047510)
at
/scratch/proj/src/mozilla-096/mozilla/xpcom/components/nsComponentManager.cpp:1673
#13 0x48230620 in nsComponentManagerImpl::GetServiceByContractID
(this=0x806a200,
aContractID=0x8061072 "@mozilla.org/security/entropy;1", aIID=@0x8060c08,
result=0x80475b4)
at
/scratch/proj/src/mozilla-096/mozilla/xpcom/components/nsComponentManager.cpp:2119
#14 0x4822ba07 in nsGetServiceByContractID::operator() (this=0x8047710,
aIID=@0x8060c08,
aInstancePtr=0x80475b4)
at
/scratch/proj/src/mozilla-096/mozilla/xpcom/components/nsComponentManager.cpp:236
#15 0x8058ea3 in nsCOMPtr<nsIEntropyCollector>::assign_from_helper
(this=0x8047720,
helper=@0x8047710, aIID=@0x8060c08) at
../../dist/include/xpcom/nsCOMPtr.h:972
#16 0x805be1b in nsCOMPtr<nsIEntropyCollector>::nsCOMPtr (this=0x8047720,
helper=@0x8047710)
at ../../dist/include/xpcom/nsCOMPtr.h:553
#17 0x8052f08 in VerifyPsmAbsentOrSane (argc=1, argv=0x804793c)
at /scratch/proj/src/mozilla-096/mozilla/xpfe/bootstrap/nsAppRunner.cpp:1029
#18 0x8053ae7 in main1 (argc=1, argv=0x804793c, nativeApp=0x0)
at /scratch/proj/src/mozilla-096/mozilla/xpfe/bootstrap/nsAppRunner.cpp:1254
#19 0x8054de3 in main (argc=1, argv=0x804793c)
at /scratch/proj/src/mozilla-096/mozilla/xpfe/bootstrap/nsAppRunner.cpp:1630
#20 0x804f28e in __start ()
(gdb) up 3
#3 0x48753e8c in _thread_sys_close () from /shlib/libc.so.2
(gdb) up
#4 0x4a833d8c in safe_popen (cmd=0x4a885a28 "netstat -nis") at unix_rand.c:666
666 for (fd = getdtablesize(); --fd > 2; close(fd))
Current language: auto; currently c
(gdb) list
661 case 0:
662 /* dup write-side of pipe to stderr and stdout */
663 if (p[1] != 1) dup2(p[1], 1);
664 if (p[1] != 2) dup2(p[1], 2);
665 close(0);
666 for (fd = getdtablesize(); --fd > 2; close(fd))
667 ;
668
669 /* clean up environment in the child process */
670 putenv("PATH=/bin:/usr/bin:/sbin:/usr/sbin:/etc:/usr/etc");
(gdb) p fd
$1 = 7
-Kurt
| Assignee | ||
Comment 23•24 years ago
|
||
Mr. Lidl, thank you for the stack backtrace for the failure
on BSD/OS 4.2.
So the child process crashed while closing fd 7. Some
observations.
1. It would be nice to find out what fd 7 was. In particular,
was it p[0] or p[1] (the pipe)?
2. All the calls made by the child process up to that point
(dup2, close, and sysconf) were async-signal-safe. So
this crash in close() could be a BSD/OS bug.
3. One possible workaround is to comment out the for loop that
closes all the unnecessary fd's in the child process:
for (fd = getdtablesize(); --fd > 2; close(fd))
;
This might introduce security vulnerability though. On the
other hand, the commands NSS executes, 'ps' and 'netstat',
are not arbitrary user code, so perhaps this security
precaution of preventing access to unnecessary fd's is not
really necessary.
| Reporter | ||
Comment 24•24 years ago
|
||
Well, I've spent quite a bit of time dinking around with various test setups
trying to see what works and what doesn't work. The long and short of it
is that I cannot make the exec() stuff work under BSD/OS, either due to an
error in how fork() is implemented, or some other interaction that I don't
understand.
The traceback that I appended showed the child process, "hung" when attempting
to close fd 7. "hung" is a bad description -- it was busy flipping between
the threads of the process, because it thought that the close call would block.
But the observant reader will note that this was in the child process, after
the fork. Why are there multiple threads in the child process after a fork?
There shouldn't be, according the rules of pthreading. But clearly there
are (see my previous debugger output).
1) I wasn't able to figure out what fd 7 was, I probably could do this
with a little more effort. But not tonight.
2) As above, it's not a *crash* in close, but rather where the thread
goes into a loop trying to switch to the other threads. They don't have
any work to do, so they switch back. Repeat, lather, rinse.
3) Commenting out the loop that closes the fds of the child process
causes the program to exit with a GDK error. What is really happening
is the process is getting a SIGPIPE writing to the X11 socket. The
application is acting as if FD_CLOEXEC is set on all the fds when
the command is run. Weird.
Modifying the code so that it only closes down to fd 8 gives the same
error. (SIGPIPE)
Finally, a small modification of changing the exec in the child
process to a call to _exit(0) shows that child process exits cleanly
and the parent process goes happily on its way. (Except, of course,
that it doesn't get the random data it was looking for.)
So, I'm drawing a blank as to what to test next -- there is definately
something squirrely with the pthreads, fork() and attaching to the
child process under BSD/OS. I don't know if it is a real problem, or just
something that looks like a problem. I just don't know.
Anyhow, I'm going to attach the latest version of my patch, which
has my debugging stuff conditionalized, rolls together my previous
patch, integrates wtc's patch to avoid certain unsafe operations and
disables the safe_popen code for just BSD/OS.
| Reporter | ||
Comment 25•24 years ago
|
||
Comment 26•24 years ago
|
||
This latest patch is apparently applied to an older version of the sources.
The code on the trunk and the code on the tip of the NSS_3_3_BRANCH
(which which PSM currently takes the sources) already use /dev/urandom.
Perhaps the PSM team has not yet moved the static NSS_CLIENT_TAG forward
to include these recent changes to unix_rand.c, so I'll add some members
of the PSM team to this cc list
I'll attach a patch that builds on the existing code that uses /dev/urandom
and that doesn't call fork on BSDI systems when data is obtained from
/dev/urandom.
Assignee: nelsonb → wtc
Comment 27•24 years ago
|
||
This patch applies to the current tip of the NSS_3_3_BRANCH.
It doesn't attempt to fork either PS or netstat on BSDI systems
when it succeeded in getting data from /dev/urandom.
| Assignee | ||
Comment 28•24 years ago
|
||
Comment on attachment 58802 [details] [diff] [review]
patch for NSS_3_3_BRANCH
I think
+#if defined(BSDI)
+ if (bytes)
+ return;
+#endif
should simply be
+#if defined(BSDI)
+ return;
+#endif
because we can't call safe_popen() no matter what. Calling safe_popen() will
not get
us more entropy; instead the child process will crash.
Comment 29•24 years ago
|
||
In the absence of both netstat and /dev/urandom input, this function does not
collect enough entropy to securely seed the PRNG, IMO. So, perhaps NSS
initialization should fail in this case.
| Assignee | ||
Comment 30•24 years ago
|
||
Mr. Lidl:
If you comment out the loop that closes the fds of the child process,
is it the parent process or the child process that gets a SIGPIPE
writing to the X11 socket? It is worrisome that the child process
can't even call exec after we removed all the unsafe functions. This
seems to mean that a process using pthreads and X11 can't spawn another
process. Can you have someone at BSDI look at this problem?
Maybe the pthreads library on BSD/OS is not ready to be used by the
Mozilla client?
Status: NEW → ASSIGNED
| Assignee | ||
Comment 31•24 years ago
|
||
This is Nelson's patch (attachment 58802 [details] [diff] [review]) plus a comment
describing this bug and the use of "netstat -nis" on
BSD/OS (when the bug that prevents us from calling
safe_popen in a pthreads environment is fixed).
Attachment #49865 -
Attachment is obsolete: true
Attachment #57608 -
Attachment is obsolete: true
Attachment #58802 -
Attachment is obsolete: true
| Assignee | ||
Comment 32•24 years ago
|
||
Fix checked into the tip of NSS.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
| Assignee | ||
Comment 33•23 years ago
|
||
Mr. Lidl,
Yesterday a customer sent me a simple test program that
can cause safe_popen to fail on Linux in a way similar
to the stack trace you provided in comment #22.
I looked into it and found that it was a shared library
linking order problem. I invoked the 'ldd' command on
the executable and found that libc.so was before libpthread.so.
When I built the test program with the -pthread option of
g++/gcc so that libpthread.so is before libc.so in the
linking order, the test program completed successfully.
If BSD/OS has 'ldd' or a similar command that can report
the shared library linking order of an executable program,
could you invoke it on the mozilla executable and see the
relative order of libc.so and libpthread.so? If the C++
compiler has a -pthread or -pthreads option, are we using
that option when linking the mozilla executable on BSD/OS?
If the C++ compiler doesn't have such an option, we will
need to link with -lpthread explicitly to ensure the correct
linking order of libc.so and libpthread.so.
| Reporter | ||
Comment 34•23 years ago
|
||
There is not a seperate pthreads library on BSD/OS.
The pthreads functions have been integrated into libc, directly.
However, as you requested, here is the ldd output:
lidl-58: export LD_LIBRARY_PATH=/usr/local/mozilla
lidl-59: cd /usr/local/mozilla
lidl-60: ldd mozilla-bin
libgkgfx.so => /usr/local/mozilla/libgkgfx.so (0x4808e000)
libjsj.so => /usr/local/mozilla/libjsj.so (0x480b8000)
libmozjs.so => /usr/local/mozilla/libmozjs.so (0x480d1000)
libxpcom.so => /usr/local/mozilla/libxpcom.so (0x48143000)
libplds4.so => /usr/local/mozilla/libplds4.so (0x4823f000)
libplc4.so => /usr/local/mozilla/libplc4.so (0x48242000)
libnspr4.so => /usr/local/mozilla/libnspr4.so (0x48247000)
libdl.so => /shlib/libdl.so (0x48274000)
libgtk-1.2.so.0 => /usr/contrib/lib/libgtk-1.2.so.0 (0x48277000)
libgdk-1.2.so.0 => /usr/contrib/lib/libgdk-1.2.so.0 (0x48395000)
libgmodule-1.2.so.0 => /usr/contrib/lib/libgmodule-1.2.so.0 (0x483c5000)
libglib-1.2.so.0 => /usr/contrib/lib/libglib-1.2.so.0 (0x483c8000)
libXext.so.6 => /usr/X11R6/lib/libXext.so.6 (0x483ea000)
libX11.so.6 => /usr/X11R6/lib/libX11.so.6 (0x483f8000)
libm.so => /shlib/libm.so.0.0 (0x484d6000)
libstdc++.so.1 => /usr/lib/libstdc++.so.1 (0x484e7000)
libgcc.so.1 => /shlib/libgcc.so.1 (0x48526000)
libc.so.2 => /shlib/libc.so.2 (0x48532000)
I will add one other comment about a possibly related problem that I
have heard of that causes problems on BSD/OS when linking pthreads
problems. Basically, the problem is that giving libc as an explicit
library on the link line (via a -lc) can cause problems that will
not occur if the program is linked such that the compiler will
implicitly add the library to the link line.
e.g.: gcc -o foo foo.o -lc (might cause pthreads problems)
gcc -o foo foo.o (will not cause pthreads problems)
I don't think that is the issue here, as the final link line on my BSD/OS
4.3 host goes like this:
g++ -o mozilla-bin -I/usr/X11R6/include -fno-rtti -fno-exceptions -Wall -Wconver
sion -Wpointer-arith -Wbad-function-cast -Wcast-align -Woverloaded-virtual -Wsyn
th -Wno-ctor-dtor-privacy -pthread -pipe -DNDEBUG -DTRIMMED -DWIDGET_DLL=\"libw
idget_gtk.so\" -DGFXWIN_DLL=\"libgfx_gtk.so\" -I/usr/contrib/lib/glib/include -I
/usr/contrib/include -I/usr/X11R6/include nsAppRunner.o nsWindowCreator.o showO
SAlert.o nsSigHandlers.o nsNativeAppSupportGtk.o nsNativeAppSupportBase.o -L..
/../dist/bin -L../../dist/lib -lgkgfx -lmpfilelocprovider_s -ljsj -L../../dist
/bin -lmozjs -L../../dist/bin -lxpcom -L/usr/obj/data/mozilla/dist/lib -lplds4 -
lplc4 -lnspr4 -ldl -L/usr/contrib/lib -L/usr/X11R6/lib -lgtk -lgdk -lgmodule -
lglib -ldl -lXext -lX11 -lm -ldl -lm
| Assignee | ||
Comment 35•23 years ago
|
||
I found that NSS (mozilla/security/nss) is not compiled with
the -pthread option. That might be a problem.
To fix this, edit mozilla/security/coreconf/BSD_OS.mk and
define CC as 'gcc -pthread'.
You need to log in
before you can comment on or make changes to this bug.
Description
•