Closed Bug 249488 Opened 20 years ago Closed 20 years ago

root cert module requires locking functions in CK_C_INITIALIZE_ARGS

Categories

(NSS :: Libraries, defect, P2)

3.9.2
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: julien.pierre, Assigned: julien.pierre)

Details

Attachments

(2 files, 1 obsolete file)

This bug was reported by the JDK 1.5 team. The JDK 1.5 PKCS#11 support can load
PKCS#11 modules directly, not through NSS.

They attempted to load our root cert module, libnssckbi.so, and it crashed.
The problem is that the the JDK provided NULL function pointers for CreateMutex,
DestroyMutex, LockMutex, and UnlockMutex.

nssckbi, unlike softoken, does not have a built-in implementation of those
functions that it can fall back to, in case the caller wants multithreaded usage
but does not supply the functions itself, as permitted by the PKCS#11 spec.

Here is a quote from the JDK team below :

"We call with CKF_OS_LOCKING_OK set and the function pointers set to null,
as allowed to the spec. This is case 2 in PKCS#11 section 11.4
C_Initialize(), page 141 in the PDF:

===
2. If the flag is set, and the function pointer fields aren’t supplied (i.e.,
they all have the value NULL_PTR), that means that the application will be
performing multi-threaded Cryptoki access, and the library needs to use the
native operating system primitives to ensure safe multi-threaded access. If the
library is unable to do this, C_Initialize should return with the value
CKR_CANT_LOCK.
===

If the CKBI module does not support this case, it must return CKR_CANT_LOCK as
per spec and we will fall back to single threaded access."

There are several possible fixes for this :

1) return CKR_CANT_LOCK, as suggested. However, we probably would have to change
other code too, in order to make sure we don't attempt to call the locking
functions at all

2) link with NSPR implicitly and use the PRLock implementation .
I'm guessing there is some reason we don't do this already - what is it ?

3) load NSPR on-demand only if we get into the situation where the caller
requests multithreaded access without providing callback functions
Priority: -- → P2
Target Milestone: --- → 3.9.3
There is an option 4) -

create a fallback OS direct locking call.

I would argue against 2 and 3. Linking loadable PKCS #11 modules with NSPR is 
something to be avoided. We have ran into problems in the past with mismatched 
version of the NSPR used by PKCS #11 modules and their host applications. 
Softoken is the exception because it is not really loaded, but linked with the 
application. In general we try to make the PKCS #11 module as standalone as 
possible. libnssckbi is very carefull to stub out NSPR calls to accomplish this.

I would suggest options 1 or 4. (Note, for JSS it's probably best for it to use 
the same scheme that NSS uses in cause the module ends up being shared).

To soften the case 2 and 3 a bit, this policy was created because we had a 
number of applications (Netscape Communicator, old Netscape Servers, etc) which 
had statically linked versions of NSPR. Loading newer PKCS #11 modules into 
these older applications invariabled caused crashes. With shared library 
versions of NSPR, this may not be quite the problem, but there still is the 
question of what happens when we go to NSPR version 5, and you again have 
mismatched NSPR interfaces.

bob
Thanks for that comment, Bob.

I don't particularly like option 4, because we will end up duplicating much code
from NSPR, and we will have to maintain this code for new platforms.

I can see why option 2 can cause problems with applications that already use
NSPR themselves, particularly static versions.

I'm not sure if the same problems apply to option 3, where NSPR is dynamically
loaded. I think it might not. We will still need to re-implement native code to
load the library and symbols, though ...

So I think option 1 is probably the preferred way.
The crash is in NSSArena_Create, which gets called in nssCKFWInstance_Create
before the arguments have been checked. Patch forthcoming.

  ---- called from signal handler with signal 4 (SIGILL) ------
  [10] 0x54ab8958(0x1927a8, 0x0, 0xffbfddd8, 0x44f9a8, 0xe8, 0x5d81e8), at
0x54ab8957
=>[11] PR_NewLock(), line 375 in "nsprstub.c"
  [12] nssArena_Create(), line 418 in "arena.c"
  [13] NSSArena_Create(), line 386 in "arena.c"
  [14] nssCKFWInstance_Create(pInitArgs = 0x16fd98, mdInstance = 0x7ec52ee4,
pError = 0xffbfdfb0), line 214 in "instance.c"
  [15] NSSCKFWC_Initialize(pFwInstance = 0x7ec50edc, mdInstance = 0x7ec52ee4,
pInitArgs = 0x16fd98), line 162 in "wrap.c"
  [16] builtinsC_Initialize(pInitArgs = 0x16fd98), line 114 in "nssck.api"
  [17] Java_sun_security_pkcs11_wrapper_PKCS11_C_1Initialize(0x36b90,
0xffbfe1e4, 0x7ec16340, 0x7f15831c, 0xfffffcf8, 0x7ec50ee0), at 0x7edd5f58
Actually, the flags in CK_C_INITIALIZE_ARGS are 0 in this case, as well as all 4
function pointers. This would mean it is case 1 in the PKCS#11 doc for
C_Initialize - which is that the application won't be performing concurrent
access . That's rather odd but perhaps it is the way the Java PKCS#11 provider
is written ...

In any case we still need to prevent this crash, which right now happens
whenever the function pointers aren't supplied by the application.
(In reply to comment #4)

To clarify, the SunPKCS11 code currently first calls C_Initialize() with an
CK_C_INITIALIZE_ARGS that has the four mutex pointers set to NULL and
CKF_OS_LOCKING_OK in flags set (case 2 in the spec).

If that fails with CKR_CANT_LOCK, we call C_Initialize() again with the same
CK_C_INITIALIZE_ARGS structure but with its flags = 0, i.e. no multi-threaded
access (case 1).
Comment on attachment 154424 [details] [diff] [review]
evaluate CK_C_INITIALIZE_ARGS only once, and allow working in single-threaded mode

Saul, Bob, could you pleae review this patch ? Thanks.
Attachment #154424 - Flags: superreview?(rrelyea0264)
Attachment #154424 - Flags: review?(saul.edwards)
I forgot to note that this patch also includes changes from Bob's first ckfw
patch in bug 248981 .
Julien-

I have looked at this code.  It looks correct to me (you can switch it to
r=mcgreer if you want).
Thanks, Ian !

I would still like to get a review from Bob or Nelson before checking in.
Attachment #154424 - Flags: review?(saul.edwards) → review+
Comment on attachment 154424 [details] [diff] [review]
evaluate CK_C_INITIALIZE_ARGS only once, and allow working in single-threaded mode

I'm not really happy with the disjoint but related globals.

The LockingState global in mutex is set for a particular pInitArgs by code in
nsprstub.c. If we have some code that passes a different pInitArgs into
NSSCKFWMutex_Create(), then Create might pick the wrong values. I believe it
would be better to change the signature of NSSCKFWMutex_Create to take locking
state as an argument. nsprstub.c could then pass it's calculated global version
of LockingState that it knows matches the pInitArgs.

Another way would be to create a datastructure with both pInitArg and the
cached LockingState in a single dataStructure.

The rest of the patch looks fine to me. I'm all in favor the the idea of
caching the calculation of LockingState.

bob
Attachment #154424 - Flags: superreview?(rrelyea0264) → superreview-
Bob, thanks for the review. Here is a new patch . The changes are :
1) keep all the globals in nsprstub.c - the locking state is
nssstub_LockingState
2) add a CryptokiLockingState argument to nssCKFWMutex_Create
3) add the CryptokiLockingState to the NSSCKFWInstance
4) add a CryptokiLockingState argument to NSSCKFWInstance_Create
5) return computed CryptokiLockingState from nssSetLockArgs
6) in NSSCKFWC_Initialize, pass the CryptokiLockingState from nssSetLockArgs to
NSSCKFWInstance_Create

These changes were all necessary in order to avoid referencing the nsprstub
globals .
Attachment #154424 - Attachment is obsolete: true
Comment on attachment 154611 [details] [diff] [review]
address Bob's concerns

Bob, please review.
Attachment #154611 - Flags: superreview?(rrelyea0264)
Comment on attachment 154611 [details] [diff] [review]
address Bob's concerns

Yes, much better. sr=relyea
Attachment #154611 - Flags: superreview?(rrelyea0264) → superreview+
Checked in to NSS_3_9_BRANCH and tip :

Checking in ckfw.h;
/cvsroot/mozilla/security/nss/lib/ckfw/ckfw.h,v  <--  ckfw.h
new revision: 1.5; previous revision: 1.4
done
Checking in ckfwm.h;
/cvsroot/mozilla/security/nss/lib/ckfw/ckfwm.h,v  <--  ckfwm.h
new revision: 1.4; previous revision: 1.3
done
Checking in instance.c;
/cvsroot/mozilla/security/nss/lib/ckfw/instance.c,v  <--  instance.c
new revision: 1.9; previous revision: 1.8
done
Checking in mutex.c;
/cvsroot/mozilla/security/nss/lib/ckfw/mutex.c,v  <--  mutex.c
new revision: 1.4; previous revision: 1.3
done
Checking in nsprstub.c;
/cvsroot/mozilla/security/nss/lib/ckfw/nsprstub.c,v  <--  nsprstub.c
new revision: 1.6; previous revision: 1.5
done
Checking in nssckfwt.h;
/cvsroot/mozilla/security/nss/lib/ckfw/nssckfwt.h,v  <--  nssckfwt.h
new revision: 1.3; previous revision: 1.2
done
Checking in wrap.c;
/cvsroot/mozilla/security/nss/lib/ckfw/wrap.c,v  <--  wrap.c
new revision: 1.11; previous revision: 1.10
done
Status: NEW → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
Testing found that the crash was resolved, but the library still didn't work in
single-threaded mode. This was due to the error code not being set to OK in the
single threaded case. This one-line patch fixes the problem.
Since Bob is away and we have an urgent need, I have taken the liberty to check
in this fix.

Checked in to the tip :

Checking in mutex.c;
/cvsroot/mozilla/security/nss/lib/ckfw/mutex.c,v  <--  mutex.c
new revision: 1.5; previous revision: 1.4
done

And NSS_3_9_BRANCH :
Checking in mutex.c;
/cvsroot/mozilla/security/nss/lib/ckfw/mutex.c,v  <--  mutex.c
new revision: 1.2.126.2; previous revision: 1.2.126.1
done
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: