NSS treatment of CKR_CRYPTOKI_ALREADY_INITIALIZED may be incorrect

RESOLVED FIXED in 3.11

Status

P1
normal
RESOLVED FIXED
14 years ago
13 years ago

People

(Reporter: julien.pierre, Assigned: julien.pierre)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 5 obsolete attachments)

(Assignee)

Description

14 years ago
In some cases, hybrid applications, such as mixed C/Java applications, may be
loading PKCS#11 libraries from multiple pieces of code, and not just from NSS.

I was given a test application where a piece of Java code first loads the
Solaris PKCS#11 module. Then, through JNI, it calls NSS_Initialize . The same
PKCS#11 module is configured in the NSS secmod.db . When NSS tries to load the
module, C_Initialize returns CKR_CRYPTOKI_ALREADY_INITIALIZED .

I think we should just ignore this error and proceed. What do you guys think ?
(Assignee)

Comment 1

14 years ago
I forgot to mention that the current behavior is that we consider this error a
load failure, and therefore in the test case mentioned, the PKCS#11 module is
invisible to NSS . This makes application behavior dependent on the order of
loading the module, and which piece loaded it first.

The loading issue is relatively easy to fix. But there are deeper problems, such
as who is going to call C_Finalize. It's quite conceivable that if a module is
initialized twice in an application, that same app may also finalize it twice .
In that scenario, one of the finalize calls will fail .

Maybe NSS could remember if it got CKR_CRYPTOKI_ALREADY_INITIALIZED at load time
not to finalize that module in NSS_Shutdown ?

Comment 2

14 years ago
The fundamental question is all wrong. No one should be loading their own PKCS
#11 modules if they are running in the address space of NSS.

The PKCS #11 spec specifically specifies that modules must fail with
CKR_CRYPTOKI_ALREADY_INITIALIZED.

Multiple applications using the PKCS #11 module at the same time can cause
conflicts. Other applications may not know the correct locks to call to protect
non-threadsafe access. Calls into the PKCS #11 module could cause unexpected
state changes (like new slots created without the knowledge of NSS, for instance).

In general, not only is it a bad idea to allow direct access to PKCS #11 while
NSS is running, it's actually forbidden.

If you must do it, you need to split your app into separate processes.

Anyway, as you can see loading and unloading is just the tip of the iceberg.
PKCS #11 is a restricted resource. You really need one owner... either NSS or
the app (not both).

bob

Comment 3

14 years ago
Sorry Bob but the world isn't that controlled any more.

In Solaris today you could have a getpwnam() can cause libpkcs11 to get loaded.
getpwam chooses the ldap backend, ldap loads libsasl, libsasl loads the gss
plugin, gss loads the Kerberos mech the Kerberos mech loads libpkcs11 to do its
crypto.

What if the application that called getpwnam() was a user of NSS or was
a Java application using JNI calls.

What we advise is that libaries that use PKCS#11 must be able to deal with
PKCS#11 already being initalized and then must NEVER call C_Finalize because
even if you were the one to initalise it someone else might have initalised it
after you and if you finalise you break them.

Comment 4

14 years ago
Your hosed then... PKCS #11 was never intended to be loaded more than once. The
SPEC itself says so. Someone went off and allowed things that shouldn't be allowed.

NSS *NEEDS* to call Finalize when NSS finalize is called. Applications already
depend on this behavior.

Besides if kerberos is making PKCS #11 calls and NSS is making PKCS #11 calls
and they are operation on different threads, you cannot guarrentee thread safe
operations! (kerberos won't know what locks to call to protect against NSS
calling into the module).

PKCS #11 *REQUIRES* one library to on control of it. If kerberos wants to use
PKCS #11, it needs to do it through NSS, or never link with NSS.

If you are assuming that you can ever make anything other then one library the
owner of PKCS #11 access, then you will never be able to get consistant
behavior. Fixing Init and Finalize is just the tip of the iceberg!

bob

Comment 5

14 years ago
The comment about locking isn't relevant for libpkcs11 in Solaris since it does
not support applicaiton supplied locks - for exactly this reason.

Since PKCS#11 requires that each thread use separate sessions there doesn't
appear to be an issue with multiple layers in the stack creating their own session.

It is working just fine in Solaris 10 this way now.

Comment 6

14 years ago
In my work this is a potential problem, too.
I now understand why under the current PKCS #11 spec I must
handle CKR_CRYPTOKI_ALREADY_INITIALIZED as a fatal error.
But it is a real problem that *independent* groups will
produce libraries that use the same Cryptoki library.
We need to have a solution that work in this environment.

The only solution we have now is to use Cryptoki libraries
through NSS.  Perhaps we need to make this solution more
palatable by making NSS's pk11wrap layer a separate library.
Is there any other PKCS #11 arbitration layer out there?

Comment 7

14 years ago
The libpkcs11 on Solaris 10 (which will be relased under the CDDL license with
the rest of OpenSolaris), is a PKCS#11 "switch".  It doesn't do any crypto or
PKCS#11 operations itself but instead has plugins which are PKCS#11
implementations.  It deals with all the dlopening and initialisation etc.

In Solaris 10 update 1 (available in Solaris Express now) there is a feature
called MetaSlot.  This makes the first slot in the slot list be the union of the
mechanisms of all the other slots, the token objects come from a configured
keystore (system wide default or environment variable).  It will "move" objects
from one slot to another to perfrom operations.  For example:

By default we have pkcs11_kernel and pkcs11_softtoken are both pluggedin to
libpkcs11.  pkcs11_kernel provides an ioctl based bridge to the kernel.  Lets
say we plugin an SCA-1000 (Broadcom 5821A) card, (or have a machine with the
future Niagara process (which has on chip RSA/DSA)).  The driver for that card
makes  two slots available via pkcs11_kernel, the first does 3DES & MD5 the
second does RSA & DSA.  That card doesn't have any token storage so NSS wouldn't
have been able to use it, but libpkcs11's MetaSlot uses the keys stored in
pkcs11_sofftoken as token objecs and migrates them as session objects to
pkcs11_kernel's second slot to do the RSA operation.

libpkc11 also has policy that allows baning algorithms from certain provides,
this is controlled via cryptoadm(1m), which will also be in OpenSolaris.

Comment 8

14 years ago
RE: application supplied locks...

The issue isn't the locks supplied to the pkcs #11 module (though that does
exist as well), it's whether or not the application itself is providing the
correct locking around the individual PKCS #11 calls when required. NSS has
various locks it uses to provide these guarrentees. kerberos doesn't know where
those locks are.

Ok, so here's the issue:

Are we talking about making one, very carefully constructed PKCS #11 module
working or are we talking about PKCS #11 modules in general?

If we are talking the general case, there are severe restrictions to making this
happen. In the general case, there are tons of tokens out there that return
exactly one session and require the application to serialize it. In this case
there needs to be *ONE* controller of the PKCS #11 module, so NSS should not
just arbitrarily ignore CKR_CRYPTOKI_ALREADY_INITIALIZED. If you need this, then
you need to define a central aribitrator which manages all the thread and
session requirements of the PKCS #11 module. I submit that such a manager would
be approximately the size of pk11wrap (which, in fact, is exactly what pk11wrap
was implemented to do;).

If we are talking about the former, then we can deal with the semantics of that
PKCS #11 module itself. That PKCS #11 module 'knows' that it's safe to allow
multiple inits, so it should keep a reference count, rather than returning
CKR_CRYPTOKI_ALREADY_INITIALIZE. Actually, maybe it should return another error
code, which we will then feed back through the PKCS #11 working group. If this
is the case is used, then these tokens would have the following additional
requirements:

1. In addition the the PKCS #11 threadsafe requirements, the following functions
would also have to be threadsafe:
C_Initialize, C_Finalize, C_GetInfo, C_GetFunctionList, C_GetSlotList,  
C_GetSlotInfo, C_GetTokenInfo, C_GetMechanismList, C_GetMechanismInfo,
C_InitToken, C_OpenSession.

Of these C_Initialize, C_Finalize, and C_OpenSession are probably the only
tricky ones to make threadsafe on a token that only supports a fixed number of
statically defined slots. (Well maybe C_InitToken... but if you are doing a
C_InitToken while someone else is using the token at all, you are in trouble:)

2. The PKCS #11 module must not add new slots (the new PKCS #11 v2.20 allows new
slots to be added in controlled ways). NOTE: this requirement may be relaxed
with more thought... but we need to convince ourselves this is not a problem.

3. The PKCS #11 module needs to supply 'infinite' sessions (probably not a
problem the the PKCS #11 module in question here).

4. There may have to be thought put into Login/Logout semantics (though I think
this is probably manageable, and probably not applicable to the PKCS #11 module
in question here).




Comment 9

14 years ago
Hmm in my previous post I was sort of assuming that a software PKCS #11 module.

Darren's description sort of scares me a bit. In some sense it can probably be
made to work, if it can meet all the requirements I specified earlier. On the
other hand, it may be harder to meet those requirements (a threadsafe
C_GetMechanismList, for instance, may be tricker that I thought).

The module will need to do session simulation for pkcs #11 modules which only
support one session. In fact this module sounds a whole lot like what pk11wrap
does (except pk11wrap supplies a pointer interface to it's crypto operations,
rather than a handle interface).

Comment 10

14 years ago
Anyway do you understand why NSS must treat CKR_CRYPTOKI_ALREADY_INITIALIZED as
a failure?
IIRC, a PKCS11 module tells its user whether it requires the user to 
serialize access to it, or not.  Clearly for such modules we should treat
ALREADY_INITIALIZED as a fatal error.  But for those modules that support
full threading semantics, it seems like we should be able to have multiple
(attempted) initializers co-exist.

Comment 12

14 years ago
I'm confused about Nelson's last comment about the module telling the user
about serialisation requirements, which part of the spec are you referring to ?

Do you mean CKF_LIBRARY_CANT_CREATE_OS_THREADS ?  If so that is the application
telling the provider not to use threads underneath.

2.11 always has serial sessions (CKF_SERIAL_SESSION is mandatory), but again
that is the application telling the module not the other way around.
(Assignee)

Comment 13

14 years ago
Darren,

I believe Nelson was referring to the CKR_CANT_LOCK return code, which PKCS#11
modules are allowed to return from C_Initialize if the user specified the
CKF_OS_LOCKING_OK flag, or if it passed function pointers for CreateMutex ,
DestroyMutex, LockMutex and UnlockMutex. This is described in section 11.4 of
the current PKCS#11 spec (v2.20) at
ftp://ftp.rsasecurity.com/pub/pkcs/pkcs-11/v2-20/pkcs-11v2-20.pdf . Basically,
the module can return CKR_CANT_LOCK for 3 of the 4 combination of inputs
provided by the application . The one combination that must always work is the
first case, when the application tells the module it won't do multi-threaded
calls by neither setting CKF_OS_LOCKING_OK nor providing any mutex callback
functions.
(Assignee)

Comment 14

14 years ago
Re: comment #8 , #11, #13

Clearly, if the first caller of C_Initialize requested only single-threaded
access, and the PKCS#11 module as a result isn't doing any locking, then there
can only be a single "owner" for that module in the process - the first caller
of C_Initialize .

Unfortunately, the second caller of C_Initialize has no idea what parameters the
first caller passed to the PKCS#11 module . In fact, the second caller doesn't
even know there was a previous caller until the module returns
CKR_CRYPTOKI_ALREADY_INITIALIZED .

So, this problem cannot be solved by the layers above the PKCS#11 module. The
generic layers I know about as of today are the NSS pk11wrap layer, the
Sunpkcs11 provider in the JDK, and some functions in Solaris mentioned by
Darren. Per Bob's comments, safe concurrent use of these layers in a single
process with the same PKCS#11 modules cannot be guaranteed . None of these
layers can safely ignore CKR_CRYPTOKI_ALREADY_INITIALIZED .

What can be done is to fix certain modules to allow them to work with the
multiple calling layers.

For now, I like the idea suggested by Bob in comment #8 of the module keeping a
refcount and returning CKR_OK . But I think this should be made stricter - the
module should work in this mode if the multiple calling layers are supplying
compatible sets of threading-related flag/mutex callbacks in
CK_C_INITIALIZE_ARGS . Otherwise, problems may still occur .

The above suggestion however violates the PKCS#11 specification, so this is
something that should be brought up with the PKCS#11 working group . Perhaps an
extra flag to C_Initialize like CKF_SCHIZOPHRENIC_APPLICATION ;) could be
defined, and then the PKCS#11 module would know to use this behavior and not
return CKR_CRYPTOKI_ALREADY_INITIALIZED , if all the callers of C_Initialize had
specified this flag . I don't think we would ever want to have an arbitrator
between the different layers. If the module is not thread-safe (ie. it always
returns CKR_CANT_LOCK for cases 2, 3, 4 of C_Initialize in the spec), then that
module could only have one calling layer. In fact we should probably define the
new flag as valid only for cases 2, 3, 4, since it would be nearly impossible
for an application to guarantee that its multiple layers are always making the
PKCS#11 calls in the same native thread .
QA Contact: bishakhabanerjee → jason.m.reid
(Assignee)

Comment 15

13 years ago
Taking bug.
After 2 meetings, the decision for the fix is a 2-step process :

a) offer a new NSS initialization flag to tell it to only load PKCS#11 modules
with OS locking, and ignore the CKR_CRYPTOKI_INITIALIZED error

b) have an equivalent of a) in the other PKCS#11 library consumers, ie. the JDK
PKCS#11 provider . All consumers in the process must behave in this same way in
order for the consumers to work reliably.

c) introduce a new flag to be passed by the consumers to the PKCS#11 modules in
CK_C_INITIALIZE_ARGS to tell it that multiple libraries may load it. This is to
be proposed to the PKCS#11 committee before we implement it, so it is longer
term. One of the actions that the PKCS#11 module would take as a result of this
flag is to not return an error on the second and subsequent C_Initialize, and
refcount the C_Initialize calls so that a corresponding number of C_Finalize is
needed to truly shutdown the module.

I will implement a) as a fix for NSS 3.10.1 .
Assignee: wtchang → julien.pierre.bugs
Priority: -- → P1
Target Milestone: --- → 3.10.1
(Assignee)

Updated

13 years ago
Depends on: 297734
(Assignee)

Comment 16

13 years ago
Created attachment 186286 [details] [diff] [review]
add support for NSS_INIT_COOPERATE flag

If the application sets this flag with NSS_Initialize, NSS will then only
attempt to load modules in thread-safe mode, with OS locking only. No lock
pointers will be passed to the module in order to make sure it uses OS locking.
Attachment #186286 - Flags: superreview?(rrelyea)
Attachment #186286 - Flags: review?(nelson)

Comment 17

13 years ago
Comment on attachment 186286 [details] [diff] [review]
add support for NSS_INIT_COOPERATE flag

Among the cooperating consumers of PKCS #11 modules,
who is responsible for calling C_Finalize?
(Assignee)

Comment 18

13 years ago
Wan-Teh,

Nobody should call C_Finalize until the new cooperate module flag is introduced,
and the modules can do the refcounting for C_Initialize / C_Finalize themselves,
thus allowing for all the consumers to call C_Finalize in any order desired . 

Prevening NSS_Shutdown from calling C_Finalize is something I should add in the
patch in the case that NSS_INIT_COOPERATE is set for now, until the module flag
is available.
(Assignee)

Comment 19

13 years ago
Due to the difficulty of providing a short-term fix for bug 297734 (root cert
module doesn't support OS locking), I'm thinking that perhaps our recommendation
in the "cooperate" mode should be instead to :
1) try OS locking without lock functions passed in first
2) if that fails, pass lock functions that rely on OS locks (NSPR locks)
3) if that fails, don't load the module

Does anyone have objections to that ?

The downside is that all the players need to implement the same logic. In
particular, Java would have to implement OS lock functions too, if it doesn't
already.

The upside is that we don't introduce any incompatibility with any existing
PKCS#11 module, except the ones that can't multithread at all .
(Assignee)

Comment 20

13 years ago
Created attachment 186876 [details] [diff] [review]
update

- if flag is set, allow either OS locking or provided lock functions using NSPR
OS locking, but not single-threading
- if flag is set, don't call C_Finalize in SECMOD_UnloadModule and
SECMOD_LoadPKCS11Module
(Assignee)

Updated

13 years ago
Attachment #186286 - Attachment is obsolete: true
(Assignee)

Updated

13 years ago
Attachment #186876 - Flags: superreview?(rrelyea)
Attachment #186876 - Flags: review?(wtchang)
(Assignee)

Updated

13 years ago
Attachment #186286 - Flags: superreview?(rrelyea)
Attachment #186286 - Flags: review?(nelson)

Comment 21

13 years ago
Comment on attachment 186876 [details] [diff] [review]
update

This patch is incomplete.  For example, the new function
pk11_setCooperativeMode is missing.

I disagree with the short-term solution implemented by
this patch.  I think we should only implement the long-term
solution of reference-counted thread-safe PKCS #11 modules
and not implement this cooperative mode.  The cooperative
mode requires all consumers of PKCS #11 to cooperate, and it
can't call C_Finalize on the PKCS #11 modules.	Like any
option, it increases the amount of testing required.  But if
everyone else thinks we should do this, I will accept this.
Attachment #186876 - Attachment is obsolete: true
Attachment #186876 - Flags: superreview?(rrelyea)
Attachment #186876 - Flags: review?(wtchang)
(Assignee)

Comment 22

13 years ago
Created attachment 186883 [details] [diff] [review]
attach forgotten files

Wan-Teh,

I missed the pk11wrap subdirectory with cvs in the last diff. Sorry.

Regarding your other concern, we already have shipping products faced with the
problem. Waiting for the new flag to make it to the spec, 3rd party vendors to
upgrade their PKCS#11 modules, and customers to upgrade to the new modules,
will take a long time, probably a year best case. I just think we should have a
response other than "turn off all these modules" before that timeframe. Do you
know of a better interim solution ?
Attachment #186883 - Flags: superreview?(rrelyea)
Attachment #186883 - Flags: review?(wtchang)
(Assignee)

Updated

13 years ago
Attachment #186883 - Flags: review?(nelson)
(Assignee)

Comment 23

13 years ago
Comment on attachment 186883 [details] [diff] [review]
attach forgotten files

This patch is missing a couple lines of code in secmod_ModuleInit.

Also, I believe the fix which consists of preventing C_Finalize calls may have
a very effect of not allowing FIPS to work. The way we switch to FIPS is to
call SECMOD_InternalModule, which I believe calls C_Finalize in turn .
Attachment #186883 - Flags: superreview?(rrelyea)
Attachment #186883 - Flags: review?(wtchang)
Attachment #186883 - Flags: review?(nelson)

Comment 24

13 years ago
Julien,

I think the PKCS #11 modules that caused this bug report
are all controlled by us.  The example you gave in the
bug description is the Solaris PKCS#11 module.  I also
heard libnssckbi.so mentioned.

Given this, we can implement a variant of c) in comment 15
(reference-counted modules if OS locking is used), and that
will solve the problem at hand.  We don't need
to wait for the reference-counted thread-safe modules
to make it into the PKCS #11 spec because it is a very
reasonable extension of the PKCS #11 standard and no new
flag needs to be added.  Here are the details of the
proposal:

  A thread-safe module that can use OS locking should use
  OS locking if the CKF_OS_LOCKING_OK flag is passed to
  C_Initialize.

  The first C_Initialize call on such a module will increment
  the reference count to 1 and actually initialize the module.

  If a thread-safe module has been initialized with OS
  locking, subsequent C_Initialize calls with the CKF_OS_LOCKING_OK
  flag will simply increment the reference count and return
  CKR_OK.  Subsequent C_Initialize calls on this module *without* the
  CKF_OS_LOCKING_OK flag will fail and return CKR_CRYPTOKI_ALREADY_INITIALIZED.

  Each C_Finalize call on this module will decrement the reference
  count.  When the reference count becomes zero, that C_Finalize
  call will actually finalize the module.

  All PKCS #11 module consumers will continue to handle
  CKR_CRYPTOKI_ALREADY_INITIALIZED as a fatal error.
(Assignee)

Comment 25

13 years ago
Wan-Teh,

The modules that *uncovered* this problem are controlled by us. But surely you
don't mean to suggest that Sun products should only work with Sun PKCS#11 modules ?

It is still very important that we provide a way for NSS to reject modules that
don't support multithreading. This could happen with any third-party module,
although it is only likely to occur with smartcard modules. We don't intend to
drop support for third-party hardware accelerator modules or HSMs in our servers.

Thus, it is very important that we provide a way for NSS to ignore the
CKR_CRYPTOKI_ALREADY_INITIALIZED error. Otherwise, third party modules that are
compliant to the 2.20 specs will fail to load in NSS if Java is initialized
first . That's 100% of third party modules today !

It is also important that we provide a way to prevent NSS from doing
NSS_Finalize on all modules, so that it doesn't interfere with Java applications
using the module, and also allow proper shutdown, regardless of order of
shutdown routines.
(Assignee)

Comment 26

13 years ago
I'm working on a new patch that will break down the behavior into 3 flags - one
for ignoring the CKR_CRYPTOKI_ALREADY_INITIALIZED error, one for never trying to
initialize modules single-threaded, and finally one for never calling C_Initialize .

I'm having a problem with the last one. It turns out the way we dynamically go
to FIPS mode is by calling SECMOD_DeleteInternalModule, which shuts down our
softoken module with C_Finalize, then calls FC_Initialize .

I don't think it's safe for an application that has both Java and NSS to
dynamically go to FIPS mode, because one module gets shutdown. My patch would
prevent the C_Finalize from occurring. However, the failure isn't getting
propagated. The stack looks like this :

=>[1] SECMOD_UnloadModule(mod = 0x80b0e00), line 409 in "pk11load.c"
  [2] SECMOD_SlotDestroyModule(module = 0x80b0e00, fromSlot = 1), line 809 in
"pk11util.c"
  [3] PK11_DestroySlot(slot = 0x80d5b20), line 429 in "pk11slot.c"
  [4] PK11_FreeSlot(slot = 0x80d5b20), line 442 in "pk11slot.c"
  [5] SECMOD_DestroyModule(module = 0x80b0e00), line 779 in "pk11util.c"
  [6] SECMOD_DestroyModuleListElement(element = 0x80a9438), line 827 in "pk11util.c"
  [7] SECMOD_DeleteInternalModule(name = 0x80aa9a8 "NSS Internal PKCS #11
Module"), line 463 in "pk11util.c"

- PK11_DestroySlot returns void, but is internal, so it could be changed.
- SECMOD_DestroyModule and PK11_FreeSlot both return void, and are public
functions, so there is no opportunity for any error to be reported (at list with
this codepath). So, what happens is that SECMOD_UnloadModule in frame 1 skips
C_Finalize, but eventually FC_Initialize gets called anyway. And it succeeds !
I'm somewhat surprised at this. Before I try to fix this further, I'd like to
know a bit more about the softoken FIPS mode .

Bob, is it normal that we can have both the non-FIPS and the FIPS token
initialized ? Is it a safe things to do ? Will softoken continue to work without
crashing with the sequence : C_Initialize, FC_Initialize ? I was only testing
with modutil so I had no opportunity to test anything afterwards.

Should I just change SECMOD_CanDeleteInternalModule to return PR_FALSE in the
case where my new "no C_Finalize" NSS flag is set ?
(Assignee)

Comment 27

13 years ago
In the previous comment, "and finally one for never calling C_Initialize" , I
meant C_Finalize, of course .

Comment 28

13 years ago
Three flags are too complicated.  It'll be very
hard to document or understand their interactions.
(Assignee)

Comment 29

13 years ago
Created attachment 187464 [details] [diff] [review]
Updated patch

Nelson, Bob,

Please review this new approach. I added several individual options so it's
possible for an application in the future to get some of the behavior we are
introduced now, but not necessarily all. The new recommended flag to use is
still NSS_INIT_COOPERATE, which merely sets multiple bits and options .

Pending more information from Bob, I tried to take care of the FIPS case by
disallowing the switchover when NSS_INIT_DONT_FINALIZE_PKCS11_MODULES is set.
I'm not really sure if the C_Finalize is needed or not for the FIPS switchover.


I also made sure that SECMOD_CancelWait emulates the events rather than use
C_WaitForSlotEvent if NSS_INIT_DONT_FINALIZE_PKCS11_MODULES is set . This is so
that C_Finalize doesn't need to get called for interruption purposes.

I believe this patch is neutral to applications that don't set the new flags,
and thus should be low risk.
Attachment #186883 - Attachment is obsolete: true
Attachment #187464 - Flags: superreview?(rrelyea)
Attachment #187464 - Flags: review?(nelson)

Comment 30

13 years ago
Comment on attachment 187464 [details] [diff] [review]
Updated patch

These three flags are closely related and are not
entirely orthogonal.

Under what circumstances is it safe to ignore
CKR_CRYPTOKI_ALREADY_INITIALIZED?  We discussed
this and found that it is only safe when the module
can lock (i.e., is thread-safe) and all the PKCS #11
consumers initialize the module with compatible
locking functions.  So
NSS_INIT_DONT_LOAD_SINGLE_THREADED_PKCS11_MODULES
and NSS_INIT_ALLOW_ALREADY_INITIALIZED_PKCS11_MODULES
must be used together.

If NSS is not the one that initialized a PKCS #11
module, under what circumstances is it safe for NSS
to finalize the module?

It is a good idea to use three internal boolean
variables, which is set by the NSS_INIT_COOPERATE
flag, to improve readability and extensibility of
our code, but exposing these options to NSS users
is a bad idea.

The macros' names are too long and different in style
from existing NSS_INIT_XXX macros.
In review of the previous patch "attach forgotten files", I commented that 
the block comment for the NSS_INIT_COOPERATE bit didn't tell the reader in
sufficient detail precisely what the behavioral effects of this bit were,
and that determining those effects by reading code was too difficult to 
expect most NSS users to determine the right answer.  So, in effect, I 
rejected the patch on the grounds that the comments needed to be better.

The "Updated patch" has a MUCH better block comment, explaining the behavioral
changes in enough detail that I intuitively understand what they do.  But
it also separated the 3 behavioral changes into 3 flags.  

I haven't thought very much about when it is appropriate to use 1 or 2 of 
those 3 new flags, and not use all 3, so I don't have an opinion about
whether 1 flag or 3 is better.  I would be satisfied (I think) if a comment
like the one in "Updated Patch" were placed in the "forgotten files" 
patch, saying "these are the 3 things done by the COOPERATE flag."
Wan-Teh, would that be acceptable to you?

Comment 32

13 years ago
Yes, that's fine.
(Assignee)

Comment 33

13 years ago
Retargetting for 3.11, since Sun decided not to respin 3.10 for this problem in
the imminent server release.
Target Milestone: 3.10.1 → 3.11

Comment 34

13 years ago
re comment 26 from julian.

Yes, there were changes made several years ago to allow both FIPS and non-FIPS
tokens to run concurrently. The mozilla clients do this because mozilla cannot
free the references to the tokens at the point that it is trying to switch
(javascript holds some references that won't be cleaned up until the garbage
collector runs). As such, you can swith FIPS mode once, and then you must wait
until the previous token is finalized to switch back.

Comment 35

13 years ago
The patch looks mostly good, the two issues I see are (all in pk11util.c)

1) I would rather just fail the Wait and Cancel functions if we can't shutdown
the token. They are really only useful for application tokens, and if you have
to drop to using the poll interface, their utility becomes less useful. (soft
requirement)

2) Unload_Module should succeed, even if we aren't finalizing. Finalizing is
only one part of unloading the module, so trying to propogate errors in this
case is not all that useful. I think you were trying to handle the FIPS mode
switching case, but if you remove all attempts to 'fail the unload', the FIPS
mode switch code will work correctly. Basically FIPS mode switching usually
fails if you are switching to an already initialized module, but the code for
ignoring the already initialized error means you should be able to switch back
and forth with impunity even if you aren't finalizing.

bob
(Assignee)

Comment 36

13 years ago
Bob,

Re: comment #35

1) What do you mean that the functions are only useful to "application tokens" ?
Isn't it better to provide something - the poll interface - even if it's
inefficient, rather than nothing at all ?

2) Yes, I tried to propagate the errors mainly to deal with the FIPS case. If it
is OK to call C_Initialize / FC_Initialize any number of times to switch back
and forth, then that greatly simplifies that part of the issue, and I will
remove the error reporting due to C_Finalize not being called.

Comment 37

13 years ago
re PK11_WaitForAnyTokenEvent().

So there already is an interface you can call that will always poll (which works
on a specific slot PK11_WaitForTokenEvent()). The primary utility of the
PK11_WaitForAnyTokenEvent() is that it uses the modules builtin
C_WaitForSlotEvent function when available. In the case where that is not the
case, the functions utility over PK11_WaitForAnyTokenEvent is much less. The
patch seems rather complicated to preserve this function. After reviewing it,
however, I believe we can just simplify the patch as follows:

In PK11_WaitForAnyTokenEvent() Add the following at the beginning of the function:
    if (!pk11_getFinalizeModulesOption()) {
        /* if we are sharing the module with other software in our
         * address space, we can't reliably use C_WaitForSlotEvent() */
	return secmod_HandleWaitForSlotEvent(mod, flags, latency);
    }

and in PK11_CancelWait, just before the finalize:
    /* can't get here unless pk11_getFinalizeModulesOption is set */
    PORT_Assert(pk11_getFinalizeModulesOption());


Servers are unlikely to call either of these functions since they typically
don't deal with token removal (but that doesn't say that they couldn't in the
future, of course).



Updated

13 years ago
Attachment #187464 - Flags: review?(nelson) → review-
(Assignee)

Comment 38

13 years ago
I had a chat with Bob today. I just wanted to clarify and amend the last comment
#37, since the wrong function names were used.

The PK11_WaitForTokenEvent function is to be considered a legacy function, and
it only works on a particular token, not on an entire PKCS#11 module, so it
isn't a full equivalent for SECMOD_WaitForAnyTokenEvent.

Bob meant to suggest the code changes to simplify SECMOD_WaitForAnyTokenEvent
and SECMOD_CancelWait, not for PK11_WaitForTokenEvent or PK11_CancelWait (there
is no such beast).

I will implement those suggestions and remove the error propagation from
finalize as discussed in comment #36 . Then hopefully we can put this one to rest !
(Assignee)

Updated

13 years ago
Attachment #187464 - Flags: superreview?(rrelyea)
(Assignee)

Comment 39

13 years ago
Created attachment 188501 [details] [diff] [review]
hopefully the final patch

I made the following changes :

1) simplified the code in pk11util according to Bob's comments, except in
SECMOD_CancelWait, I still return an error if SECMOD_WAIT_PKCS11_EVENT has been
set together with pk11_getFinalizeModulesOption instead of just asserting.

2) I removed the logic that attempts to propagate the C_Finalize for the FIPS
switch case, since it is safe to ignore the failure.

3) I shortened the flag names and make them conform with our existing pattern
more. I also updated the documentation.

In response to Wan-Teh's comment #30 :

In answer to the first question, the two flags don't have to be used together
if the calling application is aware of the loading order. If NSS is initialized
before any other PKCS#11 consumer (eg. Java), then ignoring the error with
already initialized modules isn't required. This is application dependent.
However, disallowing single-threaded modules would still be needed, even if NSS
is the first one to load the modules. So, the answer of which flags to use
varies by application. In the general case, using the flags together is the
best solution. But an application developer may choose to use only one of them.


Similarly, the "no finalize" flag isn't needed if the application never calls
NSS_Shutdown(), SECMOD_WaitForAnyTokenEvent or SECMOD_CancelWait .

Or if the application calls NSS_Shutdown() and knows NSS is the last one to be
shutdown (after Java), but not SECMOD_WaitForAnyTokenEvent/SECMOD_CancelWait,
then the "no finalize" flag isn't needed either.

The general recommendation still would be that if the application uses Java
SunPKCS11 to just use NSS_INIT_COOPERATE which sets all the flags.
(Assignee)

Updated

13 years ago
Attachment #187464 - Attachment is obsolete: true
Attachment #188501 - Flags: superreview?(rrelyea)
Attachment #188501 - Flags: review?(wtchang)

Comment 40

13 years ago
Comment on attachment 188501 [details] [diff] [review]
hopefully the final patch

r-, only 2 comments, though:

1) This if statement after the assert in CancelWait has the wrong sense it
should be (!pk11_getFinalize). With this fix the patch gets an r+ from me
without further review.

2) The PORT_Assert(0) is clearly redundant since we have a PORT_Assert on the
condition before the 'if'. You can remove this	line without getting a further
review. (but not required for an r+).

bob
Attachment #188501 - Flags: superreview?(rrelyea) → superreview-

Comment 41

13 years ago
Comment on attachment 188501 [details] [diff] [review]
hopefully the final patch

Please document what NSS_INIT_RESERVED will be.  Is
this to pass the new flag to C_Initialize to enable
reference-counted module initialization?

Usually people pass 0 as the "reserved" arguments to
functions.  Your patch essentially asks people to pass
a nonzero value, whose meaning is not defined, as the
reserved argument.  When people upgrade to a new NSS
version that implements the reserved flag, some new
behavior will be introduced.  This will break backward
compatibility.

Comment 42

13 years ago
Just to elaborate using an analogy in SSL cipher suites.

If we ask users to enable a not-yet-implemented SSL cipher
suite, their apps will suddenly gain a new cipher suite
when they upgrade to a new NSS version that implements the
cipher suite.  We consider this a form of breaking backward
compatibility, which is why all new cipher suites in NSS are
disabled by default.

So NSS_INIT_COOPERATE should not set the not-yet-implemented
NSS_INIT_RESERVED flag, at least the effect of NSS_INIT_RESERVED
must be documented.
(Assignee)

Comment 43

13 years ago
Bob,

Thanks for catching that.

Wan-Teh,

Yes, the purpose of NSS_INIT_RESERVED was to set the forthcoming module flag
when it is defined. I thought it would be better for NSS_INIT_COOPERATE to
define it, so that applications don't have to make changes multiple times in
order to take advantage of it when it is available. While this is a change in
behavior, if implemented well (eg. with a fallback if the module doesn't
understand the flag because its PKCS#11 revision is too old), I don't think it
would actually break binary compatibility. It also wouldn't be the first time we
have defined flags prior to implement them (see SSL socket options for example).
I would be fine with documenting the proposed behavior for NSS_INIT_RESERVED and
noting that it isn't implemented yet, but keeping it as part of NSS_INIT_COOPERATE .

If you can't live with that, I will just remove all mentions of
NSS_INIT_RESERVED, and apps will have to use the new NSS init flag to turn on
the module flag when it is actually implemented, after the PKCS#11 working group
defines it.

Please let me know one way or the other so I can make the checkin today.

Comment 44

13 years ago
You can check it in first.  Would be nice to hear Nelson's
opinion.
(Assignee)

Comment 45

13 years ago
Created attachment 188635 [details] [diff] [review]
patch as checked in 

This is the fix as I checked it in to the tip for 3.11 .
Attachment #188501 - Attachment is obsolete: true
(Assignee)

Updated

13 years ago
Attachment #188501 - Flags: review?(wtchang)
(Assignee)

Comment 46

13 years ago
I have already discussed many aspects of the patch with Nelson and he gave me
verbal reviews, the feedback of which was incorporated into the recent patches,
so I don't expect changes. I'll mark this bug FIXED. If there is still something
to be changed, the bug can be reopened.
Status: NEW → RESOLVED
Last Resolved: 13 years ago
Resolution: --- → FIXED
NSS_INIT_RESERVED is not a name we want to keep, going forward.
It's not a name we want applications to begin to use.
It's not a name we want to preserve, if and when the PKCS#11 flag is defined.
When that happens, we will want this flag to be named with a meaningful 
name, and applications will then need to change to use that name.
If we're going to define this flag at this time (prior to the PKCS#11 flag
being defined), then IMO it should be named with whatever name best describes 
what we expect the new flag to do.  We can comment it that it is not yet 
implemented.  We can ifdef it out until it is implemented.  But given that 
the goal is to avoid application changes at such later time as this PKCS#11
flag is defined, it definitely should NOT be named "reserved" now.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 48

13 years ago
Nelson,

I was planning on having a better name only at the time the behavior for the
module flag is defined by the working group, not before. A name chosen today
might not apply after rounds of reviews by other companies before it goes into
the spec. At the point the name is chosen, the following would be added :

#define NSS_INIT_PK11SHAREMODULE 0x200 (the same value as NSS_INIT_RESERVED)
or even :
#define NSS_INIT_PK11SHAREMODULE NSS_INIT_RESERVED

And NSS_INIT_COOPERATE would be changed to include NSS_INIT_PK11SHAREMODULE
instead of NSS_INIT_RESERVED (which would be the same value).

I would think that since NSS_INIT_RESERVED is currently defined as having no
effect, that nobody would be using the name in their applications. If I added a
comment to the effect that this flag name shouldn't be used, would that satisfy
you ?

The goal of defining it was to prevent the 0x200 value for the flag from being
used for any other purposes before the module flag is defined by the working
group, and to include the trigger value in NSS_INIT_COOPERATE .

Aother alternative is to do the following now :

/* #define NSS_INIT_RESERVED 0x200 .
   Don't reuse this value for another flag. */
#define NSS_INIT_COOPERATE 0x200 | \
        NSS_INIT_PK11THREADSAFE | \
        NSS_INIT_PK11RELOAD | \
        NSS_INIT_NOPK11FINALIZE

And later change it to :

#define NSS_INIT_PK11SHAREMODULE 0x200
#define NSS_INIT_COOPERATE NSS_INIT_PK11SHAREMODULE | \
        NSS_INIT_PK11THREADSAFE | \
        NSS_INIT_PK11RELOAD | \
        NSS_INIT_NOPK11FINALIZE
Generally bits named "reserved" are bits that application programs are NEVER 
supposed to use.  

But I gather that you want programs to begin to compile with the value
NSS_INIT_COOPERATE now, with that value defined as the union of 4 bits,
3 of which are defined now, and one of which will be defined later (maybe).
That is, you want programs to begin to use the "reserved" bit now, but 
indirectly, as part of this mask called NSS_INIT_COOPERATE.

Any program that uses that value is taking a risk, a chance that the future
meaning of that bit will remain fully compatible with the desired behaviors
that their program has at the time it is released.  

Later when that bit is implemented, it will change the behavior of the NSS
libraries for programs that are already using that bit.  If the new 
behavior is completely desirable for customers' configurations that use it
when it is implemented, then that will be OK.  But if the new behavior 
causes any existing programs/configurations to stop working, that will be
perceived as yet another NSS binary compatibility problem.

Perhaps if you define the bit clearly now, e.g. with the name
NSS_INIT_PK11SHAREMODULE, and explain what the bit will do in the future,
so that application programers can decide for themselves if they want to 
take that risk, that would be acceptable.  OTOH, even if they do that, 
that will not let NSS off the hook if some customer later complains of 
incompatibility.  My belief is that applicationprogrammers will not use a
"reserved" bit, nor masks that use it, nor masks that contain undefined bits. 

Recall the case of SSL_NO_STEP_DOWN which was defined in NSS 3.9 but not
implemented until NSS 3.10.  AFAIK, not a single application used it until
it was implemented.  

So, I'd say your best bets are:
a) define the bit now as NSS_INIT_PK11SHAREMODULE and explain it, including
the fact that it is not yet implemented (silently ignored), or
b) just don't define it now at all, and define it later if/when PKCS11 
adopts the feature it is intended to enable.
(Assignee)

Comment 50

13 years ago
Nelson,

We make changes to NSS code and behavior all the time, and it is to our
discretion as NSS developers to create separate flags if we believe the changes
affect binary compatibility. Most NSS code changes don't come with corresponding
flags, fortunately; even those changes that affect the PKCS#11 module
initialization logic, as you can see from the CVS history, as they are harmless
to binary compatibility.

It is not possible to evaluate whether a change in NSS behavior will affect
binary compatibility before we know exactly what that change will be, ie. when
that code change is up for review . I don't think we can make that determination
before.

Having defined precisely what the behavior is for NSS_INIT_PK11THREADSAFE,
NSS_PK11RELOAD and NSS_NOPK11FINALIZE, I know that I don't wish to associate any
future behavior changes with those flags. However I mean for NSS_INIT_COOPERATE
to be more generic, and not only include the defined behavior for those 3 flags,
but also any other change we deem desirable for apps using Java and PKCS11 .
That change doesn't have to break compatibility, but that will be a topic for
discussion then, not now. If we find that the module flag change may break
binary compatibility, then we can always choose to define a separate flag for it
at that point. But if we find that it does not, then we won't have the option of
rolling in the behavior change for users of NSS_INIT_COOPERATE unless this flag
is defined to be more than just a bitflag of 3 precisely defined options. Short
of defining a new flag, the only option left would be to use the module flag all
the time regardless of flags, and I don't think that's a good choice. The reason
is that apps that set NSS_INIT_COOPERATE already make some sacrifices in NSS
functionality by enabling this flag, and additionally setting the module flag in
the future may not affect binary compatibility for them; whereas the same
probably could not be said for all NSS applications.
Julien, Essentially, the problem here is defining a bit with the name reserved
and expecting applications to use it with that name.  I suggest the following
remedy:
- rename NSS_INIT_RESERVED to NSS_INIT_PK11SHAREMODULE everywhere, at this time,
  including inside the definition of NSS_INIT_COOPERATE.  This is one of your
  proposals in a comment above, I'm just proposing you do it now.
- extend the comment that explains NSS_INIT_PK11SHAREMODULE to explain that
  it is not yet implemented, is now silently ignored, and its exact semantic
  behavior is not yet defined.

Then application developers will decide for themselves whether to use 
NSS_INIT_COOPERATE and bear the risk of binary incompatibility, or use 
 NSS_INIT_PK11THREADSAFE | NSS_INIT_PK11RELOAD | NSS_INIT_NOPK11FINALIZE
with lesser risk.  

Essentially the only difference between this proposal and the code as
checked in is that we eliminate the (bad, IMO) precedent of asking app
developers to use a bit named "RESERVED".

Comment 52

13 years ago
If NSS_INIT_PK11SHAREMODULE can be implemented
without breaking backward compatibility, I think
we should always initialize NSS that way.  Is
there any reason why someone may need to turn
off NSS_INIT_PK11SHAREMODULE?

In any case, I reiterate that it is unusual to
ask users to use a flag that's not implemented
and not specified.
(Assignee)

Comment 53

13 years ago
Wan-Teh,

Re: comment #52,

That question can't really be answered until a full description of the module
flag and its effect is available as agreed by the PKCS#11 working grouop. Going
by what we proposed so far, the flag would cause the module to refcount
C_Initialize / C_Finalize . This still leaves some questions open as to how
C_WaitForSlotEvent can get interrupted when this happens, if at all. The
applications that have set NSS_INIT_COOPERATE have also implicitly set
NSS_INIT_PK11NOFINALIZE, which means they have given up on using
C_WaitForSlotEvent (the call gets emulated). So adding this module flag for
users of NSS_INIT_COOPERATE may not break compatibility, but it might for
applications who don't set it. It may also be that another mechanism to
interrupt C_WaitForSlotEvent will be proposed at the same time as the module
flag, in which case it could be OK for NSS to always use it . But the full
proposal has not been made yet, so we don't know at this point.

I am not asking any users to use NSS_INIT_RESERVED now. I only put it there to
reserve the bitflag value for the future behavior. I didn't mean to have that
symbol used in applications now. It doesn't have to be defined as a separate
flag for now at all, as I pointed out in comment #48 - see what I suggested we
change now (commenting the definition for NSS_INIT_RESERVED, and adding | 0x200
to NSS_INIT_COOPERATE).

Nelson,

Neither of my proposals in comment #48 suggested defining
NSS_INIT_PK11SHAREMODULE in the header file today.

To restate the proposals, they were .

I

a) For today, leave the header files as they are checked in

b) When the behavior is defined by the working group and we can name the flag,
change the current definitions to :

#define NSS_INIT_PK11SHAREMODULE 0x200 (the same value as NSS_INIT_RESERVED)
or :
#define NSS_INIT_PK11SHAREMODULE NSS_INIT_RESERVED

and :

#define NSS_INIT_COOPERATE NSS_INIT_PK11SHAREMODULE | \
        NSS_INIT_PK11THREADSAFE | \
        NSS_INIT_PK11RELOAD | \
        NSS_INIT_NOPK11FINALIZE

II

a) change the current definitions now to :

/* #define NSS_INIT_RESERVED 0x200 .
   Don't reuse this value for another flag. */
#define NSS_INIT_COOPERATE 0x200 | \
        NSS_INIT_PK11THREADSAFE | \
        NSS_INIT_PK11RELOAD | \
        NSS_INIT_NOPK11FINALIZE

b) When the behavior is defined by the working group, change the above to :

#define NSS_INIT_PK11SHAREMODULE 0x200
#define NSS_INIT_COOPERATE NSS_INIT_PK11SHAREMODULE | \
        NSS_INIT_PK11THREADSAFE | \
        NSS_INIT_PK11RELOAD | \
        NSS_INIT_NOPK11FINALIZE
(Assignee)

Comment 54

13 years ago
Nelson,

To add to comment #53, I don't think the name NSS_INIT_PK11SHAREMODULE should
appear in the header files now because we don't know what behavior will come out
of the working group, if any. I think it's much better to name the flag after
the behavior is known, rather than before. This keeps our options open.

There shouldn't be a higher risk for using NSS_INIT_COOPERATE vs
NSS_INIT_PK11RELOAD | NSS_INIT_NOPK11FINALIZE | NSS_PK11_THREADSAFE, as long as
we continue to do a fine job of reviewing the changes for the module flag when
it is introduced to make sure the changes are safe. I see no indication of that
changing based on the comments in this bug ;)
Comment on attachment 188635 [details] [diff] [review]
patch as checked in 

Julien, You needn't repeat proposals in the bug.  

I object to asking applications to use bits that are named "reserved"
or that use bits (e.x. 0x200) whose behavior is undefined.  
Neither of your proposals addresses that.

The fact that such code was checked in is serious enough, IMO, to 
warrant being backed out or being replaced with all due haste with
something that asks apps to use only well defined (if unimplemented)
flags/symbols.
Attachment #188635 - Flags: review-
(Assignee)

Comment 56

13 years ago
Per our meeting today, this is fixed.
Status: REOPENED → RESOLVED
Last Resolved: 13 years ago13 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.