Closed Bug 188965 Opened 22 years ago Closed 22 years ago

certutil dumps core adding certs into a Chrysalis LunaSA HSM

Categories

(NSS :: Libraries, defect, P1)

Sun
Solaris
defect

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: Bill.Burns, Assigned: julien.pierre)

Details

Attachments

(1 file)

certutil (and the "security" CGI program, part of NES 6.1sp2) core dump when I try to install a signed certificate request into the LunaSA HSM.  I can request the cert just fine but when I try to install it I see one of two things:1) the cert appears to get installed correctly (e.g. there is no warning).  But "manage certs" doesn't actually show the cert in the list.2) the admin server hangs while the CGI process ("security") dumps core.You can replicate the problem more easily by trying to install the certificate using the "certutil" program instead.I'll attach a core dump from the version of certutil bundled with NES 6.1sp2 while I was trying to add the certificate into the HSM's database.
the VeriSign intermediate CA certificate is available at:
http://www.verisign.com/support/install/intermediate.html.

The instructions to install/configure/use the LunaSA HSM are under separate cover.
Julien, could you take a look at this?  Thanks.
Assignee: wtc → jpierre
Priority: -- → P1
Target Milestone: --- → 3.8
here's what happens to get the core dump:
- (the CSR has already been generated)
- worf 29 # ls
https-worf.nscp.aoltw.net-worf-cert8.db
https-worf.nscp.aoltw.net-worf-key3.db*
libnssckbi.so*
secmod.db

worf 30 # certutil -d . -L -P https-worf.nscp.aoltw.net-worf-
VeriSign Intermediate CA                                     ,,   
worf 31 # certutil -d . -U -P https-worf.nscp.aoltw.net-worf-

    slot: 
   token: Builtin Object Token

    slot: LunaNet Slot
   token: OpsSec-Test

    slot: NSS Internal Cryptographic Services Version 3.4                
   token: NSS Generic Crypto Services

    slot: NSS User Private Key and Certificate Services                  
   token: NSS Certificate DB
worf 44 # certutil -d . -P https-worf.nscp.aoltw.net-worf- -A -t u,u,u -i
/tmp/worf.cert -n "worf-cert"
Segmentation fault (core dumped)
Yes, I am already working on it with Bill. However, we are running into some
other basic problems using the HSM.
I got the driver to run on Solaris.
After enumerating the objects with C_FindObjects, the token returns one object -
the server cert.
Then, NSS calls C_GetAttributes() to get various information about it. It does
so in a single PKCS#11 template that queries 7 attributes :
CKA_CERTIFICATE_TYPE
CKA_ID
CKA_VALUE
CKA_ISSUER
CKA_SERIAL_NUMBER
CKA_SUBJECT
CKA_NETSCAPE_EMAIL

The Chrysalis module returns a 0x30 error, which is CKR_DEVICE_ERROR .
I have found that if NSS only queries for the first 6 attributes, then the call
succeeds. What's going on is that the Chrysalis module doesn't know about the
CKA_NETSCAPE_EMAIL extension. Rather than returning CKR_DEVICE_ERROR, the
PKCS#11 spec says that it should return CKR_ATTRIBUTE_TYPE_INVALID (0x12) .
If it did, NSS would automatically try the query again without the email in the
template, and it would succeed.

The fact that it works in NES 6.0 / NSS 3.2 is coincidental - we probably did
not query all the attributes at once, but perhaps rather in individual
GetAttributes call, which is less efficient. Therefore everything worked,
despite the bug in the Chrysalis module. Resolving the problem should be a very
simple fix in the driver, it's only a matter of returning the proper error code.
Good job, Julien!

Should be be more tolerant of drivers returning
the wrong error code?  It seems that this is a
common problem and the time we spend debugging
is expensive.
Component: Build → Libraries
Can we fix certutil so that it won't core dump
in this situation?
I did not experience a core dump.
I had installed the cert with NES, which went fine. The problem was with NSS
enumerating the certs.

As far as our tolerance of the problem : we could try the call again without the
email to see if it succeeds. But this would have a performance impact.
The only viable choice if we want to work around the bug would be for us to
cache the information that the token is buggy, so that we don't try again to
query objects of our extensions on the same slot. Ie this would be a property of
the slot.
I'd like to discuss this with Chrysalis so they can adjust their driver to be
more tolerant of queries that don't understand.  Can we get a list of the
attributes that NSS submits to the module?  Or even an output of the NSS debug
you did?
The PKCS#11 logger I used doesn't display all the information needed for this
sort of debugging. I could see that a C_GetAttributes call failed, and the error
returned by the module. However, I didn't have sufficient information on the
arguments in the trace to find out which attributes we were querying. There was
only a pointer value to the template array, but not the content of the array
itself. So I had to actually debug step by step to identify the NSS code.

The queries that NSS does vary from version to version, as you found in this
case, so there is no exhaustive list we can produce. In general, we use the
standard attributes defined in PKCS#11. The two main exceptions I know of are
for e-mail addresses and for CRLs. Only the Netscape modules understand them.
Other modules shouldn't fail though. It should be noted that PKCS#11 itself
evolves from version to version. New attribute types may get defined in the
future, even in minor revisions of the spec. If any sort of compatibility is to
be maintained between PKCS#11 applications such as NSS, and PKCS#11 drivers,
which may support slightly different minor revisions of PKCS#11, then the driver
should ignore the attributes it doesn't know about. Otherwise the driver would
need to be updated for every minor revision to the spec.
Returning an error other than CKR_ATTRIBUTE_TYPE_INVALID seems to be a common
problem with token vendors.  I suppose we could always fallback to retrying the
search without CKA_NETSCAPE_EMAIL, no matter what the actual return value was,
and that would only have a performance impact on failed queries.  But this is a
fairly basic mistake, and it doesn't really make sense to hack up our code just
to be compatible with every vendor that doesn't implement the spec correctly.

Regarding the logger, it could crack the templates to show what attributes are
requested (and even what values they received), but that work hasn't been done yet.
The driver shouldn't know about any non-standard attributes, but the driver
should implement the PKCS #11 spec about what to do when it encounters an
unknown attribute. There are specific error codes that should be returned. Our
code is written to respond to those specific error codes. 
I think we should open another defect for the purpose of working around the
incorrect PKCS#11 drivers, if we intend to do so. It isn't clear that we want to
do so yet, but we might want to discuss it at our NSS meeting.

However, the original issue in this bug was a core dump in certutil. 
Bill, could you please list the stack using your core dump, dbx and the
particular certutil binary you were using ? This can be accomplished by running
"dbx ./certutil core" and then typing the "where" command.

Did the certutil core dump occur in all versions of NSS that you tested or only
some ? 
I don't have a license to use dbx on my test machine.  I only tested the version
of certutil that came bundled with that particular version of NES.  In some
cases it core dumped, in others it just reported that no certificates were found.

I've spoken to Chrysalis about this error and they are investigating the PKCS#11
spec to verify the behaviour we're seeing and how/if they are out of spec.
Bill,

If you can't provide the stack, can you provide the cert and the exact certutil
command you are using that causes the core dump ?

Thanks.
As part of the troubleshooting with Chrysalis, I had to update the firmware in
the HSM so I'm re-building the test environment from scratch.  Let me replicate
the problem and report back.
Julien,  I had more success with NES after I upraded the firmware of the
Chrysalis unit.  You need to re-install the Chrysalis software on strange; it's
available at /u/shadow/src/Chrysalis-Luna-SA/Update/client/.

I'm going to do a clean install of NES and try again to see if I can verify that
their upgrade is working.
within NES 6.1sp2 I was able to see and use a certificate that was previously
generated on the HSM.  I can't claim this bug as "resolved" since I may have a
new problem with NES 6.1sp2: unable to generate a new keypair. Outlined in
blackflag bug #616182. :(
using modutil and certutil from NSS 3.7.1 I get slightly different results:
worf 316 # modutil -dbdir . -nocertdb -list                                    
                              Using database directory ....

Listing of PKCS #11 Modules
-----------------------------------------------------------
  1. NSS Internal PKCS #11 Module
         slots: 2 slots attached
        status: loaded

         slot: NSS Internal Cryptographic Services Version 3.4                
        token: NSS Generic Crypto Services

         slot: NSS User Private Key and Certificate Services                  
        token: NSS Certificate DB

  2. LunaSA
        library name: /usr/lunasa/lib/cryst201s.so
         slots: 1 slot attached
        status: loaded

         slot: LunaNet Slot
        token: CMS-Test

  3. Root Certs
        library name: /opt/local/netscape/nes6.1sp2/alias/libnssckbi.so
         slots: 1 slot attached
        status: loaded

         slot: 
        token: Builtin Object Token
-----------------------------------------------------------

worf 315 # certutil -d . -R -s "CN=worf.netscape.com,O=America Online, C=US" -o
worf.csr -h CMS-Test -a
A random seed must be generated that will be used in the
creation of your key.  One of the easiest ways to create a
random seed is to use the timing of keystrokes on a keyboard.

To begin, type keys on the keyboard until this progress meter
is full.  DO NOT USE THE AUTOREPEAT FUNCTION ON YOUR KEYBOARD!


Continue typing until the progress meter is full:

|************************************************************|

Finished.  Press enter to continue: 
Enter Password or Pin for "CMS-Test":
certutil: unable to generate key(s)
: An I/O error occurred during security authorization.
And using the certutil from NSS3.7 produces what results?
inconsistent results.  Sometimes the token doesn't respond (times out) and
sometimes it works just fine.

Chrysalis also says that you need to issue the command "salogin -v -o -s 1 -i
4:4 -p <passphrase>" before the web server can access certs on the token.  This
seems odd to me, but isn't a new requirement.  Nevertheless, the command-line
tools and NES have inconsistent results reliably querying the Chrysalis token.

I suspect some sort of weird interaction between the Chrysalis driver and NSS. 
I've asked Chrysalis to help investigate their code to help troubleshooting this.
Bill,

I remember having the problem with the CSR / keypair generation before. You
solved it for me by adding some things into the Luna chrystoki.conf file . Since
you reinstalled the software, perhaps these lines need to be added again ?
GOod idea.  Yes, the lines were added to the conf file.  The "trick" seems to be
to run the cryptic "salogin" command before starting up any application that
tries to use the HSM.  I'll be delving into this requirement a bit more now.

As it stands, the latest HSM firmware patch + using the salogin program seem to
be working a bit better.  I'll try running the PKCS#11 logging module to see if
we can track this down any more definitively.
since updating the LunaSA firmware, I'm not seeing this bug anymore.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: