certutil dumps core adding certs into a Chrysalis LunaSA HSM

RESOLVED INVALID

Status

NSS
Libraries
P1
normal
RESOLVED INVALID
16 years ago
15 years ago

People

(Reporter: bill, Assigned: Julien Pierre)

Tracking

unspecified
Sun
Solaris

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment)

(Reporter)

Description

16 years ago
certutil (and the "security" CGI program, part of NES 6.1sp2) core dump when I try to install a signed certificate request into the LunaSA HSM.  I can request the cert just fine but when I try to install it I see one of two things:1) the cert appears to get installed correctly (e.g. there is no warning).  But "manage certs" doesn't actually show the cert in the list.2) the admin server hangs while the CGI process ("security") dumps core.You can replicate the problem more easily by trying to install the certificate using the "certutil" program instead.I'll attach a core dump from the version of certutil bundled with NES 6.1sp2 while I was trying to add the certificate into the HSM's database.
(Reporter)

Comment 1

16 years ago
Created attachment 111438 [details]
core dump from certutil using LunaSA
(Reporter)

Comment 2

16 years ago
the VeriSign intermediate CA certificate is available at:
http://www.verisign.com/support/install/intermediate.html.

The instructions to install/configure/use the LunaSA HSM are under separate cover.

Comment 3

16 years ago
Julien, could you take a look at this?  Thanks.
Assignee: wtc → jpierre
Priority: -- → P1
Target Milestone: --- → 3.8
(Reporter)

Comment 4

16 years ago
here's what happens to get the core dump:
- (the CSR has already been generated)
- worf 29 # ls
https-worf.nscp.aoltw.net-worf-cert8.db
https-worf.nscp.aoltw.net-worf-key3.db*
libnssckbi.so*
secmod.db

worf 30 # certutil -d . -L -P https-worf.nscp.aoltw.net-worf-
VeriSign Intermediate CA                                     ,,   
worf 31 # certutil -d . -U -P https-worf.nscp.aoltw.net-worf-

    slot: 
   token: Builtin Object Token

    slot: LunaNet Slot
   token: OpsSec-Test

    slot: NSS Internal Cryptographic Services Version 3.4                
   token: NSS Generic Crypto Services

    slot: NSS User Private Key and Certificate Services                  
   token: NSS Certificate DB
worf 44 # certutil -d . -P https-worf.nscp.aoltw.net-worf- -A -t u,u,u -i
/tmp/worf.cert -n "worf-cert"
Segmentation fault (core dumped)
(Assignee)

Comment 5

16 years ago
Yes, I am already working on it with Bill. However, we are running into some
other basic problems using the HSM.
(Assignee)

Comment 6

16 years ago
I got the driver to run on Solaris.
After enumerating the objects with C_FindObjects, the token returns one object -
the server cert.
Then, NSS calls C_GetAttributes() to get various information about it. It does
so in a single PKCS#11 template that queries 7 attributes :
CKA_CERTIFICATE_TYPE
CKA_ID
CKA_VALUE
CKA_ISSUER
CKA_SERIAL_NUMBER
CKA_SUBJECT
CKA_NETSCAPE_EMAIL

The Chrysalis module returns a 0x30 error, which is CKR_DEVICE_ERROR .
I have found that if NSS only queries for the first 6 attributes, then the call
succeeds. What's going on is that the Chrysalis module doesn't know about the
CKA_NETSCAPE_EMAIL extension. Rather than returning CKR_DEVICE_ERROR, the
PKCS#11 spec says that it should return CKR_ATTRIBUTE_TYPE_INVALID (0x12) .
If it did, NSS would automatically try the query again without the email in the
template, and it would succeed.

The fact that it works in NES 6.0 / NSS 3.2 is coincidental - we probably did
not query all the attributes at once, but perhaps rather in individual
GetAttributes call, which is less efficient. Therefore everything worked,
despite the bug in the Chrysalis module. Resolving the problem should be a very
simple fix in the driver, it's only a matter of returning the proper error code.

Comment 7

16 years ago
Good job, Julien!

Should be be more tolerant of drivers returning
the wrong error code?  It seems that this is a
common problem and the time we spend debugging
is expensive.
Component: Build → Libraries

Comment 8

16 years ago
Can we fix certutil so that it won't core dump
in this situation?
(Assignee)

Comment 9

16 years ago
I did not experience a core dump.
I had installed the cert with NES, which went fine. The problem was with NSS
enumerating the certs.

As far as our tolerance of the problem : we could try the call again without the
email to see if it succeeds. But this would have a performance impact.
The only viable choice if we want to work around the bug would be for us to
cache the information that the token is buggy, so that we don't try again to
query objects of our extensions on the same slot. Ie this would be a property of
the slot.
(Reporter)

Comment 10

16 years ago
I'd like to discuss this with Chrysalis so they can adjust their driver to be
more tolerant of queries that don't understand.  Can we get a list of the
attributes that NSS submits to the module?  Or even an output of the NSS debug
you did?
(Assignee)

Comment 11

16 years ago
The PKCS#11 logger I used doesn't display all the information needed for this
sort of debugging. I could see that a C_GetAttributes call failed, and the error
returned by the module. However, I didn't have sufficient information on the
arguments in the trace to find out which attributes we were querying. There was
only a pointer value to the template array, but not the content of the array
itself. So I had to actually debug step by step to identify the NSS code.

The queries that NSS does vary from version to version, as you found in this
case, so there is no exhaustive list we can produce. In general, we use the
standard attributes defined in PKCS#11. The two main exceptions I know of are
for e-mail addresses and for CRLs. Only the Netscape modules understand them.
Other modules shouldn't fail though. It should be noted that PKCS#11 itself
evolves from version to version. New attribute types may get defined in the
future, even in minor revisions of the spec. If any sort of compatibility is to
be maintained between PKCS#11 applications such as NSS, and PKCS#11 drivers,
which may support slightly different minor revisions of PKCS#11, then the driver
should ignore the attributes it doesn't know about. Otherwise the driver would
need to be updated for every minor revision to the spec.

Comment 12

16 years ago
Returning an error other than CKR_ATTRIBUTE_TYPE_INVALID seems to be a common
problem with token vendors.  I suppose we could always fallback to retrying the
search without CKA_NETSCAPE_EMAIL, no matter what the actual return value was,
and that would only have a performance impact on failed queries.  But this is a
fairly basic mistake, and it doesn't really make sense to hack up our code just
to be compatible with every vendor that doesn't implement the spec correctly.

Regarding the logger, it could crack the templates to show what attributes are
requested (and even what values they received), but that work hasn't been done yet.

Comment 13

16 years ago
The driver shouldn't know about any non-standard attributes, but the driver
should implement the PKCS #11 spec about what to do when it encounters an
unknown attribute. There are specific error codes that should be returned. Our
code is written to respond to those specific error codes. 
(Assignee)

Comment 14

16 years ago
I think we should open another defect for the purpose of working around the
incorrect PKCS#11 drivers, if we intend to do so. It isn't clear that we want to
do so yet, but we might want to discuss it at our NSS meeting.

However, the original issue in this bug was a core dump in certutil. 
Bill, could you please list the stack using your core dump, dbx and the
particular certutil binary you were using ? This can be accomplished by running
"dbx ./certutil core" and then typing the "where" command.

Did the certutil core dump occur in all versions of NSS that you tested or only
some ? 
(Reporter)

Comment 15

16 years ago
I don't have a license to use dbx on my test machine.  I only tested the version
of certutil that came bundled with that particular version of NES.  In some
cases it core dumped, in others it just reported that no certificates were found.

I've spoken to Chrysalis about this error and they are investigating the PKCS#11
spec to verify the behaviour we're seeing and how/if they are out of spec.
(Assignee)

Comment 16

16 years ago
Bill,

If you can't provide the stack, can you provide the cert and the exact certutil
command you are using that causes the core dump ?

Thanks.
(Reporter)

Comment 17

16 years ago
As part of the troubleshooting with Chrysalis, I had to update the firmware in
the HSM so I'm re-building the test environment from scratch.  Let me replicate
the problem and report back.
(Reporter)

Comment 18

15 years ago
Julien,  I had more success with NES after I upraded the firmware of the
Chrysalis unit.  You need to re-install the Chrysalis software on strange; it's
available at /u/shadow/src/Chrysalis-Luna-SA/Update/client/.

I'm going to do a clean install of NES and try again to see if I can verify that
their upgrade is working.
(Reporter)

Comment 19

15 years ago
within NES 6.1sp2 I was able to see and use a certificate that was previously
generated on the HSM.  I can't claim this bug as "resolved" since I may have a
new problem with NES 6.1sp2: unable to generate a new keypair. Outlined in
blackflag bug #616182. :(
(Reporter)

Comment 20

15 years ago
using modutil and certutil from NSS 3.7.1 I get slightly different results:
worf 316 # modutil -dbdir . -nocertdb -list                                    
                              Using database directory ....

Listing of PKCS #11 Modules
-----------------------------------------------------------
  1. NSS Internal PKCS #11 Module
         slots: 2 slots attached
        status: loaded

         slot: NSS Internal Cryptographic Services Version 3.4                
        token: NSS Generic Crypto Services

         slot: NSS User Private Key and Certificate Services                  
        token: NSS Certificate DB

  2. LunaSA
        library name: /usr/lunasa/lib/cryst201s.so
         slots: 1 slot attached
        status: loaded

         slot: LunaNet Slot
        token: CMS-Test

  3. Root Certs
        library name: /opt/local/netscape/nes6.1sp2/alias/libnssckbi.so
         slots: 1 slot attached
        status: loaded

         slot: 
        token: Builtin Object Token
-----------------------------------------------------------

worf 315 # certutil -d . -R -s "CN=worf.netscape.com,O=America Online, C=US" -o
worf.csr -h CMS-Test -a
A random seed must be generated that will be used in the
creation of your key.  One of the easiest ways to create a
random seed is to use the timing of keystrokes on a keyboard.

To begin, type keys on the keyboard until this progress meter
is full.  DO NOT USE THE AUTOREPEAT FUNCTION ON YOUR KEYBOARD!


Continue typing until the progress meter is full:

|************************************************************|

Finished.  Press enter to continue: 
Enter Password or Pin for "CMS-Test":
certutil: unable to generate key(s)
: An I/O error occurred during security authorization.

Comment 21

15 years ago
And using the certutil from NSS3.7 produces what results?
(Reporter)

Comment 22

15 years ago
inconsistent results.  Sometimes the token doesn't respond (times out) and
sometimes it works just fine.

Chrysalis also says that you need to issue the command "salogin -v -o -s 1 -i
4:4 -p <passphrase>" before the web server can access certs on the token.  This
seems odd to me, but isn't a new requirement.  Nevertheless, the command-line
tools and NES have inconsistent results reliably querying the Chrysalis token.

I suspect some sort of weird interaction between the Chrysalis driver and NSS. 
I've asked Chrysalis to help investigate their code to help troubleshooting this.
(Assignee)

Comment 23

15 years ago
Bill,

I remember having the problem with the CSR / keypair generation before. You
solved it for me by adding some things into the Luna chrystoki.conf file . Since
you reinstalled the software, perhaps these lines need to be added again ?
(Reporter)

Comment 24

15 years ago
GOod idea.  Yes, the lines were added to the conf file.  The "trick" seems to be
to run the cryptic "salogin" command before starting up any application that
tries to use the HSM.  I'll be delving into this requirement a bit more now.

As it stands, the latest HSM firmware patch + using the salogin program seem to
be working a bit better.  I'll try running the PKCS#11 logging module to see if
we can track this down any more definitively.
(Reporter)

Comment 25

15 years ago
since updating the LunaSA firmware, I'm not seeing this bug anymore.
Status: NEW → RESOLVED
Last Resolved: 15 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.