This is probably a bug in nCipher, not NSS, but I want to be able to track it. nCipher appears to deadlock in a call to C_GetAttributeValue, during a JSS program to test SDR. Here is the stack trace: ff29bd34 lwp_sema_wait (284b0) ff359ab0 _park (284b0, ff37e000, 0, 283f0, 24d98, 0) + 114 ff359778 _swtch (283f0, 283f0, ff37e000, 5, 1000, ff00) + 424 ff35b140 _mutex_adaptive_lock (ff3898ec, 4c00, 1000, fffeffff, 1, 4d58) + 160 ff35bbf0 pthread_mutex_lock (0, ff37e000, f122177c, 2fae38, 26d3, 10008cb) + f8 f13d1a3c PR_Lock (2fae38, 10008cb, 0, 0, 0, f125e790) + 74 f14cfc10 secmodLockMutext (2fae38, 10008cb, f122177c, 2fae38, f1269e80, 204) + 30 f11545f4 NFC__load_key (f125d0a8, 0, f125c948, 2, 10008cb, f1251bc8) + 12c f1153e74 NFC__get_real_key (307be0, 2fbfa0, 1, ffbee7a4, f1251bc8, 10008cb) + 94 f117312c export_sec_key (307be0, f125c948, 2bc5e8, 18, f1251bc8, 10008cb) + 164 f117500c NFC__GetAttributeValueSecretKey (f125d0a8, f125c948, f1251bc8, 1, 0, 18) + 888 f11654e4 NFC_GetAttributeValue (10008cb, f125c948, 0, 1, f125d0a8, f1251bc8) + 2b4 f14e2268 PK11_ReadAttribute (2fac90, 85000465, 11, 0, 2e27c8, 5) + 170 f14e5edc PK11_ExtractKeyValue (2e27b0, ffbeecec, fdeedbf4, fdeedc00, ffbeebf8, 0) + 9c f14e6e08 PK11_GetKeyLength (2e27b0, ffbeecec, ffbeebf8, ffbee6d8, 2bc70, 4c83f8) + 68 fdeb65dc Java_org_mozilla_jss_pkcs11_PK11SymKey_getLength (2bcfc, ffbeecec, 0, ffbeec80, 4f20ec, 0) + 6c fa40c878 ???????? (ffbeecec, f622e8f0, 0, f, 0, 0) fa405b38 ???????? (ffbeed8c, b6, 0, fa4169fc, 4, ffbeec80) fa4058b0 ???????? (ffbeee1c, b6, 0, fa416430, 8, ffbeed18) fa405940 ???????? (ffbeeed4, b6, 0, fa416430, 10, ffbeedb0) fa40588c ???????? (ffbeef74, 0, 0, fa416430, 8, ffbeee68) fa40042c ???????? (ffbef000, ffbef1b0, a, f610d810, 4, ffbeef18) fe1428fc __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_ (ffbef1a8, ffbef0b0, ffbef0d0, 2bc70, 2bc70, 6000) + 25c fe14b03c __1cKjni_invoke6FpnHJNIEnv__pnJJavaValue_pnI_jobject_nLJNICallType_pnK_jmethodID_pnSJNI_ArgumentPusher_pnGThread__v_ (0, ffbef1a8, 0, 0, e5c20, ffbef18c) + 390 fe291ab4 jni_CallStaticVoidMethod (2bcfc, 2c640, e5c20, 2c650, ffbefc04, ff00) + 190 00012394 main (28afc, 1, e5c20, 2c650, 5, 2c65c) + 1154 00011218 _start (0, 0, 0, 0, 0, 0) + 108
Created attachment 111883 [details] JSS test program that triggers the deadlock This is a JSS test program that triggers the deadlock in nCipher code.
Created attachment 111887 [details] [diff] [review] patch I was able to workaround this problem by rearranging PK11_GetKeyLength so that it first tries to determine the key length from its type, and only if that fails to try to extract the key value. This fixes the bug for certain keys (notably DES3), but not for any variable-length keys (AES or RSA). I think the code is still correct with this change. Whether we actually do this depends how important it is to get nCipher support.
Bob, could you work with Jamie on this bug? Do you think it is acceptable to use the workaround that Jamie proposed in NSS 3.7.1?
Comment on attachment 111887 [details] [diff] [review] patch Make sure key->size == 0 if keyLenghth == CK_UNAVAILABLE_INFORMATION Other than that it looks good. (NOTE: if CK_UNAVAILABLE_INFORMATION == 0, then include a comment that says so, the current code would be fine in this case). bob
Comment on attachment 111887 [details] [diff] [review] patch The 'end' label should be deleted and 'goto end' changed to 'return key->size'.
Comment on attachment 111887 [details] [diff] [review] patch Sorry, I was misreading the diff's the current code is fine. bob r=relyea
I have confirmed that this is a case of double-locking. I put a check in the locking function to see whether the current thread already holds the lock, and this check got tripped right before the process hung.
Created attachment 112233 [details] [diff] [review] cleaned up patch Differences from last patch: 1. Unified instead of context diff. 2. Removed the "end:" label. Instead of "goto end", "return key->size". I tested this latest patch to see that it still fixes the problem.
Created attachment 112255 [details] [diff] [review] Update the comments The comments need to be updated now that the code is moved.
The original code of PK11_GetKeyLength has one (undocumented) property: it fills in key->data. The workaround for this bug loses this property. I am not sure if any code is depending on this property.
Not directly. The data field only gets filled in if the GET_Attribute worked. Code needing data calls PK11_ExtractKey (0r some similiar name) which tries to read the key if data doesn't exist. The only possible dependencies could be if the GetAttribute call could fail after the call to get KeyStrength(), or the change in performance trips over a race condition. Both of these case would involve latent bugs which would then have to be fixed. bob
Apparently nCipher also support the CKA_VALUE_LEN attribute: "Possible workaround - try CKA_VALUE_LEN before CKA_VALUE. Isn't defined for fixed length keys (like DES, 3DES), but as it happens will work with all secret keys in our library - and should work for variable length keys on all tokens, even for sensitive keys."
Given the new information, could you come up with a new patch? Should the ideal order be as follows? CKA_VALUE_LEN CKA_KEY_TYPE CKA_VALUE
Marked the bug fixed because the workaround has been checked in.