In softoken/pkcs11c.c, there is a constant called NUM_MIXERS which is set to 9 . It is used in a loop that includes a number of SHA1 and MD5 calls, used to fill a key_block array . We always go 9 times through this loop, even though this much data may not be required in all cases . It would be good to figure out how much data we need ahead of time, before going into the loop . This may help the SSL performance . When I replaced this loop with a memset on both the client and server side, I got a 2.8% performance improvement in SSL3 restart handshake performance (4190 to 4310 ops/s on the v20z dual opteron). We probably can't get that much because some data is necessary, but it would still be good to reduce the number of hashing calls as much as possible.
I am about to test a patch for this. It looks like we are being particularly wasteful in the TLS case, and TLS's "prf" function is very inefficient, so we should be as consservative as possible in that case.
Created attachment 180329 [details] [diff] [review] patch v1 This patch makes basically two changes. 1) both the MASTER_KEY_DERIVE and the KEY_AND_MAC_DERIVE sets of mechanisms had loops in which they called hash update functions twice, once for the client's random data and again for the server's random data. This patch concatenates the client and server random data into a buffer, so that only one hash update call is needed in the loop, not two. This cuts the number of hash updates significantly (about 33%) for SSL-related key derivation. 2) For the KEY_AND_MAC_DERIVE mechanisms, we previously had determined the worst case, that is, the case that required the largest amount of key material, and always computed that much, often using significantly less than what was generated. Now, the code computes exactly how much it needs and generates only that amount. I found that for RC4_SHA ciphersuites, we only need 4/9 of the amount previously being generated, so this should speed up key derivation considerably. I consider this too risky one week before 3.10 RTM, so I propose this for NSS 3.11.
Comment on attachment 180329 [details] [diff] [review] patch v1 r+ comments apply to the preexisting code and do not need to be addressed before checkin. (however I would look favorably on a patch that implements one or both comments). 1) There's a comment in the code about the similarity between the mixing functions used to generated the ssl3 master secret derive and the ssl3 mac key derive, and that maybe the mixing function should be a small function used by both sections of code. The comment is probably very old, and likely predates the tls code with does use put it's mixing function in a separate function. 2) Derive is *very* long (even for functions in pkcs11c.c). The SSL derive mechanism are crying to be made into separate functions. At some point it would be nice to split these out. (it may be argued that they belong in freebl anyway.. I could go either way). anyway patch approved for checkin.
Checking in pkcs11c.c; /cvsroot/mozilla/security/nss/lib/softoken/pkcs11c.c,v <-- pkcs11c.c new revision: 1.69; previous revision: 1.68
Reopening. Unfortunately, this patch broke the build on HPUX 11. HPUX 11's compiler won't allow a struct member to be initialized with the value of a variable, or with an address of a struct member that is accessed through a pointer. The address of a local or global variable is OK. So, I have a new small patch that will satisfy HPUX, forthcoming.
Created attachment 208991 [details] [diff] [review] Proposed patch for HPUX I want to test build this patch on several platforms before asking for review.
Comment on attachment 208991 [details] [diff] [review] Proposed patch for HPUX HPUX builds with this patch, without errors or warnings in that file. Julien, please review for the trunk. Thanks.
Comment on attachment 208991 [details] [diff] [review] Proposed patch for HPUX The patch looks good. This seems like a really weird restriction in the HP compiler. I wonder what C version/standard forbids this.
Comment on attachment 208991 [details] [diff] [review] Proposed patch for HPUX >+ /* HPUX won't let a structure member be initialized with the >+ * value of a variable, but the address of a local variable. :-/ Is this missing "is OK"? It looks a little strange to only initialize some fields of these structures, especially the SECItem "keyblk". I'm wondering if the compiler initializes to 0 the fields without an initializer.
Comment on attachment 208991 [details] [diff] [review] Proposed patch for HPUX Checked in on trunk. Checking in pkcs11c.c; new revision: 1.70; previous revision: 1.69
I wanna carry this fix back to the 3.11 branch. Any objections?
go for it.
Checked in on branch lib/softoken/pkcs11c.c; new revision: 188.8.131.52; previous revision: 184.108.40.206