Pallab has reported that NSS has many locks that are created but never contended
for. One class of these are a result of creating arenas that are only used by
one thread. There are two obvious cases (borne out by data Pallab collected):
1. Throwaway arenas
There are many instances of the following code in NSS:
tmparena = PORT_NewArena();
/* code that uses the arena */
Since the lifetime of the arena is entirely within the function, it obviously
does not need to be threadsafe. In fact, we shouldn't have to do any allocation
at all for such arenas, as a PLArenaPool can be on the stack.
2. Read-only objects
I suspect many NSS objects only use their arena when they are constructed. For
such objects, the arena does not need to be threadsafe. Experimentally,
SECKEYPrivateKey seems to be such an object, though I haven't verified by code
Fixing (1) would be easy, (2) is a little trickier, because we would have to be
careful about not breaking things.
I can think of two ways to change this behavior for arenas that do not need to
A. Stop using the PORT_ functions and just use NSPR directly.
B. Create special PORT_ functions that don't do any locking.
I would prefer A.
Really? Prefer A? I thought Sun was trying to stop using NSPR entirely.
How about option c
c) Create a new form of PORT_NewArena that takes an argument that specifies
whether or not to create the lock. Then change the other functions to test
the lock pointer before attempting to (un)lock it.
I have analyzed Pallab's data, which I will attach shortly. The data is based
on 3.3.4. I am morphing the bug because it isn't as simple as modifying the use
In a single SSL connection, at startup, NSS (under the webserver) creates 56
uncontended locks. They are categorized as follows:
1) sslSocket (8)
When the socket is not shared among threads, these locks have no contention.
2) Arenas in read-only NSS structures (7)
2a) CERTCertificateList (1)
2b) SECKEYPrivateKey (5)
2c) SECKEYDBKey (1)
Unfortunately, the first two are public types. So while NSS seems to use them
in a read-only manner (once they are created, the arena is not used again), we
don't have control over how applications use them.
3) PKCS #11 session locks (18)
4) PKCS #11 object locks (2)
5) crypto context (PK11Context) locks (10)
The lock is for session multiplexing, in the case where NSS exhausts the
sessions available from a token, and thus has to switch the context between
6) ASN.1 decoding (5)
7) PKCS #5 decoding of algorithm ID (1)
Created attachment 120238 [details]
Pallab's data, call stacks of the creation of uncontended locks
Made a mistake. The first item should be:
1) sslSocket (13)
Another note: changes on the tip would reduce the number of PKCS#11 session
locks to 10, and the number of crypto context locks to 6. Additionally, this
data was obtained from an initial SSL connection, before the PKCS#11 session
freelist is built, so in the long run the average number of new sessions (and
hence new uncontended locks) in a given connections should be 4.
With those considerations, on the tip the total should be about 38.
Some general observations.
A) The ASN.1 locks are contained within SEC_ASN1DecodeItem. The arena is
created to hold the decoder state and is only used within this function, so I
believe this is an arena that doesn't need to be locked.
B) The problem with the read-only objects, as I mentioned, is that they are
public. For Stan, this should not be a problem.
C) The crypto context, PKCS#11 session, PKCS#11 object, and sslSocket locks are
all a result of NSS needing to provide thread safety, even when it may not be used.
I wonder if something can be done about (C) for Stan. Could the softoken act
like its not threadsafe, and rely on the above-PKCS#11 locking we use for such
tokens? Then, NSS could make decisions about when to lock. We could write code
in such a way that when we know sessions won't be exhausted and aren't being
shared among threads, then the associated contexts and sessions wouldn't need
locks, much the same way I proposed of having arenas without locking.
But maybe that would be too difficult, I haven't really thought it out.
What was the test environment? If you only have one thread per SSL socket, then
many locks will not be contended. But SSL is designed to allow "full duplex"
(a.k.a. "two way simultaneous") communication with handshakes going on during
the application data flow. An application that uses these features will need
those other locks.
Over the years, we've had MANY MANY cases where an application had more than one
thread using the same socket, despite many denials from the application
developers. Those situations led to crashes in NSS, and hence to blame of NSS
software. NSS put all those locks in to protect itself against applications
that sometimes use an SSL socket in more than one thread, even when they deny
Prior to the addition of those locks, many application developers who claimed
their applications never had multiple threads acting on the same socket had a
poor opinion of NSS, because they claimed it crashed a lot. After putting in
those locks, their opinion of NSS rose, because it stopped crashing. Of course,
the implication of this historic fact is that the applications did indeed have
multiple threads acting on the same sockets.
I'd rather have NSS have uncontended locks (they shouldn't cost much, right) and
have NSS be perceived as being relatively good quality than to remove those
locks and have NSS be perceived as lower quality.
If there was some way to be absolutely sure that an application really does not
ever use an SSL socket in more than one thread, then I'd agree that we could
eliminate those locks. But until then, ...
Added myself to the CC-list.
Created attachment 120740 [details] [diff] [review]
implement Nelson's suggestion from comment #1
Created attachment 120742 [details] [diff] [review]
use non-locking arena for single-part ASN.1 decodes
Created attachment 120745 [details] [diff] [review]
use non-locking arenas in a couple of places in the softoken
Just to clarify, the patch from comment #9 is for comment #2 case 6
the patch from comment #10 is for comment #2 case 2c
(there were actually two instances of those locks, I miscounted)
P2, at least. Ian, are you going to check this in on the trunk?
Yes. I've been meaning to redo the patches. For the first one (the code for
creating non-locking arenas), I plan to add the thread id of the thread in which
the arena was created as part of the arena, so that it can be asserted that the
non-locking arenas are only used in one thread (this was suggested by Nelson).
For the other two patches, I will add comments warning any and all readers of
the code that the arena must never escape from the function it was created in,
since it is not threadsafe.
Created attachment 123178 [details] [diff] [review]
revised, combined patch
Created attachment 123260 [details]
selfserv stress results
selfserv stress was a wash performance wise, but held up nicely.
I did fullhandshakes for several hours.
Comment on attachment 123178 [details] [diff] [review]
revised, combined patch
Ian, I apologize for taking so long to review this simple patch. Just this
I learned how to search bugzilla for outstanding review requests, and found
r=MisterSSL, with one requested change, and one suggested change.
>@@ -77,12 +77,16 @@
> #endif /* THREADMARK */
> /* The value of this magic must change each time PORTArenaPool changes. */
>-#define ARENAPOOL_MAGIC 0xB8AC9BDF
>+#define ARENAPOOL_MAGIC 0xB8AC9BE0
1. Please keep the magic number ODD. We wanna crash if this is ever used as a
pointer, and an odd value increases the likelihood of a crash.
2. The following lines of code appear in numerous places in your patch.
>+ if (!pool->lock)
>+ PORT_Assert(pool->creating_thread == PR_GetCurrentThread());
>+#endif /* DEBUG */
I suggest replacing those 4 lines with this line:
PORT_Assert(pool->lock || pool->creating_thread ==
Moved to NSS 3.10.
Ian, what's the status of this bug?
Taking Ian's perf bugs for 3.11.
Fixed in NSS 3.11 with introduction of SSL_NOLOCKS socket option.