Open
Bug 951781
Opened 11 years ago
Updated 6 months ago
libssl accesses the NSS certificate database during handshake, causing disk I/O to block network activity
Categories
(NSS :: Libraries, defect, P3)
NSS
Libraries
Tracking
(Not tracked)
NEW
People
(Reporter: briansmith, Unassigned)
Details
(Keywords: main-thread-io, perf, Whiteboard: [snappy])
Gecko does all its network I/O on a thread we call the "socket transport thread." This is the thread on which the SSL handshake is executed. In ssl3_HandleCertificate and maybe other places, CERT_NewTempCertificate and maybe other functions that access the NSS certificate database are called. My understanding is that this can cause disk I/O. We are not supposed to have disk I/O on the socket transport thread. We can probably solve this problem by making a new version of the async certificate verification API that passes the application SECItems instead of CERTCertificate objects and by making the parsing DER -> CERTCertificate lazy in other functions like SSL_PeerCertificate. Then, if an application uses the new verification API and if it avoids functions like SSL_PeerCertificate then it will avoid the disk I/O on the networking thread.
Comment 1•11 years ago
|
||
Brian, this would only cause disk I/O in the event of PKCS#11 modules being loaded, and even then, only if they've fired certain events. NSS maintains an in-memory cache of the contents - both of the user DB and in modules' contents - and will only re-scan if the module has gone through a state change or fired an event. Indeed, you're at MUCH greater risk from calling *any* NSS function on the socket transport function, due to the need of going through PKCS#11 modules. If a PKCS#11 module is slow, blocks, or requires physical (eg: serial or USB) I/O - you'll end up blocked on those. Even if you move the handshake off to a worker, ANY NSS function is at risk of triggering this. In Chromium's case, we've had to move the entire SSL layer off onto a dedicated thread from the IO (which handles socket I/O and local IPC) thread because of this on Linux and ChromeOS. This is because Linux users may have PKCS#11 modules, and on ChromeOS, we have a PKCS#11 module that interacts with the TPM. There are too many code paths in NSS that end up getting blocked on the PK11Slot/Module locks - even if they're doing nothing with the TPM - that we had to move it wholesale.
Updated•2 years ago
|
Severity: normal → S3
Updated•6 months ago
|
Severity: S3 → S4
Priority: -- → P3
You need to log in
before you can comment on or make changes to this bug.
Description
•