Closed Bug 274518 Opened 21 years ago Closed 20 years ago

SSL close layer function is too CPU intensive

Tracking

(Not tracked)

Status:

RESOLVED FIXED

Milestone:

3.11

People

(Reporter: julien.pierre, Assigned: nelson)

Details

Julien Pierre

Reporter

Description

•

21 years ago

Benchmarking of SSL on the server-side shows that the SSL layer close function is much more expensive than the base NSPR layer close socket function. I used a dtrace script I wrote to collect data on a running selfserv process which was doing full handshakes to measure where the CPU is going . I also proceeded to decompose the SSL close layer function . Here is the interesting part of the results : CPU time (us) - per CPU second : PR_PopIOLayer 3826 ssl3_FreeKeyPair 4142 SECKEY_DestroyPrivateKey 6230 CERT_DestroyCertificateList 7100 ssl_DestroyLocks 10150 ssl_DestroySecurityInfo 16680 ssl_DestroyGather 28216 close 33014 pt_Close 40390 ssl3_DestroySSL3Info 70994 ssl_DestroySocketContents 150391 ssl_FreeSocket 172364 ssl_DefClose 224974 PR_Close 255919 Total 1000665 Please ignore the fact that total is not exactly 1 million us - there is a limitation of dtrace that prevents me from getting it completely right (long story!), but the data is overall accurate still . Note that all the data above is inclusive . The table shows that 25.5% of the CPU time for the selfserv process went to PR_Close . Only 4% were in pt_Close, the NSPR layer function; and 3.3% in the close system call. The rest is all due to SSL specific overhead . It would appear that we have a private copy of the entire server socket configuration in the connection socket, and destroying it is very expensive . Interestingly, the call which makes the copy of the config, SSL_ImportFD (not measured in the above run), is not nearly as expensive as the SSL close call, only about one fifth . So we should focus on the close first . One way to possibly get a big boost would be maybe to refcount the model socket's configuration somehow, and refer to it in the connection sockets. We would only need to make a copy only if the socket config actually changes . Of course, this is only an expedient fix and still leaves a lot of situations with overhead, though it may help specific benchmarks (specweb 99) given the way they use NSS . The proper way to fix this is probably to take a much deeper look at libssl and all the socket structures and figure out a better systematic approach to deal with this problem .

Julien Pierre

Reporter

Comment 1

•

21 years ago

Sorry, the server was doing restart handshakes in the previous test. For full handshakes, the data looks different . For the most part, the server is spending more time in crypto and has fewer sockets and thus fewer calls to PR_Close . But overall, the ratios of time subfunctions within the close method look the same . CPU time (us) - per CPU second : PR_PopIOLayer 1220 ssl3_FreeKeyPair 1267 SECKEY_DestroyPrivateKey 2076 CERT_DestroyCertificateList 2284 ssl_DestroyLocks 3405 ssl_DestroySecurityInfo 5623 ssl_DestroyGather 7727 close 11215 pt_Close 13279 ssl3_DestroySSL3Info 20652 ssl_DestroySocketContents 42418 ssl_FreeSocket 49021 ssl_DefClose 65255 PR_Close 74620 Total 1000696

Julien Pierre

Reporter

Comment 2

•

21 years ago

This is a critical bottleneck to SSL performance with NSS, especially with hardware tokens. Marking P1 .

Severity: normal → enhancement

Priority: -- → P1

Target Milestone: --- → 3.10

Julien Pierre

Reporter

Comment 3

•

20 years ago

The call to ssl3_FreeKeyPair is to destroy the stepdown keypair, which gets copied . When the fix for bug 148452 is in, this function will no longer be called. I found that the lock creation/deletion accounts for about 0.7% of the time in a full handshake test. Thus, recycling SSL socket state objects would be useful, to avoid constantly creating and freeing those locks . FYI, the cost of the lock operations themselves was 0.3% in the full handshake test. Most of the time spent in ssl3_DestroySSL3Info is spent destroying PK11Context objects . Again, we should recycle state objects, so that we don't have to free and reallocate the PKCS#11 sessions associated with them .

Julien Pierre

Reporter

Updated

•

20 years ago

Assignee: wtchang → nelson

Nelson Bolyard (seldom reads bugmail)

Assignee

Updated

•

20 years ago

QA Contact: bishakhabanerjee → jason.m.reid

Nelson Bolyard (seldom reads bugmail)

Assignee

Comment 4

•

20 years ago

Planned to be part of the performance work in 3.11

Target Milestone: 3.10 → 3.11

Julien Pierre

Reporter

Comment 5

•

20 years ago

Per our meeting today, this was fixed by the bypass.

Status: NEW → RESOLVED

Closed: 20 years ago

Resolution: --- → FIXED

You need to log in before you can comment on or make changes to this bug.

Bugzilla

SSL close layer function is too CPU intensive

Categories

(NSS :: Libraries, enhancement, P1)

Tracking

(Not tracked)

People

(Reporter: julien.pierre, Assigned: nelson)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Comment 3

Updated

Updated

Comment 4

Comment 5