Open Bug 134122 Opened 24 years ago Updated 3 years ago

Need SSL model/listen socket validation

Categories

(NSS :: Libraries, enhancement, P2)

enhancement

Tracking

(Not tracked)

People

(Reporter: julien.pierre, Unassigned)

Details

This was tested originally with NES, but can probably be reproduced easily with selfserv. The test is to run a server with client auth required on the first handshake, but use a cert database without any CAs trusted for client auth. NES uses the model socket approach, where a model SSL socket is created with all the options, and then the connection is accepted with PR_Accept, and SSL_ImportFD is used to set the SSL settings of the model to the connection socket. The server actually comes up this way. When connecting with Communicator 4.79, the client displays a "network error". NES logs the following : [28/Mar/2002:20:43:21] failure ( 4235): Error receiving connection (SSL_ERROR_NO_TRUSTED_SSL_CLIENT_CA - the CA that signed t he client certificate is not trusted locally) Ideally, the expected behavior would be that the server would detect this critical setup failure at startup time, when creating the model socket, rather than after having accepted a connection. There should be a validation call, like SECStatus SSL_ValidateModelSocket(PRFileDesc*) that would tell us whether the SSL socket has critical setup failures, so that the server could decide not to come up. In this case we should detect that the model socket is a server socket, and it has the client auth required bit, and there are no trusted CA certs available, so that no connections will be able to go through. There are probably many other tests that could be performed (such as not enabling any ciphers or protocols, perhaps specifying both client and server bits, etc). Of course this would be an "instantaneous" check. For this test, we could only look at the available CA certs at the time of the call. The CA certs could live in a hardware token that could be removed later on. So the error in the SSL code path could still occur even if the server checked at startup. I will open a separate bug for that.
Changed the QA contact to Bishakha.
QA Contact: sonja.mirtitsch → bishakhabanerjee
Priority: -- → P2
Target Milestone: --- → 3.6
This is really an enhancement request. Question: What function would you propose to detect this and fail?
Severity: normal → enhancement
Nelson, I'm not sure exactly what the prototype should be - I prefer to leave that to you, since you would be the one implementing it. In the original text of the bug, I had proposed SECStatus SSL_ValidateModelSocket(PRFileDesc*) . That would work for the specific case, but it probably should be made more flexible and allow for future options. Eg, perhaps it should take some usage to check the socket for, maybe as a bitflag. Though currently all usages are set in the socket as SSL options. Another improvement on that prototype would be a way to return the error or errors that make the SSL socket invalid : eg. all ciphers and/or protocols are disabled, client auth is required, but there are no trusted client CAs, etc.
Mass retarget all my old NSS bugs that were previous targeted at NSS versions that have now been released.
Target Milestone: 3.6 → 3.7
There is a discussion of stepdown handshakes in bug 148452 . The purpose of that bug is to prevent the generation of RSA keys for step down when they aren't necessary or wanted. This validation API should check that no export cipher suites are enabled on the socket if the stepdown key is not generated.
Summary: Need SSL model socket validation → Need SSL model/listen socket validation
Moved to target milestone 3.8 because the original NSS 3.7 release has been renamed 3.8.
Target Milestone: 3.7 → 3.8
Remove target milestone of 3.8, since these bugs didn't get into that release.
Target Milestone: 3.8 → ---
QA Contact: bishakhabanerjee → jason.m.reid
Another category of sanity testing this function can do comes from the recent addition of ECC. A server can get configured with an ECC keypair+cert only but all the cipher suites which are usable with this setup disabled. There's no really good way to detect this configuration error at startup today (the mapping from cipher suites to key types is not entirely obvious and while such a map can be maintained in the server, it is suboptimal to have to do so).
QA Contact: jason.m.reid → libraries
I reiterate my question from comment 2. This enhancement request needs more input from the requestors (web server folks) concerning what the API should look like. One could imagine a function that takes a single argument, a PRFileDesc * for an SSL socket (perhaps a model socket), and returns a SECStatus, saying (effectively) Yes or No, this socket is (or is not) configured properly. Is that really what you want? If not, please think about what else you want, and how those additional desired affect the function you're requesting.
Nelson, This RFE is certainly very open-ended. It wasn't my intention when I worked on the web server and filed this bug to dictate the API design. But I can help with the requirements . I hope Jyri chimes in too . I would say that there are at least a number of critical configuration errors that need to be considered that the web server will always want to know about. In particular, when the server socket is configured in such a way that no client, no matter its configuration, can ever connect to it, then this function should return PR_FALSE and return an error code explaining why. Such cases include, but I'm sure are not limited to : - requiring client auth but having no trusted root certs for client auth - not having a single intersection of protocols / cipher suites / server certs / private keys that are compatible with each other Then, there are some other less broken but still bad cases. Here are some examples : - turning on SSL2 but disabling SSL2 client hello . When doing this on the client socket, it disables SSL2. - requesting client auth but having no trusted root certs for client auth . This means client auth will never happen; but non-client auth handshakes can still happen. The server admnistrator probably cares about that (especially as the web server uses only the request feature, not require, so it can send back an error page to the browser) - SSL2 is turned on, but no SSL2 cipher suites are on; or vice-versa. But there are other valid SSL3 or TLS combinations turned on. This means libSSL automatically disables SSL2 today. Do we want to error because the SSL2 suites got disabled ? I think that the function should provide a way to check that. - Some ECC cipher suites are on, but without a compatible ECC cert. Other RSA suites and an RSA cert are on. So ECC gets disabled. Again I think the server administrator needs to know that he made some configuration changes that don't make sense. We should probably distinguish between critical errors and the non-critical ones (things getting turned off). Maybe the checking function could take an integer to pass the level of checking desired. Level 1 would be for the critical errors only, level 5 nitpicking mode - everything gets treated as error. I hope this helps.
Julien has identified a small number of discrete error cases, that (I think) he wants to have separately reported to the caller. There are numerous ways we could do that, some of which are: A) Return a pointer to a PR_Malloc'ed array of integers, each containing an error code. The value zero would indicate the end of the list. The caller would be responsible to call PR_Free on that list when done with it. Might require defining a bunch of new error codes. B) like A, but caller passes in array of ints and maximum number of ints in array. Function returns count of error codes returned in array, zero being no error. C) return a bit vector of configuration errors (ICK) D) separate functions to check for each of those error cases, each of which checks for a single error and returns a boolean result. This idea requires no new error codes, and avoids arrays of error codes, by not using error codes. Today: 1) it is not an error to enable SSL2 but have no SSL2 cipher suites enabled. 2) it is not an error to disable SSL2 but have some SSL2 cipher suites enabled. The consequence of either of these is that SSL2 and all its cipher suites are disabled for the handshake. 3) It is not an error to enable SSL3 and/or TLS, but have no SSL3/TLS cipher suites enabled. 4) It is not an error to disable SSL3 and TLS, but have some SSL3/TLS cipher suites enabled. The consequence of either of these is that SSL3 and TLS and all their cipher suites are disabled for the handshake. I think the proposal in comment 10 would propose to report these as errors. Is that what's wanted? Should there be a flag that says whether to treat those as errors or not? Should the check function always detect these errors (when true), and the caller can simply ignore any of those error codes if he doesn't care about them? What about the multiple handshake situation? Say, for example, that the server does a handshake without requesting client auth in the first handshake, and then a second handshake in which it does request client auth. Would this function get called twice? What if the function says all is well for the first handhsake, and says there is an error for the second one? Has that solved any problem for the server? Would we suggest that the server test out a socket for the client auth case, even when it does not (yet) intend to request client auth, just to see if it would work, hypothetically? (I have LOTS more questions like these about ECC because of the curves, but we need to walk before we can run.) Lots of other potential errors: a) your cert has expired. b) you have an incomplete cert chain (which is not an error today) c) your cert is not valid for the necessary key usages and/or extended KUs. d) your cert doesn't bear the expected host name (how does it know what host name is expected) ? e) your cert is revoked (do we need to be able to be an OCSP client for this?) f) you have enabled cipher suites for which you do not have the necessary cert (public key) type g) you have enabled cipher suites for which your local PKCS#11 modules do not have access to the necessary "mechanisms" [Today these last two just silently disable those cipher suites, and the only error is if no cipher suites are left enabled.] Some day: h) you have disabled some ECC curves and disabled others, but your EC cert's curve is one of the disabled ones. i) you have disabled some ECC curves and disabled others, and SOME of your EC certs have disabled curves, but others have enabled curves.
Target Milestone: --- → 3.11.1
In web server today the code checks during configuration (when server config file is read in, parsed, and the relevant options contained therein are passed to NSS) a number of error or warning scenarios: - server cert validity vs. time now - ssl socket but all of (ssl2,ssl3,tls) are disabled - (ssl2,ssl3,tls) enabled but all corresponding cipher suites disabled - no server keypair+cert present - all cipher suites usable with given server key/cert type are disabled (e.g. only 1 ECC keypair+cert, but all ECC-related suites disabled) (The primary interest from web server point of view in this validator function is the last item on the above list. The logic to do that is best contained in the library which is most familiar with the cipher suite details, i.e. NSS. If this RFE solved only that case, web server could be happy.) However in the interest of a generic solution it certainly makes sense to provide a general solution for the other cases listed above (as well as in prior comments). A call that simply returns true/false would not do because - the server needs to log descriptive messages on the condition(s) found - the server, or its configuration, may treat some conditions as fatal and others as warnings (though as Julien noted, some conditions are inevitably fatal) Passing in a threshhold (comment #10) is not flexible enough - it's easy to envision that it'll be hard to agree among all products which conditions belong in which bucket. So, the function should not make hardcoded judgment about the conditions it finds. It should return a set of error codes that describe its finding(s). The caller can decide what to do for each. I say "function" but plural functions would work as well (option D above). > What about the multiple handshake situation? I don't follow the use case here, can you say more. I envision web server calling this function only once, during server startup, when the config data is processed.
Nelson, Re: comment 11, I would suggest that cert expiration check/cert verification be left out of this function; or at least not be made a critical error. This is because we need to keep certain hacks working where the server is unable to verify its own certificate. See bug 321765 . Let the server verify its cert on its own if it wants to, using the already-defined public APIs. IMO, this function should solve problems that aren't currently easily solved without knowledge of the internal limitations of NSS. Let me add one to the list of error/warnings to check : If some DHE cipher suites are enabled on the server socket, which NSS supports on the client-side only, there should be a way for this new function to notify the application of the problem. Currently the cipher suite just gets silently disabled. I believe some products have gone as far as putting checkboxes for them in their server products even though they only work for client-side . Re: you other questions. My main point was to separate errors into critical and non-critical/warning groups. I don't have a strong opinion on what the API looks like and how it reports the errors/warnings. I suggested using a threshold as input, but Jyri rejected that in comment 12 . In order of preference, I like your B proposal best, but I would add that the function should return an error if all the errors don't fit in the array. And the content of each element could be more detailed than just an error code, if needed. Eg. a few bits to indicate severity, plus an error code, plus maybe another ptr field specific to each error type. If you really want to go that far, of course. Jyri, Re: comment 12, When the server does client auth in a second handshake, it changes the socket's configuration by setting one of the client auth option bits . I think Nelson is asking whether the server would then revalidate the config again. My guess is no, unless the validation is known to be very cheap. Checking for trusted CAs will be an expensive operation, needing to enumerate all certificates. So, the main thing that we would want to check after such change will be impractical to do on every connection. If the server could predict when its config will trigger a second handshake for client auth based on its configuration, it could create a second model socket and set the client auth option, and validate that model at startup. That 2nd model would never be used to actually import any connection socket. But I think that kind of logic will be hard to implement in the server.
From ECC interop discussions, another validation scenario came up: TLS_ECDH_RSA_* cipher suites require server ECC keypair, with the server cert signed with CA's RSA key. If server cert is not signed with RSA key, these cipher suites can't be used (A subcase of this is that self-signed won't work...) The validator can check for invalid conditions related to this. e.g. - All ciphers disabled except TLS_ECDH_RSA_* - Applicable server cert(s) not signed with RSA key. Perhaps there are other subtle cases like this wrt ECC suite/curve/key/cert combinations.
Re comment 13 You're right, the web server wouldn't revalidate config at request time, it'd be too expensive. Any amount of validation (within some reason...) can be done at startup though, even if some of it is unnecessary. I could imagine the server doing as you suggest to validate a client auth config at startup and log messages if the config doesn't make sense. Even if turns out client auth never gets triggered, it is worth it to burn cycles at startup to generate informative messages/warnings. In cases where the server knows it will be used it can take more drastic action (like refusing to start) but it won't always know (since client auth could be triggered dynamically, for example from servlets; the servlet code might not even be installed at the time of startup).
remove target milestone, since the target was missed.
Target Milestone: 3.11.1 → ---
Assignee: nelson → nobody
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.